Skip to content

cli,server: add --exclude-log-severities flag to debug zip#165802

Merged
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
aa-joshi:introduce_log_filters
Mar 17, 2026
Merged

cli,server: add --exclude-log-severities flag to debug zip#165802
trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
aa-joshi:introduce_log_filters

Conversation

@aa-joshi
Copy link
Contributor

@aa-joshi aa-joshi commented Mar 16, 2026

Add an --exclude-log-severities flag to debug zip that filters out
log entries by severity server-side. For large clusters, INFO-level
logs can be extremely voluminous and often aren't needed for debugging.
This flag allows operators to exclude specific severity levels
(e.g. --exclude-log-severities=INFO) to reduce both network transfer
and zip file size.

The implementation adds a new exclude_severities repeated field to
the LogFileRequest protobuf message. The LogFile() RPC handler
builds an exclusion set and skips matching entries during the decode
loop. On the CLI side, severity names are validated early in
runDebugZip and converted to proto values before being passed in
each per-node LogFileRequest.

Rolling upgrade compatibility is maintained without a version gate:
old servers ignore the unknown protobuf field (graceful degradation),
and old clients send an empty field (no filtering).

Epic: none
Fixes: CRDB-61602

Release note (cli change): Added --exclude-log-severities flag to
cockroach debug zip that filters log entries by severity server-side.
For example, --exclude-log-severities=INFO excludes all INFO-level log
entries from the collected log files, which can significantly reduce
zip file size for large clusters. Valid severity names are INFO,
WARNING, ERROR, and FATAL. The flag accepts a comma-delimited list or
can be specified multiple times.

@trunk-io
Copy link
Contributor

trunk-io bot commented Mar 16, 2026

😎 Merged successfully - details.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@aa-joshi aa-joshi force-pushed the introduce_log_filters branch from 14e4cbb to 94b867b Compare March 16, 2026 15:28
@aa-joshi aa-joshi marked this pull request as ready for review March 16, 2026 15:29
@aa-joshi aa-joshi requested review from a team as code owners March 16, 2026 15:29
@aa-joshi aa-joshi requested review from Abhinav1299, arjunmahishi and kyle-a-wong and removed request for a team March 16, 2026 15:29
Comment on lines +221 to +230
// Validate excluded log severities.
for _, name := range zipCtx.excludeLogSeverities {
if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
"unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL",
name,
)
}
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: We are doing the same parsing again in zip_per_node.go file for converting severity names to its proto values. Can we convert it at one place only? Just after validation here seems logical as other conversion is done in zip_per_node file which is for every node. We can store the severity detail in ZipCtx here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it. With your suggestion, the ZipCtx would have redundant declaration: one for array of string and another one for array of logSeverity. I am trying to avoid it here.

Comment on lines +223 to +224
if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add a validation for case sensitivity here? User can input --exclude-log-severity=info or --exclude-log-severity=INFO

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SeverityByName func is converting to upper case and then performing validation.

Comment on lines +1891 to +1893
<PRE>

</PRE>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PRE block seems to be empty, was it intentional?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional to avoid text wrapping in --help command.

}

ZipExcludeLogSeverity = FlagInfo{
Name: "exclude-log-severity",
Copy link
Contributor

@Abhinav1299 Abhinav1299 Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: can we change the flag name to exclude-log-severities

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE

if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
"unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL",
name,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently there are 7 severity levels possible, should we add a check to limit unwanted severities? Like UNKNOWN and NONE ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UNKNOWN and NONE log severities for unexpected and rare events. It means that we can not determine the log severity. In such situation, it make sense to include log of such severities all the time.

Add an `--exclude-log-severities` flag to `debug zip` that filters out
log entries by severity server-side. For large clusters, INFO-level
logs can be extremely voluminous and often aren't needed for debugging.
This flag allows operators to exclude specific severity levels
(e.g. `--exclude-log-severities=INFO`) to reduce both network transfer
and zip file size.

The implementation adds a new `exclude_severities` repeated field to
the `LogFileRequest` protobuf message. The `LogFile()` RPC handler
builds an exclusion set and skips matching entries during the decode
loop. On the CLI side, severity names are validated early in
`runDebugZip` and converted to proto values before being passed in
each per-node `LogFileRequest`.

Rolling upgrade compatibility is maintained without a version gate:
old servers ignore the unknown protobuf field (graceful degradation),
and old clients send an empty field (no filtering).

Epic: none
Fixes: CRDB-61602

Release note (cli change): Added `--exclude-log-severities` flag to
`cockroach debug zip` that filters log entries by severity server-side.
For example, `--exclude-log-severities=INFO` excludes all INFO-level log
entries from the collected log files, which can significantly reduce
zip file size for large clusters. Valid severity names are INFO,
WARNING, ERROR, and FATAL. The flag accepts a comma-delimited list or
can be specified multiple times.
@aa-joshi aa-joshi force-pushed the introduce_log_filters branch from 94b867b to f3e946d Compare March 17, 2026 08:18
Copy link
Contributor Author

@aa-joshi aa-joshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aa-joshi made 5 comments and resolved 1 discussion.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on Abhinav1299, arjunmahishi, and kyle-a-wong).

Comment on lines +223 to +224
if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SeverityByName func is converting to upper case and then performing validation.

if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
"unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL",
name,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UNKNOWN and NONE log severities for unexpected and rare events. It means that we can not determine the log severity. In such situation, it make sense to include log of such severities all the time.

Comment on lines +221 to +230
// Validate excluded log severities.
for _, name := range zipCtx.excludeLogSeverities {
if _, ok := logpb.SeverityByName(name); !ok {
return errors.Newf(
"unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL",
name,
)
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about it. With your suggestion, the ZipCtx would have redundant declaration: one for array of string and another one for array of logSeverity. I am trying to avoid it here.

}

ZipExcludeLogSeverity = FlagInfo{
Name: "exclude-log-severity",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DONE

Comment on lines +1891 to +1893
<PRE>

</PRE>
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is intentional to avoid text wrapping in --help command.

Copy link
Contributor

@Abhinav1299 Abhinav1299 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
few NITs:

  • Can we update the commit and PR description to have --exclude-log-severities instead of --exclude-log-severity.
  • Can you check the failing CI check.

@aa-joshi aa-joshi changed the title cli,server: add --exclude-log-severity flag to debug zip cli,server: add --exclude-log-severities flag to debug zip Mar 17, 2026
Copy link
Contributor Author

@aa-joshi aa-joshi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

@aa-joshi made 1 comment.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on Abhinav1299, arjunmahishi, and kyle-a-wong).

@trunk-io trunk-io bot merged commit fa98ab3 into cockroachdb:master Mar 17, 2026
51 of 56 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants