cli,server: add --exclude-log-severities flag to debug zip#165802
cli,server: add --exclude-log-severities flag to debug zip#165802trunk-io[bot] merged 1 commit intocockroachdb:masterfrom
Conversation
|
😎 Merged successfully - details. |
14e4cbb to
94b867b
Compare
| // Validate excluded log severities. | ||
| for _, name := range zipCtx.excludeLogSeverities { | ||
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( | ||
| "unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL", | ||
| name, | ||
| ) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
NIT: We are doing the same parsing again in zip_per_node.go file for converting severity names to its proto values. Can we convert it at one place only? Just after validation here seems logical as other conversion is done in zip_per_node file which is for every node. We can store the severity detail in ZipCtx here.
There was a problem hiding this comment.
I thought about it. With your suggestion, the ZipCtx would have redundant declaration: one for array of string and another one for array of logSeverity. I am trying to avoid it here.
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( |
There was a problem hiding this comment.
Should we also add a validation for case sensitivity here? User can input --exclude-log-severity=info or --exclude-log-severity=INFO
There was a problem hiding this comment.
SeverityByName func is converting to upper case and then performing validation.
| <PRE> | ||
|
|
||
| </PRE> |
There was a problem hiding this comment.
This PRE block seems to be empty, was it intentional?
There was a problem hiding this comment.
It is intentional to avoid text wrapping in --help command.
pkg/cli/cliflags/flags.go
Outdated
| } | ||
|
|
||
| ZipExcludeLogSeverity = FlagInfo{ | ||
| Name: "exclude-log-severity", |
There was a problem hiding this comment.
NIT: can we change the flag name to exclude-log-severities
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( | ||
| "unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL", | ||
| name, |
There was a problem hiding this comment.
Currently there are 7 severity levels possible, should we add a check to limit unwanted severities? Like UNKNOWN and NONE ?
There was a problem hiding this comment.
The UNKNOWN and NONE log severities for unexpected and rare events. It means that we can not determine the log severity. In such situation, it make sense to include log of such severities all the time.
Add an `--exclude-log-severities` flag to `debug zip` that filters out log entries by severity server-side. For large clusters, INFO-level logs can be extremely voluminous and often aren't needed for debugging. This flag allows operators to exclude specific severity levels (e.g. `--exclude-log-severities=INFO`) to reduce both network transfer and zip file size. The implementation adds a new `exclude_severities` repeated field to the `LogFileRequest` protobuf message. The `LogFile()` RPC handler builds an exclusion set and skips matching entries during the decode loop. On the CLI side, severity names are validated early in `runDebugZip` and converted to proto values before being passed in each per-node `LogFileRequest`. Rolling upgrade compatibility is maintained without a version gate: old servers ignore the unknown protobuf field (graceful degradation), and old clients send an empty field (no filtering). Epic: none Fixes: CRDB-61602 Release note (cli change): Added `--exclude-log-severities` flag to `cockroach debug zip` that filters log entries by severity server-side. For example, `--exclude-log-severities=INFO` excludes all INFO-level log entries from the collected log files, which can significantly reduce zip file size for large clusters. Valid severity names are INFO, WARNING, ERROR, and FATAL. The flag accepts a comma-delimited list or can be specified multiple times.
94b867b to
f3e946d
Compare
aa-joshi
left a comment
There was a problem hiding this comment.
@aa-joshi made 5 comments and resolved 1 discussion.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on Abhinav1299, arjunmahishi, and kyle-a-wong).
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( |
There was a problem hiding this comment.
SeverityByName func is converting to upper case and then performing validation.
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( | ||
| "unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL", | ||
| name, |
There was a problem hiding this comment.
The UNKNOWN and NONE log severities for unexpected and rare events. It means that we can not determine the log severity. In such situation, it make sense to include log of such severities all the time.
| // Validate excluded log severities. | ||
| for _, name := range zipCtx.excludeLogSeverities { | ||
| if _, ok := logpb.SeverityByName(name); !ok { | ||
| return errors.Newf( | ||
| "unknown log severity %q; valid values are INFO, WARNING, ERROR, FATAL", | ||
| name, | ||
| ) | ||
| } | ||
| } | ||
|
|
There was a problem hiding this comment.
I thought about it. With your suggestion, the ZipCtx would have redundant declaration: one for array of string and another one for array of logSeverity. I am trying to avoid it here.
pkg/cli/cliflags/flags.go
Outdated
| } | ||
|
|
||
| ZipExcludeLogSeverity = FlagInfo{ | ||
| Name: "exclude-log-severity", |
| <PRE> | ||
|
|
||
| </PRE> |
There was a problem hiding this comment.
It is intentional to avoid text wrapping in --help command.
Abhinav1299
left a comment
There was a problem hiding this comment.
LGTM
few NITs:
- Can we update the commit and PR description to have
--exclude-log-severitiesinstead of--exclude-log-severity. - Can you check the failing CI check.
aa-joshi
left a comment
There was a problem hiding this comment.
TFTR!
@aa-joshi made 1 comment.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on Abhinav1299, arjunmahishi, and kyle-a-wong).
Add an
--exclude-log-severitiesflag todebug zipthat filters outlog entries by severity server-side. For large clusters, INFO-level
logs can be extremely voluminous and often aren't needed for debugging.
This flag allows operators to exclude specific severity levels
(e.g.
--exclude-log-severities=INFO) to reduce both network transferand zip file size.
The implementation adds a new
exclude_severitiesrepeated field tothe
LogFileRequestprotobuf message. TheLogFile()RPC handlerbuilds an exclusion set and skips matching entries during the decode
loop. On the CLI side, severity names are validated early in
runDebugZipand converted to proto values before being passed ineach per-node
LogFileRequest.Rolling upgrade compatibility is maintained without a version gate:
old servers ignore the unknown protobuf field (graceful degradation),
and old clients send an empty field (no filtering).
Epic: none
Fixes: CRDB-61602
Release note (cli change): Added
--exclude-log-severitiesflag tocockroach debug zipthat filters log entries by severity server-side.For example,
--exclude-log-severities=INFOexcludes all INFO-level logentries from the collected log files, which can significantly reduce
zip file size for large clusters. Valid severity names are INFO,
WARNING, ERROR, and FATAL. The flag accepts a comma-delimited list or
can be specified multiple times.