-
Notifications
You must be signed in to change notification settings - Fork 573
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cloud_storage/ducktape: allow bad logs in scope #14143
cloud_storage/ducktape: allow bad logs in scope #14143
Conversation
/ci-repeat 10 |
f325a59
to
d21bc72
Compare
/ci-repeat 10 |
d21bc72
to
226650f
Compare
f026336
to
a648865
Compare
a648865
to
a8ea3ed
Compare
leaving this PR up as the tooling could be useful, but removed the changes for #14132 because it might need change in the C++ code. |
Usually a test will ignore bad log strings throughout the test run. However there are instances where we know exactly when the bad log lines are acceptable but not elsewhere. This change adds a bad log line matcher which is initialized with the expression to match but no timestamps. Later the test can use the new decorator added here to mark the time ranges where the expression is acceptable. When searching for bad log lines to ignore, both the expression and the datetime range must match the log entry for it to be allowed.
a8ea3ed
to
109a1de
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mixed feelings. Tests that make assumptions about time/clocks are generally flaky and should be avoided. This one makes even more assumptions. I.e. that clocks are synchronised to millisecond precision across machines (in CDT ducktape runs on distinct machine from Redpanda). Truncating time and adding tolerances I think will make it just worse. I'd prefer the following alternatives over this one:
- Ignore specific error loglines for the entire test
- Change log level to WARN, quoting our internal logging guide (Logging in Redpanda Core on the wiki): "Issues with external systems (S3, Kafka Clients) are not considered errors: they are warnings."
Sorry, I missed the review. The idea here was to add some sort of tolerance to a specific error in a scope, which is unfortunately tied to time. This is useful in cases where for example we toggle a feature on and off which causes an error. So this is better than the alternative, instead of saying the error is acceptable for the entire run of the test, we are saying it is only acceptable for a shorter time span. In worst case scenario we may get false negatives due to the time differences. However I do not see the utility of this any more as the original issue will not be served by this change, so I will close the PR. |
A utility is added which allows logs to be ignored in a specific time frame.
The allow list is added to the cluster decorator:
Then within that test:
Backports Required
Release Notes