TRT-2673: Filter NoExecuteTaintManager disruption from backend-disruption.json#31248
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: automatic mode |
|
@smg247: This pull request references TRT-2673 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository YAML (base), Central YAML (inherited) Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (1)
WalkthroughThis PR extracts disruption interval filtering logic into a shared module to eliminate duplication. A new ChangesDisruption filter extraction and integration
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes 🚥 Pre-merge checks | ✅ 14 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (14 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
The NoExecuteTaintManager serial test applies NoExecute taints to worker nodes, evicting pods without matching tolerations. This causes ~20s of expected disruption for backends like image-registry when both replicas happen to be on the tainted nodes. PR openshift#30855 added a filter to exclude this disruption from JUnit test evaluation, but the filter was not applied to the disruption serializer that writes backend-disruption.json. This file feeds disruption dashboards, so the expected disruption continued to appear there. Extract FilterOutKnownDisruptiveTestIntervals into a new disruptionfilter package (to avoid import cycles with the utility package) and apply it in the disruption serializer's WriteContentToStorage before computing backend-disruption.json. Disruption that overlaps with NoExecuteTaintManager test windows will no longer appear in the serialized data or on dashboards. This means disruption during known-disruptive serial tests is fully excluded from both JUnit evaluation and dashboard data. If finer-grained visibility is desired in the future (e.g. showing expected disruption in a different color on the Sippy intervals chart), that would be a separate effort to annotate rather than filter these intervals. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
c81a4a6 to
7e8ff9d
Compare
|
/lgtm |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: neisw, smg247 The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/retest-required |
|
Scheduling required tests: |
|
/retest-required |
|
/verified bypass |
|
@smg247: The DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@smg247: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
The NoExecuteTaintManager serial test applies NoExecute taints to worker nodes, evicting pods without matching tolerations. This causes ~20s of expected disruption for backends like image-registry when both replicas happen to be on the tainted nodes.
#30855 added a filter to exclude this disruption from JUnit test evaluation, but the filter was not applied to the disruption serializer that writes backend-disruption.json. This file feeds disruption dashboards, so the expected disruption continued to appear there.
Move FilterOutKnownDisruptiveTestIntervals to the shared utility package and apply it in the disruption serializer's WriteContentToStorage before computing backend-disruption.json. Disruption that overlaps with NoExecuteTaintManager test windows will no longer appear in the serialized data or on dashboards.
This means disruption during known-disruptive serial tests is fully excluded from both JUnit evaluation and dashboard data. If finer-grained visibility is desired in the future (e.g. showing expected disruption in a different color on the Sippy intervals chart), that would be a separate effort to annotate rather than filter these intervals.
It is still unclear why we are only noticing this on
4.22inGCP, but this should keep it from appearing at all...Summary by CodeRabbit
Release Notes