New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flake for volume detach metrics #53664
Fix flake for volume detach metrics #53664
Conversation
@gnufied: Adding do-not-merge/release-note-label-needed because the release note process has not been followed. One of the following labels is required "release-note", "release-note-action-required", or "release-note-none". Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/release-note-none |
test/e2e/storage/volume_metrics.go
Outdated
@@ -100,7 +100,7 @@ var _ = SIGDescribe("[Serial] Volume metrics", func() { | |||
backoff := wait.Backoff{ | |||
Duration: 10 * time.Second, | |||
Factor: 1.2, | |||
Steps: 3, | |||
Steps: 8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this gives ~40s for detach, is it enough to stop flakes? ebs is given max 21 steps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't that ~80s @rootfs ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually because it is exponentially backing off the actual time given is 260 seconds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
plugging these numbers into ExponentialBackoff
, the max duration is 36s, entire duration is 154s. This should be sufficient for most cases. But since test flakes are at the mercy of cloud providers, let's sync with the same max steps in ebs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack done.
// Usually a pod deletion does not mean immediate volume detach | ||
// we will have to retry to verify volume_detach metrics | ||
_, detachMetricFound := updatedStorageMetrics["volume_detach"] | ||
if metricCount < 3 || !detachMetricFound { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the number 3 it obvious? Not to me but I am not knowledgeable in metrics. Perhaps this should be a descriptive constant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in test below I am measuring 3 metrics and hence I expect at least 3 metrics to be present in the result. But because volume_detach
metric will be generated at very last, i am separately checking that in case code generates other metrics.
So 3 isn't really a magic number - just number of metrics test is checking. i can document it.
2cbaa3a
to
6f0c98b
Compare
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: gnufied, rootfs Associated issue: 53596 The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these OWNERS Files:
You can indicate your approval by writing |
/test pull-kubernetes-e2e-gce-bazel |
Automatic merge from submit-queue (batch tested with PRs 51677, 53690, 53025, 53717, 53664). If you want to cherry-pick this change to another branch, please follow the instructions here. |
Fixes #53596
cc @kubernetes/sig-storage-pr-reviews @msau42