Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix flake for volume detach metrics #53664

Merged

Conversation

gnufied
Copy link
Member

@gnufied gnufied commented Oct 10, 2017

Fixes #53596

cc @kubernetes/sig-storage-pr-reviews @msau42

@k8s-ci-robot
Copy link
Contributor

@gnufied: Adding do-not-merge/release-note-label-needed because the release note process has not been followed.

One of the following labels is required "release-note", "release-note-action-required", or "release-note-none".
Please see: https://github.com/kubernetes/community/blob/master/contributors/devel/pull-requests.md#write-release-notes-if-needed.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 10, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 10, 2017
@gnufied
Copy link
Member Author

gnufied commented Oct 10, 2017

/release-note-none

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Oct 10, 2017
@@ -100,7 +100,7 @@ var _ = SIGDescribe("[Serial] Volume metrics", func() {
backoff := wait.Backoff{
Duration: 10 * time.Second,
Factor: 1.2,
Steps: 3,
Steps: 8,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this gives ~40s for detach, is it enough to stop flakes? ebs is given max 21 steps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't that ~80s @rootfs ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually because it is exponentially backing off the actual time given is 260 seconds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plugging these numbers into ExponentialBackoff, the max duration is 36s, entire duration is 154s. This should be sufficient for most cases. But since test flakes are at the mercy of cloud providers, let's sync with the same max steps in ebs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack done.

// Usually a pod deletion does not mean immediate volume detach
// we will have to retry to verify volume_detach metrics
_, detachMetricFound := updatedStorageMetrics["volume_detach"]
if metricCount < 3 || !detachMetricFound {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the number 3 it obvious? Not to me but I am not knowledgeable in metrics. Perhaps this should be a descriptive constant?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in test below I am measuring 3 metrics and hence I expect at least 3 metrics to be present in the result. But because volume_detach metric will be generated at very last, i am separately checking that in case code generates other metrics.

So 3 isn't really a magic number - just number of metrics test is checking. i can document it.

@gnufied gnufied force-pushed the fix-volume-detach-metric-flake branch from 2cbaa3a to 6f0c98b Compare October 11, 2017 13:45
@rootfs
Copy link
Contributor

rootfs commented Oct 11, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 11, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: gnufied, rootfs

Associated issue: 53596

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@gnufied
Copy link
Member Author

gnufied commented Oct 11, 2017

/test pull-kubernetes-e2e-gce-bazel

@k8s-github-robot
Copy link

Automatic merge from submit-queue (batch tested with PRs 51677, 53690, 53025, 53717, 53664). If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit 34ceb5b into kubernetes:master Oct 11, 2017
@jpbetz jpbetz added this to the v1.8 milestone Oct 11, 2017
@jpbetz jpbetz added cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cherrypick-candidate labels Oct 11, 2017
@jpbetz jpbetz removed cherry-pick-approved Indicates a cherry-pick PR into a release branch has been approved by the release branch manager. cherrypick-candidate labels Oct 11, 2017
@jpbetz jpbetz removed this from the v1.8 milestone Oct 11, 2017
k8s-github-robot pushed a commit that referenced this pull request Oct 13, 2017
…-origin-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #53664

Cherry pick of #53664 on release-1.8.

#53664: Fix flake for volume detach metrics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants