Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update test/e2e/storage for new GetReadySchedulableNodes stuff #83480

Merged
merged 1 commit into from
Oct 9, 2019

Conversation

danwinship
Copy link
Contributor

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Spinoff of #82291 with just the changes to test/e2e/storage/, which were causing pull-kubernetes-e2e-gce-storage-slow to fail for mysterious reasons...

Does this PR introduce a user-facing change?:

NONE

/cc

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Oct 3, 2019
@k8s-ci-robot
Copy link
Contributor

@danwinship: GitHub didn't allow me to request PR reviews from the following users: danwinship.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
Spinoff of #82291 with just the changes to test/e2e/storage/, which were causing pull-kubernetes-e2e-gce-storage-slow to fail for mysterious reasons...

Does this PR introduce a user-facing change?:

NONE

/cc

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. area/test sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Oct 3, 2019
@danwinship
Copy link
Contributor Author

/retest

@danwinship
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-storage-slow

@oomichi
Copy link
Member

oomichi commented Oct 4, 2019

/cc @oomichi

Copy link
Member

@oomichi oomichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

curious why pull-kubernetes-e2e-gce-storage-slow was failed..

/test pull-kubernetes-e2e-gce-storage-slow

framework.Skipf("Requires at least %d node", 1)
}
nodeInfo = TestContext.NodeMapper.GetNodeInfo(nodes.Items[0].Name)
nodeInfo = GetReadySchedulableRandomNodeInfo()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The original code made skip if the node number is 0, and the changed code makes failure with gomega.Expect(nodesInfo).NotTo(gomega.BeEmpty()) in GetReadySchedulableRandomNodeInfo().
I guess the new behavior is not bad because that is the same as the other test code.
This is just a comment if somebody face this new behavior.

@@ -72,7 +72,6 @@ var _ = utils.SIGDescribe("vsphere cloud provider stress [Feature:vsphere]", fun
gomega.Expect(instances > len(scNames)).To(gomega.BeTrue(), "VCP_STRESS_INSTANCES should be greater than 3 to utilize all 4 types of storage classes")

iterations = GetAndExpectIntEnvVar(VCPStressIterations)
framework.ExpectNoError(err, "Error Parsing VCP_STRESS_ITERATIONS")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch

maxLen := len(nodes.Items)
if maxLen > maxNodes {
maxLen = maxNodes
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it fine to remove this capping way?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new code uses the new GetBoundedReadySchedulableNodes, which does the capping for you

@danwinship
Copy link
Contributor Author

curious why pull-kubernetes-e2e-gce-storage-slow was failed..

Yeah, it seems to time out, but I couldn't figure out why. (Also, it seems that it doesn't always time out; it passed on the first run.)

@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Oct 5, 2019
@danwinship
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-storage-slow

@danwinship
Copy link
Contributor Author

@oomichi is it possible that pull-kubernetes-e2e-gce-storage-slow is just super flaky? It seems like there are lots of failures in other PR's runs as well...

@oomichi
Copy link
Member

oomichi commented Oct 8, 2019

@danwinship

@oomichi is it possible that pull-kubernetes-e2e-gce-storage-slow is just super flaky?
It seems like there are lots of failures in other PR's runs as well...

Ah, that make sense.
I am checking the job condition with #83623

@oomichi
Copy link
Member

oomichi commented Oct 8, 2019

rerun for checking current situation of the job:

/test pull-kubernetes-e2e-gce-storage-slow

Copy link
Member

@oomichi oomichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/assign @oomichi

nodeList := framework.GetReadySchedulableNodesOrDie(f.ClientSet)
node := nodeList.Items[0]
node, err := e2enode.GetRandomReadySchedulableNode(f.ClientSet)
framework.ExpectNoError(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I got the point why the pull-kubernetes-e2e-gce-storage-slow job was failed.
At the original line 71, nodeName is always the first node of the list.
And the label is added to the first node at original line 82.
In addition, flexvolume is installed into the node which is selected here.

After applying this change, both nodes of the original line 71 and 128 are selected randomly.
So if they are different, the test is failed.
That means the original code always selects the same node at these lines.

So it is better to change here and line 69 like

nodeList, err := e2enode.GetReadySchedulableNodes(f.ClientSet)
framework.ExpectNoError(err)
node := nodeList.Items[0]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aha! I was originally thinking it must be something to do with the introduction of node randomization, but I missed that there was dependency between the two GetReadySchedulableNodes calls in this file.

I kept the use of GetRandomReadySchedulableNode in the BeforeEach and just made it save the selected node and use the same one again in the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updating, that makes sense :-)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2019
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Oct 9, 2019
@danwinship danwinship changed the title WIP Update test/e2e/storage for new GetReadySchedulableNodes stuff Update test/e2e/storage for new GetReadySchedulableNodes stuff Oct 9, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 9, 2019
@danwinship
Copy link
Contributor Author

/test pull-kubernetes-e2e-gce-storage-slow

@oomichi
Copy link
Member

oomichi commented Oct 9, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 9, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, oomichi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2019
@k8s-ci-robot k8s-ci-robot merged commit 0a98ccb into kubernetes:master Oct 9, 2019
@k8s-ci-robot k8s-ci-robot added this to the v1.17 milestone Oct 9, 2019
@danwinship danwinship deleted the getnodes-storage branch October 9, 2019 19:15
ohsewon pushed a commit to ohsewon/kubernetes that referenced this pull request Oct 16, 2019
Update test/e2e/storage for new GetReadySchedulableNodes stuff
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. release-note-none Denotes a PR that doesn't merit a release note. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants