Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

try to deflake CR watches in tests #46536

Merged
merged 1 commit into from
Jun 1, 2017

Conversation

deads2k
Copy link
Contributor

@deads2k deads2k commented May 26, 2017

Fixes #46446

I've added a comment trying to explain the reasoning in the code. Without being able to expose the RV of the cache, I can't think of a reliable way to do it. Even if you tried experimenting with a watch, it essentially does this since you'd be waiting to not get an error.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label May 26, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@k8s-github-robot k8s-github-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. release-note-label-needed labels May 26, 2017
@deads2k
Copy link
Contributor Author

deads2k commented May 26, 2017

@k8s-bot pull-kubernetes-e2e-kops-aws test this

@@ -127,6 +127,24 @@ func CreateNewCustomResourceDefinition(customResourceDefinition *apiextensionsv1
if err != nil {
return nil, err
}

// the REST storage isn't created until you hit the endpoint (the list does this),
// at which point the storage cacher for the CR starts asynchronously.
Copy link
Member

@liggitt liggitt May 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it not be better to inject the delay at the point where we build the rest storage? we'd at least have the possibility of fixing the coordination between the list and the cache server-side in the future and removing the delay hack instead of making multiple clients add first-use special casing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it not be better to inject the delay at the point where we build the rest storage? we'd at least have the possibility of fixing the coordination between the list and the cache server-side in the future and removing the delay hack instead of making multiple clients add first-use special casing

No preference. If you want it there I can do it there.

// at which point the storage cacher for the CR starts asynchronously.
// The REST APi list gets the RV of etcd, but the storage cacher's reflector's list
// gets a different RV because etcd can be touched in between.
// Later, you can issue a a watch with the REST apis list.RV and end up earlier than the storage cacher.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a a

@sttts sttts added release-note-none Denotes a PR that doesn't merit a release note. and removed release-note-label-needed labels May 29, 2017
@sttts
Copy link
Contributor

sttts commented May 29, 2017

Not pretty, but if it helps...

@deads2k
Copy link
Contributor Author

deads2k commented May 30, 2017

switched to where we build the REST storage and start the cacher.

@sttts ptal.

// REST API operations return like list use the RV of etcd, but the storage cacher's reflector's list
// can get a different RV because etcd can be touched in between the initial list operation (if that's what you're doing first)
// and the storage cache reflector starting.
// Later, you can issue a watch with the REST apis list.RV and end up earlier than the storage cacher.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now with fixed typo

@sttts
Copy link
Contributor

sttts commented May 30, 2017

@deads2k something is missing. forgot to push?

@liggitt
Copy link
Member

liggitt commented May 30, 2017

LGTM, @sttts what were you expecting?

@sttts
Copy link
Contributor

sttts commented May 30, 2017

@liggitt it's alright. Didn't notice the modified file changed. lgtm

@deads2k deads2k added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 31, 2017
@deads2k
Copy link
Contributor Author

deads2k commented May 31, 2017

@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

@cblecker
Copy link
Member

@deads2k etcd failure is related to #46713

@deads2k
Copy link
Contributor Author

deads2k commented May 31, 2017

@k8s-bot pull-kubernetes-unit test this

@deads2k
Copy link
Contributor Author

deads2k commented Jun 1, 2017

@k8s-bot pull-kubernetes-e2e-gce-etcd3 test this

@deads2k
Copy link
Contributor Author

deads2k commented Jun 1, 2017

@k8s-bot pull-kubernetes-e2e-kops-aws test this

@deads2k deads2k added the kind/flake Categorizes issue or PR as related to a flaky test. label Jun 1, 2017
@deads2k
Copy link
Contributor Author

deads2k commented Jun 1, 2017

flake fix, bumping priority.

@deads2k deads2k added this to the v1.7 milestone Jun 1, 2017
@deads2k
Copy link
Contributor Author

deads2k commented Jun 1, 2017

The flake needs to be fixed.

@fejta
Copy link
Contributor

fejta commented Jun 1, 2017

@k8s-bot pull-kubernetes-federation-e2e-gce test this
ref kubernetes/test-infra#2931

@fejta
Copy link
Contributor

fejta commented Jun 1, 2017

@k8s-bot pull-kubernetes-e2e-kops-aws test this
ref: kubernetes/test-infra#2932

@fejta
Copy link
Contributor

fejta commented Jun 1, 2017

hey bot

@k8s-github-robot
Copy link

Automatic merge from submit-queue

@k8s-github-robot k8s-github-robot merged commit dd897a5 into kubernetes:master Jun 1, 2017
@caesarxuchao
Copy link
Member

@deads2k
Copy link
Contributor Author

deads2k commented Jun 2, 2017

a test started at around 6pm still failed: https://k8s-gubernator.appspot.com/build/kubernetes-jenkins/logs/ci-kubernetes-test-go/6169/

Well that's a bummer. The issue is re-opened.

@deads2k deads2k deleted the crd-07-deflake branch August 3, 2017 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note-none Denotes a PR that doesn't merit a release note. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants