New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a race condition in SharedInformer #59828

Merged
merged 2 commits into from Feb 14, 2018

Conversation

@krousey
Member

krousey commented Feb 13, 2018

What this PR does / why we need it:

This fixes a race condition that can occur in the sharedIndexInformer

Which issue(s) this PR fixes:
Fixes #59822

Release note:

Fixed a race condition in k8s.io/client-go/tools/cache.SharedInformer that could violate the sequential delivery guarantee and cause panics on shutdown.

krousey added some commits Feb 13, 2018

Add started state to the processor to protect against double starts
This prevents a race condition where the sharedIndexInformer was
causeing the processorListener's run and pop method to be started
twice. That violated the SharedInformer's interface guarantee of
sequential delivery and also caused panics on shutdown.
@krousey

This comment has been minimized.

Member

krousey commented Feb 13, 2018

Without the fix, I was able to use golang.org/x/tools/cmd/stress to generate about 76 panics in 31000 runs with the following:

> go test -c
> stress cache.test -test.run=TestSharedInformerInitializationRace

After the fix in the second commit, I got 0 failures in 870,000 runs.

@jennybuckley

This comment has been minimized.

Contributor

jennybuckley commented Feb 13, 2018

/lgtm

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Feb 13, 2018

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jennybuckley, krousey

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these OWNERS Files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Feb 13, 2018

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Feb 14, 2018

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-merge-robot k8s-merge-robot merged commit 6590ea6 into kubernetes:master Feb 14, 2018

13 of 14 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-verify
Details
cla/linuxfoundation krousey authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-bazel-test-canary Job succeeded.
Details
pull-kubernetes-cross Skipped
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-e2e-gke Skipped
pull-kubernetes-e2e-kops-aws Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Job succeeded.
Details
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-unit Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
@smarterclayton

This comment has been minimized.

Contributor

smarterclayton commented Feb 14, 2018

How far does this need to be backported? @kubernetes/sig-api-machinery-bugs

@krousey

This comment has been minimized.

Member

krousey commented Feb 14, 2018

@smarterclayton I came upon this while testing a controller using the 1.8 release libraries. I would like at least that far back to make my life easier.

@wojtek-t

This comment has been minimized.

Member

wojtek-t commented Feb 14, 2018

@krousey - great finding! LGTM

@krousey krousey added this to the v1.8 milestone Feb 14, 2018

k8s-merge-robot added a commit that referenced this pull request Feb 15, 2018

Merge pull request #59891 from jpbetz/automated-cherry-pick-of-#59828-…
…origin-release-1.8

Automatic merge from submit-queue.

Automated cherry pick of #59828: Add a test case for the race in #59822

Cherry pick of #59828 on release-1.8.

#59828: Add a test case for the race in #59822

@krousey krousey removed this from the v1.8 milestone Feb 15, 2018

@krousey krousey added this to the v1.9 milestone Feb 15, 2018

@mbohlool

This comment has been minimized.

Member

mbohlool commented Feb 15, 2018

@k8s-ci-robot

This comment has been minimized.

Contributor

k8s-ci-robot commented Feb 15, 2018

@mbohlool: GitHub didn't allow me to request PR reviews from the following users: wenjiaswe.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

/cc @roycaihw @wenjiaswe

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot requested a review from roycaihw Feb 15, 2018

@roycaihw

This comment has been minimized.

Member

roycaihw commented Feb 16, 2018

/lgtm

@krousey

This comment has been minimized.

Member

krousey commented Feb 20, 2018

@mbohlool can we approve this for the 1.9 branch yet?

@k8s-merge-robot

This comment has been minimized.

Contributor

k8s-merge-robot commented Mar 16, 2018

Commit found in the "release-1.9" branch appears to be this PR. Removing the "cherrypick-candidate" label. If this is an error find help to get your PR picked.

@rmmh

This comment has been minimized.

Contributor

rmmh commented Mar 27, 2018

re-adding cherrypick-candidate for debugging

k8s-merge-robot added a commit that referenced this pull request Mar 27, 2018

Merge pull request #61751 from maisem/automated-cherry-pick-of-#59828-…
…upstream-release-1.9

Automatic merge from submit-queue.

Automated cherry pick of #59828 upstream release 1.9

Cherry pick of #59828 on release-1.9.

#59828 : Fix a race condition in SharedInformer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment