Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix data race in kubelet volume test: add lock for ut #104069

Merged
merged 1 commit into from Sep 10, 2021

Conversation

pacoxu
Copy link
Member

@pacoxu pacoxu commented Aug 2, 2021

/kind flake
Fixes #104057

NONE

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. kind/flake Categorizes issue or PR as related to a flaky test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. area/kubelet labels Aug 2, 2021
@k8s-ci-robot k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Aug 2, 2021
@ehashman ehashman added this to Triage in SIG Node CI/Test Board Aug 2, 2021
@SergeyKanzhelev SergeyKanzhelev moved this from Triage to PRs - Needs Reviewer in SIG Node CI/Test Board Aug 4, 2021
@SergeyKanzhelev
Copy link
Member

/assign @manugupt1

@k8s-ci-robot
Copy link
Contributor

@SergeyKanzhelev: GitHub didn't allow me to assign the following users: manugupt1.

Note that only kubernetes members, repo collaborators and people who have commented on this issue/PR can be assigned. Additionally, issues/PRs can only have 10 assignees at the same time.
For more information please see the contributor guide

In response to this:

/assign @manugupt1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -105,6 +105,11 @@ func (f *fakePodWorkers) ShouldPodRuntimeBeRemoved(uid types.UID) bool {
defer f.statusLock.Unlock()
return f.removeRuntime[uid]
}
func (f *fakePodWorkers) SetPodRuntimeBeRemoved(uid types.UID) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PodWorkers interface does not set SetPodRuntimeBeRemoved.
Does it make sense for this to be a private implementation?

f.setPodRuntimeBeRemoved

Another question for my own understanding; is what happens in the original implementation? Does this wait for volumes to be unmounted (I understand its in the comments, but wanted to double check).

Copy link
Member Author

@pacoxu pacoxu Aug 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To fix it gracefully, I think we need to use something like waitForVolumeUnmount to wait for fake unmount.

SetPodRuntimeBeRemoved is more like unmount runtime by test itself

I only fixed the thread-safe problem for current logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@matthyx matthyx moved this from PRs - Needs Reviewer to PRs Waiting on Author in SIG Node CI/Test Board Aug 12, 2021
@manugupt1
Copy link
Contributor

/lgtm

@k8s-ci-robot
Copy link
Contributor

@manugupt1: changing LGTM is restricted to collaborators

In response to this:

/lgtm

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Contributor

@adisky adisky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should fix the race condition
/lgtm

WARNING: DATA RACE
Write at 0x00c00080eae8 by goroutine 113:
  k8s.io/kubernetes/pkg/kubelet.TestVolumeUnmountAndDetachControllerDisabled()
      /home/sharmaad/go/src/k8s.io/kubernetes/pkg/kubelet/kubelet_volumes_test.go:319 +0xc73
  testing.tRunner()
      /usr/local/go/src/testing/testing.go:1193 +0x202

Previous read at 0x00c00080eae8 by goroutine 132:
  k8s.io/kubernetes/pkg/kubelet.(*fakePodWorkers).ShouldPodRuntimeBeRemoved()
      /home/sharmaad/go/src/k8s.io/kubernetes/pkg/kubelet/pod_workers_test.go:106 +0xbc

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 20, 2021
@adisky adisky moved this from PRs Waiting on Author to PRs - Needs Approver in SIG Node CI/Test Board Aug 20, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Sep 1, 2021

/assign @mrunalp
for approval

Signed-off-by: Paco Xu <paco.xu@daocloud.io>
Co-authored-by: Jian Zeng <zengjian.zj@bytedance.com>
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2021
@pacoxu pacoxu requested a review from adisky September 1, 2021 09:14
@aojea
Copy link
Member

aojea commented Sep 1, 2021

/lgtm
Thanks

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 1, 2021
@pacoxu pacoxu added this to Needs Approver in SIG Node PR Triage Sep 2, 2021
@pacoxu
Copy link
Member Author

pacoxu commented Sep 2, 2021

/triage accepted
/priority important-longterm

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Sep 2, 2021
Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@manugupt1
Copy link
Contributor

/assign mrunalp
as it has enough lgtm

@SergeyKanzhelev SergeyKanzhelev moved this from PRs - Needs Approver to Archive-it in SIG Node CI/Test Board Sep 8, 2021
@SergeyKanzhelev SergeyKanzhelev moved this from Archive-it to PRs - Needs Approver in SIG Node CI/Test Board Sep 8, 2021
@mrunalp
Copy link
Contributor

mrunalp commented Sep 10, 2021

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mrunalp, pacoxu, SergeyKanzhelev

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 10, 2021
@k8s-ci-robot k8s-ci-robot merged commit 5724484 into kubernetes:master Sep 10, 2021
SIG Node CI/Test Board automation moved this from PRs - Needs Approver to Done Sep 10, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 10, 2021
@pacoxu pacoxu deleted the fix-data-race-104057 branch May 10, 2022 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-longterm Important over the long term, but may not be staffed and/or may need multiple releases to complete. release-note-none Denotes a PR that doesn't merit a release note. sig/node Categorizes an issue or PR as relevant to SIG Node. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
Archived in project
SIG Node PR Triage
Needs Approver
Development

Successfully merging this pull request may close these issues.

The test TestVolumeUnmountAndDetachControllerDisabled test if failing from kubelet_volumes_test.go
7 participants