Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deadlock in ready test #116251

Merged
merged 1 commit into from Mar 3, 2023
Merged

Conversation

wojtek-t
Copy link
Member

@wojtek-t wojtek-t commented Mar 3, 2023

Fix #116225

NONE

/kind flake
/sig api-machinery
/priority important-soon

/assign @aojea

@k8s-ci-robot k8s-ci-robot added the release-note-none Denotes a PR that doesn't merit a release note. label Mar 3, 2023
@k8s-ci-robot k8s-ci-robot added kind/flake Categorizes issue or PR as related to a flaky test. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 3, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 3, 2023
@aojea
Copy link
Member

aojea commented Mar 3, 2023

yeah, it can deadlock, great catch
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: ca84027d46482b36b29888ae52b43a99c9224382

@aojea
Copy link
Member

aojea commented Mar 3, 2023

/hold

this still can race

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 3, 2023
@aojea
Copy link
Member

aojea commented Mar 3, 2023

if there is a deadlock and the context is cancaled it will return an error, the error will make the test fail and be flaky

We need to ensure that the ready is set to true, check this diff , I think that this will f

EDIT - the code was completely wrong

EDIT2 -- this works

diff --git a/staging/src/k8s.io/apiserver/pkg/storage/cacher/ready_test.go b/staging/src/k8s.io/apiserver/pkg/storage/cacher/ready_test.go
index e8720e2ad32..67dcb1a6a1c 100644
--- a/staging/src/k8s.io/apiserver/pkg/storage/cacher/ready_test.go
+++ b/staging/src/k8s.io/apiserver/pkg/storage/cacher/ready_test.go
@@ -18,6 +18,7 @@ package cacher
 
 import (
        "context"
+       "sync"
        "testing"
        "time"
 )
@@ -82,21 +83,25 @@ func Test_newReadyRacy(t *testing.T) {
        ready := newReady()
        ready.set(false)
 
-       ctx, cancel := context.WithTimeout(context.Background(), 5 * time.Second)
-       defer cancel()
-
+       var wg sync.WaitGroup
        for i := 0; i < concurrency; i++ {
+               wg.Add(2)
                go func() {
-                       errCh <- ready.wait(ctx)
+                       errCh <- ready.wait(context.Background())
                }()
                go func() {
+                       defer wg.Done()
                        ready.set(false)
                }()
                go func() {
+                       defer wg.Done()
                        ready.set(true)
                }()
        }
+       // last one has to set ready to true
+       wg.Wait()
         ready.set(true)
tress ./cacher.test -test.run Test_newReadyRacy
5s: 1116 runs so far, 0 failures
10s: 2275 runs so far, 0 failures
15s: 3419 runs so far, 0 failures
20s: 4548 runs so far, 0 failures
25s: 5704 runs so far, 0 failures
30s: 6859 runs so far, 0 failures
35s: 8041 runs so far, 0 failures
40s: 9186 runs so far, 0 failures
45s: 10353 runs so far, 0 failures

@wojtek-t
Copy link
Member Author

wojtek-t commented Mar 3, 2023

if there is a deadlock and the context is cancaled it will return an error,

ok - good catch, forgot about it.

Yeah - what you have makes sense - changing.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 3, 2023
@k8s-ci-robot k8s-ci-robot requested a review from aojea March 3, 2023 15:47
@aojea
Copy link
Member

aojea commented Mar 3, 2023

/lgtm
/hold cancel

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. labels Mar 3, 2023
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 585146b78161861f4ac3d832ef83fe96e9c4d754

@k8s-ci-robot k8s-ci-robot merged commit a1b12e4 into kubernetes:master Mar 3, 2023
@k8s-ci-robot k8s-ci-robot added this to the v1.27 milestone Mar 3, 2023
@fedebongio
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 7, 2023
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2023
…1-upstream-release-1.24

Automated cherry pick of #116251: Fix deadlock in ready test
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2023
…1-upstream-release-1.25

Automated cherry pick of #116251: Fix deadlock in ready test
k8s-ci-robot added a commit that referenced this pull request Jul 6, 2023
…1-upstream-release-1.26

Automated cherry pick of #116251: Fix deadlock in ready test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/flake Categorizes issue or PR as related to a flaky test. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note-none Denotes a PR that doesn't merit a release note. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Flake] Test_newReadyRacy flaking
4 participants