-
Notifications
You must be signed in to change notification settings - Fork 38.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shorten waits in TestWatchStreamSeparation #123888
base: master
Are you sure you want to change the base?
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: serathius The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
err = cacher.storage.RequestWatchProgress(metadata.NewOutgoingContext(context.Background(), contextMetadata)) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
// Give time for bookmark to arrive | ||
time.Sleep(time.Second) | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make sense to stress
the test to make sure it is not flaky ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Run the test couple of times, however I forgot to use stress
let me test it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(stress
with a binary compiled with / without -race
is most helpful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done:
kubernetes $ rm cacher.test
kubernetes $ go test ./staging/src/k8s.io/apiserver/pkg/storage/cacher/ --run TestWatchStreamSeparation -c -race
kubernetes $ stress ./cacher.test --test.run TestWatchStreamSeparation
5s: 0 runs so far, 0 failures
10s: 12 runs so far, 0 failures
15s: 12 runs so far, 0 failures
20s: 24 runs so far, 0 failures
25s: 29 runs so far, 0 failures
30s: 36 runs so far, 0 failures
35s: 46 runs so far, 0 failures
40s: 48 runs so far, 0 failures
45s: 59 runs so far, 0 failures
50s: 63 runs so far, 0 failures
55s: 72 runs so far, 0 failures
1m0s: 78 runs so far, 0 failures
1m5s: 84 runs so far, 0 failures
1m10s: 94 runs so far, 0 failures
1m15s: 98 runs so far, 0 failures
1m20s: 108 runs so far, 0 failures
1m25s: 116 runs so far, 0 failures
1m30s: 122 runs so far, 0 failures
1m35s: 129 runs so far, 0 failures
1m40s: 135 runs so far, 0 failures
1m45s: 143 runs so far, 0 failures
^C
@@ -2396,7 +2396,7 @@ func TestWatchStreamSeparation(t *testing.T) { | |||
defer cacher.watchCache.RUnlock() | |||
return cacher.watchCache.resourceVersion | |||
} | |||
waitContext, cancel := context.WithTimeout(context.Background(), 2*time.Second) | |||
waitContext, cancel := context.WithTimeout(context.Background(), time.Second) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this (ctx) only matters when something goes wrong and the fails, right ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed, this is the upper-bound we will wait for before failing the test by returning a blank RV. I don't think shortening this will actually make the test pass faster, just increase the possibility of failing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will because there are subtests where we expect to not get bookmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see. are we confident a second is long enough to not flake on the tests where we do expect a bookmark event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It will because there are subtests where we expect to not get bookmark.
Actually, if we choose our wait time differently for these two types of tests, we can leave the wait as-is (2 seconds) when we do expect to receive a bookmark, and shorten the wait more aggressively (500ms? 250ms?) when we don't expect to receive a bookmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't such asymmetry make the test less accurate? I understand that we can lower the timeout when we don't expect the bookmark as we don't need to wait, however on the other hand it makes it easy to miss if a regression is introduce later. In my opinion making a comparison test like this, we should keep the setup symmetrical to be sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we'd lower it so much it would be easy to miss a regression. We should pick a time large enough that consistently would receive an incorrect bookmark event and fail the test.
xref: #123685 |
err = cacher.storage.RequestWatchProgress(metadata.NewOutgoingContext(context.Background(), contextMetadata)) | ||
if err != nil { | ||
t.Fatal(err) | ||
} | ||
// Give time for bookmark to arrive | ||
time.Sleep(time.Second) | ||
time.Sleep(100 * time.Millisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why do we need to sleep at all if the next thing we do is call waitForEtcdBookmark()
which blocks until the event is received?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is for the bookmark in watch cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but... waitForEtcdBookmark() will wait until we receive it, right? why do we need to sleep?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure if this is related but 100ms
wasn't enough for #123926
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, increased to 200ms
/triage accepted |
a80adcc
to
741a58d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
historically, millisecond-level timing has proven flaky in resource-constrained CI environments
I really don't think we should try to time these sleeps that tightly.
For positive tests, where we expect to receive an event, can we eliminate the sleep entirely and switch to event-based or poll-based checks (on a short interval, e.g. 10-50ms)?
For negative tests, where we expect not to receive an event, can we pick a sleep period ~10x the normal time it takes to receive the event that we will wait to ensure we didn't get an unexpected event?
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind cleanup
Ref #123850
Before
After