Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#50102 Task 3: Until, backed by retry watcher #67350

Merged
merged 3 commits into from
Feb 27, 2019

Conversation

tnozicka
Copy link
Contributor

@tnozicka tnozicka commented Aug 13, 2018

What this PR does / why we need it:
This is a split off from #50102 to go in smaller pieces.

Introduces Until based on RetryWatcher. It can survive closed watches if last read ResourceVersion is still present in etcd.

Fixes: #31345

Requires:

/hold

Release note:

watch.Until now works for long durations.

/priority important-soon
/kind bug
(bug after the main PR which is this split from)

@k8s-ci-robot k8s-ci-robot added release-note-none Denotes a PR that doesn't merit a release note. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. kind/bug Categorizes issue or PR as related to a bug. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Aug 13, 2018
@tnozicka tnozicka changed the title [WIP] - #50102 Task 3: Until backed by retry watcher [WIP] - #50102 Task 3: Until, backed by retry watcher Aug 13, 2018
@spiffxp
Copy link
Member

spiffxp commented Oct 29, 2018

ping, curious if any progress is being planned for this release cycle

@spiffxp
Copy link
Member

spiffxp commented Oct 29, 2018

/skip
trying to clear the Submit Queue status

@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. area/apiserver area/kubectl sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Jan 3, 2019
@tnozicka tnozicka force-pushed the retry-watcher branch 3 times, most recently from 6dd7303 to 363ae00 Compare January 4, 2019 11:30
@tnozicka tnozicka mentioned this pull request Jan 4, 2019
5 tasks
@wojtek-t wojtek-t self-assigned this Jan 4, 2019
@tnozicka
Copy link
Contributor Author

tnozicka commented Jan 4, 2019

/milestone v1.14

@tnozicka tnozicka force-pushed the retry-watcher branch 2 times, most recently from 0854316 to f048308 Compare February 26, 2019 14:15
@liggitt
Copy link
Member

liggitt commented Feb 26, 2019

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 26, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt, tnozicka

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 26, 2019
@tnozicka
Copy link
Contributor Author

@liggitt thank you for reviewing this in detail!

@tnozicka
Copy link
Contributor Author

moving the ResultChan close has confused unit test, fixed:

diff --git a/staging/src/k8s.io/client-go/tools/watch/retrywatcher_test.go b/staging/src/k8s.io/client-go/tools/watch/retrywatcher_test.go
index 7e09fa953e..cd57e51c52 100644
--- a/staging/src/k8s.io/client-go/tools/watch/retrywatcher_test.go
+++ b/staging/src/k8s.io/client-go/tools/watch/retrywatcher_test.go
@@ -541,8 +541,10 @@ func TestRetryWatcher(t *testing.T) {
 			// but have to tolerate some delay. Given this is best effort detection we can use short duration.
 			// It also makes sure that for 0 events the watchFunc has time to be called.
 			select {
-			case event := <-watcher.ResultChan():
-				t.Error(spew.Errorf("Unexpected event received after reading all the expected ones: %#+v", event))
+			case event, ok := <-watcher.ResultChan():
+				if ok {
+					t.Error(spew.Errorf("Unexpected event received after reading all the expected ones: %#+v", event))
+				}
 			case <-time.After(10 * time.Millisecond):
 				break
 			}

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 26, 2019
@liggitt
Copy link
Member

liggitt commented Feb 26, 2019

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 26, 2019
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

1 similar comment
@fejta-bot
Copy link

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

@k8s-ci-robot k8s-ci-robot merged commit 9059021 into kubernetes:master Feb 27, 2019
Deflaking kubernetes e2e tests automation moved this from Inbox (unprioritized) to Done Feb 27, 2019
@tnozicka tnozicka deleted the retry-watcher branch February 27, 2019 06:34
@nikhita
Copy link
Member

nikhita commented Feb 27, 2019

Is this a flake? #56876 (comment)

(can't reproduce locally)

Edit: just to be clear, this is not flaking on k/k's master... https://testgrid.k8s.io/sig-release-master-blocking#integration-master is green :)

@nikhita
Copy link
Member

nikhita commented Feb 27, 2019

cc @sttts @dims
re above ^ question, because it is a publishing-bot failure

@liggitt
Copy link
Member

liggitt commented Feb 27, 2019

yes, looks like there are some races:

go test ./vendor/k8s.io/client-go/tools/watch -run TestRetryWatcher -count 1000 -race -v -failfast

produced at least these failures:

--- FAIL: TestRetryWatcher/survives_4_closed_watches_and_reads_4_items_for_nonconsecutive,_spread_RVs_and_skips_those_with_lower_or_equal_RV (0.03s)
    retrywatcher_test.go:554: expected 5 watcher starts, but it has started 4 times
--- FAIL: TestRetryWatcher/survives_1_closed_watch_and_reads_1_item (0.02s)
    retrywatcher_test.go:554: expected 2 watcher starts, but it has started 1 times
--- FAIL: TestRetryWatcher/survives_2_closed_watches_and_reads_2_items (0.04s)
    retrywatcher_test.go:554: expected 3 watcher starts, but it has started 2 times

@tnozicka
Copy link
Contributor Author

I'll fix send a fix shortly, it's a race in the unit tests when counting the watch calls (the last one is theoretically optional based on timing). Looks like I've used too small -count when testing it.

@tnozicka
Copy link
Contributor Author

fix is in #74663

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/apiserver area/deflake Issues or PRs related to deflaking kubernetes tests area/kubectl cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/bug Categorizes issue or PR as related to a bug. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/auth Categorizes an issue or PR as relevant to SIG Auth. sig/cli Categorizes an issue or PR as relevant to SIG CLI. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Using wait.Until doesn't work for long durations
8 participants