Terminate custom resource watches when storage is destroyed #78029

liggitt · 2019-05-17T14:05:46Z

What type of PR is this?
/kind bug

What this PR does / why we need it:
When a watch cache is stopped, drop all active watchers.

For normal resources, this only happens on apiserver shutdown. For custom resources, this happens whenever a CRD spec is changed, which currently strands active watchers.

Which issue(s) this PR fixes:
Fixes #74105
Fixes #71138

Does this PR introduce a user-facing change?:

Active watches of custom resources now terminate properly if the CRD is modified.

/sig api-machinery
/cc @jpbetz @sttts

liggitt · 2019-05-17T14:06:30Z

/priority critical-urgent

this results in hung watch clients with O(hour) timeouts when CRDs change

k8s-ci-robot · 2019-05-17T14:06:40Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~staging/src/k8s.io/apiextensions-apiserver/OWNERS~~ [liggitt]
~~staging/src/k8s.io/apiserver/pkg/storage/OWNERS~~ [liggitt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher_whitebox_test.go

sttts · 2019-05-17T14:14:54Z

one nit.

Otherwise lgtm.

sttts · 2019-05-17T14:32:58Z

/lgtm

smarterclayton · 2019-05-17T14:37:58Z

Impacted how far back?

liggitt · 2019-05-17T14:47:29Z

Impacted how far back?

Watch cache at least back to 1.12 (I'd expect even further). Custom resources as far back as they used the watch cache (not sure how far that is)

k8s-ci-robot · 2019-05-17T17:34:27Z

New changes are detected. LGTM label has been removed.

liggitt · 2019-05-17T17:34:44Z

fixed gofmt issue

jpbetz · 2019-05-17T21:04:57Z

Looks right. Thanks for finding and fixing.

/lgtm

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher_whitebox_test.go

yliaog · 2019-05-17T22:58:25Z

/lgtm

…9-upstream-release-1.12 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

…9-upstream-release-1.14 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

…9-upstream-release-1.13 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

k8s-ci-robot requested review from jpbetz and sttts May 17, 2019 14:05

k8s-ci-robot added area/apiserver priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels May 17, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 17, 2019

liggitt changed the title ~~Terminate CRD watches when storage is destroyed~~ Terminate custom resource watches when storage is destroyed May 17, 2019

liggitt added the area/custom-resources label May 17, 2019

liggitt added this to Required for GA, in progress in Custom Resource Definitions May 17, 2019

This was referenced May 17, 2019

Changing CRD validation causes existing WATCH requests to silently hang #71138

Closed

Make dynamic shared informer stoppable #77480

Closed

Watch not terminated on CRD deletion #74105

Closed

sttts reviewed May 17, 2019

View reviewed changes

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher.go Show resolved Hide resolved

sttts reviewed May 17, 2019

View reviewed changes

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher_whitebox_test.go Outdated Show resolved Hide resolved

Terminate watchers when watch cache is destroyed

d304c9e

liggitt force-pushed the crd-watch branch from dc38a95 to 47ade3b Compare May 17, 2019 14:22

k8s-ci-robot assigned sttts May 17, 2019

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 17, 2019

liggitt mentioned this pull request May 17, 2019

Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed #78034

Merged

10 tasks

This was referenced May 17, 2019

Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed #78035

Merged

Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed #78036

Merged

liggitt added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. labels May 17, 2019

Add CRD integration test for dropping watches

ea46423

liggitt force-pushed the crd-watch branch from 47ade3b to ea46423 Compare May 17, 2019 17:34

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 17, 2019

liggitt added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label May 17, 2019

k8s-ci-robot assigned jpbetz May 17, 2019

yliaog reviewed May 17, 2019

View reviewed changes

staging/src/k8s.io/apiserver/pkg/storage/cacher/cacher_whitebox_test.go Show resolved Hide resolved

caesarxuchao mentioned this pull request May 17, 2019

Switched to use dynamic shared informer for Garbage Collector. #74440

Merged

k8s-ci-robot assigned yliaog May 17, 2019

k8s-ci-robot merged commit 0f8009b into kubernetes:master May 18, 2019

liggitt moved this from Required for GA, in progress to Complete in Custom Resource Definitions May 20, 2019

liggitt deleted the crd-watch branch May 20, 2019 13:41

k8s-ci-robot added a commit that referenced this pull request May 23, 2019

Merge pull request #78036 from liggitt/automated-cherry-pick-of-#7802…

e09f5c4

…9-upstream-release-1.12 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

k8s-ci-robot added a commit that referenced this pull request May 31, 2019

Merge pull request #78034 from liggitt/automated-cherry-pick-of-#7802…

9150633

…9-upstream-release-1.14 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

k8s-ci-robot added a commit that referenced this pull request May 31, 2019

Merge pull request #78035 from liggitt/automated-cherry-pick-of-#7802…

578e284

…9-upstream-release-1.13 Automated cherry pick of #78029: Terminate watchers when watch cache is destroyed

liggitt mentioned this pull request Jun 4, 2019

CSINodeInfo lister not syncing existing objects #71052

Closed

This was referenced Jun 24, 2019

Backport 77816 78029 openshift/origin#23248

Merged

Bug 1723869: Terminate custom resource watches when storage is destroyed openshift/origin#23267

Merged

liggitt mentioned this pull request Feb 26, 2020

Watch channel does not get closed ever kubernetes/client-go#755

Closed

l1b0k mentioned this pull request Nov 17, 2022

updating CRD causes connection loss with active watches of custom resources #113966

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terminate custom resource watches when storage is destroyed #78029

Terminate custom resource watches when storage is destroyed #78029

liggitt commented May 17, 2019 •

edited

liggitt commented May 17, 2019

k8s-ci-robot commented May 17, 2019

sttts commented May 17, 2019

sttts commented May 17, 2019

smarterclayton commented May 17, 2019

liggitt commented May 17, 2019 •

edited

k8s-ci-robot commented May 17, 2019

liggitt commented May 17, 2019

jpbetz commented May 17, 2019

yliaog commented May 17, 2019

Terminate custom resource watches when storage is destroyed #78029

Terminate custom resource watches when storage is destroyed #78029

Conversation

liggitt commented May 17, 2019 • edited

liggitt commented May 17, 2019

k8s-ci-robot commented May 17, 2019

sttts commented May 17, 2019

sttts commented May 17, 2019

smarterclayton commented May 17, 2019

liggitt commented May 17, 2019 • edited

k8s-ci-robot commented May 17, 2019

liggitt commented May 17, 2019

jpbetz commented May 17, 2019

yliaog commented May 17, 2019

liggitt commented May 17, 2019 •

edited

liggitt commented May 17, 2019 •

edited