Storageversion API for CRDs (using workqueue) #123999

richabanker · 2024-03-19T20:28:07Z

What type of PR is this?

/kind feature

What this PR does / why we need it:

This PR is based off of #120582.

This is an attempt to drive the StorageVersion API graduation toward beta by addressing requirements/concerns for the same.

Which issue(s) this PR fixes:

Fixes kubernetes/enhancements#2339

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

- [KEP]: https://github.com/kubernetes/enhancements/tree/master/keps/sig-api-machinery/2339-storageversion-api-for-ha-api-servers
- [Logic explained]: https://docs.google.com/document/d/1-wiczjRXdWcJJPoBsxJkpvTkn3iGbJsSeAgzW4UsapM/edit?tab=t.0#heading=h.6fil3z77boj8

k8s-ci-robot · 2024-03-19T20:28:09Z

Adding the "do-not-merge/release-note-label-needed" label because no release-note block was detected, please follow our release note process to remove it.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

test/integration/storageversion/storage_version_crd_test.go

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

jpbetz · 2024-03-20T01:39:10Z

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

+	m.updateStorageVersion(ctx, latestSVUpdateInfo.crd, latestSVUpdateInfo.updateChannels.teardownFinishedCh, latestSVUpdateInfo.updateChannels.processedCh, latestSVUpdateInfo.updateChannels.errCh)
+	markUpdateAsProcessed(latestSVUpdateInfo)
+	return true
+}


Should we re-enqueue? I'm unclear on how the SV update is coordinated when the SV update request times out.

Loosely related: If the CRD update succeeds, and the cluster has only 1 apiserver, and that apiserver crashes after the CRD update but before the SV update, what happens?

Should we re-enqueue?

That might be a good suggestion. We do retries on just the storageversion update here, but there can be failures outside of this function. Like if the teardown of the old storage times out for any reason, do we want try the SV update (along with the teardown) again? Open to suggestions. If we re-enqueue, then for how long after the first failure do we keep doing that?

If the CRD update succeeds, and the cluster has only 1 apiserver, and that apiserver crashes after the CRD update but before the SV update, what happens?

From my understanding, the old handler and old storage would be torn down. When the apiserver comes up again:

new handler (using new storageversion) is created when a CR request arrives

the new CR request triggers waitForStorageVersionUpdate() which tries to fetch the latest entry from the storageVersionUpdateInfoMap

finds nothing there, and returns nil

CR is served using new storage but the storageversion API doesnt reflect that. Storageversion migrator wouldn't trigger the migration, unless a new CRD update happens.

Maybe we should change the behavior at #3. We can queue the SV Update here instead of returning a nil. Thought the SV update might not be processed instantaneously, but we will atleast have it tracked..
But we should also be careful that we dont queue SV update for a deleted CRD.

cc @roycaihw for verifying my understanding and his thoughts on the same.

If we do the above, it also means that whenever a CR write arrives after the server is restarted, we will queue the relevant CRD for SV update regardless of whether its SV was already updated (i.e. before the server crashed)

When a server reboots, it reconciles all the CRDs and update SVs accordingly. New CR write requests should be blocked until the corresponding SV is up-to-date.

I added requeuing for whenever a CRD is not found in the storageversionUpdateInfoMap to accommodate for server bootstrap

Note: The requeuing would only happen when a CR write is observed after the server restarts. Which means that the SV for a CRD would not be reconciled until a corresponding CR write is made. This doesn't seem very ideal to me. Maybe we should instead have a poststarthook that does a list on all available CRDs and queues the SV updates for all CRDs found in the list? @roycaihw @jpbetz

I have modified the existing poststarthook (which currently only waited for the crd informer to sync, to also wait till all crds are enqueued for SV updates).

jpbetz · 2024-03-20T01:46:55Z

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

+}
+
+func (m *Manager) updateLatestStorageVersionUpdateInfo(crdUID types.UID, svUpdateInfo *storageVersionUpdateInfo) {
+	latestStorageVersionUpdateInfo.Store(crdUID, svUpdateInfo)


should entries be deleted from this map? It looks like is only ever grows.

Yeah that part is still pending. Entries are made per CRD(UID) and should be deleted only when a CRD is deleted I think. Any other cases where you think we can delete the entry?

Added functionality to delete entries from the map.

roycaihw · 2024-04-09T17:30:30Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

+
+	// alwaysServe is set to true when a CRD's storageversion is published.
+	// All CR writes for a CRD are allowed if this is set to true.
+	alwaysServe bool


Shall we use atomic.Bool?

Done. Thanks for the suggestion!

roycaihw · 2024-04-09T17:34:55Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

+func isAllowedToServe(crdHandler *crdHandler, crdInfo *crdInfo, crd *apiextensionsv1.CustomResourceDefinition, requestInfo *apirequest.RequestInfo, w http.ResponseWriter, req *http.Request) error {
+	if !utilfeature.DefaultFeatureGate.Enabled(features.StorageVersionAPI) ||
+		!utilfeature.DefaultFeatureGate.Enabled(features.APIServerIdentity) {
+		klog.V(4).Infof("not checking if handler is allowed to serve since StorageVersionAPI and/or APIServerIdentity feature are disabled.")


This would be a lot of logs (for every CR requests) if the verbosity is turned above 4.

Can we only log once at server startup?

Removed this logline.

roycaihw · 2024-04-09T17:38:22Z

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

+	return false, activeTeardownsCount
+}
+
+func (m *Manager) AlreadyPublishedLatestSV(crd *apiextensionsv1.CustomResourceDefinition) (bool, error) {


This function is not checking the "latest". It's checking whether the storage version in the given CRD is reflected in SV. Could you rename the function and document?

Renamed to StorageVersionFoundInCacheFor()

roycaihw · 2024-04-09T18:07:29Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

@@ -571,6 +664,35 @@ func (r *crdHandler) tearDown(oldInfo *crdInfo) {
 		// destroy only the main storage. Those for the subresources share cacher and etcd clients.
 		storage.CustomResource.DestroyFunc()
 	}
+
+	r.updateActiveTeardownsCount(oldInfo.name, -1)


Consider moving this to the top of the function and using a defer method

roycaihw · 2024-04-09T18:10:30Z

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

+	crdName := key.(string)
+	crd, err := m.crdInformer.Lister().Get(crdName)
+	if err != nil {
+		klog.V(4).Infof("crd: %s not found in CRD cache, abandoning storageversion update", crdName)


Will there be errors other than NotFound? Can we distinguish them? If the CRD doesn't exists, we don't need to write SV

Added separate handling for NotFound err - i.e. do not requeue SV update when this happens

deads2k · 2024-04-10T18:02:32Z

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go

+	}
+
+	// TODO: indefinitely requeue on error?
+	m.Enqueue(crd.Name)


This smells like a hot loop. What prevents this from immediately de-queuing on the m.queue.Get()? Perhaps structuring with the separate syncHandler like our example in https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md#rough-structure will help.

deads2k · 2024-04-10T18:10:40Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

+
+	// alwaysServe is set to true when a CRD's storageversion is published.
+	// All CR writes for a CRD are allowed if this is set to true.
+	alwaysServe atomic.Bool


nit: is this allowMutationRequests?

deads2k · 2024-04-10T18:16:44Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

 			err := apierrors.NewMethodNotSupported(schema.GroupResource{Group: requestInfo.APIGroup, Resource: requestInfo.Resource}, requestInfo.Verb)
 			err.ErrStatus.Message = fmt.Sprintf("%v not allowed while custom resource definition is terminating", requestInfo.Verb)
 			responsewriters.ErrorNegotiated(err, Codecs, schema.GroupVersion{Group: requestInfo.APIGroup, Version: requestInfo.APIVersion}, w, req)
 			return nil
 		}
+
+		if err := isAllowedToServe(r, crdInfo, crd, requestInfo, w, req); err != nil {
+			return nil


What writes the response to the client?

What writes the response to the client?

Oh, having an isFoo function write a response is unexpected. Can it refactor to make it practical to write the response here?

deads2k · 2024-04-10T18:18:25Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

+	ok, err := crdHandler.storageVersionManager.StorageVersionFoundInCacheFor(crd)
+	if err != nil {
+		klog.V(4).Infof("handler not allowed to serve this request, err: %v", err)
+		err = apierrors.NewServiceUnavailable(err.Error())


Seems like a 429 would be more seamless.

deads2k · 2024-04-10T18:19:04Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

+		return nil
+	}
+
+	ok, err := crdHandler.storageVersionManager.StorageVersionFoundInCacheFor(crd)


Given an arbitrarily latent list/watch for StorageVersions, how does this prevent serving in a case of

steady state CRD storing v1

steady state StorageVersion encoding as v1

update CRD to storing v2

update StorageVersion encoding as v2, BUT watch event is delayed by five minutes

update CRD to storing v1

CR create arrives and is stored as v1, but StorageVersion indicates v2.

Good catch. I think this is not handled today.

Here's my understanding of this case:

SV was written to v1 correctly, SV cache reflects that.

CRD updated to v2, SV updated to v2, but SV cache is still at v1.

CRD updated to v1 again, CR request received - finds CRD SV to match SV cache (v1) - stores in v1.

the SV worker after 3. will find that CRD cache is at v1, SV cache is at v1, so will not process the v1 update again. Thus SV will be stuck at v2.

So to confirm, is the concern that

the storageversion API will show v2 which is wrong since it should show v1?

CR was allowed to be written in v1 before the SV was updated to v1?

So to confirm, is the concern that

1. the storageversion API will show v2 which is wrong since it should show v1? 2. CR was allowed to be written in v1 before the SV was updated to v1?

Well a fix could come in either direction, but I guess the more specific problem is CR was allowed to be written in v1 before the storageversion was updated to v1.

Had a proposal to address this issue (#124543 (review), doc). @deads2k Interested to hear your opinion on it.

deads2k · 2024-04-10T18:37:17Z

staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go

-func (r *crdHandler) tearDown(oldInfo *crdInfo) {
+// TODO: Do better than multiple bool params.
+func (r *crdHandler) tearDownAndQueueSVUpdate(oldInfo *crdInfo, queueSVUpdate bool, deleteTeardownCountInfo bool) {
+	r.updateActiveTeardownsCount(oldInfo.name, 1)


Why is this update done inside of a gofunc instead of under the lock?

richabanker · 2024-04-23T00:24:03Z

Pausing efforts on this support currently. The approach I have followed in this PR is summarized in this doc. Hope it benefits anyone who picks this up at a later point!

k8s-ci-robot · 2024-04-26T01:20:23Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: richabanker
Once this PR has been reviewed and has the lgtm label, please assign deads2k for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

…st SV updates async Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Co-authored-by: Haowei Cai (Roy) <haoweic@google.com>

…informer cache to serve CR writes

…ow whether to serve a CR write

k8s-ci-robot · 2024-04-26T03:08:43Z

@richabanker: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name	Commit	Details	Required	Rerun command
pull-kubernetes-typecheck	`729c800`	link	true	`/test pull-kubernetes-typecheck`
pull-kubernetes-verify-lint	`729c800`	link	true	`/test pull-kubernetes-verify-lint`
pull-kubernetes-integration	`729c800`	link	true	`/test pull-kubernetes-integration`

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

richabanker · 2024-04-26T03:37:48Z

Also posting a summary of all the approaches we have tried so far for this issue here

k8s-ci-robot requested review from alexzielenski and aojea March 19, 2024 20:28

richabanker changed the title ~~Sv single update~~ Storageversion API for CRDs Mar 19, 2024

richabanker mentioned this pull request Mar 19, 2024

Add initial support for StorageVersion API for CRDs #120582

Open

richabanker changed the title ~~Storageversion API for CRDs~~ Storageversion API for CRDs (using workqueue) Mar 19, 2024

richabanker force-pushed the sv-single-update branch 6 times, most recently from 16d7158 to 36c0eb1 Compare March 19, 2024 23:39

richabanker mentioned this pull request Mar 20, 2024

[WIP] Storageversion API for CRDs - update in just one place richabanker/kubernetes#1

Closed

jpbetz reviewed Mar 20, 2024

View reviewed changes

test/integration/storageversion/storage_version_crd_test.go Outdated Show resolved Hide resolved

staging/src/k8s.io/apiextensions-apiserver/pkg/storageversion/manager.go Show resolved Hide resolved

jpbetz reviewed Mar 20, 2024

View reviewed changes

richabanker force-pushed the sv-single-update branch 2 times, most recently from 61765b0 to 6370889 Compare March 20, 2024 23:53

richabanker force-pushed the sv-single-update branch 2 times, most recently from 44312ba to c0f169e Compare April 6, 2024 06:42

roycaihw reviewed Apr 9, 2024

View reviewed changes

richabanker force-pushed the sv-single-update branch 2 times, most recently from 745f498 to 94e9a45 Compare April 10, 2024 01:35

deads2k reviewed Apr 10, 2024

View reviewed changes

richabanker force-pushed the sv-single-update branch 4 times, most recently from a3253d7 to 9f7d63b Compare April 19, 2024 00:50

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 23, 2024

richabanker force-pushed the sv-single-update branch from 9f7d63b to 751f8f3 Compare April 26, 2024 01:19

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 26, 2024

richabanker force-pushed the sv-single-update branch 2 times, most recently from 9c5aa95 to 9aaaa67 Compare April 26, 2024 02:03

use workqueue, storageVersionUpdateInfoMap to record and process late…

4c7d9cb

…st SV updates async Signed-off-by: Andrew Sy Kim <andrewsy@google.com> Co-authored-by: Haowei Cai (Roy) <haoweic@google.com>

richabanker force-pushed the sv-single-update branch from 9aaaa67 to 926c77c Compare April 26, 2024 02:05

richabanker added 2 commits April 25, 2024 19:06

use active teardown count to determine when to requeue an update, SV …

6184dc5

…informer cache to serve CR writes

use latestPublishedSV and finishedPublishingLatestSV as signals to kn…

729c800

…ow whether to serve a CR write

richabanker force-pushed the sv-single-update branch from 926c77c to 729c800 Compare April 26, 2024 02:06

roycaihw mentioned this pull request Apr 26, 2024

support SV API for CRDs (using workqueue and wait channel) #124543

Draft

Storageversion API for CRDs (using workqueue) #123999

Are you sure you want to change the base?

Storageversion API for CRDs (using workqueue) #123999

Conversation

richabanker commented Mar 19, 2024 • edited

What type of PR is this?

What this PR does / why we need it:

Which issue(s) this PR fixes:

Special notes for your reviewer:

Does this PR introduce a user-facing change?

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

k8s-ci-robot commented Mar 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker Mar 20, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker Mar 20, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker Apr 10, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

richabanker commented Apr 23, 2024 • edited

k8s-ci-robot commented Apr 26, 2024

k8s-ci-robot commented Apr 26, 2024

richabanker commented Apr 26, 2024 • edited

richabanker commented Mar 19, 2024 •

edited

richabanker Mar 20, 2024 •

edited

richabanker Mar 20, 2024 •

edited

richabanker Apr 10, 2024 •

edited

richabanker commented Apr 23, 2024 •

edited

richabanker commented Apr 26, 2024 •

edited