Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update AnyVolumeDataSource feature gate to beta #108736

Merged
merged 1 commit into from Mar 30, 2022

Conversation

bswartz
Copy link
Contributor

@bswartz bswartz commented Mar 16, 2022

What type of PR is this?

/kind feature

What this PR does / why we need it:

Default AnyVolumeDataSource feature gate to enabled, as the feature is moving to beta.

Which issue(s) this PR fixes:

Fixes #99955

Special notes for your reviewer:

Does this PR introduce a user-facing change?

The AnyVolumeDataSource feature is now beta, and the feature gate is enabled by default. In order to provide user feedback on PVCs with data sources, deployers must install the VolumePopulators CRD and the data-source-validator controller.

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:

[KEP]: 1495

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. kind/feature Categorizes issue or PR as related to a new feature. size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Mar 16, 2022
@k8s-ci-robot
Copy link
Contributor

@bswartz: This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-priority Indicates a PR lacks a `priority/foo` label and requires one. label Mar 16, 2022
@mehabhalodiya
Copy link

/milestone 1.24

@k8s-ci-robot
Copy link
Contributor

@mehabhalodiya: You must be a member of the kubernetes/milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

/milestone 1.24

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@xing-yang
Copy link
Contributor

/sig storage

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. and removed do-not-merge/needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels Mar 23, 2022
@xing-yang
Copy link
Contributor

This needs to be updated to "beta" as well:
https://github.com/kubernetes/kubernetes/blob/v1.24.0-alpha.4/pkg/apis/core/types.go#L475

@bswartz bswartz force-pushed the any-volume-data-source-beta branch from 9ab237f to a6b893d Compare March 23, 2022 14:40
@bswartz
Copy link
Contributor Author

bswartz commented Mar 23, 2022

This needs to be updated to "beta" as well: https://github.com/kubernetes/kubernetes/blob/v1.24.0-alpha.4/pkg/apis/core/types.go#L475

This is updated now

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. area/code-generation kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/apps Categorizes an issue or PR as relevant to SIG Apps. and removed size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. labels Mar 23, 2022
@k8s-triage-robot
Copy link

This PR may require API review.

If so, when the changes are ready, complete the pre-review checklist and request an API review.

Status of requested reviews is tracked in the API Review project.

@bswartz
Copy link
Contributor Author

bswartz commented Mar 23, 2022

/retest-required

@bswartz
Copy link
Contributor Author

bswartz commented Mar 24, 2022

/label api-review

@fedebongio
Copy link
Contributor

/assign @kubernetes/api-reviewers
/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Mar 24, 2022
@liggitt
Copy link
Member

liggitt commented Mar 28, 2022

API comment, feature gate change, and KEP graduation references lgtm. Routing to thockin for the question on default-on-read behavior

I saw an earlier reference to an issue with default-on-read for Services, and I wasn't able to see how the dataSourceRef field could have the same problems, but I'd be grateful to have another set of eyes on it.

tl;dr is that the decorator doesn't run on the old object read from etcd when doing an update, so if your PrepareForUpdate or update validation would be unhappy with the new object being defaulted and the old object not being defaulted, you'll need to default the old object explicitly in those paths

@bswartz
Copy link
Contributor Author

bswartz commented Mar 28, 2022

tl;dr is that the decorator doesn't run on the old object read from etcd when doing an update, so if your PrepareForUpdate or update validation would be unhappy with the new object being defaulted and the old object not being defaulted, you'll need to default the old object explicitly in those paths

Is there an easy workaround that I can copy? I'll take another look at how this was fixed for the Service object to see if I missed something.

@liggitt
Copy link
Member

liggitt commented Mar 28, 2022

it looks like persistentvolumeclaimStrategy#PrepareForUpdate is already calling the same default-on-read normalization as defaultOnReadPvc on the new PVC:

// PrepareForUpdate sets the Status field which is not allowed to be set by end users on update
func (persistentvolumeclaimStrategy) PrepareForUpdate(ctx context.Context, obj, old runtime.Object) {
newPvc := obj.(*api.PersistentVolumeClaim)
oldPvc := old.(*api.PersistentVolumeClaim)
newPvc.Status = oldPvc.Status
pvcutil.DropDisabledFields(&newPvc.Spec)
// We need to use similar logic to PrepareForCreate here both to preserve backwards
// compatibility with the old behavior (ignoring of garbage dataSources at both create
// and update time) and also for compatibility with older clients, that might omit
// the dataSourceRef field which we filled in automatically, so we have to fill it
// in again here.
pvcutil.EnforceDataSourceBackwardsCompatibility(&newPvc.Spec, &oldPvc.Spec)
pvcutil.NormalizeDataSources(&newPvc.Spec)
}

But it's not touching the old PVC. I don't know if validation is counting on the old PVC having been normalized by the default-on-read path or not, but it is not currently being normalized.

@bswartz
Copy link
Contributor Author

bswartz commented Mar 29, 2022

it looks like persistentvolumeclaimStrategy#PrepareForUpdate is already calling the same default-on-read normalization as defaultOnReadPvc on the new PVC:
But it's not touching the old PVC. I don't know if validation is counting on the old PVC having been normalized by the default-on-read path or not, but it is not currently being normalized.

I understood that that function treated the "oldPvc" as a read-only object and that newPvc was the one to be updated. How could anyone be interested in the contents of the oldPvc object after calling PrepareForUpdate? Maybe I'm not familiar enough the the API server guts, but I don't see any benefit to also fixing the oldPvc in that function (if it isn't already accurate)

@liggitt
Copy link
Member

liggitt commented Mar 29, 2022

Some update validation disallows specific transitions or enforces immutable fields based on comparing the old and new objects (I don't know if PVC validation does, I didn't check). If the new object had normalization applied and the old object didn't, the user could get an error saying a field was not allowed to be changed in some way that would make no sense to them (since from their perspective they weren't changing the field)

@bswartz
Copy link
Contributor Author

bswartz commented Mar 29, 2022

I see what you're saying. You're not worried about bogus output from the function but the possibility that a bogus input could cause it to misbehave. I'll look for that specifically.

@bswartz
Copy link
Contributor Author

bswartz commented Mar 29, 2022

Okay I re-reviewed the code and this is a case that we considered extensively when we first designed this (with a huge amount of help from @thockin). If you follow the PrepareForUpdate() function logic, there are several function calls but they're all very small. There's exactly one place where the contents of the old PVC is considered, and it's written to look at both the old field and the new field, such that it makes the correct decision whether the old object is upgraded or not.

if oldPVCSpec.DataSource != nil || oldPVCSpec.DataSourceRef != nil {

The huge comment for EnforceDataSourceBackwardsCompatibility() tries to explain exactly what's going on here so that future versions of ourselves can remember why this code is the way it is. I'm confident there are no issues if someone tries to update an un-upgraded object.

@liggitt
Copy link
Member

liggitt commented Mar 29, 2022

The last commit in https://github.com/liggitt/kubernetes/commits/any-volume-data-source-beta adds an integration test that demonstrates the problems if someone tries to update a non-upgraded PVC object. All three update methods fail the PVC spec immutability check when the feature gate is enabled:

  1. write original content that just sets dataSource
  2. send a no-op patch of {} that lets the server compute the updated object internally based on the persisted object
  3. get from the server and put the content without changing it

All three update methods work with the AnyVolumeDataSource feature gate disabled.

        upgrade_test.go:90: PVC stored in etcd {"kind":"PersistentVolumeClaim","apiVersion":"v1","metadata":{"name":"test-old-pvc","namespace":"old-pvc-ns","uid":"08675309-9376-9376-9376-086753099999","creationTimestamp":"2022-03-29T13:36:09Z"},"spec":{"accessModes":["ReadWriteOnce"],"resources":{"requests":{"storage":"10G"}},"dataSource":{"apiGroup":null,"kind":"PersistentVolumeClaim","name":"foo"}},"status":{}}
        upgrade_test.go:96: write of original content failed: PersistentVolumeClaim "test-old-pvc" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
              core.PersistentVolumeClaimSpec{
              	... // 5 identical fields
              	VolumeMode:    &"Filesystem",
              	DataSource:    &{Kind: "PersistentVolumeClaim", Name: "foo"},
            - 	DataSourceRef: nil,
            + 	DataSourceRef: &core.TypedLocalObjectReference{Kind: "PersistentVolumeClaim", Name: "foo"},
              }
        upgrade_test.go:104: no-op patch failed: PersistentVolumeClaim "test-old-pvc" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
              core.PersistentVolumeClaimSpec{
              	... // 5 identical fields
              	VolumeMode:    &"Filesystem",
              	DataSource:    &{Kind: "PersistentVolumeClaim", Name: "foo"},
            - 	DataSourceRef: nil,
            + 	DataSourceRef: &core.TypedLocalObjectReference{Kind: "PersistentVolumeClaim", Name: "foo"},
              }
        upgrade_test.go:116: no-op get/put failed: PersistentVolumeClaim "test-old-pvc" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
              core.PersistentVolumeClaimSpec{
              	... // 5 identical fields
              	VolumeMode:    &"Filesystem",
              	DataSource:    &{Kind: "PersistentVolumeClaim", Name: "foo"},
            - 	DataSourceRef: nil,
            + 	DataSourceRef: &core.TypedLocalObjectReference{Kind: "PersistentVolumeClaim", Name: "foo"},
              }

@bswartz bswartz force-pushed the any-volume-data-source-beta branch from a6b893d to 856b3ae Compare March 29, 2022 15:55
@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. area/test sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. sig/testing Categorizes an issue or PR as relevant to SIG Testing. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Mar 29, 2022
@bswartz bswartz force-pushed the any-volume-data-source-beta branch from 856b3ae to cdefa1a Compare March 29, 2022 15:58
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Mar 29, 2022
Default to enabled
Fix validation of null-updates/patches when the "old" PVC was persisted by
an older version. Add upgrade integration tests written by liggitt.
@bswartz bswartz force-pushed the any-volume-data-source-beta branch from cdefa1a to 08948ca Compare March 29, 2022 17:39
Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like the right fix. It's an unfortunate wart, but .

@@ -111,6 +112,10 @@ func (persistentvolumeclaimStrategy) PrepareForUpdate(ctx context.Context, obj,
// in again here.
pvcutil.EnforceDataSourceBackwardsCompatibility(&newPvc.Spec, &oldPvc.Spec)
pvcutil.NormalizeDataSources(&newPvc.Spec)

// We also normalize the data source fields of the old PVC, so that objects saved
// from an earlier version will pass validation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This could also document that it should be called AFTER EnforceDataSourceBackwardsCompatibility (I went and read that code, and I am pretty sure that must run first :)

I don't want anyone "fixing" it later.

@thockin
Copy link
Member

thockin commented Mar 29, 2022

Thanks!

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 29, 2022
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bswartz, thockin

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 29, 2022
@cici37
Copy link
Contributor

cici37 commented Mar 29, 2022

/remove-sig api-machinery

@k8s-ci-robot k8s-ci-robot removed the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Mar 29, 2022
@k8s-ci-robot k8s-ci-robot merged commit 9fe98d8 into kubernetes:master Mar 30, 2022
@k8s-ci-robot k8s-ci-robot added this to the v1.24 milestone Mar 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-review Categorizes an issue or PR as actively needing an API review. approved Indicates a PR has been approved by an approver from all required OWNERS files. area/code-generation area/test cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API kind/feature Categorizes issue or PR as related to a new feature. lgtm "Looks good to me", indicates that a PR is ready to be merged. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/storage Categorizes an issue or PR as relevant to SIG Storage. sig/testing Categorizes an issue or PR as relevant to SIG Testing. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Update AnyVolumeDataSource feature gate to beta
10 participants