WIP - introduce ratcheting validation mechanism #64907

liggitt · 2018-06-08T06:59:18Z

What this PR does / why we need it:

Introduces a ratcheting validation mechanism, as described in #64841 (comment)

Tightening validation on existing data cannot be done in a way that prevents existing stored objects from being updated.

This PR introduces the following mechanism:

identify granular additional validations (usually to fix fatal errors that always result in persisted, yet broken objects)
apply those validations on create to prevent new broken objects from being created
apply those validations on update only if the existing object also passes the validation
- this prevents valid objects from becoming invalid
- this avoids breaking the ability to update existing invalid objects

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #64841
Fixes #57510 - flex driver name
Fixes #58477 - duplicate envvar names
Fixes #54567, #64011 - invalid memory quantities (xref #63426)
xref #52936 - storage medium
xref #60934 - duplicate pvc claimName

Special notes for your reviewer:
/assign @lavalamp @deads2k

Release note:

Validation has tightened when creating new pods (or new objects containing pod templates):
* flex volume driver names must now be valid
* emptyDir medium must now be valid
* duplicate envvar names are not allowed
* duplicate volumes referencing the same PVC claimName are not allowed
* memory limits and requests must be an integer value

liggitt · 2018-06-08T07:01:00Z

@kubernetes/sig-api-machinery-pr-reviews

liggitt · 2018-06-08T07:02:30Z

would like feedback on the mechanism first, then will tag various teams on the specific validation commits (those were mostly to illustrate how the mechanism could be used)

liggitt · 2018-06-08T15:20:18Z

/retest

lavalamp · 2018-06-08T20:19:53Z

pkg/apis/core/validation/ratcheting.go

+		spec := data.(*core.PodSpec)
+		for i, c := range spec.InitContainers {
+			names := sets.NewString()
+			for j, e := range c.Env {


nit: seems like these loops could be a subroutine.

cleaned up the loops

lavalamp · 2018-06-08T20:22:38Z

Looks good overall.

I do wish there were less setup code: folks have to remember to call this everywhere there's a pod template type. But it is fairly straightforward, maybe that is better than a complicated automated system. We can always automate when we find people making errors that the automation would catch.

lavalamp · 2018-06-08T20:23:37Z

We should think about how to do this problem for CRD validation, too, when we add it. Perhaps there all validations should be ratcheting.

sttts · 2018-06-11T08:33:48Z

We should think about how to do this problem for CRD validation, too, when we add it. Perhaps there all validations should be ratcheting.

This is getting interesting. For native types we only have positive validation predicates A(old,new) and B(old,new) and C(old,new). So we can easily declare some as ratcheting (call it A', B', C'), making the complete validation predicate A'(old,new) and B'(old,new) and C'(old,new) weaker than the original. With CRDs and their OpenAPI based validation, we don't have positivity, e.g. ((not A(old,new)) or B(old,new)) and C(old,new) is expressible. Making A ratcheting as A' gives a stronger predicate than before, not what we want. So ratcheting has to be restricted to positive predicates.

Another speciality for CRDs: for defaulting we have to apply the OpenAPI schema before we even have an old object to check against. We could get away by tagging certain OpenAPI properties as ratcheting, drop them on (coercing) schema validation for defaulting and only apply them on strategy validation.

But even this way, it is not obvious how a ratcheting validation with an old and a new object will look like. The validation algorithm and the whole OpenAPI schema semantics does not know the concept of validation of differences between two objects on update. If we come up with such semantics though, this would also be very interesting to formulate "read-only field", which cannot be espressed today in OpenAPI at all.

lavalamp · 2018-06-11T16:32:41Z

@sttts I assumed we would be writing a validator that takes both old and new objects, and only enforces a validation check on the new object if the old one passes (or is missing the relevant field). I am talking about value validation and not schema validation, of course. And yes, we will definitely be implementing read-only/immutable fields, for built-ins and CRs. Take a look at some of the stuff @apelisse and @seans3 have been doing in kube-openapi if you haven't seen it already.

…

On Mon, Jun 11, 2018 at 1:35 AM Dr. Stefan Schimanski < ***@***.***> wrote: We should think about how to do this problem for CRD validation, too, when we add it. Perhaps there all validations should be ratcheting. This is getting interesting. For native types we only have positive validation predicates A(old,new) and B(old,new) and C(old,new). So we can easily declare some as ratcheting (call it A', B', C'), making the complete validation predicate A'(old,new) and B'(old,new) and C'(old,new) weaker than the original. With CRDs and their OpenAPI based validation, we don't have positivity, e.g. ((not A(old,new)) or B(old,new)) and C(old,new) is expressible. Making A ratcheting as A' gives a stronger predicate than before, not what we want. So ratcheting has to be restricted to positive predicates. Another speciality for CRDs: for defaulting we have to apply the OpenAPI schema before we even have an old object to check against. We could get away by tagging certain OpenAPI properties as ratcheting, drop them on (coercing) schema validation for defaulting and only apply them on strategy validation. But even this way, it is not obvious how a ratcheting validation with an old and a new object will look like. The validation algorithm and the whole OpenAPI schema semantics does not know the concept of validation of differences between two objects on update. If we come up with such semantics though, this would also be very interesting to formulate "read-only field", which cannot be espressed today in OpenAPI at all. — You are receiving this because you were assigned. Reply to this email directly, view it on GitHub <#64907 (comment)>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AAngloo5xoZ7tDp7wJ4Tuc7jh4WNh3uLks5t7ivlgaJpZM4Uflyh> .

liggitt · 2018-06-11T17:32:24Z

@sttts I assumed we would be writing a validator that takes both old and new objects, and only enforces a validation check on the new object if the old one passes (or is missing the relevant field). I am talking about value validation and not schema validation, of course.

One of the reasons the approach in this PR works well is that the tightening validations are separable. Take the following scenario:

I have base validation V, and tightening validations A, B, C.
I have an old object that passes V, A, B, and fails C

Any update would also be required to pass V, A, B. If V, A, B, and C could not be checked individually, the old object failing C means we'd have to skip all validation on the updated object, which would open us up to objects with much bigger validation problems.

@sttts is the declarative validation able to be broken down to the level of individual checks?

mbohlool · 2018-06-11T17:44:49Z

/sub

sttts · 2018-06-11T17:59:03Z

@sttts is the declarative validation able to be broken down to the level of individual checks?

No, in general it is not. But one knows which parts of a schema can be "broken apart" in the way you describe. So I think we can mark certain sub-schemata as ratcheting and the CRD type validation will check those markers. E.g. in ((not A(old,new)) or B(old,new)) and C(old,new) only B and C are positive and therefore can be ratcheting (or some sub-schema of them).

sttts · 2018-06-11T18:33:14Z

And yes, we will definitely be implementing read-only/immutable fields, for
built-ins and CRs. Take a look at some of the stuff @apelisse and @seans3
have been doing in kube-openapi if you haven't seen it already.

@lavalamp If you have a link, that would be awesome. This sounds interesting. I saw the extension PRs. Do you mean those to express new kube-specific properties?

We can certainly make this work using the restriction to positive sub-schemata (see post above). I am not completely convinced though yet that this always gives the semantics we want and that users expect, especially because schema and value validation are not separable: some value sub-schemata are required to select the right branches of AllOf/AnyOf/OneOf quantors. This needs some more thought.

yliaog · 2018-06-11T20:06:44Z

/cc @yliaog

bgrant0607 · 2018-07-17T01:42:43Z

Sorry for not noticing this earlier.

I commented on #64841

liggitt · 2018-07-17T01:47:39Z

/hold

bgrant0607 · 2018-07-17T02:02:07Z

@liggitt Is the idea that every release the ratcheting validations would be moved to standard ones, and potentially new ratcheting validations would be introduced?

liggitt · 2018-07-17T02:21:26Z

Is the idea that every release the ratcheting validations would be moved to standard ones, and potentially new ratcheting validations would be introduced?

No, they couldn't be moved to be standard validations until we could guarantee the API servers would never encounter persisted etcd data that would fail those validations.

tallclair · 2018-07-17T02:40:20Z

/lgtm cancel
See #64841 (comment)

k8s-ci-robot · 2018-07-17T02:40:27Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: liggitt

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~pkg/apis/core/OWNERS~~ [liggitt]
~~pkg/registry/OWNERS~~ [liggitt]
~~staging/src/k8s.io/apimachinery/pkg/OWNERS~~ [liggitt]
~~test/e2e/apps/OWNERS~~ [liggitt]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot · 2018-07-17T21:13:58Z

@liggitt: PR needs rebase.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

erictune · 2018-07-20T22:40:24Z

Hi, sorry to be late to the party, but some here is some feedback on this approach:

Yes, we should have CRD support for ratcheting, and it should ideally be very similar to the built-in support.
Annotating or modifying the CRD schema to express RatchetingValidations seems to me like it scales poorly.
Instead of having special RatchetingValidation functions, instead, one could instead introduce a new api version with stronger validation. Then one can set the desired storage version (as CRDs do) to lazily move over objects or run a "storage version upgrade script" procedure to eagerly move over objects.
To address the complaint that we don't want to move pods to v2, we could introduce a "microversion" and the validation uses both apiversion and metadata.microversion to compose a tuple which is used to select what functions/schema is used for validation.

liggitt · 2018-07-21T14:07:10Z

Instead of having special RatchetingValidation functions, instead, one could instead introduce a new api version with stronger validation. Then one can set the desired storage version (as CRDs do) to lazily move over objects or run a "storage version upgrade script" procedure to eagerly move over objects.

What would the storage migration do with existing objects that would not pass the new stricter validation?

As long as the previous API version is still usable (or data persisted via the previous API version is still able to be encountered), we have to handle round-tripping and updating that object via the new API version. Making an update of an unrelated field (like an annotation or a finalizer) fail for validation reasons is problematic.

bgrant0607 · 2018-07-23T20:29:54Z

Quick note: We don't have versioned validation yet: #8550

CaoShuFeng · 2018-08-01T05:58:50Z

pkg/registry/batch/cronjob/strategy.go

+	allErrs := validation.ValidateCronJob(cronJob)
+	allErrs = append(allErrs, apimachineryvalidation.ValidateRatchetingCreate(
+		&cronJob.Spec.JobTemplate.Spec.Template.Spec,
+		field.NewPath("spec", "template", "spec", "template", "spec"),


nit: field.NewPath("spec", "jobTemplate", "spec", "template", "spec")

mariantalla · 2018-08-21T09:54:17Z

@hoegaarden and I were also wondering what you thought about the following:

The capability of also applying preprogrammed "migrations" to invalid objects? In other words, the next step after a failed validation could be to apply some logic to automatically reconcile it/ make it compatible with the new api version.
The validators being extracted from the body of the api server. This is perhaps more of a design consideration, but could make it easier to

maintain validator collections in future versions of the API server
maybe even allow operators to define their own additional validators and configure the API server with those. Maybe admission webhooks already offer the tools for that use case though. 🤔

dims · 2018-08-23T01:42:58Z

@liggitt should we shoot for v1.12?

liggitt · 2018-08-24T22:25:39Z

should we shoot for v1.12?

This isn't a top priority for the next week, so no

fejta-bot · 2019-01-27T15:22:39Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

smarterclayton · 2019-02-19T01:23:54Z

Fate of this?

lavalamp · 2019-02-19T16:31:34Z

@smarterclayton It's solvable but with much more difficulty-- read from this comment to the end of the issue.

liggitt · 2019-02-19T16:44:51Z

@smarterclayton It's solvable but with much more difficulty-- read from this comment to the end of the issue.

If we limit ourselves to ratcheting in cases where the created objects were fatally flawed, I don't think we need to continue to accept invalid objects. I mostly closed this because I wasn't actively working on it.

k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Jun 8, 2018

k8s-ci-robot assigned lavalamp Jun 8, 2018

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Jun 8, 2018

k8s-ci-robot requested review from caesarxuchao and dchen1107 June 8, 2018 06:59

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 8, 2018

k8s-ci-robot added the sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. label Jun 8, 2018

liggitt assigned deads2k Jun 8, 2018

This was referenced Jun 8, 2018

Validation Tightening is not Generally Possible #64841

Open

add pod template validation for statefulset #62987

Closed

Missing validation in HPA #60511

Open

validate PersistentVolumeClaim in StatefulSet #60635

Closed

liggitt force-pushed the ratcheting-validation branch from ae9aefa to 23262bd Compare June 8, 2018 07:46

liggitt added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Jun 8, 2018

liggitt force-pushed the ratcheting-validation branch from 23262bd to 78b3e95 Compare June 8, 2018 14:00

lavalamp reviewed Jun 8, 2018

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 16, 2018

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jul 17, 2018

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jul 17, 2018

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 17, 2018

liggitt mentioned this pull request Jul 27, 2018

Fix empty string reclaim policy being considered valid in StorageClass #66682

Closed

CaoShuFeng mentioned this pull request Aug 1, 2018

refuse serviceaccount projection volume request when pod has no servceaccount bounded #66698

Merged

CaoShuFeng reviewed Aug 1, 2018

View reviewed changes

liggitt changed the title ~~introduce ratcheting validation mechanism~~ WIP - introduce ratcheting validation mechanism Oct 29, 2018

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Oct 29, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2019

MaciekPytel mentioned this pull request Jan 30, 2019

Adding missing validation in HPA #72047

Closed

liggitt closed this Feb 9, 2019

damemi mentioned this pull request Jul 21, 2020

Add label selector value validation #92714

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP - introduce ratcheting validation mechanism #64907

WIP - introduce ratcheting validation mechanism #64907

liggitt commented Jun 8, 2018 •

edited

liggitt commented Jun 8, 2018

liggitt commented Jun 8, 2018

liggitt commented Jun 8, 2018

lavalamp Jun 8, 2018

liggitt Jul 4, 2018

lavalamp commented Jun 8, 2018

lavalamp commented Jun 8, 2018

sttts commented Jun 11, 2018

lavalamp commented Jun 11, 2018 via email

liggitt commented Jun 11, 2018

mbohlool commented Jun 11, 2018

sttts commented Jun 11, 2018

sttts commented Jun 11, 2018

yliaog commented Jun 11, 2018

bgrant0607 commented Jul 17, 2018

liggitt commented Jul 17, 2018

bgrant0607 commented Jul 17, 2018

liggitt commented Jul 17, 2018

tallclair commented Jul 17, 2018

k8s-ci-robot commented Jul 17, 2018

k8s-ci-robot commented Jul 17, 2018

erictune commented Jul 20, 2018

liggitt commented Jul 21, 2018

bgrant0607 commented Jul 23, 2018

CaoShuFeng Aug 1, 2018

mariantalla commented Aug 21, 2018

dims commented Aug 23, 2018

liggitt commented Aug 24, 2018

fejta-bot commented Jan 27, 2019

smarterclayton commented Feb 19, 2019

lavalamp commented Feb 19, 2019

liggitt commented Feb 19, 2019

WIP - introduce ratcheting validation mechanism #64907

WIP - introduce ratcheting validation mechanism #64907

Conversation

liggitt commented Jun 8, 2018 • edited

liggitt commented Jun 8, 2018

liggitt commented Jun 8, 2018

liggitt commented Jun 8, 2018

lavalamp Jun 8, 2018

Choose a reason for hiding this comment

liggitt Jul 4, 2018

Choose a reason for hiding this comment

lavalamp commented Jun 8, 2018

lavalamp commented Jun 8, 2018

sttts commented Jun 11, 2018

lavalamp commented Jun 11, 2018 via email

liggitt commented Jun 11, 2018

mbohlool commented Jun 11, 2018

sttts commented Jun 11, 2018

sttts commented Jun 11, 2018

yliaog commented Jun 11, 2018

bgrant0607 commented Jul 17, 2018

liggitt commented Jul 17, 2018

bgrant0607 commented Jul 17, 2018

liggitt commented Jul 17, 2018

tallclair commented Jul 17, 2018

k8s-ci-robot commented Jul 17, 2018

k8s-ci-robot commented Jul 17, 2018

erictune commented Jul 20, 2018

liggitt commented Jul 21, 2018

bgrant0607 commented Jul 23, 2018

CaoShuFeng Aug 1, 2018

Choose a reason for hiding this comment

mariantalla commented Aug 21, 2018

dims commented Aug 23, 2018

liggitt commented Aug 24, 2018

fejta-bot commented Jan 27, 2019

smarterclayton commented Feb 19, 2019

lavalamp commented Feb 19, 2019

liggitt commented Feb 19, 2019

liggitt commented Jun 8, 2018 •

edited