New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate that cronjob names are 52 characters or less #52733

Merged
merged 1 commit into from Sep 23, 2017

Conversation

@julia-stripe
Contributor

julia-stripe commented Sep 19, 2017

What this PR does / why we need it:

Right now when you create a cronjob with a name longer than 52 characters, creation will succeed but the cronjob controller will create Job objects with names longer than 63 characters. Jobs cannot have names longer than 63 characters, so the cronjob will never be able to run any jobs.

Which issue this PR fixes : Fixes #50850

Special notes for your reviewer:

Release note:

Validate that cronjob names are 52 characters or less
@k8s-ci-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-ci-robot

k8s-ci-robot Sep 19, 2017

Contributor

Hi @julia-stripe. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

Contributor

k8s-ci-robot commented Sep 19, 2017

Hi @julia-stripe. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe
Contributor

julia-stripe commented Sep 19, 2017

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 19, 2017

Contributor

/ok-to-test
/assign

Contributor

soltysh commented Sep 19, 2017

/ok-to-test
/assign

Show outdated Hide outdated pkg/apis/batch/validation/validation.go Outdated
Show outdated Hide outdated pkg/apis/batch/validation/validation.go Outdated
@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 19, 2017

Contributor

@soltysh thanks, made those changes! take another look?

Contributor

julia-stripe commented Sep 19, 2017

@soltysh thanks, made those changes! take another look?

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 19, 2017

Contributor

/retest

Contributor

julia-stripe commented Sep 19, 2017

/retest

@soltysh

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Sep 19, 2017

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 19, 2017

Contributor

@julia-stripe thank you!

Contributor

soltysh commented Sep 19, 2017

@julia-stripe thank you!

@k8s-merge-robot k8s-merge-robot removed the lgtm label Sep 19, 2017

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 19, 2017

Contributor

It changed because I forgot to run go fmt before

Contributor

julia-stripe commented Sep 19, 2017

It changed because I forgot to run go fmt before

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 19, 2017

Contributor

/retest

Contributor

julia-stripe commented Sep 19, 2017

/retest

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 20, 2017

Contributor

/lgtm

Contributor

soltysh commented Sep 20, 2017

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm label Sep 20, 2017

@soltysh soltysh added the queue/fix label Sep 20, 2017

@soltysh soltysh added this to the v1.8 milestone Sep 20, 2017

@soltysh

This comment has been minimized.

Show comment
Hide comment
@soltysh

soltysh Sep 20, 2017

Contributor

Eric mind approving this one, this is hardening the cronjob name validation, but that's better than failing to create a job.

/assign @erictune

Contributor

soltysh commented Sep 20, 2017

Eric mind approving this one, this is hardening the cronjob name validation, but that's better than failing to create a job.

/assign @erictune

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Sep 20, 2017

Member
Member

thockin commented Sep 20, 2017

@@ -162,6 +163,13 @@ func ValidateCronJob(scheduledJob *batch.CronJob) field.ErrorList {
// CronJobs and rcs have the same name validation
allErrs := apivalidation.ValidateObjectMeta(&scheduledJob.ObjectMeta, true, apivalidation.ValidateReplicationControllerName, field.NewPath("metadata"))
allErrs = append(allErrs, ValidateCronJobSpec(&scheduledJob.Spec, field.NewPath("spec"))...)
if len(scheduledJob.ObjectMeta.Name) > apimachineryvalidation.DNS1035LabelMaxLength-11 {

This comment has been minimized.

@thockin

thockin Sep 20, 2017

Member

Job creates Pods which have further suffixes, right? And Pod names also are limited in size.

@thockin

thockin Sep 20, 2017

Member

Job creates Pods which have further suffixes, right? And Pod names also are limited in size.

This comment has been minimized.

@julia-stripe

julia-stripe Sep 20, 2017

Contributor

Pod names' size limits are 255 characters, like most Kubernetes resources. As far as I can tell only job names have a size limit of 63 characters (because job names get copied in to a label, which have a lower size limit than Kubernetes resource names).

@julia-stripe

julia-stripe Sep 20, 2017

Contributor

Pod names' size limits are 255 characters, like most Kubernetes resources. As far as I can tell only job names have a size limit of 63 characters (because job names get copied in to a label, which have a lower size limit than Kubernetes resource names).

@dims

This comment has been minimized.

Show comment
Hide comment
@dims

dims Sep 21, 2017

Member

@thockin go or no-go for 1.8 please - #52733

Member

dims commented Sep 21, 2017

@thockin go or no-go for 1.8 please - #52733

@dims

This comment has been minimized.

Show comment
Hide comment
@dims

dims Sep 22, 2017

Member

@julia-stripe @soltysh @thockin @erictune - ok to move this out of 1.8?

Member

dims commented Sep 22, 2017

@julia-stripe @soltysh @thockin @erictune - ok to move this out of 1.8?

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Sep 22, 2017

Member

I'll OK this for now, but I'd like to revisit this topic as part of 1.9. Can I count on you @julia-stripe to bring up back up?

@erictune @kow3ns

Member

thockin commented Sep 22, 2017

I'll OK this for now, but I'd like to revisit this topic as part of 1.9. Can I count on you @julia-stripe to bring up back up?

@erictune @kow3ns

@thockin

This comment has been minimized.

Show comment
Hide comment
@thockin

thockin Sep 22, 2017

Member

/approve

Member

thockin commented Sep 22, 2017

/approve

@k8s-merge-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-merge-robot

k8s-merge-robot Sep 22, 2017

Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: julia-stripe, soltysh, thockin

Associated issue: 50850

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

Contributor

k8s-merge-robot commented Sep 22, 2017

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: julia-stripe, soltysh, thockin

Associated issue: 50850

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@abgworrall

This comment has been minimized.

Show comment
Hide comment
@abgworrall

abgworrall Sep 23, 2017

Contributor

This missed the final ffwd for release-1.8, so will need to be cherrypicked in afterwards. It will not be in the release-candidate for 1.8; ping me on slack now if that is a problem ...

Contributor

abgworrall commented Sep 23, 2017

This missed the final ffwd for release-1.8, so will need to be cherrypicked in afterwards. It will not be in the release-candidate for 1.8; ping me on slack now if that is a problem ...

// The cronjob controller appends a 11-character suffix to the cronjob (`-$TIMESTAMP`) when
// creating a job. The job name length limit is 63 characters.
// Therefore cronjob names must have length <= 63-11=52. If we don't validate this here,
// then job creation will fail later.

This comment has been minimized.

@liggitt

liggitt Sep 23, 2017

Member

Does this prevent updating an existing cronjob? That can block addition/removal of finalizers during deletion

@liggitt

liggitt Sep 23, 2017

Member

Does this prevent updating an existing cronjob? That can block addition/removal of finalizers during deletion

@k8s-merge-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-merge-robot

k8s-merge-robot Sep 23, 2017

Contributor

/test all [submit-queue is verifying that this PR is safe to merge]

Contributor

k8s-merge-robot commented Sep 23, 2017

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-merge-robot

This comment has been minimized.

Show comment
Hide comment
@k8s-merge-robot

k8s-merge-robot Sep 23, 2017

Contributor

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here..

Contributor

k8s-merge-robot commented Sep 23, 2017

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here..

@k8s-merge-robot k8s-merge-robot merged commit d85f4c2 into kubernetes:master Sep 23, 2017

11 of 13 checks passed

Submit Queue Required Github CI test is not green: pull-kubernetes-bazel-build
Details
pull-kubernetes-e2e-kubeadm-gce Parent Job Status Changed: Job triggered.
cla/linuxfoundation julia-stripe authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-e2e-gce-bazel Job succeeded.
Details
pull-kubernetes-e2e-gce-etcd3 Jenkins job succeeded.
Details
pull-kubernetes-e2e-gce-gpu Job succeeded.
Details
pull-kubernetes-e2e-kops-aws Jenkins job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce Jenkins job succeeded.
Details
pull-kubernetes-node-e2e Jenkins job succeeded.
Details
pull-kubernetes-unit Jenkins job succeeded.
Details
pull-kubernetes-verify Jenkins job succeeded.
Details
@abgworrall

This comment has been minimized.

Show comment
Hide comment
@abgworrall

abgworrall Sep 23, 2017

Contributor

Just to followup - this was merged into master, but not into release-1.8. You'll need to cherrypick it into 1.8; it'll get approved, as it was just bad timing that meant it missed the 1.8 boat.

Contributor

abgworrall commented Sep 23, 2017

Just to followup - this was merged into master, but not into release-1.8. You'll need to cherrypick it into 1.8; it'll get approved, as it was just bad timing that meant it missed the 1.8 boat.

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Sep 23, 2017

Member

Just to followup - this was merged into master, but not into release-1.8. You'll need to cherrypick it into 1.8; it'll get approved, as it was just bad timing that meant it missed the 1.8 boat.

need to double-check this doesn't prevent update/delete of existing cronjobs with longer-than-52-char names before picking

Member

liggitt commented Sep 23, 2017

Just to followup - this was merged into master, but not into release-1.8. You'll need to cherrypick it into 1.8; it'll get approved, as it was just bad timing that meant it missed the 1.8 boat.

need to double-check this doesn't prevent update/delete of existing cronjobs with longer-than-52-char names before picking

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 25, 2017

Contributor

@liggitt I just tested updating and deleting existing cronjobs with this patch. The results were:

  • Deleting the cronjob succeeds. (cronjob "100000000010000000001000000000100000000010000000001000000000" deleted)
  • Updating the cronjob fails (The CronJob "100000000010000000001000000000100000000010000000001000000000" is invalid: metadata.name: Invalid value: "100000000010000000001000000000100000000010000000001000000000": must be no more than 52 characters)

My view is that it's okay that cronjob updates fail (and in fact preferable). If I have an invalid cronjob in my cluster, I'd prefer to be notified when updating it that it's invalid than continue having a job that I don't know is broken. Previously we had cronjobs in our cluster that were not running because of this issue and we didn't know about it which was a pretty scary state to be in.

Contributor

julia-stripe commented Sep 25, 2017

@liggitt I just tested updating and deleting existing cronjobs with this patch. The results were:

  • Deleting the cronjob succeeds. (cronjob "100000000010000000001000000000100000000010000000001000000000" deleted)
  • Updating the cronjob fails (The CronJob "100000000010000000001000000000100000000010000000001000000000" is invalid: metadata.name: Invalid value: "100000000010000000001000000000100000000010000000001000000000": must be no more than 52 characters)

My view is that it's okay that cronjob updates fail (and in fact preferable). If I have an invalid cronjob in my cluster, I'd prefer to be notified when updating it that it's invalid than continue having a job that I don't know is broken. Previously we had cronjobs in our cluster that were not running because of this issue and we didn't know about it which was a pretty scary state to be in.

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Sep 25, 2017

Member

thanks for checking.

Deleting the cronjob succeeds.

if updating fails, it is likely that deleting with a foreground propagation policy would fail as well (this adds a finalizer to the object, which requires doing an update)

Updating the cronjob fails

no-op updates (get bytes/put bytes) are used to migrate storage (json->protobuf, v1beta1->v1, etc) during cluster upgrades, and should never fail

Member

liggitt commented Sep 25, 2017

thanks for checking.

Deleting the cronjob succeeds.

if updating fails, it is likely that deleting with a foreground propagation policy would fail as well (this adds a finalizer to the object, which requires doing an update)

Updating the cronjob fails

no-op updates (get bytes/put bytes) are used to migrate storage (json->protobuf, v1beta1->v1, etc) during cluster upgrades, and should never fail

@julia-stripe

This comment has been minimized.

Show comment
Hide comment
@julia-stripe

julia-stripe Sep 25, 2017

Contributor

no-op updates (get bytes/put bytes) are used to migrate storage (json->protobuf, v1beta1->v1, etc) during cluster upgrades, and should never fail

Makes sense. How do you recommend dealing with this?

Contributor

julia-stripe commented Sep 25, 2017

no-op updates (get bytes/put bytes) are used to migrate storage (json->protobuf, v1beta1->v1, etc) during cluster upgrades, and should never fail

Makes sense. How do you recommend dealing with this?

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Sep 25, 2017

Member

Validation-wise, I'd recommend just validating the shorter name restriction on creation (names are immutable once created, so this would keep new bad data from entering the system, while allowing migrating/deleting old bad data):

  1. make a ValidateCronJobUpdate (we really should have had this already)

    func ValidateCronJobUpdate(job, oldJob *batch.CronJob) field.ErrorList {
    	allErrs := apivalidation.ValidateObjectMetaUpdate(&job.ObjectMeta, &oldJob.ObjectMeta,     field.NewPath("metadata"))
    	allErrs = append(allErrs, ValidateCronJobSpec(&job.Spec, field.NewPath("spec"))...)
    	return allErrs
    }
  2. change cronJobStrategy#ValidateUpdate to call ValidateCronJobUpdate instead

longer-term, we should work to make failing cronjobs more visible. there are lots of potential reasons cronjobs could fail, not just invalid names, so we need to surface this sort of case much more visibly (better error status on the cronjob object, events in namespace, etc)

Member

liggitt commented Sep 25, 2017

Validation-wise, I'd recommend just validating the shorter name restriction on creation (names are immutable once created, so this would keep new bad data from entering the system, while allowing migrating/deleting old bad data):

  1. make a ValidateCronJobUpdate (we really should have had this already)

    func ValidateCronJobUpdate(job, oldJob *batch.CronJob) field.ErrorList {
    	allErrs := apivalidation.ValidateObjectMetaUpdate(&job.ObjectMeta, &oldJob.ObjectMeta,     field.NewPath("metadata"))
    	allErrs = append(allErrs, ValidateCronJobSpec(&job.Spec, field.NewPath("spec"))...)
    	return allErrs
    }
  2. change cronJobStrategy#ValidateUpdate to call ValidateCronJobUpdate instead

longer-term, we should work to make failing cronjobs more visible. there are lots of potential reasons cronjobs could fail, not just invalid names, so we need to surface this sort of case much more visibly (better error status on the cronjob object, events in namespace, etc)

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Sep 25, 2017

Member

cc @soltysh for visibility on failing cronjobs

Member

liggitt commented Sep 25, 2017

cc @soltysh for visibility on failing cronjobs

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Sep 25, 2017

Member

Validation-wise, I'd recommend just validating the shorter name restriction on creation

FYI, opened #52967 to do this. If this is picked into 1.8, I think that should be as well

Member

liggitt commented Sep 25, 2017

Validation-wise, I'd recommend just validating the shorter name restriction on creation

FYI, opened #52967 to do this. If this is picked into 1.8, I think that should be as well

k8s-merge-robot added a commit that referenced this pull request Sep 29, 2017

Merge pull request #52967 from liggitt/cronjob-validate-update
Automatic merge from submit-queue (batch tested with PRs 53263, 52967, 53262, 52654, 53187). If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Limit 52-character cronjob name validation to create

Follow up to #52733
Related to #50850

Needed to allow old cronjobs to be updated/migrated/deleted (with foregroundPropagation)
@jpbetz

This comment has been minimized.

Show comment
Hide comment
@jpbetz

jpbetz Oct 25, 2017

Contributor

@liggitt Is this ready to be cherry picked to 1.8? Or do we need to wait on #52967?

Contributor

jpbetz commented Oct 25, 2017

@liggitt Is this ready to be cherry picked to 1.8? Or do we need to wait on #52967?

@liggitt

This comment has been minimized.

Show comment
Hide comment
@liggitt

liggitt Oct 26, 2017

Member

if this is picked to 1.8, #52967 should be as well

Member

liggitt commented Oct 26, 2017

if this is picked to 1.8, #52967 should be as well

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment