Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"tutor k8s" blows up on Tutor version update #531

Closed
fghaas opened this issue Nov 18, 2021 · 9 comments
Closed

"tutor k8s" blows up on Tutor version update #531

fghaas opened this issue Nov 18, 2021 · 9 comments
Labels
bug Bugs will be investigated and fixed as quickly as possible.

Comments

@fghaas
Copy link
Contributor

fghaas commented Nov 18, 2021

Bug description

As one moves from one version of Tutor to another (I just saw this going from Tutor 12.1.6 to 12.1.7), tutor k8s attempts to update the app.kubernetes.io/version label selector on all resources. As explained in kubernetes/client-go#508 (and probably elsewhere), this is not allowed for Deployments, causing tutor k8s start to break if applied to any Open edX environment previously deployed with an earlier version of Tutor. (Even though it will happily update the labels on the PVCs, ConfigMaps, and Services it touches before it gets to Deployments.)

How to reproduce

  • Install Tutor 12.1.6
  • Run tutor k8s quickstart
  • Install Tutor 12.1.7
  • Run tutor k8s start (or quickstart)
  • You should now get an error like the following for all your Deployments:
    Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable
    

Environment

Tutor 12.1.6 and 12.1.7, running against Kubernetes 1.18.16.

Additional context

Removing the offending line from tutor/templates/kustomization.yml would remove this problem for newly created deployments. That is to say, any Open edX environment that's deployed to Kubernetes after this line has been removed from the Kustomization template will not have the label set, and won't break on Tutor updates. But any running cluster that does have it set it now, will break on a Tutor version update even after the line is removed, because Kustomization then tries to remove the label — which is also not allowed.

Thus, removing the line from tutor/templates/kustomization.yml would constitute a breaking change: users with existing Tutor-deployed Open edX environments will need to throw away all of their Deployments and then re-create them.

@fghaas
Copy link
Contributor Author

fghaas commented Nov 18, 2021

It should probably be noted that if I understand correctly, the thing that's breaking here is not the attempted change to labels (in .spec.templates.metadata) but to matchLabels (in .spec.selector).

@regisb regisb added the bug Bugs will be investigated and fixed as quickly as possible. label Nov 22, 2021
@regisb
Copy link
Contributor

regisb commented Nov 22, 2021

Hey Florian, thanks for your feedback on your usage of Kubernetes. I have seen the issues, pull requests and discussion topics you created -- sorry if I did not find the time to answer until now.

I think it's great that you are giving us feedback on actual usage of Tutor running on k8s; this does not happen very frequently. Thus, I'd like to take this opportunity to allocate some time to improve the tutor k8s integration, based on your feedback. In particular, I want to make Tutor follow Kubernetes best practices more.

Regarding the specific issue of common labels that we are discussing here:

  • The idea of adding the tutor version to the common labels comes from here: https://kubernetes.io/docs/concepts/overview/working-with-objects/common-labels/ If it does not allow end users to upgrade in-place, then it was not such a great idea... Thus, I agree with you that we should remove this label.
  • Unless I'm mistaken, end users can bypass the issue by deleting the corresponding deployment objects, right? For instance with tutor k8s stop. I understand that deleting resources is not a satisfying, permanent fix, but it should be enough for the moment.
  • It's fine to introduce breaking changes, but it's better for end users if we introduce them in major releases only. Tutor v13 is scheduled for December 9th, so maybe we can add it then?

Based on these observations, can I suggest that you open a PR on top of the nightly branch of Tutor?

@fghaas
Copy link
Contributor Author

fghaas commented Nov 22, 2021

I think the best approach here is to use a labelTransformer instead of the commonLabels directive, as the aforementioned post suggests. What do you think?

@regisb
Copy link
Contributor

regisb commented Nov 22, 2021

Right, thanks for bearing with me.

I'm not too excited by the perspective of adding a labelTransformer object -- which would resolve our issue, but introduce yet another layer of complexity.

Instead, could we move the tutor version from the commonLabels to the commonAnnotations? If we don't use the tutor version for querying, then maybe it makes more sense to consider it as an annotation?

@fghaas
Copy link
Contributor Author

fghaas commented Nov 22, 2021

Right, if you don't actually want to query or reference anything by Tutor version, then I suppose it could also be an annotation. (I haven't used this before, mind you, but it does look like a sensible idea to me. :) )

@regisb
Copy link
Contributor

regisb commented Nov 22, 2021

Great. Would you have the time to try this and open a PR? If not I can do it myself.

fghaas added a commit to fghaas/tutor that referenced this issue Nov 22, 2021
…n Tutor version changes

Through the commonLabels directive in kustomization.yml, all resources
get a label named "app.kubernetes.io/version", which is being set to
the Tutor version at the time of initial deployment.

When the user then subsequently progresses to a new Tutor version,
Kubernetes attempts to update this label — but for Deployment,
ReplicaSet, and DaemonSet resources, this is no longer allowed as of
kubernetes/kubernetes#50808. This causes
"tutor k8s start" (at the "kubectl apply --kustomize" step) to break
with errors such as:

Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Simply removing the app.kubernetes.io/version label from
kustomization.yml will permanently fix this issue for newly created
Kubernetes deployments, which will "survive" any future Tutor version
changes thereafter.

However, *existing* production Open edX deployments will need to throw
the affected Deployments away, and re-create them.

Also, add the Tutor version as a resource annotation instead, using
the commonAnnotations directive.

See also:
kubernetes/client-go#508
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonlabels/
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonannotations/

Fixes overhangio#531.
@fghaas
Copy link
Contributor Author

fghaas commented Nov 22, 2021

I don't have bandwidth to test this today, but I've created a PR against nightly in #533. Feel free to either run a version upgrade test yourself, or else mark the PR as a Draft and I'll open it for review when I've been able to test more.

@fghaas
Copy link
Contributor Author

fghaas commented Nov 22, 2021

So, I've not been able to test with nightly, but I did

  • manually make the change to kustomization.yml setting the version specified in commonAnnotations to 12.1.6 (and removing the version number from `commonLabels),
  • recreate my Deployments by deleting them with kubectl delete pod and then running tutor k8s start from Tutor 1.12.6,
  • upgrade to Tutor 1.12.7,
  • run tutor config save,
  • re-apply the manual kustomization.yml change, setting the version number annotations to 1.12.6,
  • run tutor k8s start again,

and the upgrade went through without a hitch.

fghaas added a commit to fghaas/tutor that referenced this issue Nov 23, 2021
…n Tutor version changes

Through the commonLabels directive in kustomization.yml, all resources
get a label named "app.kubernetes.io/version", which is being set to
the Tutor version at the time of initial deployment.

When the user then subsequently progresses to a new Tutor version,
Kubernetes attempts to update this label — but for Deployment,
ReplicaSet, and DaemonSet resources, this is no longer allowed as of
kubernetes/kubernetes#50808. This causes
"tutor k8s start" (at the "kubectl apply --kustomize" step) to break
with errors such as:

Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Simply removing the app.kubernetes.io/version label from
kustomization.yml will permanently fix this issue for newly created
Kubernetes deployments, which will "survive" any future Tutor version
changes thereafter.

However, *existing* production Open edX deployments will need to throw
the affected Deployments away, and re-create them.

Also, add the Tutor version as a resource annotation instead, using
the commonAnnotations directive.

See also:
kubernetes/client-go#508
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonlabels/
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonannotations/

Fixes overhangio#531.
regisb pushed a commit that referenced this issue Nov 25, 2021
…n Tutor version changes

Through the commonLabels directive in kustomization.yml, all resources
get a label named "app.kubernetes.io/version", which is being set to
the Tutor version at the time of initial deployment.

When the user then subsequently progresses to a new Tutor version,
Kubernetes attempts to update this label — but for Deployment,
ReplicaSet, and DaemonSet resources, this is no longer allowed as of
kubernetes/kubernetes#50808. This causes
"tutor k8s start" (at the "kubectl apply --kustomize" step) to break
with errors such as:

Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Simply removing the app.kubernetes.io/version label from
kustomization.yml will permanently fix this issue for newly created
Kubernetes deployments, which will "survive" any future Tutor version
changes thereafter.

However, *existing* production Open edX deployments will need to throw
the affected Deployments away, and re-create them.

Also, add the Tutor version as a resource annotation instead, using
the commonAnnotations directive.

See also:
kubernetes/client-go#508
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonlabels/
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonannotations/

Fixes #531.
@regisb
Copy link
Contributor

regisb commented Nov 25, 2021

Closed by #533.

@regisb regisb closed this as completed Nov 25, 2021
@regisb regisb moved this from Backlog to Done in Tutor project management (obsolete) Nov 25, 2021
regisb pushed a commit that referenced this issue Dec 20, 2021
…n Tutor version changes

Through the commonLabels directive in kustomization.yml, all resources
get a label named "app.kubernetes.io/version", which is being set to
the Tutor version at the time of initial deployment.

When the user then subsequently progresses to a new Tutor version,
Kubernetes attempts to update this label — but for Deployment,
ReplicaSet, and DaemonSet resources, this is no longer allowed as of
kubernetes/kubernetes#50808. This causes
"tutor k8s start" (at the "kubectl apply --kustomize" step) to break
with errors such as:

Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Simply removing the app.kubernetes.io/version label from
kustomization.yml will permanently fix this issue for newly created
Kubernetes deployments, which will "survive" any future Tutor version
changes thereafter.

However, *existing* production Open edX deployments will need to throw
the affected Deployments away, and re-create them.

Also, add the Tutor version as a resource annotation instead, using
the commonAnnotations directive.

See also:
kubernetes/client-go#508
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonlabels/
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonannotations/

Fixes #531.
regisb pushed a commit that referenced this issue Dec 20, 2021
…n Tutor version changes

Through the commonLabels directive in kustomization.yml, all resources
get a label named "app.kubernetes.io/version", which is being set to
the Tutor version at the time of initial deployment.

When the user then subsequently progresses to a new Tutor version,
Kubernetes attempts to update this label — but for Deployment,
ReplicaSet, and DaemonSet resources, this is no longer allowed as of
kubernetes/kubernetes#50808. This causes
"tutor k8s start" (at the "kubectl apply --kustomize" step) to break
with errors such as:

Deployment.apps "redis" is invalid: spec.selector: Invalid value: v1.LabelSelector{MatchLabels:map[string]string{"app.kubernetes.io/instance":"openedx-JIONBLbtByCGUYgHgr4tDWu1", "app.kubernetes.io/managed-by":"tutor", "app.kubernetes.io/name":"redis", "app.kubernetes.io/part-of":"openedx", "app.kubernetes.io/version":"12.1.7"}, MatchExpressions:[]v1.LabelSelectorRequirement(nil)}: field is immutable

Simply removing the app.kubernetes.io/version label from
kustomization.yml will permanently fix this issue for newly created
Kubernetes deployments, which will "survive" any future Tutor version
changes thereafter.

However, *existing* production Open edX deployments will need to throw
the affected Deployments away, and re-create them.

Also, add the Tutor version as a resource annotation instead, using
the commonAnnotations directive.

See also:
kubernetes/client-go#508
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonlabels/
https://kubectl.docs.kubernetes.io/references/kustomize/kustomization/commonannotations/

Fixes #531.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs will be investigated and fixed as quickly as possible.
Projects
Development

No branches or pull requests

2 participants