Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

`kubectl apply deployment -f` doesn't accept label/selector changes #26202

Closed
lavalamp opened this issue May 24, 2016 · 95 comments

Comments

@lavalamp
Copy link
Member

commented May 24, 2016

I got errors complaining about the selector and labels not matching even if I explicitly specified both. Showed @janetkuo.

@janetkuo

This comment has been minimized.

Copy link
Member

commented May 24, 2016

To reproduce:

First create a Deployment via apply -f

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
        test: abcd
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

Then remove any of the labels in the file (and optionally specify the selector), and then apply -f

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  selector:
    matchLabels:
      app: nginx
  replicas: 3
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.7.9
        ports:
        - containerPort: 80

Error message:

proto: no encoder for TypeMeta unversioned.TypeMeta [GetProperties]
proto: tag has too few fields: "-"
proto: no coders for struct *reflect.rtype
proto: no encoder for sec int64 [GetProperties]
proto: no encoder for nsec int32 [GetProperties]
proto: no encoder for loc *time.Location [GetProperties]
proto: no encoder for Time time.Time [GetProperties]
proto: no encoder for InitContainers []v1.Container [GetProperties]
proto: no coders for intstr.Type
proto: no encoder for Type intstr.Type [GetProperties]
The Deployment "nginx-deployment" is invalid.
spec.template.metadata.labels: Invalid value: {"app":"nginx"}: `selector` does not match template `labels`

@kubernetes/kubectl @adohe

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented May 24, 2016

Is there a bug already open for all that proto crap? I had not seen that
showing up before - I would not expect it.

On May 24, 2016, at 6:29 PM, Janet Kuo notifications@github.com wrote:

To reproduce:

First create a Deployment via apply -f

apiVersion: extensions/v1beta1kind: Deploymentmetadata:
name: nginx-deploymentspec:
replicas: 3
template:
metadata:
labels:
app: nginx
test: abcd
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80

Then remove any of the labels in the file, and apply -f

$ kubectl apply -f docs/user-guide/nginx-deployment.yaml proto: no
encoder for TypeMeta unversioned.TypeMeta [GetProperties]proto: tag
has too few fields: "-"proto: no coders for struct
*reflect.rtypeproto: no encoder for sec int64 [GetProperties]proto: no
encoder for nsec int32 [GetProperties]proto: no encoder for loc
*time.Location [GetProperties]proto: no encoder for Time time.Time
[GetProperties]proto: no encoder for InitContainers []v1.Container
[GetProperties]proto: no coders for intstr.Typeproto: no encoder for
Type intstr.Type [GetProperties]The Deployment "nginx-deployment" is
invalid.spec.template.metadata.labels: Invalid value: {"test":"abcd"}:
selector does not match template labels

@kubernetes/kubectl https://github.com/orgs/kubernetes/teams/kubectl
@adohe https://github.com/AdoHe


You are receiving this because you are on a team that was mentioned.
Reply to this email directly or view it on GitHub
#26202 (comment)

@janetkuo

This comment has been minimized.

Copy link
Member

commented May 24, 2016

The problem is that the apply annotation didn't record .spec.selector when it's defaulted from .spec.template.labels and it caused problems when calculating diff:

kubectl.kubernetes.io/last-applied-configuration: '{"kind":"Deployment","apiVersion":"extensions/v1beta1","metadata":{"name":"nginx-deployment","creationTimestamp":null},"spec":{"replicas":3,"template":{"metadata":{"creationTimestamp":null,"labels":{"app":"nginx","test":"abcd"}},"spec":{"containers":[{"name":"nginx","image":"nginx:1.7.9","ports":[{"containerPort":80}],"resources":{}}]}},"strategy":{}},"status":{}}'

@adohe

This comment has been minimized.

Copy link
Member

commented May 25, 2016

That's because the last-applied-configuration is just the resource content read from user input when we use apply to create resource. Also the .spec.selector is derived from .spec.template.labels rule is not used in client side.

@adohe

This comment has been minimized.

Copy link
Member

commented May 25, 2016

@janetkuo I am considering adding the default .spec.selector to last-applied-configuration annotation WDYT?

@therc

This comment has been minimized.

Copy link
Contributor

commented May 26, 2016

We've been bitten by this for a service:

The Service "labeler" is invalid.

* metadata.resourceVersion: Invalid value: "": must be specified for an update
* spec.clusterIP: Invalid value: "": field is immutable

I just nuked the last-applied-configuration annotation and the change applied successfully.

@therc

This comment has been minimized.

Copy link
Contributor

commented May 26, 2016

And before that error, there was the very same sequence of proto errors.

@lavalamp

This comment has been minimized.

Copy link
Member Author

commented May 26, 2016

Please note if you want to get defaulted stuff, it's super important to let the server do the defaulting.

@adohe

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@lavalamp yes, we should put this in server side.

@adohe

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@bgrant0607 @janetkuo @lavalamp I thought about this carefully, I think the root cause of both this issue and #24198 is last-applied-configuration annotation doesn't include fields which has default value, e.g spec.selector and spec.strategy. Users may not specify these fields in the original resource file, but we will set in the client side. I have opened a PR #26405 to indicate this problem.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@adohe We should not add default values to last-applied-configuration. Unfortunately the design hasn't been written up yet, but there's some discussion on #1702, I think.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

Changing selectors has other problems, also. For instance, it's a map, and maps are merged by default.

Also note that a default selector won't be changed automatically when pod template labels are changed.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

I wrote a brief explanation here: #15894 (comment)

@metral

This comment has been minimized.

Copy link
Contributor

commented May 27, 2016

Ran into this myself. In the meantime is there a work around that doesn't require blowing away last-applied-configuration?

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

Accidental orphaning is another problem: #24888

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@metral What problem did you encounter?

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@lavalamp We capture the last-applied-configuration prior to defaulting, so it shouldn't matter whether the defaulting happens in the client or server, though obviously I'd prefer the latter.

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

Controlled pods are identified by labels, not by name.

In general, changing the labels and selector on a controller isn't expected to work. To do it properly requires a multi-step dance. In any case, controllers don't update running pods -- for the most part, not even Deployment does that.

We should strongly recommend that users shouldn't do that, unless they use kubectl replace --force, which does a cascading delete and re-create.

@metral

This comment has been minimized.

Copy link
Contributor

commented May 27, 2016

@bgrant0607 I'm hitting the same issue as @janetkuo described when I try changing a label. In my case it's a label I use for the revision/commit of the current build which gets altered in the deployment manifest upon a successful build so that I can apply the new deployment.

Oddly enough, this same strategy is working on 2 other projects / deployments that I have in the system just fine, but for some reason on my 3rd one I'm hitting this issue. I'm sure it's something on my end that I haven't caught onto, but strange nevertheless that updating the rev & applying the new deployment works for some and not others

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 27, 2016

@metral Specify a selector that omits the label you plan to change.

@metral

This comment has been minimized.

Copy link
Contributor

commented May 27, 2016

@bgrant0607 that makes sense - I'll give that a shot when I'm back at my computer.

Just odd that I've never specified the selector in my working deployments, just the labels in the metadata and changing the revs on those has never caused an issue until now with my new projects deployment

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 28, 2016

I think the longer-term solution will be to move to a defaulting approach similar to what we use for Job:
https://github.com/kubernetes/kubernetes/blob/master/docs/design/selector-generation.md

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 28, 2016

@therc How did you create the service in the case described by #26202 (comment) ? Did you use apply or create?

@bgrant0607

This comment has been minimized.

Copy link
Member

commented May 28, 2016

This is basically a dupe of #24888.

@therc

This comment has been minimized.

Copy link
Contributor

commented Jun 15, 2016

@bgrant0607 I didn't create it myself. I suspect my colleague used apply.

@therc

This comment has been minimized.

Copy link
Contributor

commented Jun 16, 2016

I think the problem is that in my case I used kubectl edit on the service to add a couple of annotations. That seems to have been responsible for persisting resourceVersion and clusterIP in the last-applied-configuration annotation.

@rphillips

This comment has been minimized.

Copy link
Member

commented Sep 15, 2017

Is this issue considered fixed with the new spec.selector behavior being released in 1.8?

Reference: Draft Release Notes

@liggitt

This comment has been minimized.

Copy link
Member

commented Sep 18, 2017

in apps/v1beta2, defaults are no longer set on spec.selector

however, the spec.selector field is also no longer allowed to mutate at all in apps/v1beta2 as of #50719

@tonglil

This comment has been minimized.

Copy link
Contributor

commented Nov 15, 2017

It seems like this is still an issue for me w/ extensions/v1beta1, even though it is should work based on the release notes from 50719:

For Deployment, ReplicaSet, and DaemonSet, selectors are now immutable when updating via the new apps/v1beta2 API. For backward compatibility, selectors can still be changed when updating via apps/v1beta1 or extensions/v1beta1.

@kow3ns

This comment has been minimized.

Copy link
Member

commented Nov 29, 2017

@tonglil We can't modify the behavior of extensions/v1beta1 without breaking backward compatibility.

dlebrero added a commit to akvo/akvo-flow-maps that referenced this issue Nov 30, 2017

@liggitt

This comment has been minimized.

Copy link
Member

commented Jan 14, 2018

This was resolved in apps/v1beta2 and forward by requiring selectors to be explicitly defined, disallowing mutation on those fields

@ltupin

This comment has been minimized.

Copy link

commented Oct 24, 2018

disallowing mutation on those fields

Why do we want this immutable ?

@janetkuo

This comment has been minimized.

Copy link
Member

commented Oct 26, 2018

@ltupin because it's not safe to mutate a controller's selector. For example, deployment finds its children resources (replicasets) by label/selector. Changing a deployment's selector is dangerous. It could cause the deployment to be unable to find its existing children and therefore cause disruptions to your running workloads. We don't want to encourage that.

@Sergey80

This comment has been minimized.

Copy link

commented Dec 29, 2018

still reproducible in apps/v1beta2

apiVersion: apps/v1beta2
kind: Deployment
metadata:
name: $APP_NAME-deployment
labels:
app: $APP_NAME-k8s ...

Or maybe I did not get @liggitt - what exactly I should do with the fields..

@liggitt

This comment has been minimized.

Copy link
Member

commented Dec 29, 2018

Mutation of label selectors on deployments, replicasets, and daemonsets is not allowed in apps/v1beta2 and forward

@liggitt

This comment has been minimized.

Copy link
Member

commented Dec 29, 2018

See #50808 for details

timcrall added a commit to IGNW/hello-nodejs that referenced this issue Feb 5, 2019

mehdime pushed a commit to mehdime/microservices-demo that referenced this issue Feb 25, 2019

Mehdi
Fix for "selector does not match template labels"
In certain situations (see details and repro below), the deployment to Kubernetes fails with "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value. This commit adds explicit `selector` values for all Deployment objects.

It also bumps the K8S API from the now deprecated `extensions/v1beta1` version to the current stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any further issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however fix the damage done by the defaulted Deployment selectors in existing deployments of the app (see below for an explanation of the issue). Selectors are immutable and will therefore retain their current defaulted value. Which means that you'll still run into the error above. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

```
app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true
```

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

```
The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`
```

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if your're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808

mehdime pushed a commit to mehdime/microservices-demo that referenced this issue Feb 25, 2019

Mehdi
Fix for "selector does not match template labels"
In certain situations (see details below), the deployment to Kubernetes fails with:

> "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value.

This commit:
* adds explicit `selector` values for all Deployment objects.
* bumps the K8S API from the deprecated `extensions/v1beta1` version to the stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any further issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however resolve the issue for existing deployments of the app. Selectors are immutable and will therefore retain their current defaulted value. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

**The nitty-gritty details**

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

> app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

> The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if your're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808

mehdime added a commit to mehdime/microservices-demo that referenced this issue Feb 25, 2019

Fix for "selector does not match template labels"
In certain situations (see details below), the deployment to Kubernetes fails with:

> "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value.

This commit:
* adds explicit `selector` values for all Deployment objects.
* bumps the K8S API from the deprecated `extensions/v1beta1` version to the stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any further issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however resolve the issue for existing deployments of the app. Selectors are immutable and will therefore retain their current defaulted value. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

> app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

> The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`

(and the same error for every other deployment object)

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if your're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808

mehdime added a commit to mehdime/microservices-demo that referenced this issue Feb 25, 2019

Fix for "selector does not match template labels"
In certain situations (see details below), the deployment to Kubernetes fails with:

> "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value.

This commit:
* adds explicit `selector` values for all Deployment objects.
* bumps the K8S API from the deprecated `extensions/v1beta1` version to the stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any further issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however resolve the issue for existing deployments of the app. Selectors are immutable and will therefore retain their current defaulted value. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

|**The nitty-gritty details**

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

> app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

> The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`

(and the same error for every other deployment object)

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if your're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808

mehdime added a commit to mehdime/microservices-demo that referenced this issue Feb 25, 2019

Fix for "selector does not match template labels"
In certain situations (see details below), the deployment to Kubernetes fails with:

> "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value.

This commit:
* adds explicit `selector` values for all Deployment objects.
* bumps the K8S API from the deprecated `extensions/v1beta1` version to the stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however resolve the issue for existing deployments. Selectors are immutable and will therefore retain their current defaulted value. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

**The nitty-gritty details**

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

> app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

> The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`

(and the same error for every other deployment object)

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if you're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808

ahmetb added a commit to GoogleCloudPlatform/microservices-demo that referenced this issue Mar 4, 2019

kubernetes: use apps/v1 for Deployment (#166)
In certain situations (see details below), the deployment to Kubernetes fails with:

> "The Deployment [DEPLOYMENT_OBJECT] is invalid: [...] `selector` does not match template `labels`".

This is caused by the K8S Deployment manifests missing an explicit `selector` value.

This commit:
* adds explicit `selector` values for all Deployment objects.
* bumps the K8S API from the deprecated `extensions/v1beta1` version to the stable `apps/v1` version. This version made the `selector` property of the Deployment a required value, preventing any issues with missing selectors in the future.

This change is backwards compatible with existing deployments of the microservices demo app. I.e. you should be able to pull this change and run `skaffold run` against an existing deployment of the app without issues.

This will not however resolve the issue for existing deployments. Selectors are immutable and will therefore retain their current defaulted value. You should run `skaffold delete` followed by `skaffold run` after having pulled this change to do a clean re-deployment of the app, which will resolve the issue.

**The nitty-gritty details**

In the `extensions/v1beta1` version of K8S API (the version that was used by this project), the `selector` property of a Deployment object is optional and is defaulted to the labels used in the pod template. This can cause subtle issues leading to deployment failures. This project, where Deployment selectors were omitted, is a good example of what can go wrong with defaulted selectors.

Consider this:

1. Run `skaffold run` to build locally with Docker and deploy.

Since the Deployment specs don't have explict selectors, they will be defaulted to the pod template labels. And since skaffold adds additional labels to the pod template like `skaffold-builder` and `skaffold-deployer`, the end-result will be a selector that looks like this:

> app=cartservice,cleanup=true,docker-api-version=1.39,skaffold-builder=local,skaffold-deployer=kubectl,skaffold-tag-policy=git-commit,tail=true

So far, so good.

2. Now run `skaffold run -p gcb --default-repo=your-gcr-repo` to build on Google Cloud Build instead of building locally.

This will blow up when attempting to deploy to Kubernetes with an error similar to:

> The Deployment "cartservice" is invalid: spec.template.metadata.labels: Invalid value: map[string]string{"skaffold-builder":"google-cloud-build", "profiles"="gcb", "skaffold-deployer":"kubectl", "skaffold-tag-policy":"git-commit", "docker-api-version":"1.39", "tail":"true", "app":"cartservice", "cleanup":"true"}: `selector` does not match template `labels`

(and the same error for every other deployment object)

This is because the skaffold labels that were automatically added to the pod template have changed to include references to Google Cloud Build. That normally shouldn't be an issue.

But without explicit Deployment selectors, this results in the defaulted selectors for our Deployment objects to have also changed. Which means that the new version of our Deployment objects are now managing different sets of Pods. Which is thankfully caught by kubectl before the deployment happens (otherwise this would have resulted in orphaned pods).

In this commit, we explicitely set the `selector` value of all Deployment objects, which fixes this issue. We also bump the K8S API version to the stable `apps/v1`, which makes the `selector` property a required value and will avoid accidently forgetting selectors in the future.

More details if you're curious:

* Why defaulted Deployment selectors cause problems: kubernetes/kubernetes#26202
* Why Deployment selectors should be (and were made) immutable: kubernetes/kubernetes#50808
@quinn

This comment has been minimized.

Copy link

commented Jul 24, 2019

All i'm seeing is k8s members saying this is "discouraged". What if i have to do this? In production? I have to allocate downtime to destroy the deployment and replace with a new one?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.