Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[flux v0.26.0-2] Kustomization tries to modify immutable fields #2386

Open
1 task done
Legion2 opened this issue Feb 4, 2022 · 60 comments
Open
1 task done

[flux v0.26.0-2] Kustomization tries to modify immutable fields #2386

Legion2 opened this issue Feb 4, 2022 · 60 comments

Comments

@Legion2
Copy link
Contributor

Legion2 commented Feb 4, 2022

Describe the bug

I updated to flux 0.26.1 and then observed an reconciliation error in a deployment. I deleted the deployment and now there is a problem with the pvc of that deployment.

PersistentVolumeClaim/monitoring/influxdb-volume dry-run failed, reason: Invalid, error: PersistentVolumeClaim "influxdb-volume" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims
  core.PersistentVolumeClaimSpec{
        AccessModes:      {"ReadWriteOnce"},
        Selector:         nil,
        Resources:        {Requests: {s"storage": {i: {...}, s: "10Gi", Format: "BinarySI"}}},
-       VolumeName:       "",
+       VolumeName:       "pvc-c7f9929e-2741-43ce-b690-ed00816092ad",
        StorageClassName: &"aws-gp2-dynamic",
        VolumeMode:       &"Filesystem",
        DataSource:       nil,
  }

I tried to downgrade the kustomization controller, but that did not resolve the issue.

Steps to reproduce

  1. Install flux 0.25.3
  2. create a PVC
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: influxdb-volume
labels:
  kustomize.toolkit.fluxcd.io/prune: disabled
spec:
storageClassName: aws-gp2-dynamic
accessModes:
  - ReadWriteOnce
volumeMode: Filesystem
resources:
  requests:
    storage: 10Gi
  1. Update flux to 0.26.1

Expected behavior

Should work after the update

Screenshots and recordings

No response

OS / Distro

Ubuntu 21

Flux version

flux: v0.26.1

Flux check

► checking prerequisites
✔ Kubernetes 1.20.11-eks-f17b81 >=1.20.6-0
► checking controllers
✔ helm-controller: deployment ready
► ghcr.io/fluxcd/helm-controller:v0.16.0
✔ image-automation-controller: deployment ready
► ghcr.io/fluxcd/image-automation-controller:v0.20.0
✔ image-reflector-controller: deployment ready
► ghcr.io/fluxcd/image-reflector-controller:v0.16.0
✔ kustomize-controller: deployment ready
► ghcr.io/fluxcd/kustomize-controller:v0.19.1
✔ notification-controller: deployment ready
► ghcr.io/fluxcd/notification-controller:v0.21.0
✔ source-controller: deployment ready
► ghcr.io/fluxcd/source-controller:v0.21.1
✔ all checks passed

Git provider

Gitlab

Container Registry provider

Gitlab

Additional context

No response

Code of Conduct

  • I agree to follow this project's Code of Conduct
@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

I'm going to try to reproduce this, but first could you please add some more information about how you created the PVC?

Is it just, create a PVC (without volumeName) through Flux 0.25.3, then upgrade to a different version?

(I did not run into this issue in my testing, possibly because I am always creating PVCs with volumeName set)

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

I updated the issue, I use PVC with dynamic provisioning of storage classes on aws. As a workaround I removed the pvc from the kustomization.

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

Can you please post the content output from:

kubectl get pvc influxdb-volume -oyaml --show-managed-fields

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

Also, there are some serious formatting errors in the YAML that you posted, and it's missing a namespace. This would not work in Flux. Can you put the YAML that was added in the commit to Flux?

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

I tried to replicate your scenario here:

https://github.com/kingdonb/fleet-testing/blob/main/apps/keycloak/example-pvc.yaml

Installed Flux 0.25.3, added a PVC and a pod bound to it so the PV would be created... omitted volumeName from my spec, as it appears you have (and most people would do)

Upgraded to Flux 0.26.1

I am not seeing any errors across the Kubernetes versions in my test matrix. If the issue is resolved for you, I cannot reproduce it and we'll have to close this, unless you can provide more information (or if somebody else has this issue.)

Thanks again for your report.

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  annotations:
    pv.kubernetes.io/bind-completed: "yes"
    pv.kubernetes.io/bound-by-controller: "yes"
    volume.beta.kubernetes.io/storage-provisioner: kubernetes.io/aws-ebs
    volume.kubernetes.io/selected-node: ip-192-168-41-199.eu-central-1.compute.internal
    volume.kubernetes.io/storage-resizer: kubernetes.io/aws-ebs
  creationTimestamp: "2020-10-02T16:43:43Z"
  finalizers:
  - kubernetes.io/pvc-protection
  labels:
    kustomize.toolkit.fluxcd.io/name: infrastructure
    kustomize.toolkit.fluxcd.io/namespace: flux-system
    kustomize.toolkit.fluxcd.io/prune: disabled
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:pv.kubernetes.io/bind-completed: {}
          f:pv.kubernetes.io/bound-by-controller: {}
          f:volume.beta.kubernetes.io/storage-provisioner: {}
          f:volume.kubernetes.io/selected-node: {}
          f:volume.kubernetes.io/storage-resizer: {}
        f:finalizers:
          .: {}
          v:"kubernetes.io/pvc-protection": {}
        f:labels:
          .: {}
          f:kustomize.toolkit.fluxcd.io/namespace: {}
          f:kustomize.toolkit.fluxcd.io/prune: {}
      f:spec:
        f:accessModes: {}
        f:resources:
          f:requests:
            .: {}
            f:storage: {}
        f:storageClassName: {}
        f:volumeMode: {}
        f:volumeName: {}
      f:status:
        f:accessModes: {}
        f:capacity:
          .: {}
          f:storage: {}
        f:phase: {}
    manager: kustomize-controller
    operation: Apply
    time: "2021-10-27T11:39:29Z"
  name: influxdb-volume
  namespace: monitoring
  resourceVersion: "456920544"
  uid: c7f9929e-2741-43ce-b690-ed00816092ad
spec:
  accessModes:
  - ReadWriteOnce
  resources:
    requests:
      storage: 10Gi
  storageClassName: aws-gp2-dynamic
  volumeMode: Filesystem
  volumeName: pvc-c7f9929e-2741-43ce-b690-ed00816092ad
status:
  accessModes:
  - ReadWriteOnce
  capacity:
    storage: 10Gi
  phase: Bound

I have a kustomization.yaml file which defines the namespace for all resources

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

        f:volumeName: {}

This suggests that the Flux YAML in your git repo contains the volumeName field in its spec. Is that the case?

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

no

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

maybe does the kustomization add this field somehow?

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

It is possible the Flux manager took the field over when it shouldn't have, I'm not sure.

In my cluster, I have this:

- apiVersion: v1
  fieldsType: FieldsV1
  fieldsV1:
    f:metadata:
      f:annotations:
        f:pv.kubernetes.io/bind-completed: {}
        f:pv.kubernetes.io/bound-by-controller: {}
        f:volume.beta.kubernetes.io/storage-provisioner: {}
        f:volume.kubernetes.io/storage-provisioner: {}
    f:spec:
      f:volumeName: {}
  manager: kube-controller-manager
  operation: Update
  time: "2022-02-04T15:35:52Z"

If flux finds fields managed by kube-controller-manager, it does not manage or take them over as I understand it.

My cluster is kind + local-path-provisioner, I can try a different storage class provider as this might not be representative.

(The best bet is probably for me to try replicating this on EKS next, since that's your environment...)

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

I have kubernetes 1.20 and the volume is from 2020, back then it did not use server side apply

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

I have a test suite that runs on Flux 0.17.2 and upgrades it to the current version, I can use that to try to replicate the issue. If your volumes are that old, it might have different behavior – we resolved a number of issues like that to bring out 0.26.0, where you can only see them if you have started with Flux before serverside apply, and upgraded through 0.18.x-0.25.x.

Like this issue:

I might actually need to start with a Kubernetes version before serverside apply to reliably reproduce all of these kind of issues. As if kube-controller-manager uses serverside apply, or if it gets captured in managedFields however that happens nowadays, it won't matter what version of Flux created the PVC initially in the cluster, I won't see the issue reproduced...

@kingdonb
Copy link
Member

kingdonb commented Feb 4, 2022

I confirmed that Kubernetes 1.20 with Flux 0.17.2 together with local-path-provisioner produces a PVC that looks like this:

kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rancher.io/local-path
  creationTimestamp: "2022-02-04T17:52:28Z"
  finalizers:
  - kubernetes.io/pv-protection
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:phase: {}
    manager: kube-controller-manager
    operation: Update
    time: "2022-02-04T17:52:28Z"

In other words, K8s 1.20.x is not old enough to satisfy the required test conditions. I'm looking into trying to create a K8s cluster @1.15.x, install Flux 0.17.x on it, ... but that will not work, since Flux hasn't supported K8s versions < 1.16.0 since Flux2 v0.0.7

This is an extreme case :)

I don't think we should balk at it though... I think we'll need to upgrade a cluster from K8s 1.15.x, to know for sure what happens when the cluster has resources that were initially created before server side apply had even reached beta.

SSA was marked beta in 1.16.x. It might be easier and just as effective to test against a cluster with 1.16.x, and just ensure that beta SSA feature is turned off before the volume is created. (That way, I don't also need to start with Flux v1...)

I've spent too much time on this today, but I think you have likely got something here. We may need to give some advisory if your cluster was in production before a certain threshold date. Hopefully there's a way we can reconcile this more broadly.

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 4, 2022

kind: PersistentVolume
metadata:
  annotations:
    pv.kubernetes.io/provisioned-by: rancher.io/local-path
  creationTimestamp: "2022-02-04T17:52:28Z"
  finalizers:
  - kubernetes.io/pv-protection
  managedFields:
  - apiVersion: v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:phase: {}
    manager: kube-controller-manager
    operation: Update
    time: "2022-02-04T17:52:28Z"

@kingdonb this is the pv not the pvc

@kingdonb
Copy link
Member

kingdonb commented Feb 5, 2022

Whoops, you're right... I already tore down the test scaffold, I'll have to repeat the test again later. Sorry for the noise.

@somtochiama
Copy link
Member

@Legion2 is it possible for you to paste the managedFields of the PVC before you updated flux,

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 5, 2022

No I don't have this data

@stefanprodan
Copy link
Member

stefanprodan commented Feb 5, 2022

Removing f:volumeName from the managedFields with a kubectl patch should unblock you. But we need to figure out how did kustomize-controller ended up owning that field. Was this PVC created with Flux v1 or with kubectl?

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 5, 2022

We used flux from the beginning and used it for everything in the cluster, so I think it was created with flux 1. However, since then we migrated stuff multiple times and need to manually fix stuff, so I'm not 100% sure if kubectl is not involved here.

@Legion2
Copy link
Contributor Author

Legion2 commented Feb 7, 2022

@stefanprodan I'm not familiar with the kubectl patch command, could you give an example on how to remove a managed field?

@stefanprodan
Copy link
Member

stefanprodan commented Feb 7, 2022

Create a YAML with only metadata and managed fields, remove the volumeName, then apply it with kubectl patch, docs here https://kubernetes.io/docs/tasks/manage-kubernetes-objects/update-api-object-kubectl-patch/

@monotek
Copy link

monotek commented Feb 15, 2022

We had the same problem with a lot of kustomizations and immutable CluserIP, volumname and stroageclass fields, updating from kustomize controller 0.14.1 to 0.20.2.

flux-system   backstage                   False   Service/backstage/patroni-metrics dry-run failed, reason: Invalid, error: Service "patroni-metrics" is invalid: [spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset, spec.ipFamilies[0]: Invalid value: []core.IPFamily(nil): primary ipFamily can not be unset]...   

Removing the the managed fields via patch helps though.

kubectl -n namespace patch svc service-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'
kubectl -n namespace patch pvc pvc-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'
kubectl -n namespace patch storageclass storageclass-name --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'

@ludovicm67
Copy link

I was facing the same issue.

Using the following commands was helpful for me:

kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:storageClassName"}]'
kubectl -n NAMESPACE patch pvc PVC_NAME --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'

@SStorm
Copy link

SStorm commented Feb 16, 2022

I've got the same problem migration from flux v1 to flux v2 (0.26.1) and with ingress-nginx.

Flux complains with:

❯ flux reconcile kustomization infra --with-source
► annotating GitRepository flux-system in flux-system namespace
✔ GitRepository annotated
◎ waiting for GitRepository reconciliation
✔ fetched revision aks1-westeurope/4bcf2a7eb50d2f869d217f850fa59955754d6375
► annotating Kustomization infra in flux-system namespace
✔ Kustomization annotated
◎ waiting for Kustomization reconciliation
✗ Kustomization reconciliation failed: Service/ingress-nginx/ingress-nginx dry-run failed, reason: Invalid, error: Service "ingress-nginx" is invalid: spec.clusterIPs[0]: Invalid value: []string(nil): primary clusterIP can not be unset

Here's what's in k8s:

apiVersion: v1
kind: Service
metadata:
  annotations:
    external-dns.alpha.kubernetes.io/hostname: [snip].
  creationTimestamp: "2020-10-22T16:20:39Z"
  finalizers:
  - service.kubernetes.io/load-balancer-cleanup
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
    kustomize.toolkit.fluxcd.io/name: infra
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  name: ingress-nginx
  namespace: ingress-nginx
  resourceVersion: "192626589"
  uid: a79bc74c-aa57-4223-9f80-0b25249f13b9
spec:
  clusterIP: 10.113.45.120
  clusterIPs:
  - 10.113.45.120
  externalTrafficPolicy: Local
  healthCheckNodePort: 32210
  ports:
  - name: http
    nodePort: 32695
    port: 80
    protocol: TCP
    targetPort: http
  - name: https
    nodePort: 30120
    port: 443
    protocol: TCP
    targetPort: https
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
  sessionAffinity: None
  type: LoadBalancer
status:
  loadBalancer:
    ingress:
    - ip: [snip]

And here's what we always had in git:

kind: Service
apiVersion: v1
metadata:
  name: ingress-nginx
  namespace: ingress-nginx
  annotations:
    external-dns.alpha.kubernetes.io/hostname: ingress.aks1.westeurope.azure.cratedb.net.
  labels:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx
spec:
  type: LoadBalancer
  externalTrafficPolicy: Local
  ports:
  - name: http
    port: 80
    targetPort: http
  - name: https
    port: 443
    targetPort: https
  selector:
    app.kubernetes.io/name: ingress-nginx
    app.kubernetes.io/part-of: ingress-nginx

Banging my head on the table with this. Any help much appreciated.

@monotek
Copy link

monotek commented Feb 16, 2022

If you want to see the actual problematic fields you have to add "--show-managed-fields" to your kubectl command.

See my post above for a workaround: #2386 (comment)

kubectl -n ingress-nginx patch svc ingress-nginx --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'

@SStorm
Copy link

SStorm commented Feb 16, 2022

Thank you! I did not know that managed fields are no longer shown by default, assumed it was a different issue. Problem solved.

@kingdonb
Copy link
Member

kingdonb commented Feb 16, 2022

If you have not tried upgrading to Flux v0.26.2 v0.26.3 or greater, it has breaking changes around this area:

The issue it fixes is here:

This could easily be the same issue, and I think it most likely is. I see reports mentioning version v0.26.1 but I do not see anyone who has mentioned this issue on a version >= v0.26.2 v0.26.3, which included the fix above (and another important one, @stefanprodan mentions below.)

As Flux takes over all the managed fields, so if you have edits which you expect to remain in the cluster but they are not in git, they will have to set a manager to avoid being overwritten by Flux. So I want to be careful about advising this upgrade, though it is important and it makes Flux work more like as advertised, (so I do not want to caution anyone away from it.)

https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux

If we can confirm this issue is still present in later versions of Flux, I will be glad to investigate. The kubectl patch described above should no longer be necessary after the upgrade. If anyone is still struggling with this, please let us know. 🙏

@stefanprodan
Copy link
Member

I think those that upgraded to v0.26.0 or v0.26.1 or v0.26.2 will have this issue. In v0.26.3 we found a better way to take over the fields managed by kubectl. We'll need to point people that upgraded to v0.26.0-2 to this issue, as patching the managedFields to remove the immutable ones is the only way to fix this.

@stefanprodan stefanprodan changed the title Kustomization tries to modify immutable field VolumeName of PVC [flux v0.26.0-2] Kustomization tries to modify immutable fields Feb 16, 2022
@stefanprodan stefanprodan pinned this issue Feb 16, 2022
@mhsh64
Copy link

mhsh64 commented Mar 1, 2022

But I am also wondering why there was no issue in our previous flux version?

Shouldn't we upgrade the flux version because of this issue?

@stefanprodan
Copy link
Member

I'm not aware of any better way to solve this for now, you have to delete and recreate those services.

To avoid deleting the Service, a patch can be used to remove the ClusterIP from the managed fields:

kubectl patch svc my-svc --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:clusterIP"}]'

Note that the index must match the kustomize-controller manager, the example above is for index 0.

@tshak
Copy link

tshak commented Mar 1, 2022

I believe that I've reproduced this issue when migrating from Fluxv1->Fluxv2. A Knative Serving resource manages all annotations. Its validating webhook prevents modification of these annotations. The yaml in git does not contain any annotations. Everything applies fine with Flux v1. When applying with Fluxv2 I get the following error:

BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator

Note that metadata.annotations is a managed field owned by Knative Serving. Is it fair to assume that v0.27.3 will fix this issue without a workaround? Please let me know if you need full debug info.

@tshak
Copy link

tshak commented Mar 2, 2022

I'm still reproducing my issue with v0.27.3. I misread the previous comment about fixes being in 0.26.3, not 0.27.3 so this is not surprising. It's not clear to me why the patch would mutate existing annotations as 1) our resource in git contains no annotations and 2) annotations are managed by another controller as specified in managedFields. This is a Fluxv1->Fluxv2 migration blocker for us.

@tshak
Copy link

tshak commented Mar 2, 2022

I've tested using flux build kustomization... | kubectl apply -f - and I can apply the manifest without any issues. This leads me to believe that the issue is with how the kustomize controller is applying the manifests.

@kingdonb
Copy link
Member

kingdonb commented Mar 2, 2022

If knative updates annotations in resources that Flux applies, then what do you see in managedFields before Flux overwrites it?

kubectl get [resource that errored] -oyaml --show-managed-fields

@tshak the problem is likely that Flux doesn't allow arbitrary writes from arbitrary writers to persist, that is considered drift and then reverted. This is new behavior https://fluxcd.io/docs/faq/#why-are-kubectl-edits-rolled-back-by-flux

It was required to make changes to the way that Flux does applies after server-side apply was implemented because of some issues with Kubernetes. Many users reported that Flux was allowing drift to persist in their clusters even though they had made attempts to overwrite it in git, or the section of config had already been deleted, but it still persisted in the cluster.

Flux has adjusted the approach to honor the expectations of GitOps users everywhere that expect configuration to reflect what is in git, and not some drift introduced from an unknown source. But some resources are not consistently putting their (intentional, expected, according-to-specifications) drift in places where it can work with GitOps that behaves this way.

Instead of using kubectl apply -f use kubectl apply --server-side -f and see what happens. That's more similar to what Flux is doing now than kubectl apply -f, which is a client-side apply. Also, please provide the content of managed fields, it will help us better understand what knative is doing and if there is any workaround we can build into Flux to resolve this now.

@tshak
Copy link

tshak commented Mar 3, 2022

Thank you for the detailed explanation. Here is an example Knative Service that is failing to apply:

Git:

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  name: emojivoto
  namespace: demo
spec:
  template:
    spec:
      containers:
      - name: emojivoto
        image: buoyantio/emojivoto-web:v6
        env:
        - name: EMOJISVC_HOST
          value: emojivoto-emoji-svc.demo.svc.cluster.local:80
        - name: VOTINGSVC_HOST
          value: emojivoto-voting-svc.demo.svc.cluster.local:80
        - name: INDEX_BUNDLE
          value: dist/index_bundle.js
        - name: WEB_PORT
          value: "80"
        ports:
        - containerPort: 80
          protocol: TCP

Server (status is redacted for brevity) :

apiVersion: serving.knative.dev/v1
kind: Service
metadata:
  annotations:
    serving.knative.dev/creator: system:serviceaccount:flux:flux
    serving.knative.dev/lastModifier: system:serviceaccount:flux:flux
  creationTimestamp: "2021-12-09T12:21:32Z"
  generation: 5
  labels:
    kustomize.toolkit.fluxcd.io/name: apps-kubernetes-state
    kustomize.toolkit.fluxcd.io/namespace: flux-system
  managedFields:
  - apiVersion: serving.knative.dev/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations: {}
        f:labels:
          .: {}
          f:kustomize.toolkit.fluxcd.io/name: {}
          f:kustomize.toolkit.fluxcd.io/namespace: {}
      f:spec:
        .: {}
        f:template:
          .: {}
          f:spec:
            .: {}
            f:containers: {}
    manager: kustomize-controller
    operation: Apply
    time: "2021-12-09T15:49:33Z"
  - apiVersion: serving.knative.dev/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        .: {}
        f:address:
          .: {}
          f:url: {}
        f:conditions: {}
        f:latestCreatedRevisionName: {}
        f:latestReadyRevisionName: {}
        f:observedGeneration: {}
        f:traffic: {}
        f:url: {}
    manager: controller
    operation: Update
    time: "2021-12-09T15:40:00Z"
  name: emojivoto
  namespace: demo
  resourceVersion: "533389058"
  uid: d367349c-aa69-4e82-96f8-f88804108c27
spec:
  template:
    metadata:
      creationTimestamp: null
    spec:
      containerConcurrency: 0
      containers:
      - env:
        - name: EMOJISVC_HOST
          value: emojivoto-emoji-svc.demo.svc.cluster.local:80
        - name: VOTINGSVC_HOST
          value: emojivoto-voting-svc.demo.svc.cluster.local:80
        - name: INDEX_BUNDLE
          value: dist/index_bundle.js
        - name: WEB_PORT
          value: "80"
        image: buoyantio/emojivoto-web:v6
        name: emojivoto
        ports:
        - containerPort: 80
          protocol: TCP
        readinessProbe:
          successThreshold: 1
          tcpSocket:
            port: 0
        resources:
          limits:
            cpu: "1"
            memory: 2G
          requests:
            cpu: 50m
            memory: 50M
      enableServiceLinks: false
      timeoutSeconds: 30
  traffic:
  - latestRevision: true
    percent: 100

@tshak
Copy link

tshak commented Mar 3, 2022

I was able to successfully perform a server side apply ut it required --force-conflict due to the managed field .spec.template.spec.containers. This is expected as knative manages this field (although I'm not sure why). When I set force: true for the Kustomization resource I still see the annotation error:

Service/demo/emojivoto dry-run failed, reason: BadRequest, error: admission webhook "validation.webhook.serving.knative.dev" denied the request: validation failed: annotation value is immutable: metadata.annotations.serving.knative.dev/creator

Interestingly, a flux diff... using the exact same arguments fails with this error as well.

@stefanprodan
Copy link
Member

@tshak try to remove the annotations with kubectl -n demo patch service emojivoto --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:metadata/f:annotations"}]'

@tshak
Copy link

tshak commented Mar 3, 2022

After patching the resource I can still repro the issue. I don't believe that the root cause is due to managed fields or server-side apply behaviour. Here is a recap of what I've tried (all apply commands are using --server-side):

  1. kubectl apply... the standalone yaml from the git repo that contains only the offending resource (emojivoto knative service) ✅
  2. flux build... | kubectl apply... the Kustomization which contains the offending resource ✅
  3. flux diff... the Kustomization which contains the offending resource ❌
  4. Kustomization applied by the kustomization-controller

In all ❌ cases, the error is the same annotation value is immutable error from the knative admission webhook.

@stefanprodan
Copy link
Member

Flux performs a server-side apply, and since it manages f:annotations it tries to remove them as they are not in Git. After you've run the kubectl patch can you confirm that annotations are no longer managed by kustomize-controller?

@tshak
Copy link

tshak commented Mar 3, 2022

I have confirmed that f:annotations: {} is no longer present in the offending resource. However, the error persists.

@stefanprodan
Copy link
Member

So if you do kubectl apply --server-side does it fail the same?

@tshak
Copy link

tshak commented Mar 3, 2022

In this case it fails but as expected:

error: Apply failed with 1 conflict: conflict with "kustomize-controller": .spec.template.spec.containers

It passes by adding --force-conflicts.

@stefanprodan
Copy link
Member

I see in the annotations "serving.knative.dev/creator: system:serviceaccount:flux:flux" but there is no such service account in cluster. Have you changed the SA?

@tshak
Copy link

tshak commented Mar 3, 2022

This resource was created with FluxV1. I'm testing a FluxV1->FluxV2 migration.

@stefanprodan
Copy link
Member

stefanprodan commented Mar 3, 2022

@tshak there is nothing we can do in Flux about this, if Knative decided to make annotations immutable then you can't reconcile this with anything else but the flux SA. I have no clue why would Knative do such a crazy thing...

@tshak
Copy link

tshak commented Mar 3, 2022

I think that it's just these specific knative annotations that are immutable. And again, even a flux build... | kubectl apply --server-side... works.

@stefanprodan
Copy link
Member

stefanprodan commented Mar 3, 2022

Using kubectl works because there is no SA, but when you apply things from inside the cluster Knative whats the flux SA. I see no way around this but to delete the service and then it will be created by the kustomize-controller SA.

@tshak
Copy link

tshak commented Mar 3, 2022

There was an error on my end. The managedFields patch did in fact work. The error I was getting was pointing to a similarly named emojivoto service. This means that we'll need to patch all knative services prior to upgrading to Flux v2. Thank you for taking the time to help me debug this issue. 🙏

@mrtndwrd
Copy link

I'm also running into this issue. The system we're running doesn't really enable us to manually run patches. Is there any way we can apply these patches through flux to roll it out to all the servers tracking our repository?

@mdnfiras
Copy link

i'm running into this problem after updating flux from 0.24.1 to 0.28.5:

PersistentVolumeClaim/my-db/my-pvc dry-run failed, reason: Invalid, error: PersistentVolumeClaim "my-pvc" is invalid: spec: Forbidden: spec is immutable after creation except resources.requests for bound claims	
              	                   	                                                	         	     	  core.PersistentVolumeClaimSpec{                                                                                                                                                                                                            	
              	                   	                                                	         	     	  	AccessModes:      {"ReadWriteOnce"},                                                                                                                                                                                                       	
              	                   	                                                	         	     	  	Selector:         nil,                                                                                                                                                                                                                     	
              	                   	                                                	         	     	  	Resources:        {Requests: {s"storage": {i: {...}, s: "20Gi", Format: "BinarySI"}}},                                                                                                                                                     	
              	                   	                                                	         	     	- 	VolumeName:       "",                                                                                                                                                                                                                      	
              	                   	                                                	         	     	+ 	VolumeName:       "pvc-xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxxx",                                                                                                                                                                              	
              	                   	                                                	         	     	- 	StorageClassName: nil,                                                                                                                                                                                                                     	
              	                   	                                                	         	     	+ 	StorageClassName: &"standard",                                                                                                                                                                                                             	
              	                   	                                                	         	     	  	VolumeMode:       &"Filesystem",                                                                                                                                                                                                           	
              	                   	                                                	         	     	  	DataSource:       nil,                                                                                                                                                                                                                     	
              	                   	                                                	         	     	  }

i'm running on gke 1.20

@snahelou
Copy link

snahelou commented Apr 26, 2022

Hello

Patch did the job 👍

kubectl -n demo patch pvc test --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeMode"}]'
kubectl -n demo patch pvc test --type=json -p='[{"op": "remove", "path": "/metadata/managedFields/0/fieldsV1/f:spec/f:volumeName"}]'

Flux : 0.28.5 / K8S: 1.22

Thanks

@uclaeaslam
Copy link

I've run into this issue too. I upgraded from Flux v1 to v2 (latest 0.29.3). Everything appeared to be working fine (after modifying all the yamls to use default namespace since v2 requires namespace). A few days later, I noticed that I'm getting reconciliation errors about the immutable fields

@stefanprodan
Copy link
Member

stefanprodan commented May 3, 2022

@uclaeaslam it seems that Flux v1 took ownership of fields it didn’t manage (I suspect this is a kubectl bug). To fix this you need to remove those fields with a patch, please see the examples in this thread. If after patching, the problem persists, please open an issue in kustomize-controller repo.

@uclaeaslam
Copy link

Patching did the trick

The reason I thought it didn't work was because flux wasn't rerunning the reconciliation. I had to do
flux suspend kustomization flux-system
flux resume kustomization flux-system
After that, it retried.

@stefanprodan
Copy link
Member

@uclaeaslam the reconciliation interval is set to 10 minutes. You can trigger it with a git commit or with the CLI:

flux reconcile ks flux-system --with-source

@stefanprodan stefanprodan unpinned this issue Jun 8, 2022
@zzvara
Copy link

zzvara commented Jun 10, 2022

Unfortunately, this is still an issue with Flux and knative. We can not use knative Services in HelmChart, therefore we are looking into Argo CD and whether it could work there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests