Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sync is not working with ignoreDifferences and RespectIgnoreDifferences=true #9678

Closed
3 tasks done
Hax7 opened this issue Jun 15, 2022 · 67 comments · Fixed by #17693
Closed
3 tasks done

sync is not working with ignoreDifferences and RespectIgnoreDifferences=true #9678

Hax7 opened this issue Jun 15, 2022 · 67 comments · Fixed by #17693
Labels
bug Something isn't working

Comments

@Hax7
Copy link

Hax7 commented Jun 15, 2022

Checklist:

  • I've searched in the docs and FAQ for my answer: https://bit.ly/argocd-faq.
  • I've included steps to reproduce the bug.
  • I've pasted the output of argocd version.

Describe the bug

I have an application where I want to ignore the image of the deployment and ignore it from being synced, I can see that the image is ignored and there is new env variables to be added in app diff but once I sync it it succeeds without applying the env variables like it's ignored with the image itself and I still see them in app diff!. This is the application:

project: default
source:
  repoURL: 'git@bitbucket.org:x/x.git'
  path: kubernetes/production
  targetRevision: master
  plugin:
    name: argocd-vault-plugin
destination:
  server: 'https://x.x.x.x'
syncPolicy:
  syncOptions:
    - RespectIgnoreDifferences=true
ignoreDifferences:
  - group: apps
    kind: Deployment
    name: service
    jqPathExpressions:
      - >-
        .spec.template.spec.containers[] | select(.name == "service").image

To Reproduce

Apply an application with ignoreDifferences like the application above and add new changes inside the same container specs
The new specs will not be applied after sync.

Expected behavior

I expect the image to be ignored and the new env variables to be applied/synced

Version

❯ argocd version --grpc-web
argocd: v2.2.7+b8e154f
  BuildDate: 2022-03-09T01:07:22Z
  GitCommit: b8e154f767412172bb0c2b41131460c34d2a0c73
  GitTreeState: clean
  GoVersion: go1.16.11
  Compiler: gc
  Platform: linux/amd64
argocd-server: v2.3.4+ac8b7df
  BuildDate: 2022-05-18T11:41:37Z
  GitCommit: ac8b7df9467ffcc0920b826c62c4b603a7bfed24
  GitTreeState: clean
  GoVersion: go1.17.10
  Compiler: gc
  Platform: linux/amd64
  Ksonnet Version: v0.13.1
  Kustomize Version: v4.4.1 2021-11-11T23:36:27Z
  Helm Version: v3.8.0+gd141386
  Kubectl Version: v0.23.1
  Jsonnet Version: v0.18.0
@Hax7 Hax7 added the bug Something isn't working label Jun 15, 2022
@bmarick
Copy link

bmarick commented Oct 11, 2022

I am also experiencing this issue

@si-c613
Copy link
Contributor

si-c613 commented Oct 27, 2022

We are also seeing the issue, we have two ignore differences on Deployments, one for spec.replicas using jsonPointers as these can be changed by HPA and other external processes, and one for the image .spec.template.spec.containers[].image using a jqPathExpression due to the array of containers (see below).

Through experimentation I've found that if you change .spec.template.spec.containers[0].resources.memory.limits to a different value, then OutOfSync appears, yet when you click Sync with RespectIgnoreDifferences=true the Sync task sees nothing to change and marks as complete but the App still shows OutOfSync. When auto sync is enabled this causes infinite syncs.
I then revert the change in the repo.
If I change .spec.template.metadata.labels by adding a new label, then OutOfSync appears and when I press Sync with RespectIgnoreDifferences=true the Sync applies the change and maintains the image.

So my assumption is that using an array, potentially only a 'wildcard' array, within JQ breaks the Sync logic

  ignoreDifferences:
  - group: apps
    jsonPointers:
    - /spec/replicas
    kind: Deployment
  - group: apps
    jqPathExpressions:
    - .spec.template.spec.containers[].image
    kind: Deployment

@si-c613
Copy link
Contributor

si-c613 commented Oct 27, 2022

Did a further experiment and found that it doesn't appear to be just the JQ expression but in fact anything with an array.
Knowing we only tend to have single container Pods I changed the jqPathExpressions to jsonPointers using /0 in place of the [] and found an incomplete sync when changing the memory limit as above

      ignoreDifferences:
      - group: apps
        jsonPointers:
        - /spec/replicas
        kind: Deployment
      - group: apps
        jsonPointers:
        - /spec/template/spec/containers/0/image

@Cr4mble
Copy link

Cr4mble commented Nov 8, 2022

Facing also the same issue. Here is a simple PoC of the error to reproduce it:

apiVersion: argoproj.io/v1alpha1
kind: Application
metadata:
  name: ingress-nginx-testing
  namespace: argocd
  finalizers:
    - resources-finalizer.argocd.argoproj.io
spec:
  project: default
  source:
    repoURL: 'https://charts.bitnami.com/bitnami'
    targetRevision: 13.2.13
    chart: nginx
    helm:
      version: v3
      values: |
        image:
          tag: "1.23.2-debian-11-r2"
  destination:
    namespace: default
    server: https://kubernetes.default.svc
  syncPolicy:
    syncOptions:
      - ApplyOutOfSyncOnly=true
      - RespectIgnoreDifferences=true
  ignoreDifferences:
    - group: apps
      kind: Deployment
      jqPathExpressions:
        - .spec.template.spec.containers[] | select(.name=="nginx") | .env[] | select(.name=="BITNAMI_DEBUG") | .value

If you change the image tag from 1.23.2-debian-11-r2 to 1.23.2-debian-11-r1 the app will be shown as out-of-sync. If you sync now the pod will not be replaced and after the sync the diff is still there:
Screenshot

This occurs on Argo 2.4.8 and 2.4.14, also on the newest version 2.5.2.

@QuingKhaos
Copy link
Contributor

QuingKhaos commented Dec 12, 2022

Same behavior on arrays if managedFieldsManagers is used. ArgoCD 2.4.14

      syncPolicy:
        syncOptions:
          - RespectIgnoreDifferences=true
      ignoreDifferences:
        - group: apps
          kind: Deployment
          managedFieldsManagers:
            - openshift-controller-manager

OpenShift allows triggering image stream updates directly on the Deployment and updates the container image. Was easier to use the manager name, so the difference is only ignored, if the trigger is used in our apps.

@cehoffman
Copy link
Contributor

cehoffman commented Dec 15, 2022

Confirmed this is also not working with 2.5.2. We are have an ignore like this

ignoreDifferences:
- group: certmanager.io
  kind: Certificate
  jsonPointers:
  - /spec/dnsNames/2

We see the ignore works correctly, but the 3rd dns name is removed when syncing with RespectIgnoreDifferences. We also tried /spec/dnsNames in case there is an oddity with arrays and had the same behavior. Our source file has this for the item being removed for live state:

spec:
  dnsNames:
    - in git
    - also in git
    - manually added and needs to be ignored

And desired state

spec:
  dnsNames:
    - in git
    - also in git

@kaiquerass
Copy link

any updates on this issue?
got the same problem :/

@diogokiss
Copy link

I'm facing the same issue with the list of environment variables for a container. Any updates on this issue? Or even a work around? 🙁

@Len4i
Copy link

Len4i commented Jan 25, 2023

We are using some operator that is changing the env vars of the pod in the deployment. So now, because of this issue, argo rolling back operator changes to its defaults on every sync (and then operator changes it again).
So we have a restart of the deployment on every change of the application, even when it is not related to mentioned deployments at all.

@QuingKhaos
Copy link
Contributor

@Len4i did you forget to configure ignoreDifferences in your apps so the operator changes are at least ignored with RespectIgnoreDifferences=true?

@Len4i
Copy link

Len4i commented Jan 26, 2023

@QuingKhaos, I'll elaborate
We have deployment applied with argo, on the app configured ignoreDifferences

      ignoreDifferences:
      - group: apps
        kind: Deployment
        name: some-deployment
        jqPathExpressions:
         - '.spec.template.spec.containers[] | select(.name == "some-container").env[] | select(.name == "_JAVA_OPTIONS")'

at this stage behavior is following:
on every change in the app, argocd does "full" sync, overriding env var to what it see in the git, as it happens, the operator sees changes and applies its patch to env var. Argocd according to ignoreDifferences not doing anything with this change. Until a change in any other place of the app.

After adding RespectIgnoreDifferences=true behaviors is the following:
argocd not triggering sync at all when change is related to anything inside '.spec.template.spec.containers[].
the workaround of syncing from UI with removing V from RespectIgnoreDifferences is working, but it is "not the GitOps that we want" :)

@QuingKhaos
Copy link
Contributor

the workaround of syncing from UI with removing V from RespectIgnoreDifferences is working, but it is "not the GitOps that we want" :)

unfortunately this is the only viable workaround currently, to not end in a sync loop.

@QuingKhaos
Copy link
Contributor

QuingKhaos commented Jan 26, 2023

Same behavior on arrays if managedFieldsManagers is used. ArgoCD 2.4.14

      syncPolicy:
        syncOptions:
          - RespectIgnoreDifferences=true
      ignoreDifferences:
        - group: apps
          kind: Deployment
          managedFieldsManagers:
            - openshift-controller-manager

OpenShift allows triggering image stream updates directly on the Deployment and updates the container image. Was easier to use the manager name, so the difference is only ignored, if the trigger is used in our apps.

In my use case, if we update some value in .spec.template.spec.containers[], like .resources.limits.cpu the diff for the change is not applied. But if we remove .resources.limits.cpu, then the diff for the removal is applied correctly.

Problem still exists in ArgoCD v2.4.40

@Len4i
Copy link

Len4i commented Jan 26, 2023

what are the chances that this issue is related?
i just saw that argo using gojq and there was bug in compiler related to arrays that was fixed in 0.12.11
and argo 2.5 uses 0.12.9 at the latest

@RamazanKara
Copy link

RamazanKara commented Feb 15, 2023

same issue on latest version

@kahirokunn
Copy link

same issue on latest version

@llavaud
Copy link

llavaud commented Apr 6, 2023

same here with the latest version :(

I have RespectIgnoreDifferences=true and the following ignoreDifferences:

  ignoreDifferences:
    - group: apps
      kind: Deployment
      jqPathExpressions:
        - .spec.template.spec.containers[] | select(.name == "container1").resources

With this configuration if I add a new container in my deployment, application stay OutOfSync and never update the Deployment with the new container.

Same thing occurs with jsonPointers:

      jsonPointers:
        - /spec/template/spec/containers/0/resources

Interesting things occurs with managedFieldsManagers:

      managedFieldsManagers:
        - myManager

With this managedFieldsManagers configuration, on a fresh deployment, if I add a container in my spec, it is added and synced correctly, then I make a change in spec.template.spec.containers using my custom manager and after that if I try to add another container the problem occurs, app stay OutOfSync.
(If I make a change in another field than spec.template.spec.containers, there is no problem)

Syncing from the UI do not work until I disable the RespectIgnoreDifferences option.

@dinukarajapaksha
Copy link

Is there any update on this? Having this issue on the latest version

{
    "Version": "v2.7.3+e7891b8.dirty",
    "BuildDate": "2023-05-24T15:05:34Z",
    "GitCommit": "e7891b899a35dca06ae94965ea5ae2a86b344848",
    "GitTreeState": "dirty",
    "GoVersion": "go1.19.6",
    "Compiler": "gc",
    "Platform": "linux/amd64",
    "KustomizeVersion": "v5.0.1 2023-03-14T01:32:48Z",
    "HelmVersion": "v3.11.2+g912ebc1",
    "KubectlVersion": "v0.24.2",
    "JsonnetVersion": "v0.19.1"
}

@thombohlk
Copy link

thombohlk commented Jun 14, 2023

I had a similar issue where I couldn't sync because of a difference in an immutable field which I wanted to ignore. However, setting ServerSideApply=true fixed the issue for me:

  syncPolicy:
    syncOptions:
    - ServerSideApply=true
    - RespectIgnoreDifferences=true
  ignoreDifferences:
  - group: '*'
    kind: PersistentVolumeClaim
    jqPathExpressions:
    - .spec.volumeName

@ryan-dyer-sp
Copy link

ryan-dyer-sp commented Jun 15, 2023

Same here with 2.7.4

ignoreDifferences:
  - group: admissionregistration.k8s.io
    kind: MutatingWebhookConfiguration
    name: aws-load-balancer-webhook
    namespace: aws-lb-controller
    jqPathExpressions:
      - '.webhooks[]?.clientConfig.caBundle'

Still showing differences on the caBundle.

I see the gojq library in question was recently updated in main but doesnt appear to have made it into the 2.7.4 release. So hopefully it will go into the 2.7.5 release.

@asaf400
Copy link

asaf400 commented Jun 28, 2023

Same here with 2.7.6:

ignoreDifferences:
    - group: batch
      kind: CronJob
      jqPathExpressions:
      - .spec.jobTemplate.spec.template.spec.containers[] | select(.image|test(":0.19.*")).image
  syncPolicy:
    automated:
    syncOptions:
      - RespectIgnoreDifferences=true
      - ApplyOutOfSyncOnly=true

Upon changing container command arguments, ArgoCD sync succeeds with unchanged resource,
we are using argoproj-labs/argocd-image-updater,
that particular jq is about ignoring our double tagged images in our registry ( we do both minor 'latest' and patch update upon deployment, similar to how you would have python:3.11 equivalent to python:3.11.4 as of writing this )

the workaround of syncing from UI with removing V from RespectIgnoreDifferences is working, but it is "not the GitOps that we want" :)

indeed works, and the resource is updated..
this is not GitOps.. 🫤

@asaf400 asaf400 mentioned this issue Jul 4, 2023
@crenshaw-dev
Copy link
Member

There may be more than one problem, but I think most of them can be traced back to this function:

func intersectMap(templateMap, valueMap map[string]interface{}) map[string]interface{} {

Imagine I want to modify a CronJob to add a new container.

To construct the patch, Argo CD takes the git state (templateMap) and the live state (valueMap) and merges them, using this intersectMap function.

When intersectMap encounters an array, it loops over the array from git (say, an array with two containers) and for each item merges the same array from the live object (say, an array with only one container). If the array in the live object is shorter than the array in the git object, intersectMap just drops any remaining items from the git state. So the patch contains only the first container.

That's the answer for the specific case of "adding a new container." The general answer is: this patch function is custom and is not Kubernetes-aware. A Kubernetes-aware patch function would have merged the array on a merge key (like container name) if present.

@crenshaw-dev
Copy link
Member

That's the boring explanation of "what seems to be wrong." In order to fix everyone's use cases, I need details about what is failing:

  1. live state of the resource, in yaml
  2. desired state of the resource, in yaml
  3. ignoreDifferences rules

We have those for a few use cases above. Those can inform the unit tests we write for the fix. But the more cases we can cover, the better.

@FredrikAugust
Copy link
Contributor

@blakepettersson @crenshaw-dev

I'm trying to add/update the liveness/readiness probes on a container, and have the following ignore "statement":

      ignoreDifferences:
      - group: argoproj.io
        kind: "Rollout"
        jqPathExpressions:
        - ".spec.template.spec.containers[] | select(.name == \"init\") | .image"

This is set on an ApplicationSet, but I don't think that should matter.

init is the name of our application — not to be confused with init containers 😄

In this example, I have changed the initialDelaySeconds from 20 to 15 seconds which is correctly detected in the app diff

image

The application is also marked as "OutOfSync", which is correct, but when I sync the application with "Respect Ignore Differences" it loads for a split second and then does nothing. Hope that was a sufficient example.

@blakepettersson
Copy link
Member

@FredrikAugust could you post the full (or somewhat redacted) live state of the Rollout object, as well as the full (or somewhat redacted) desired state of said object?

@FredrikAugust
Copy link
Contributor

@blakepettersson Sure!

I've posted the live state to a gist here: https://gist.github.com/FredrikAugust/8ba73d9665b60b168ab96d3f5ec69949

The desired state, if I understand the UI correctly should be that, but with the initialDelaySeconds: 15 instead of : 20. I don't have the desired state at ready right now as I've manually applied the changes, but it should be easy to reproduce.

@blakepettersson
Copy link
Member

@FredrikAugust indeed it was easy to reproduce, and it also seems like it's currently failing locally. I'll look into that and fix it, and push the update to #14602.

@davidmontoyago
Copy link

hi folks, update on my comment above: #9678 (comment)

for those seeing this issue with Secrets, in our case this was caused by the target Secret resource not having attribute data set. we set the data field to a foo value (foo: 'bar') and it began respecting the ignoreDifferences. so it's unable to match on the fields when a null value is coming from the source manifest AND there are out-of-band modifications to the live object. this hack only works for key-value fields like data though.

@crenshaw-dev
Copy link
Member

Sounds like a good case to unit test and add to #14602 :-)

blakepettersson added a commit to blakepettersson/argo-cd that referenced this issue Sep 27, 2023
There's an issue with `intersectMap` not appending items for array
properties where the live array is smaller than the desired array.

Hopefully fixes argoproj#9678.

Signed-off-by: Blake Pettersson <blake.pettersson@gmail.com>
@zonzamas
Copy link

I think I've hit the same bug. Is there any progress or estimation on this?

@akorzy-pl
Copy link
Contributor

Hello, we are experiencing this issue on ArgoCD v2.8.1

@yevhenvolchenko
Copy link

ArgoCD v2.9.2 - same issue

@cirano-eusebi
Copy link

would this also apply when ignoring map keys (like in labels/annotations)?
I have a global ignore:

  resource.customizations.ignoreDifferences.all: |
    managedFieldsManagers:
    - kube-controller-manager
    jsonPointers:
    - /spec/replicas
    - /metadata/annotations/reloader.stakater.com~1last-reloaded-from
    - /spec/template/metadata/annotations/reloader.stakater.com~1last-reloaded-from

And then a live manifest:

spec:
  template:
    metadata:
      annotations:
        reloader.stakater.com/last-reloaded-from: >-
          {"type":"CONFIGMAP","name":"REDACTED","namespace":"REDACTED","hash":"4a665ab5539934f5d6ccd39b90976ab28786bdcc","containerRefs":["app"],"observedAt":1703227429}

And a desired manifest:

spec:
  template:
    metadata:
      annotations: {}

This creates the following diff

template:
  metadata:
- annotations: {}

The application gets stuck in Healthy, OutOfSync, Sync OK

@asaf400
Copy link

asaf400 commented Dec 27, 2023

I like how I'm following this issue,
and every week or so I basically get an email notification with a *joke XD

Thanks for making me laugh once again @cirano-eusebi

** I'll be ruthlessly truthful, The joke is this 'tool' and that more and more people are being affected by this issue.. (I'm on your team)
I tend to try and not to spam a meaningful conversations, but 4 months have past since my last outburst, and also, my Argo instance malfunctioned today [got stuck refreshing], as expected at this point..

#9678 (comment)
Sorry 🙏

@saumeya
Copy link
Contributor

saumeya commented Jan 16, 2024

@QuingKhaos, I'll elaborate We have deployment applied with argo, on the app configured ignoreDifferences

      ignoreDifferences:
      - group: apps
        kind: Deployment
        name: some-deployment
        jqPathExpressions:
         - '.spec.template.spec.containers[] | select(.name == "some-container").env[] | select(.name == "_JAVA_OPTIONS")'

at this stage behavior is following: on every change in the app, argocd does "full" sync, overriding env var to what it see in the git, as it happens, the operator sees changes and applies its patch to env var. Argocd according to ignoreDifferences not doing anything with this change. Until a change in any other place of the app.

After adding RespectIgnoreDifferences=true behaviors is the following: argocd not triggering sync at all when change is related to anything inside '.spec.template.spec.containers[]. the workaround of syncing from UI with removing V from RespectIgnoreDifferences is working, but it is "not the GitOps that we want" :)

Hi @Len4i what do you mean by removing V from RespectIgnoreDifferences ? Do you mean disabling it?

@Len4i
Copy link

Len4i commented Jan 16, 2024

Hi @Len4i what do you mean by removing V from RespectIgnoreDifferences ? Do you mean disabling it?

Hi @saumeya,
Yes, disabling RespectIgnoreDifferences for this specific sync triggered from UI.

@druskus20
Copy link

I confirm that the original issue still persists, however, this workaround seemed to do the trick for me

  - kind: Secret
    name: aws-load-balancer-tls
    jsonPointers:
    - /data/ca.crt
    - /data/tls.crt
    - /data/tls.key
  - group: admissionregistration.k8s.io
    kind: MutatingWebhookConfiguration
    name: aws-load-balancer-webhook
    jsonPointers:
    - /webhooks/0/clientConfig/caBundle
    - /webhooks/1/clientConfig/caBundle
    - /webhooks/2/clientConfig/caBundle
  - group: admissionregistration.k8s.io
    kind: ValidatingWebhookConfiguration
    name: aws-load-balancer-webhook
    jsonPointers:
    - /webhooks/0/clientConfig/caBundle
    - /webhooks/1/clientConfig/caBundle
    - /webhooks/2/clientConfig/caBundle

@gerardocorea
Copy link
Contributor

gerardocorea commented Jan 31, 2024

I am working on testing this out but want to get some clarification on my usage. if I have a jq expression for ignoring an array in a resource spec. If I was to delete the entire argo application without cascading and have it recreated, would having RespectIgnoreDifferences=true in my syncOptions for that application avoid it applying the desired state for the array from git on that resource again if the resource exists already? or would it still initially sync everything on the resource again and then going forward ignore.

Just trying to better understand the documentation at https://argo-cd.readthedocs.io/en/stable/user-guide/sync-options/#respect-ignore-difference-configs

@FredrikAugust
Copy link
Contributor

Hi. I'm wondering if there are any updates on this issue? We're still affected by this, and have to use a workaround to get our CD pipelines to function as expected.

We would like to have the image of one of our containers be ignored so we can update that from the pipeline, and then manage everything else with ArgoCD. This would work very well, but due to this issue it doesn't, and we ended up managing the image tag in the manifests in git and running sync from the CD pipeline. This feels a bit hacky, and as a result all other scheduled changes on the apps will be applied as well. If anyone has any suggestions as to other ways you could do CD with argocd that would be greatly appreciated.

@crenshaw-dev You mentioned in the linked PR that this should be solved by improving the diffing/patch generation functionality by using the strategic patch merge. Is this on the roadmap, or is there any estimate of when it will be addressed?

@si-c613
Copy link
Contributor

si-c613 commented Jan 31, 2024

There's no more updates than what is above. I believe that Blake would welcome contributions to the MR.

As for your 'workaround' I'm not sure what the issue is. ArgoCD is a GitOps CD tool. It feels like what you are trying to achieve is something different to GitOps and why not just do a kubectl apply in the pipeline?

What you describe is more or less exactly what our pipelines do by design, except we have auto sync on so anything that goes to the main branch gets synced immediately. We still use sync from the pipelines to report status back to the merger and manage tests/promotions.

Two of the major goals we had were removing privileged access from pipelines and DR, this pipeline design solves both. The pipeline only needs read access to the cluster and if we ever lose the cluster everything is in Git.

@gerardocorea
Copy link
Contributor

gerardocorea commented Jan 31, 2024

My issue is as follows.

  1. I have an argo application syncing a helm chart to a cluster.
  2. I apply the following ignoreDifferences configuration
ignoreDifferences:
- group: install.istio.io
  kind: IstioOperator
  namespace: istio-system
  name: redacted
  jqPathExpressions:
  - .spec.components.ingressGateways[0].k8s.service.ports[]

Along with

syncPolicy:
  syncOptions:
    - RespectIgnoreDifferences=true

The resourc ports arrays looks like below to begin (initial sync and creation) and a k8s operator appends and removes ports based off some business logic we instrument, We need argo to not label it as out of sync and also to not remove those ports added during a sync (which its currently doing even with the ignoreDifference config provided above). My other test case is also deleting the entirety of the argo application

apiVersion: install.istio.io/v1alpha1
kind: IstioOperator
metadata:
  labels:
    argocd.argoproj.io/instance: redacted
  name: redacted
  namespace: istio-system
spec:
  components:
    ingressGateways:
      - enabled: true
        k8s:
          service:
            ports:
              - name: http2
                port: 80
                targetPort: 8080
              - name: https
                port: 443
              - name: status-port
                port: 15021
                targetPort: 15021

@zswanson
Copy link

zswanson commented Feb 4, 2024

The last 'update' is effectively that the PR from blake isn't the best approach to take and it has be re-thought.
#14602 (comment)

@blakepettersson
Copy link
Member

blakepettersson commented Mar 11, 2024

With #14602, trying to fix this quickly got hairy and had a bunch of edge cases which didn't actually work.

I took another look at this issue and think that a more viable approach would be to rethink ignoreDifferences in terms of Server-Side Apply / Server-Side Diff. This wouldn't be a quick fix, but IMO would solve the problem more elegantly.

Here's an example using kubectl apply --server-side, with one of the test cases @si-c613 provided.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: client
  name: client
spec:
  replicas: 1
  selector:
    matchLabels:
      app: client
  strategy: {}
  template:
    metadata:
      labels:
        app: client
    spec:
      containers:
      - image: alpine:3
        name: alpine
        resources: {}
kubectl apply -f minimal-image-replicas-deployment.yaml  --server-side
deployment.apps/client serverside-applied

The way to make ignoreDifference work in an SSA context would be to make the ignore fields managed by a separate field manager. In this example we want to ignore changes to the image name, expressed by the JQ expression .spec.template.spec.containers[].image. To do this with SSA, we apply only the fields that we want to ignore.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: client
spec:
  template:
    spec:
      containers:
        - image: alpine:3
          name: alpine

Then we apply this subset with another field manager than the one used in the previous SSA. The naming is arbitrary but we call it ignoredifferences for this example.

kubectl apply -f ignore.yaml  --server-side --field-manager=ignoredifferences --validate=false
deployment.apps/client serverside-applied

Then we try to update the deployment with a bunch of changed fields, as well as attempting to change the image tag (from alpine:3 to alpine:2).

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: client
    appProcess: web
  name: client
spec:
  replicas: 2
  selector:
    matchLabels:
      app: client
  strategy: {}
  template:
    metadata:
      labels:
        app: client
        appProcess: web
    spec:
      containers:
      - image: alpine:2
        name: alpine
        resources:
          requests:
            cpu: 400m
        env:
          - name: EV
            value: here

This will fail.

kubectl apply -f additional-image-replicas-deployment.yaml  --server-side 
Apply failed with 1 conflict: conflict with "ignoredifferences": .spec.template.spec.containers[name="alpine"].image

Sadly kubectl (and by extension presumably the client-go library) doesn't have a way of saying "ignore any changes to fields managed by other field managers" so we need to take the JSON path given, remove it and try again.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: client
    appProcess: web
  name: client
spec:
  replicas: 2
  selector:
    matchLabels:
      app: client
  strategy: {}
  template:
    metadata:
      labels:
        app: client
        appProcess: web
    spec:
      containers:
      # This time without the image tag
      - name: alpine
        resources:
          requests:
            cpu: 400m
        env:
          - name: EV
            value: here
kubectl apply -f additional-image-replicas-deployment.yaml  --server-side 
deployment.apps/client serverside-applied

The reason I bring up server-side diff is that in many cases server-side diff will show the desired changes without the ignored fields, so could potentially save a round-trip to the API server (preventing the need to retry with removed fields). This example didn't work while having the image tag changed using server-side diff though, but it did work for other examples I tried.

@FredrikAugust
Copy link
Contributor

FredrikAugust commented Mar 13, 2024

@blakepetterson Thanks for the great writeup. I'm sadly not too familiar with the diffing ecosystem, but I am able to contribute if there is any development that needs to be done if I'm shown the general direction. The problem caused by this issue is blocking us from running the pipelines how we'd prefer so I'd be able to dedicate quite some time to resolving this if you want some assistance.

Am I also reading this correctly in that this would require changes to Kubernetes (or the underlying library for kubectl)? Or would it be possible to implement this in ArgoCD/GitOpsEngine?

After reading the Field management documentation from Kubernetes, would it be possible to use a field manager for the ignored fields, and filter out the conflicting fields during applies:

If the applier doesn't care about the value of the field any more, the applier can remove it from their local model of the resource, and make a new request with that particular field omitted. This leaves the value unchanged, and causes the field to be removed from the applier's entry in managedFields.

@blakepettersson
Copy link
Member

blakepettersson commented Mar 13, 2024

Thanks for the great writeup. I'm sadly not too familiar with the diffing ecosystem, but I am able to contribute if there is any development that needs to be done if I'm shown the general direction

I think the way to go would be to create a formal proposal for this. Since Argo CD is going in the direction of only having SSA in the future, I think this would fit in neatly. Before writing up something more formal it would be good to get some feedback from either @crenshaw-dev or @leoluz to see if this is something that sounds agreeable (perhaps a topic for tomorrow's contributors' meeting?).

Am I also reading this correctly in that this would require changes to Kubernetes (or the underlying library for kubectl)? Or would it be possible to implement this in ArgoCD/GitOpsEngine?

No, I think this can be resolved in Argo CD / gitops-engine, having this done natively in k8s would have made things a bit more convenient though.

After reading the Field management documentation from Kubernetes, would it be possible to use a field manager for the ignored fields, and filter out the conflicting fields during applies:

I was thinking something along the lines of fetching the fields from the ignore field manager and filtering out those fields before an apply could be an option, but haven't given this too much thought. I think in most cases doing a server-side diff and applying only that patch during a sync should work though (and in the cases it doesn't it shouldn't be too difficult to remove the offending field from the diff).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet