Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean install of a public helm chart results in numerous Modified / Unknown resources #189

Closed
virtuman opened this issue Dec 20, 2020 · 7 comments

Comments

@virtuman
Copy link

Already tried raising this question in slack, and got confirmation that this is not working as expected and should be filed as an issue:

All of the resources are stuck in Modified / Unknown state.
Can anyone explain the reason it is all stuck in this condition and what this error in Conditions tab?
This is my first attempt installing helm chart "bitnami/thanos" ...

Screen Shot 2020-12-09 at 7 53 49 PM
Screen Shot 2020-12-09 at 7 55 22 PM

Here's a workaround that worked in 2.5.3 and now it's highlighting some of the resources as Modified again, and the GitRepo is stuck in out of sync indefinitely:

defaultNamespace: thanos
helm:
  releaseName: thanos
  repo: https://charts.bitnami.com/bitnami
  chart: thanos
  values:
    existingServiceAccount: thanos
    bucketweb:
      enabled: true
      resources:
        requests:
          cpu: 15m
          memory: 40Mi
    compactor:
      enabled: true
      logLevel: info
      persistence:
        accessModes:
        - ReadWriteOnce
        enabled: true
        size: 100Gi
      resources:
        requests:
          cpu: 100m
          memory: 60Mi
    metrics:
      enabled: true
      serviceMonitor:
        enabled: true
    objstoreConfig: |-
      type: GCS
      config:
        bucket: thanos-prometheus-metrics
    query:
      autoscaling:
        enabled: false
        minReplicas: 1
        maxReplicas: 5
      dnsDiscovery:
        enabled: true
      enabled: true
      stores:
        - dns+prometheus.example.com:10901
        - prometheus-operated.cattle-monitoring-system.svc:10901
      ingress:
        certManager: false
        enabled: true
        extraTls:
        - hosts:
          - thanos.trgdev.com
          secretName: trgdev.com-tls
        hostname: thanos.trgdev.com
      logLevel: info
      resources:
        requests:
          cpu: 50m
          memory: 100Mi
    queryFrontend:
      autoscaling:
        enabled: false
        minReplicas: 1
        maxReplicas: 5
      resources:
        requests:
          cpu: 10m
          memory: 20Mi
    ruler:
      enabled: false
      logLevel: info
    storegateway:
      autoscaling:
        enabled: false
        minReplicas: 1
        maxReplicas: 5
      enabled: true
      logLevel: info
      persistence:
        size: 50Gi
      resources:
        limits:
          memory: 8Gi
        requests:
          cpu: 70m
          memory: 5Gi
diff:
  comparePatches:
  - apiVersion: apps/v1
    kind: Deployment
    name: thanos-bucketweb
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/strategy" }
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits/cpu" }
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits/memory" }
  - apiVersion: apps/v1
    kind: Deployment
    name: thanos-compactor
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/strategy" }
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}
  - apiVersion: apps/v1
    kind: Deployment
    name: thanos-query
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/strategy" }
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}
    - {"op": "remove", "path": "/spec/template/spec/containers/0/volumeMounts"}
    - {"op": "remove", "path": "/spec/template/spec/volumes"}
  - apiVersion: apps/v1
    kind: Deployment
    name: thanos-query-frontend
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/strategy" }
    - {"op": "remove", "path": "/spec/template/spec/containers/0/resources/limits"}
    - {"op": "remove", "path": "/spec/template/spec/containers/0/volumeMounts"}
    - {"op": "remove", "path": "/spec/template/spec/volumes"}
  - apiVersion: apps/v1
    kind: StatefulSet
    name: thanos-storegateway
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/strategy" }
    - {"op": "remove", "path": "/spec/accessModes"}
    - {"op": "remove", "path": "/spec/resources/requests/storage"}
    - {"op": "remove", "path": "/spec/template/spec/volumes"}
    - {"op": "remove", "path": "/spec/volumeClaimTemplates/0"}
    - {"op": "remove", "path": "/spec/volumeClaimTemplates/0/metadata/name"}
    - {"op": "remove", "path": "/spec/volumeClaimTemplates/0/spec/accessModes"}
    - {"op": "remove", "path": "/spec/volumeClaimTemplates/0/spec/resources/requests/storage"}
  - apiVersion: v1
    kind: Service
    name: thanos-bucketweb
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/ports/0/nodePort"}
  - apiVersion: v1
    kind: Service
    name: thanos-compactor
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/ports/0/nodePort"}
  - apiVersion: v1
    kind: Service
    name: thanos-query
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/ports/0/nodePort"}
    - {"op": "remove", "path": "/spec/ports/1/nodePort"}
  - apiVersion: v1
    kind: Service
    name: thanos-query-frontend
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/ports/0/nodePort"}
  - apiVersion: v1
    kind: Service
    name: thanos-storegateway
    namespace: thanos
    operations:
    - {"op": "remove", "path": "/spec/ports/0/nodePort"}
    - {"op": "remove", "path": "/spec/ports/1/nodePort"}

This is just crazy what I have to do in order for fleet to go all green, it just doesn't seem right.

I don't overwrite anything in my install, just supplying values via values.yaml .
Noticed for one thing - that whenever leaving resources: unspecified/empty - it looks like empty object is automatically converted into {} and it won't match : null default value (or vice-versa) and will complain that "resources" are modified. The rest is funny too - "Service" object expecting for nodePort value, even for service type ClusterIP, each app controller is expecting to have "type" field, etc.

Basically this is what I had to do for bitnami/thanos to install properly and turn green in fleet. This seems excessive and I seriously doubt I'm doing things correctly, but the only way I could get it all to validate and work, again - seems a bit strange and most of our workload just wont' work well if we have to create such long of comparison exclusions, even moreso - that they tend to validate differently in future versions of rancher (just like in this particular case, where things kinda started to work in v2.5.3, and stopped working in v2.5.4-rc*

Thank you for any explanation and maybe a roadmap, of how this is actually supposed to work, using my sample values: sections, it is easy to reproduce same behavior in any k8s cluster.

@virtuman
Copy link
Author

Just an FYI, rancher 2.5.4-rc5 generates this type of modification set, note that nodeAffinity and podAffinity were never supplied in the original install and are not listed when i run kubectl get deploy -oyaml XXXXX:

    modifiedStatus:
    - apiVersion: apps/v1
      kind: Deployment
      name: thanos-bucketweb
      namespace: thanos
      patch: '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":null,"podAffinity":null}}}}}'
    - apiVersion: apps/v1
      kind: Deployment
      name: thanos-compactor
      namespace: thanos
      patch: '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":null,"podAffinity":null}}}}}'
    - apiVersion: apps/v1
      kind: Deployment
      name: thanos-query
      namespace: thanos
      patch: '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":null,"podAffinity":null}}}}}'
    - apiVersion: apps/v1
      kind: Deployment
      name: thanos-query-frontend
      namespace: thanos
      patch: '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":null,"podAffinity":null}}}}}'
    - apiVersion: apps/v1
      kind: StatefulSet
      name: thanos-storegateway
      namespace: thanos
      patch: '{"spec":{"template":{"spec":{"affinity":{"nodeAffinity":null,"podAffinity":null}}}}}'

@atsai1220
Copy link

Here is a related issue for posterity: #124

@janeczku
Copy link

janeczku commented Jan 7, 2021

This issue can easily be reproduced by installing a YAML manifest with fleet that contains null values for mapping keys.

The Helm chart mentioned by OP renders manifest like below (simplified). Note the null values in the spec.affinity component:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: foo
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: bar
  template:
    metadata:
      labels:
        app.kubernetes.io/name: bar
    spec:
      affinity:
        podAffinity:
        podAntiAffinity:
      containers:
        - name: baz
          image: nginx

Installing this manifest will result in the resource being stuck in modified state:

Screenshot 2021-01-07 at 02 25 41

Rancher issue ref: rancher/rancher#30696

@paulchoi
Copy link

paulchoi commented Feb 3, 2021

Running into the same issue while trying to install Longhorn using Fleet. It'd be a very simple Helm based install, but it's stuck in Modified due to policy differences as well as the Service (NodePort set after service creation).

The error is:

Modified(2) [Bundle longhorn-longhorn]; podsecuritypolicy.policy longhorn-psp modified {"spec":{"hostIPC":false,"hostNetwork":false}}; service.v1 longhorn-system/longhorn-frontend modified {"spec":{"$setElementOrder/ports":[{"port":80}],"ports":[{"nodePort":null,"port":80}]}}

My fleet.yaml is:

defaultNamespace: longhorn-system

helm:
  chart: longhorn
  repo: https://charts.longhorn.io
  version: 1.1.0

@tyzbit
Copy link

tyzbit commented Feb 4, 2021

I think this is also happening to me for a resource that has creationTimestamp of null. I've made sure to specify the rest of the persistentVolumeClaimTemplate object.

image

Click to expand...
~|⇒ (⎈ |takeout:default) k get statefulsets.apps electrumx -o yaml                                                                                                                  [4s] [2021/02/04|17:28]
apiVersion: apps/v1
kind: StatefulSet
metadata:
  annotations:
    field.cattle.io/publicEndpoints: '[{"addresses":["192.168.1.63"],"port":50002,"protocol":"TCP","serviceName":"default:electrumx","allNodes":false}]'
    meta.helm.sh/release-name: apps
    meta.helm.sh/release-namespace: default
    objectset.rio.cattle.io/id: default-apps
  creationTimestamp: "2021-02-04T22:21:54Z"
  generation: 1
  labels:
    app: electrumx
    app.kubernetes.io/managed-by: Helm
    objectset.rio.cattle.io/hash: 48a9d76eea6dc292a0244df36318f37481348de9
  managedFields:
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          .: {}
          f:meta.helm.sh/release-name: {}
          f:meta.helm.sh/release-namespace: {}
          f:objectset.rio.cattle.io/id: {}
        f:labels:
          .: {}
          f:app: {}
          f:app.kubernetes.io/managed-by: {}
          f:objectset.rio.cattle.io/hash: {}
      f:spec:
        f:podManagementPolicy: {}
        f:replicas: {}
        f:revisionHistoryLimit: {}
        f:selector:
          f:matchLabels:
            .: {}
            f:app: {}
        f:serviceName: {}
        f:template:
          f:metadata:
            f:labels:
              .: {}
              f:app: {}
          f:spec:
            f:containers:
              k:{"name":"electrumx"}:
                .: {}
                f:env:
                  .: {}
                  k:{"name":"COIN"}:
                    .: {}
                    f:name: {}
                    f:value: {}
                  k:{"name":"DAEMON_URL"}:
                    .: {}
                    f:name: {}
                    f:valueFrom:
                      .: {}
                      f:secretKeyRef:
                        .: {}
                        f:key: {}
                        f:name: {}
                f:image: {}
                f:imagePullPolicy: {}
                f:name: {}
                f:ports:
                  .: {}
                  k:{"containerPort":50002,"protocol":"TCP"}:
                    .: {}
                    f:containerPort: {}
                    f:name: {}
                    f:protocol: {}
                f:resources:
                  .: {}
                  f:limits:
                    .: {}
                    f:memory: {}
                  f:requests:
                    .: {}
                    f:memory: {}
                f:terminationMessagePath: {}
                f:terminationMessagePolicy: {}
                f:volumeMounts:
                  .: {}
                  k:{"mountPath":"/data"}:
                    .: {}
                    f:mountPath: {}
                    f:name: {}
            f:dnsPolicy: {}
            f:restartPolicy: {}
            f:schedulerName: {}
            f:securityContext: {}
            f:terminationGracePeriodSeconds: {}
        f:updateStrategy:
          f:rollingUpdate:
            .: {}
            f:partition: {}
          f:type: {}
        f:volumeClaimTemplates: {}
    manager: Go-http-client
    operation: Update
    time: "2021-02-04T22:21:54Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:metadata:
        f:annotations:
          f:field.cattle.io/publicEndpoints: {}
    manager: rancher
    operation: Update
    time: "2021-02-04T22:21:54Z"
  - apiVersion: apps/v1
    fieldsType: FieldsV1
    fieldsV1:
      f:status:
        f:collisionCount: {}
        f:currentReplicas: {}
        f:currentRevision: {}
        f:observedGeneration: {}
        f:readyReplicas: {}
        f:replicas: {}
        f:updateRevision: {}
        f:updatedReplicas: {}
    manager: kube-controller-manager
    operation: Update
    time: "2021-02-04T22:21:56Z"
  name: electrumx
  namespace: default
  resourceVersion: "69597335"
  selfLink: /apis/apps/v1/namespaces/default/statefulsets/electrumx
  uid: 60264bca-cc03-442c-ab7c-810af99b4924
spec:
  podManagementPolicy: OrderedReady
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: electrumx
  serviceName: electrumx
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: electrumx
    spec:
      containers:
      - env:
        - name: COIN
          value: BitcoinSegwit
        - name: DAEMON_URL
          valueFrom:
            secretKeyRef:
              key: DAEMON_URL
              name: electrumx
        image: lukechilds/electrumx
        imagePullPolicy: Always
        name: electrumx
        ports:
        - containerPort: 50002
          name: electrumx
          protocol: TCP
        resources:
          limits:
            memory: 3072M
          requests:
            memory: 3072M
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
        volumeMounts:
        - mountPath: /data
          name: electrumx-data
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
  updateStrategy:
    rollingUpdate:
      partition: 0
    type: RollingUpdate
  volumeClaimTemplates:
  - apiVersion: v1
    kind: PersistentVolumeClaim
    metadata:
      creationTimestamp: null
      name: electrumx-data
    spec:
      accessModes:
      - ReadWriteOnce
      resources:
        requests:
          storage: 50Gi
      storageClassName: nfs-client
      volumeMode: Filesystem
    status:
      phase: Pending
status:
  collisionCount: 0
  currentReplicas: 1
  currentRevision: electrumx-79568c448c
  observedGeneration: 1
  readyReplicas: 1
  replicas: 1
  updateRevision: electrumx-79568c448c
  updatedReplicas: 1

@MG2R
Copy link

MG2R commented Mar 4, 2021

I'm having this exact same issue with 3 PSPs in dynatrace-oneagent-operator.

 Modified(2) [Cluster fleet-default/c-6vzrb];
podsecuritypolicy.policy dynatrace-oneagent-operator modified {"spec":{"hostIPC":false,"hostNetwork":false,"hostPID":false,"privileged":false}};
podsecuritypolicy.policy dynatrace-oneagent-unprivileged modified {"spec":{"privileged":false}};
podsecuritypolicy.policy dynatrace-oneagent-webhook modified {"spec":{"hostIPC":false,"hostNetwork":false,"hostPID":false,"privileged":false}} 

@artsiom-abakumov
Copy link

This blocks proper installation of simple helmcharts for opa-gatekeeper and kube-prometheus-stack.

@kkaempf kkaempf closed this as not planned Won't fix, can't repro, duplicate, stale Dec 12, 2022
@zube zube bot closed this as completed Dec 12, 2022
@zube zube bot removed the [zube]: Done label Mar 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants