Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EMQX Config patches takes precendence on statefulset changes #1027

Closed
ddellarocca opened this issue Mar 26, 2024 · 1 comment · Fixed by #1034
Closed

EMQX Config patches takes precendence on statefulset changes #1027

ddellarocca opened this issue Mar 26, 2024 · 1 comment · Fixed by #1034
Labels
bug Something isn't working

Comments

@ddellarocca
Copy link

Describe the bug
If the EMQX crd is updated with changes that affect both the statefulset and the EMQX config, the statefulset is updated last and this blocks the update process if the statefulset changes are referenced in EMQX config. See the following example.

To Reproduce
Preconditions: emqx operator up and running in a k8s cluster.

  1. Apply the following manifest and wait for the cluster to be ready
apiVersion: apps.emqx.io/v2beta1
kind: EMQX
metadata:
  name: emqx
  namespace: emqx-operated
spec:
  image: emqx:5.5.1
  config:
    data: |-
      log {
        file_handlers {
          enable = false
        }

        console_handler {
          enable = true
          level = debug
          formatter = json
        }
      }

      cluster {
        autoclean = "5m"
      }

  coreTemplate:
    spec:
      replicas: 2
      resources:
        limits:
          cpu: 1
          memory: 4Gi
        requests:
          cpu: 1
          memory: 4Gi
      ports:
        - containerPort: 8883
          name: mqttssl
          protocol: TCP
        - containerPort: 1883
          name: mqtt
          protocol: TCP
  listenersServiceTemplate:
    spec:
      type: LoadBalancer
  dashboardServiceTemplate:
    spec:
      type: LoadBalancer
  updateStrategy:
    initialDelaySeconds: 10
    type: Recreate
  1. Add an extra mount and change the config in order to use it (in this case adding an ACL authorization)
apiVersion: apps.emqx.io/v2beta1
kind: EMQX
metadata:
  name: emqx
  namespace: emqx-operated
spec:
  image: emqx:5.5.1
  config:
    data: |-
      log {
        file_handlers {
          enable = false
        }

        console_handler {
          enable = true
          level = debug
          formatter = json
        }
      }

      cluster {
        autoclean = "5m"
      }

      authorization {
        cache {
          enable = true
          ttl = "5m"
        }
        deny_action = "ignore"
        no_match = "allow"
        sources = [
          {
            type = "file"
            enable = true

            path = "/opt/emqx/data/authz/acl/acl.conf"
          }
        ]
      }

  coreTemplate:
    spec:
      replicas: 2
      resources:
        limits:
          cpu: 1
          memory: 4Gi
        requests:
          cpu: 1
          memory: 4Gi
      ports:
        - containerPort: 8883
          name: mqttssl
          protocol: TCP
        - containerPort: 1883
          name: mqtt
          protocol: TCP
      extraVolumeMounts:
        - name: authz-acl-file
          mountPath: /opt/emqx/data/authz/acl
      extraVolumes:
        - name: authz-acl-file
          configMap:
            name: authz-acl-file
  listenersServiceTemplate:
    spec:
      type: LoadBalancer
  dashboardServiceTemplate:
    spec:
      type: LoadBalancer
  updateStrategy:
    initialDelaySeconds: 10
    type: Recreate
  1. Deploy again the manifest
  2. EMQX and EMQX Operator report the error of missing file
{"time":1711446652610981,"level":"alert","msg":"failed_to_read_acl_file","mfa":"emqx_authz_file:validate/1(99)","explain":"No such file or directory","path":"/opt/emqx/data/authz/acl/acl.conf","pid":"<0.4476.0>"}
{"level":"error","ts":"2024-03-26T09:52:14Z","msg":"Reconciler error","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","eMQX":{"name":"emqx","namespace":"emqx-operated"},"namespace":"emqx-operated","name":"emqx","reconcileID":"71da2fe2-2e7e-4fda-b1be-5083c61ea4ba","error":"failed to put emqx config: failed to put API http://10.244.2.15:18083/api/v5/configs?mode=merge, status : 400 Bad Request, body: {\"authorization\":{\"reason\":\"failed_to_read_acl_file\",\"value\":\"/opt/emqx/data/authz/acl/acl.conf\",\"path\":\"authorization.sources.1.path\",\"kind\":\"validation_error\",\"matched_type\":\"authz:file\"}}","errorVerbose":"failed to put API http://10.244.2.15:18083/api/v5/configs?mode=merge, status : 400 Bad Request, body: {\"authorization\":{\"reason\":\"failed_to_read_acl_file\",\"value\":\"/opt/emqx/data/authz/acl/acl.conf\",\"path\":\"authorization.sources.1.path\",\"kind\":\"validation_error\",\"matched_type\":\"authz:file\"}}\ngithub.com/emqx/emqx-operator/controllers/apps/v2beta1.putEMQXConfigsByAPI\n\t/workspace/controllers/apps/v2beta1/sync_emqx_config.go:135\ngithub.com/emqx/emqx-operator/controllers/apps/v2beta1.(*syncConfig).reconcile\n\t/workspace/controllers/apps/v2beta1/sync_emqx_config.go:77\ngithub.com/emqx/emqx-operator/controllers/apps/v2beta1.(*EMQXReconciler).Reconcile\n\t/workspace/controllers/apps/v2beta1/emqx_controller.go:134\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:234\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598\nfailed to put emqx config\ngithub.com/emqx/emqx-operator/controllers/apps/v2beta1.(*syncConfig).reconcile\n\t/workspace/controllers/apps/v2beta1/sync_emqx_config.go:78\ngithub.com/emqx/emqx-operator/controllers/apps/v2beta1.(*EMQXReconciler).Reconcile\n\t/workspace/controllers/apps/v2beta1/emqx_controller.go:134\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:121\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:320\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:234\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1598","stacktrace":"sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:273\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2\n\t/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.12.3/pkg/internal/controller/controller.go:234"}

Expected behavior
The operator should update the statefulset first if it needs to be redeployed and then apply the EMQX config, or change the EMQX config configmap and then update the statefulset.

Anything else we need to know?:
If the EMQX is deleted and then applied again it successfully starts with the ACL configured, so it is not related to the crd.

Environment details:

  • Kubernetes version: 1.21.14
  • Cloud-provider/provisioner: local kind
  • emqx-operator version: 2.2.14
  • Install method: helm
@ddellarocca ddellarocca added the bug Something isn't working label Mar 26, 2024
@Rory-Z
Copy link
Member

Rory-Z commented Mar 26, 2024

Yes, this is a defect, thanks for feedback, let me fix it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants