Skip to content
This repository has been archived by the owner on Mar 22, 2024. It is now read-only.

Upgrade appears to be broken #360

Closed
drewwells opened this issue Jun 21, 2023 · 4 comments · Fixed by #366
Closed

Upgrade appears to be broken #360

drewwells opened this issue Jun 21, 2023 · 4 comments · Fixed by #366

Comments

@drewwells
Copy link
Contributor

Repo steps:

    ${HELM} upgrade \
        --debug --wait --namespace spire-server \
        --create-namespace --timeout 20m \
        -i spire ${SPIRE_CHART} \
        -f spire-box-3-values.yaml ${HELM_SETFLAGS}
  1. repeat the above
  2. Notice that pods are now missing and preupgrade job fails to run

Expected results:
Pods needing changing are recreated

I haven't figured out why this happens, but the only way to recover is to helm delete the chart.

@faisal-memon
Copy link
Contributor

@drewwells Im sorry this happened. Would you be able to provide the values files and any other info we might be able to use to reproduce the failure.

@drewwells
Copy link
Contributor Author

# -- The log level, valid values are "debug", "info", "warn", and "error"
logLevel: info

# -- Set the name of the Kubernetes cluster. (`kubeadm init --service-dns-domain`)
clusterName: bloxisabox

# -- Set the trust domain to be used for the SPIFFE identifiers
trustDomain: infoblox.com

bundleConfigMap: spire-bundle

# -- This is the value of your clusters `kubeadm init --service-dns-domain` flag
clusterDomain: cluster.local

global:
  spire:
    trustDomain: "infoblox.com"

spire-server:
  # -- The JWT issuer domain
  jwtIssuer: "auth.infoblox.com"
  namespaceOverride: ''
  trustDomain: "infoblox.com"
  federation:
    enabled: true
    bundleEndpoint:
      port: 8443
      address: "0.0.0.0"

    ingress:
      enabled: true
      className: ""
      annotations:
        kubernetes.io/ingress.class: nginx
        kubernetes.io/tls-acme: "true"
        nginx.ingress.kubernetes.io/backend-protocol: "HTTPS"
        # If Profile Type == https_spiffe:
        nginx.ingress.kubernetes.io/ssl-passthrough: "true"
        cert-manager.io/cluster-issuer: cert-manager-issuer-letsencrypt
      hosts:
        - host: spire-box-4.eng.test.infoblox.com
          paths:
            - path: /
              pathType: Prefix
      tls:
       - hosts:
           - spire-box-4.eng.test.infoblox.com
         secretName: spire-server-tls

  ca_subject:
    country: US
    organization: Infoblox
    common_name: infoblox.com

  upstreamAuthority:
    disk:
      enabled: false
      secret:
        # -- If disabled requires you to create a secret with the given keys (certificate, key and optional bundle) yourself.
        create: true
        # -- If secret creation is disabled, the secret with this name will be used.
        name: "spiffe-upstream-ca"
        # -- If secret creation is enabled, will create a secret with following certificate info
        data:
          certificate: ""
          key: ""
          bundle: ""
    certManager:
      enabled: true
      rbac:
        create: true
      issuer_name: "spire-server-ca"
      issuer_kind: "Issuer"
      issuer_group: "cert-manager.io"
      # -- Specify to use a namespace other then the one the chart is installed into
      namespace: ""
      kube_config_file: ""
      createCA: true
      spec:
        selfSigned: {}
    spire:
      enabled: false
      server:
        address: ""
        port: 8081

  # notifier:
  #   k8sbundle:
  #     # -- Namespace to push the bundle into, if blank will default to SPIRE Server namespace
  #     namespace: ""

  controllerManager:
    enabled: true

spire-agent:
  logLevel: debug
  trustDomain: infoblox.com

@drewwells
Copy link
Contributor Author

I can't reproduce this any longer, Im going to close unless I can find the culprit.

@kfox1111 kfox1111 reopened this Jun 22, 2023
@kfox1111
Copy link
Contributor

Ok. I found the issue.... The test didn't fully run to check the wait code, which is why we didn't detect it. Your using the older flag, which does enable the wait container, which is why it broke.
Fixing the flag in #366

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants