Skip to content

Release cluedin-platform-2.0.0

Choose a tag to compare

@dervalp dervalp released this 07 Feb 15:01

Chart 2.0.0 upgrade

Stage 1: Prepare custom values

First of all migrate your existing configuration to be compatible with v2. Be aware of the following known breaking changes.

Breaking changes

Neo4j

Before After Comment
infrastructure.neo4j.authEnabled infrastructure.neo4j.config.dbms.security.auth_enabled
infrastructure.neo4j.password infrastructure.neo4j.neo4j.password
infrastructure.neo4j.image infrastructure.neo4j.cluedinExtensions.image.[registry/repository] Now it's split into two fields
infrastructure.neo4j.imageTag infrastructure.neo4j.cluedinExtensions.image.tag
infrastructure.neo4j.imagePullSecret infrastructure.neo4j.image.imagePullSecrets Notice, it's a list now
infrastructure.neo4j.priorityClassName infrastructure.neo4j.podSpec.priorityClassName
infrastructure.neo4j.tolerations infrastructure.neo4j.podSpec.tolerations
infrastructure.neo4j.serviceAccount.name infrastructure.neo4j.podSpec.serviceAccountName
infrastructure.neo4j.dbms.memory.use_memrec - not used anymore, as Neo4j does it OOTB
infrastructure.neo4j.core.standalone - not used anymore
infrastructure.neo4j.core.numberOfServers - not used anymore
infrastructure.neo4j.core.discoveryService - not used anymore
infrastructure.neo4j.core.persistentVolume.enabled infrastructure.neo4j.volumes.data.mode equivalent. Can disable persistence using custom template
infrastructure.neo4j.core.persistentVolume.size infrastructure.neo4j.volumes.data.defaultStorageClass.requests.storage
- infrastructure.neo4j.cluedinExtensions.podCommand
infrastructure.neo4j.core.pluginInstallers infrastructure.neo4j.cluedinExtensions.pluginInstallers
infrastructure.neo4j.core.resources.requests infrastructure.neo4j.neo4j.resources.requests

Note that for some settings you have neo4j.neo4j - it's not a mistake, you have to nest same key again.

Stage 2: Apply upgrade values

Run helm upgrade with your custom values using the latest v2 HELM chart. On top of your values you have to apply platform-upgrade-v2-stage1.yml values file.

platform-upgrade-v2-stage1.yml
global:
  containerImages:
    initSql:
      enabled: false
    initNeo4J:
      enabled: false
    initCluedIn:
      enabled: false

application:
  annotation:
    replicas: 0
  cluedin:
    roles:
      crawling:
        count: 0
      main:
        count: 0
      processing:
        count: 0
  cluedincontroller:
    enabled: false
  datasource:
    replicas: 0
  gql:
    replicas: 0
  libpostal:
    enabled: false
  monitoring:
    enabled: false
  oauth2:
    enabled: false
  openrefine:
    enabled: false
  prepare:
    replicas: 0
  strategy:
    replicas: 0
  submitter:
    replicas: 0
  ui:
    replicas: 0

infrastructure:
  cert-manager:
    enabled: false
  haproxy-ingress:
    enabled: false
  elasticsearch:
    enabled: false
  monitoring:
    enabled: false
  mssql:
    enabled: false
  neo4j:
    enabled: false
  rabbitmq:
    enabled: false
  redis:
    enabled: false

Example:

helm upgrade -n cluedin my-name cluedin/cluedin-platform
    --values my-custom-values.yaml
    --values platform-upgrade-v2-stage1.yml

⚠ Upgrade values should be applied at the end to override any other values

❗ Wait until deployment finishes. Make sure that all pods are healthy and all jobs are completed.

As result most of the resources should be undeployed, but your volumes and claims should remain.

Stage 3: Apply upgrade-2 values for data-upgrade

Run helm upgrade with your custom values using the latest v2 HELM chart. On top of your values you have to apply platform-upgrade-v2-stage2 values file.

platform-upgrade-v2-stage2.yml
global:
  containerImages:
    initSql:
      enabled: true
    initNeo4J:
      enabled: true
    initCluedIn:
      enabled: false # Don't run nuget installers for now, focus on infrastructure upgrade

# Disable services which should not run during upgrade
application:
  annotation:
    replicas: 0
  cluedin:
    roles:
      crawling:
        count: 0
      main:
        count: 0
      processing:
        count: 0
  cluedincontroller:
    enabled: false
  datasource:
    replicas: 0
  gql:
    replicas: 0
  libpostal:
    enabled: false
  monitoring:
    enabled: false
  oauth2:
    enabled: false
  openrefine:
    enabled: false
  prepare:
    replicas: 0
  strategy:
    replicas: 0
  submitter:
    replicas: 0
  ui:
    replicas: 0

  system:
    runDatabaseJobsOnUpgrade: true # Will run the init jobs to migrate databases

infrastructure:
  cert-manager:
    enabled: false
  haproxy-ingress:
    enabled: false
  elasticsearch:
    enabled: false
  monitoring:
    enabled: false
  redis:
    enabled: false

  mssql:
    command:
    - /usr/bin/stdbuf
    - -i0
    - -o0
    - -e0
    - /bin/bash
    - -c
    - |
      ### LOGGING SETUP
      UPGRADE_LOG_FILE_PATH=$MSSQL_DATA_DIR/upgrade_2017-2022.upgrade.log

      counter=1
      UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
      while [ -e "$UPGRADE_LOG_FILE_PATH_CURRENT" ]; do
        UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
      done

      exec > >(tee $${UPGRADE_LOG_FILE_PATH_CURRENT}) 2>&1
      echo "✍ Upgrade log file path: $UPGRADE_LOG_FILE_PATH_CURRENT"
      ### END LOGGING SETUP

      echo "⌛ [UPGRADE] Changing database file permissions to mssql"

      for env_var in MSSQL_DATA_DIR MSSQL_LOG_DIR MSSQL_BACKUP_DIR MSSQL_MASTER_DATA_FILE MSSQL_MASTER_LOG_FILE; do
          if [ "${!env_var}" ]; then
              chown -R mssql -v ${!env_var}
              echo "[UPGRADE] ✔ Changed owner to mssql for ${env_var}"
          fi
      done

      for env_var in MSSQL_MASTER_DATA_FILE MSSQL_MASTER_LOG_FILE; do
          if [ "${!env_var}" ] && [ -f "${!env_var}" ]; then
              parentdir="$(dirname "${!env_var}")"
              chown -R mssql -v ${parentdir}
              echo "[UPGRADE] ✔ Changed owner to mssql for ${env_var} parent directory"
          fi
      done

      echo "[UPGRADE] ✅ Changed database file permissions to mssql"

      echo "[UPGRADE] 🎁 - Starting MS SQL as mssql user.."
      exec su -c "/opt/mssql/bin/sqlservr --accept-eula" mssql
    # Run as root, as we need to change database file owner
    securityContext:
      runAsNonRoot: false
      runAsUser: 0
      runAsGroup: 0
  
  rabbitmq:
    initContainers:
      - name: rabbit-3-11-enable-feature-flags
        image: "{{ printf \"%s/%s:%s\"  $.Values.image.registry $.Values.image.repository \"3.11.20-debian-11-r25\" }}"
        command:
        - /usr/bin/stdbuf
        - -i0
        - -o0
        - -e0
        - /bin/bash
        - -c
        - |
          ### LOGGING SETUP
          counter=1
          UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
          while [ -e "$UPGRADE_LOG_FILE_PATH_CURRENT" ]; do
            UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
          done

          exec > >(tee $${UPGRADE_LOG_FILE_PATH_CURRENT}) 2>&1
          echo "✍ Upgrade log file path: $UPGRADE_LOG_FILE_PATH_CURRENT"
          ### END LOGGING SETUP

          # Function to handle the interrupt signal and exit the script gracefully
          exit_upon_signal() {
              echo "❌ Received interrupt signal. Exiting..."
              exit 1
          }
          trap exit_upon_signal SIGINT SIGTERM

          echo "[UPGRADE] 🟢 Running RabbitMQ 3.11 Upgrade script"

          if [[ -z "${UPGRADE_LOCK_FILE_PATH}" ]]; then
              echo "[UPGRADE] ⛔ - Env variable LOCK_FILE_PATH is not set"
              exit 1
          fi

          if [ -f $UPGRADE_LOCK_FILE_PATH ]; then
            echo '[UPGRADE] 👋 - Upgrade stage was already completed (lock file already exists). Exiting..'
            exit
          fi

          echo [UPGRADE] 🎁 Running server in background: /opt/bitnami/scripts/rabbitmq/entrypoint.sh /opt/bitnami/scripts/rabbitmq/run.sh
          /opt/bitnami/scripts/rabbitmq/entrypoint.sh /opt/bitnami/scripts/rabbitmq/run.sh &
          pid=$!

          while true; do
              echo "[UPGRADE] 🖊 - Attempting to enable all features using rabbitmqctl.."
              rabbitmqctl enable_feature_flag all

              if [ $? -eq 0 ]; then
                  break
              fi

              sleep 3
          done

          echo "[UPGRADE] ✅ - Enabled all RabbitMQ features. Sending stop signal to RabbitMQ.."
          kill $pid

          echo "[UPGRADE] ⏳ - Waiting for RabbitMQ (PID: $pid) to stop.."
          # || true is used to suppress exit code.
          wait $pid || true

          echo "[UPGRADE] ✅ - RabbitMQ stopped"

          touch $UPGRADE_LOCK_FILE_PATH
          echo "[UPGRADE] 👋 - Upgrade is completed. Wrote lock file: $UPGRADE_LOCK_FILE_PATH. Exiting.."

          exit 
        volumeMounts:
          # - name: configuration
          #   mountPath: /bitnami/rabbitmq/conf
          - name: data
            mountPath: "{{ $.Values.persistence.mountPath }}"
        env:
          - name: RABBITMQ_NODE_NAME
            value: rabbit@cluedin-rabbitmq-0.cluedin-rabbitmq-headless.cluedin.svc.cluster.local  
          - name: RABBITMQ_USE_LONGNAME
            value: "true"
          - name: UPGRADE_LOCK_FILE_PATH
            value: "{{ $.Values.persistence.mountPath }}/upgrade-rabbitmq-3.11-completed.lock"
          - name: UPGRADE_LOG_FILE_PATH
            value: "{{ $.Values.persistence.mountPath }}/upgrade-rabbitmq-3.11.upgrade.log"

  neo4j:
    volumes:
      old-data:
        mode: "volume"
        volume:
          persistentVolumeClaim:
            claimName: "datadir-cluedin-neo4j-core-0"
        disableSubPathExpr: true

    podSpec:
      initContainers:
        - name: upgrade-4-intro
          command:
            - /usr/bin/stdbuf
            - -i0
            - -o0
            - -e0
            - "/bin/bash"
            - "-c"
            - |
              ### LOGGING SETUP
              counter=1
              UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              while [ -e "$UPGRADE_LOG_FILE_PATH_CURRENT" ]; do
                UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              done

              exec > >(tee $${UPGRADE_LOG_FILE_PATH_CURRENT}) 2>&1
              echo "✍ Upgrade log file path: $UPGRADE_LOG_FILE_PATH_CURRENT"
              ### END LOGGING SETUP

              /upgrade-intro.sh
          volumeMounts:
            - name: old-data
              mountPath: /data
          # Run as root, as we need to change database file owner
          securityContext:
            runAsNonRoot: false
            runAsUser: 0
            runAsGroup: 0
          env:
            - name: NEO4J_dbms_directories_data
              value: /data
            - name: NEO4J_dbms_allow__upgrade
              value: "true"
            - name: UPGRADE_LOCK_FILE_PATH
              value: /data/neo4j_4_intro_upgrade_complete.lock
            - name: UPGRADE_LOG_FILE_PATH
              value: /data/neo4j_4_intro.upgrade.log
            - name: UPGRADE_SECRETS_NEO4J_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: "cluedin-neo4j-secrets"
                  key: "neo4j-password"
          cluedinExtensions:
            image:
              repository: cluedin/neo4j-upgrade-4
              # tag: <use platform version>
        - name: upgrade-4-outro
          command:
            - /usr/bin/stdbuf
            - -i0
            - -o0
            - -e0
            - "/bin/bash"
            - "-c"
            - |
              ### LOGGING SETUP
              counter=1
              UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              while [ -e "$UPGRADE_LOG_FILE_PATH_CURRENT" ]; do
                UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              done

              exec > >(tee $${UPGRADE_LOG_FILE_PATH_CURRENT}) 2>&1
              echo "✍ Upgrade log file path: $UPGRADE_LOG_FILE_PATH_CURRENT"
              ### END LOGGING SETUP

              /upgrade-outro.sh
          volumeMounts:
            - name: old-data
              mountPath: /data
          env:
            - name: NEO4J_dbms_directories_data
              value: /data
            - name: UPGRADE_DB_DUMP_DIR
              value: /data/db_dumps
            - name: UPGRADE_LOCK_FILE_PATH
              value: /data/neo4j_4_outro_upgrade_complete.lock
            - name: UPGRADE_LOG_FILE_PATH
              value: /data/neo4j_4_outro.upgrade.log
          cluedinExtensions:
            image:
              repository: cluedin/neo4j-upgrade-4
              # tag: <use platform version>
            inheritSecurityContext: true
        - name: upgrade-5-intro
          command:
            - /usr/bin/stdbuf
            - -i0
            - -o0
            - -e0
            - "/bin/bash"
            - "-c"
            - |
              ### LOGGING SETUP
              counter=1
              UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              while [ -e "$UPGRADE_LOG_FILE_PATH_CURRENT" ]; do
                UPGRADE_LOG_FILE_PATH_CURRENT="${UPGRADE_LOG_FILE_PATH%.upgrade.log}.$((counter++)).upgrade.log"
              done

              exec > >(tee $${UPGRADE_LOG_FILE_PATH_CURRENT}) 2>&1
              echo "✍ Upgrade log file path: $UPGRADE_LOG_FILE_PATH_CURRENT"
              ### END LOGGING SETUP

              # Copy previous upgrade logs, as they live on different volume
              cp -f $UPGRADE_PREV_LOG_FILE_DIR/*.upgrade.log "$(dirname $UPGRADE_LOG_FILE_PATH)/"
              echo "📝 Copied previous upgrade logs from: $UPGRADE_PREV_LOG_FILE_DIR"

              /upgrade-intro.sh
          volumeMounts:
            - name: old-data
              mountPath: /old-data
            - name: data
              mountPath: /data
              subPathExpr: data
          env:
            - name: NEO4J_server_directories_data
              value: /data
            - name: UPGRADE_DB_DUMP_DIR
              value: /old-data/db_dumps
            - name: UPGRADE_LOCK_FILE_PATH
              value: /old-data/neo4j_5_intro_upgrade_complete.lock
            - name: UPGRADE_LOG_FILE_PATH
              value: /data/neo4j_5_intro.upgrade.log
            - name: UPGRADE_PREV_LOG_FILE_DIR
              value: /old-data
            - name: UPGRADE_SECRETS_NEO4J_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: "cluedin-neo4j-secrets"
                  key: "neo4j-password"
          cluedinExtensions:
            image:
              repository: cluedin/neo4j-upgrade-5
              # tag: <use platform version>
            inheritSecurityContext: true
    cluedinExtensions:
      pluginInstallersEnabled: false

Example:

helm upgrade -n cluedin my-name cluedin/cluedin-platform
    --values my-custom-values.yaml
    --values platform-upgrade-v2-stage2.yml

⚠ Upgrade values should be applied at the end to override any other values

Wait until deployment finishes and all the PODS are green and JOBS are completed.

❗ Wait until deployment finishes. Make sure that all pods are HEALTHY and all jobs are COMPLETED.

❗ Don't proceed further if any of the pods have health issues. Rather investigate the issues.

Stage 4. Continue upgrade using new chart without upgrade values

Continue the upgrade flow using the new chart but not using the upgrade values file anymore. That could for example
include running the chart with your custom value application.system.runNugetFullRestore.

Example:

helm upgrade -n cluedin my-name cluedin/cluedin-platform
    --values my-custom-values.yaml
    --set application.system.runNugetFullRestore=true

Perform all the remaining upgrade steps and do clean up. At this point you should have fully working solution with existing data.

Stage 5. Remove obsolete data

The following volumes could be safely removed, as they contain old data:

  • datadir-cluedin-neo4j-core-0
    Neo4j databases were migrated to a new data-cluedin-neo4j-0 (default name) volume,
    so the old one could be safely removed to free up space.

Features

  • [cluedin-infrastructure] neo4j upgarded to version 5.12.x
  • [cluedin-infrastructure] rabbitmq upgarded to version 12.0.x
  • [cluedin-infrastructure] ms-sql upgarded to version 2022-latest
  • [cluedin-application] Dotnet v6.x upgrade
  • [cluedin-application] Added new datasource micro-services
  • [cluedin-application] Added shared file storage for new micro-services datasource
  • [cluedin-application] Added option for cert-manager removal
  • [cluedin-application] Added keyvault integrarion for custom certs and keys
  • [cluedin-application] Added security for grafana
  • [cluedin-application] Added new grafana profiling