# Vault Setup in Kubernetes (EKS)

This demo is to show how a Vault cluster can be configured on Kubernetes.  We will also show how Auto Unseal can be configured using the Vault Transit Engine.

<img src="images/vault-demo-vault-cluster-on-K8s.png">

## Summary of solution

This setup is tested on MacOS and is meant to simulate a distributed setup.  In this demo, we will be going through the following steps:
- Setup a kind K8s cluster (https://kind.sigs.k8s.io/)
- Install and Configure a 3 node Vault cluster using the Vault Helm Chart
- Expose the Vault nodes using a NodePort
- Demonstrate how automated snapshots configuration and also how manual snapshots are used
- Test out the High Availability of the Vault Cluster
- Setup Transit Auto-Unseal to simplify Vault server restarts
- Upgrading a Vault cluster using integrated storage autopilot

## Requirements to Run This Demo
You will need Visual Studio Code to be installed with the Jupyter plugin.  To run this notebook in VS Code, chose the Jupyter kernel and then Bash.
- To run the current cell, use Ctrl + Enter.
- To run the current cell and advance to the next, use Shift+Enter.

# Setup Pre-requisites (One-time)

Assumes you have docker installed and brew installed

- https://docs.docker.com/desktop/install/mac-install/
- https://brew.sh/

In [None]:
# Install kind
brew install kind

In [None]:
# Install Kubectl CLI
brew install kubernetes-cli

In [None]:
# Install Helm CLI.  This is used to install the VSO helm chart.
brew install helm

In [None]:
# Install K9s.  This is a nice console GUI for K8s.  https://k9scli.io/
brew install K9s

# Setup K8s cluster

In [None]:
# Start a kind cluster 3 nodes for the Vault cluster and 1 node for the Transit Auto-Unseal
# We will be setting up 6 worker nodes as we will be showing the demo for autopilot upgrade later on.
# Note that the Vault helm chart default affinity settings spreads a Vault setup across different host nodes
# We will be doing a NodePort on port 30000 so kind needs to configure the extraPortMappings to expose port 30000 to the host
kind create cluster --name vault --image kindest/node:v1.28.0 --config - <<EOF
kind: Cluster
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
  extraPortMappings:
  - containerPort: 30000
    hostPort: 30000
    listenAddress: "0.0.0.0" # Optional, defaults to "0.0.0.0"
    protocol: tcp # Optional, defaults to tcp
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
- role: worker
EOF

In [None]:
# Verify kind containers are running
docker ps
echo
# Show that we have 6 nodes in our K8s cluster
kubectl get nodes

In [None]:
# Create a new K8s namespace for this demo
## Specify the K8s namespace for the Vault setup
export KUBENAMESPACE=vault-ns

## Delete namespace if it exists
#kubectl delete ns $KUBENAMESPACE

echo "Creating K8s namespace: $KUBENAMESPACE"
kubectl create ns $KUBENAMESPACE

In [None]:
# Setup Vault Enterprise License in a K8s secret.  Update the path to your license file.
export VAULT_LICENCE=$(cat ../../vault-enterprise/vault_local/data/vault.hclic)
#kubectl delete secret vault-ent-license -n $KUBENAMESPACE
kubectl create secret generic vault-ent-license --from-literal="license=${VAULT_LICENCE}" -n $KUBENAMESPACE

In [None]:
# We will be exposing the vault nodes using a NodePort on port 30000
# vault-active: "true" is commented out.  If included, it will only route to the leader node
kubectl apply -n $KUBENAMESPACE -f - <<EOF
kind: Service
apiVersion: v1
metadata:
  name: port-vault-svc
spec:
  type: NodePort 
  selector:
    app.kubernetes.io/name: "vault"
    app.kubernetes.io/instance: "vault"
    component: server
    #vault-active: "true"
  ports:
    - nodePort: 30000
      port: 8200
      targetPort: 8200
EOF

In [None]:
# Configure my host to connect to the NodePort for Vault
export VAULT_ADDR=http://localhost:30000

In [None]:
# Add the HashiCorp repo (Only required for the first time)
helm repo add hashicorp https://helm.releases.hashicorp.com

In [None]:
# Optional.  Update the repo (Only required when new versions are released)
helm repo update

In [None]:
# Optional.  This allows you to view the helm charts for vault
helm search repo hashicorp/vault -l

# Setting up a new 3 node Vault Cluster

In [None]:
# Install a 3 node Vault cluster using the Vault helm chart.  
# This will configure the raft database on PersistentVolumes and also configure raft auto join between the 3 Vault pods.
# For demo purposes, we will be using HTTP.
# See https://developer.hashicorp.com/vault/docs/platform/k8s/helm/configuration for options
helm install vault hashicorp/vault --version 0.27.0 -n $KUBENAMESPACE -f - <<EOF
injector:
  enabled: false
server:
  image:
    repository: hashicorp/vault-enterprise
    tag: latest
  enterpriseLicense:
    secretName: vault-ent-license
  logLevel: trace
  auditStorage:
    enabled: true
  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        disable_mlock = true
        ui = true
        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }
        storage "raft" {
          # PVC Volume to keep Vault data
          path = "/vault/data"
          # For auto-join to the raft cluster
          retry_join {
            leader_api_addr = "http://vault-0.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-1.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-2.vault-internal:8200"
          } 
        }
EOF



In [None]:
# View installed charts
helm list -A

In [None]:
# View Vault pods in Vault namespace
#kubectl get pods -n $KUBENAMESPACE -o wide

# Show resources in Vault namespace
kubectl -n $KUBENAMESPACE get all

# Make sure all Vault pods are in Running status before continuing

# Note:
# The containers should start within less than a minute.  If the containers get stuck in ContainerCreating for very long without any errors.
# There could be throttling issues on the DockerHub side.  You might want to kill and restart the kind cluster and try again.

In [None]:
# On first time setup, verify that all Vault nodes are sealed and not initialized.  Initialized = falase & Sealed = true
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault status
echo
kubectl exec -ti vault-1 -n $KUBENAMESPACE -- vault status
echo
kubectl exec -ti vault-2 -n $KUBENAMESPACE -- vault status

In [None]:
# Initialize vault-0 pod.  For demo purposes, we will just be generating 1 unseal key.
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator init -format=json -key-shares=1 -key-threshold=1 > init.json

In [None]:
# Show the init.json
cat init.json | jq

# Store the Unseal Key and Root Token for use later
export UNSEAL_KEY=$(jq -r '.unseal_keys_b64[]' init.json)
export VAULT_TOKEN=$(jq -r '.root_token' init.json)
echo
echo "Vault Unseal Key: $UNSEAL_KEY"
echo "Vault Root Token: $VAULT_TOKEN"

In [None]:
# Unseal vault-0 pod.  You should see Sealed = false.  Re-run the command if Sealed is true.
echo "Vault Unseal Key: $UNSEAL_KEY"
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator unseal $UNSEAL_KEY

In [None]:
# Note: This step is only required if the retry_join setting is not in the Vault config.
# We are skipping this step but note that you can do manual joining if you don't specify retry_join in the Vault config.
# Join vault-1 pod to the cluster.
#kubectl exec -ti vault-1 -n $KUBENAMESPACE -- vault operator raft join http://vault-0.vault-internal:8200
# Join vault-2 pod to the cluster.
#kubectl exec -ti vault-2 -n $KUBENAMESPACE -- vault operator raft join http://vault-0.vault-internal:8200

In [None]:
# Unseal vault-1 pod.  You should see Sealed = false.  Re-run the command if Sealed is true.
echo "Vault Unseal Key: $UNSEAL_KEY"
kubectl exec -ti vault-1 -n $KUBENAMESPACE -- vault operator unseal $UNSEAL_KEY

In [None]:
# Unseal vault-2 pod.  You should see Sealed = false.  Re-run the command if Sealed is true.
echo "Vault Unseal Key: $UNSEAL_KEY"
kubectl exec -ti vault-2 -n $KUBENAMESPACE -- vault operator unseal $UNSEAL_KEY

In [None]:
# Verify that I can access the vault cluster from the node port
vault secrets list
echo
# Test logging in as root on vault-0 and verify that you can also access vault from the pod
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault login $VAULT_TOKEN
echo
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault secrets list

## Vault Backup and Restore

In [None]:
# Configure automated snapshots every 24 hours.  The snapshots are stored locally in a directory named 
# "vault" and retain 7 snapshots before one can be deleted to make room for the next snapshot. 
# The local disk space available to store the snapshot is 1GB or 1073741824 bytes. This means that raft-backup retains up to 7 snapshots
# or 1GB of data whichever the condition meets first.
# Note that storage_type can also be configured to point to cloud storage types on AWS, Azure, or GCP.
# Ref:
# https://developer.hashicorp.com/vault/tutorials/raft/raft-storage#automated-snapshots
# https://developer.hashicorp.com/vault/api-docs/system/storage/raftautosnapshots
vault write sys/storage/raft/snapshot-auto/config/daily interval="24h" retain=7 \
  path_prefix="/vault" storage_type="local" local_max_space=1073741824


In [None]:
# View the automated snapshot configuration
vault read sys/storage/raft/snapshot-auto/config/daily

In [None]:
# You can also do a manual snapshot.  Note: Needs to be executed on the leader node.
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault login $VAULT_TOKEN
echo
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator raft snapshot save /vault/demo.snapshot

In [None]:
# Let's do some modifications after the snapshot, let's enable the Transit engine.
kubectl exec -ti vault-0 -n $KUBENAMESPACE --  vault secrets enable transit

In [None]:
# View that the transit engine is now in the list of secret engines.
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault secrets list

In [None]:
# Restore a snapshot.  Note: Needs to be executed on the leader node.
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator raft snapshot restore /vault/demo.snapshot

In [None]:
# Now view the list of secret engines and you will see that is has restored to the original state without the transit engine
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault secrets list

In [None]:
# View raft status of your Vault cluster.  Verify that the cluster is still healthy after the restore.
vault operator raft autopilot state

## Testing High Availability

In [None]:
# View raft status of your Vault cluster and verify the leader is vault-0
vault operator raft list-peers

In [None]:
# Let's try deleting the vault-0 pod
kubectl delete pod vault-0 -n $KUBENAMESPACE
# See that the pod gets recreated
kubectl get pods -n $KUBENAMESPACE

In [None]:
# View raft status of your Vault cluster and verify that another leader has taken over
vault operator raft list-peers
echo
# Verify that vault-0 is sealed
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault status

In [None]:
# Verify that I can still access vault secrets even thought vault-0 is sealed
vault secrets list

In [None]:
# View raft status of your Vault cluster.  Verify that Healthy is false as the new Vault node is still sealed.
vault operator raft autopilot state

In [None]:
# Unseal vault-0 pod.  You should see Sealed = false.  Re-run the command if Sealed is true.
echo "Vault Unseal Key: $UNSEAL_KEY"
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator unseal $UNSEAL_KEY

In [None]:
# View raft status of your Vault cluster.  Verify that Healthy is now true.
vault operator raft autopilot state

In [None]:
# The next demo will be showing Transit Auto Unseal.  Clearing the existing setup.
# Delete the Vault cluster
helm delete vault -n $KUBENAMESPACE

In [None]:
# Clear Vault PVCs
kubectl -n $KUBENAMESPACE delete pvc --all 
echo
# Verify that all PVCs are cleared
kubectl -n $KUBENAMESPACE get pvc  

# Configure Vault Cluster for Transit Auto Unseal

Auto-unseal can be done with AWS KMS, Azure Key Vault, GCP Cloud KMS, HSM devices via PKCS#11, and lastly Vault's Transit Engine.

Ref: https://developer.hashicorp.com/vault/tutorials/auto-unseal

This section will demonstrate how we can do auto-unsealing of the Vault cluster nodes using Transit Auto Unseal.

Note that you can use the seal migration process.

Ref: https://developer.hashicorp.com/vault/docs/concepts/seal#seal-migration

For this demo purposes, we will be doing a fresh Vault cluster setup.

## Configure a standalone Vault server to provide the auto unseal keys using Transit Engine

In [None]:
# Install a new Vault pod to provide the transit engine for auto-unseal
helm install vault-transit hashicorp/vault --version 0.27.0 -n $KUBENAMESPACE -f - <<EOF
injector:
  enabled: false
server:
  image:
    repository: hashicorp/vault-enterprise
    tag: latest
  enterpriseLicense:
    secretName: vault-ent-license
  logLevel: trace
  auditStorage:
    enabled: true
  ha:
    enabled: true
    replicas: 1
    raft:
      enabled: true
      setNodeId: true
EOF


In [None]:
# Show resources in Vault namespace.  Verify that vault-transit-0 is Running before continuing.
kubectl -n $KUBENAMESPACE get all

In [None]:
# Initialize vault-transit-0 pod.  For demo purposes, we will just be generating 1 unseal key.
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault operator init -format=json -key-shares=1 -key-threshold=1 > transit-init.json

In [None]:
# Show the transit-init.json
cat transit-init.json | jq

# Store the vault-transit-0 Unseal Key and Root Token for use later
export TRANSIT_UNSEAL_KEY=$(jq -r '.unseal_keys_b64[]' transit-init.json)
export TRANSIT_VAULT_TOKEN=$(jq -r '.root_token' transit-init.json)
echo
echo "Transit Vault Unseal Key: $TRANSIT_UNSEAL_KEY"
echo "Transit Vault Root Token: $TRANSIT_VAULT_TOKEN"

In [None]:
# Unseal vault-transit-0 pod.  You should see Sealed = false
echo "Transit Vault Unseal Key: $TRANSIT_UNSEAL_KEY"
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault operator unseal $TRANSIT_UNSEAL_KEY

In [None]:
# Configure the Transit Engine for use for Auto Unseal
echo "Transit Vault Root Token: $TRANSIT_VAULT_TOKEN"
# Login to Vault and enable the transit engine
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault login $TRANSIT_VAULT_TOKEN
echo
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault secrets enable transit

In [None]:
# Create an auto-unseal encryption key
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault write -f transit/keys/autounseal

In [None]:
# Create an auto-unseal policy to access the autounseal key
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault policy write autounseal - <<EOF
path "transit/encrypt/autounseal" {
    capabilities = [ "update" ]
}
path "transit/decrypt/autounseal" {
    capabilities = [ "update" ]
}
EOF
echo
# Verify the policy is written
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault policy read autounseal

In [None]:
# Create an orphaned token that will be auto-renewed by the transit auto-unseal (when it reaches 2/3 of expiry, it will try to renew)
# Ref:
# https://developer.hashicorp.com/vault/tutorials/auto-unseal/autounseal-transit
# https://developer.hashicorp.com/vault/docs/commands/token/create#period
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault token create -orphan -policy="autounseal" \
    -period=24h -format=json > token.json


In [None]:
# Show token.json
cat token.json | jq

# Store the Auto Unseal Token, this will be used to configure the Vault cluster for Auto Unseal
export AUTOUNSEAL_TOKEN=$(jq -r .auth.client_token token.json)
echo
echo "Auto Unseal Token: $AUTOUNSEAL_TOKEN"

In [None]:
# Verify that the created token is valid and has the "autounseal" policy
echo "Transit Vault Auto Unseal Token: $AUTOUNSEAL_TOKEN"
kubectl exec -ti vault-transit-0 -n $KUBENAMESPACE -- vault login $AUTOUNSEAL_TOKEN

In [None]:
# We will be showing the Vault upgrade process later
# Ref: https://hub.docker.com/r/hashicorp/vault-enterprise/tags
# Install latest minor release for Vault Enterprise 1.14 for now
# Later we will be testing the upgrade to the latest minor release for Vault Enterprise 1.15
export INITIAL_VAULT_VERSION=1.14-ent
export TARGET_VAULT_VERSION=1.15-ent

# Install the 3 node Vault cluster with the Vault helm chart.  Note the extra seal stanza.  
# We will be setting the autounseal vault token as an environment variable.
# See https://developer.hashicorp.com/vault/docs/platform/k8s/helm/configuration for options

# The retry_join stanza is updated for 6 nodes to demo the autopilot upgrade use case below
helm install vault hashicorp/vault --version 0.27.0 -n $KUBENAMESPACE -f - <<EOF
injector:
  enabled: false
server:
  image:
    repository: hashicorp/vault-enterprise
    tag: $INITIAL_VAULT_VERSION
  enterpriseLicense:
    secretName: vault-ent-license
  logLevel: trace
  auditStorage:
    enabled: true
  extraEnvironmentVars:
    VAULT_TOKEN: $AUTOUNSEAL_TOKEN
  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        disable_mlock = true
        ui = true
        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }
        storage "raft" {
          # PVC Volume to keep Vault data
          path = "/vault/data"
          # For auto-join to the raft cluster
          retry_join {
            leader_api_addr = "http://vault-0.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-1.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-2.vault-internal:8200"
          } 
          retry_join {
            leader_api_addr = "http://vault-3.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-4.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-5.vault-internal:8200"
          } 
        }
        seal "transit" {
          address = "http://vault-transit-0.vault-transit-internal:8200"
          #token = $AUTOUNSEAL_TOKEN
          disable_renewal = "false"
          key_name = "autounseal"
          mount_path = "transit/"
          tls_skip_verify = "true"
        } 
EOF

In [None]:
# Show resources in Vault namespace
kubectl -n $KUBENAMESPACE get all

# Make sure all Vault pods are in Running status before continuing

In [None]:
# Initialize vault-0 pod.  Note that the flags we use are for recovery keys as this is a auto unseal setup.
# For demo purposes, we will just be generating 1 recovery key.
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault operator init -format=json -recovery-shares=1 -recovery-threshold=1 > init.json
cat init.json | jq

In [None]:
# Verify that all Vault nodes are already unsealed before going to the next step.  Sealed = false
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault status
echo
kubectl exec -ti vault-1 -n $KUBENAMESPACE -- vault status
echo
kubectl exec -ti vault-2 -n $KUBENAMESPACE -- vault status

In [None]:
# View raft status of your Vault cluster.  Verify that Healthy is now true.
export VAULT_TOKEN=$(jq -r '.root_token' init.json)
echo "Vault Root Token: $VAULT_TOKEN"
echo
vault operator raft autopilot state

In [None]:
# Let's try deleting the vault-0 pod
kubectl delete pod vault-0 -n $KUBENAMESPACE
# See that the pod gets recreated
kubectl get pods -n $KUBENAMESPACE

In [None]:
# Verify that the vault-0 pod got recreated
kubectl get pods -o=wide -n $KUBENAMESPACE


In [None]:
# Verify that recreated Vault node is already unsealed.  Sealed = false
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- vault status

# Upgrading a Vault Cluster on K8s with Integrated Storage Autopilot

This will demostrate the upgrade process for a Vault Cluster on K8s and how integration storage autopilot does the automatic promotion of the leader/follower nodes to the new K8s pods with the newer Vault version.
- Update helm chart and scale out the Vault cluster to 6 nodes with the target Vault version.  i.e. 3 current nodes + 3 newer nodes
- Let autopilot handle the automatic promotion of the new nodes to be the leader/followers.
- Once, the autopilot process is complete, delete the old 3 Vault nodes.  3 new Vault nodes will be recreated on the target Vault version to form a 6 node cluster.
- Remove Vault nodes 4-6 from the raft setup.  i.e. Demote back to a 3 Vault node cluster.
- Scale the cluster back down from 6 nodes to 3 nodes to remove Vault nodes 4-6.

Ref:
https://developer.hashicorp.com/vault/tutorials/kubernetes/kubernetes-raft-deployment-guide#upgrading-vault-on-kubernetes
https://developer.hashicorp.com/vault/tutorials/raft/raft-autopilot

The Vault StatefulSet uses OnDelete update strategy. It is critical to use OnDelete instead of RollingUpdate because standbys must be updated before the active primary. A failover to an older version of Vault must always be avoided.

Important: For a Kubernetes StatefulSet with N replicas, note that pods are deleted in the reverse order that they are created. i.e. From N-1 to 0.

Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#deployment-and-scaling-guarantees

Note: We will be using the previous Vault 3-node cluster with transit auto unseal that was configured earlier. So the previous setup needs to be executed before running this section.

In [None]:
# View raft status of your Vault cluster
vault operator raft list-peers

In [None]:
# We will be using the same yaml, the only change is the server.image.tag and server.ha.replicas values.
# We will be scaling to 6 nodes.  3 nodes (current Vault version) + 3 nodes (new Vault version).
helm upgrade vault hashicorp/vault --version 0.27.0 -n $KUBENAMESPACE -f - <<EOF
injector:
  enabled: false
server:
  image:
    repository: hashicorp/vault-enterprise
    tag: $TARGET_VAULT_VERSION
  enterpriseLicense:
    secretName: vault-ent-license
  logLevel: trace
  auditStorage:
    enabled: true
  extraEnvironmentVars:
    VAULT_TOKEN: $AUTOUNSEAL_TOKEN
  ha:
    enabled: true
    replicas: 6
    raft:
      enabled: true
      setNodeId: true
      config: |
        disable_mlock = true
        ui = true
        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }
        storage "raft" {
          # PVC Volume to keep Vault data
          path = "/vault/data"
          # For auto-join to the raft cluster
          retry_join {
            leader_api_addr = "http://vault-0.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-1.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-2.vault-internal:8200"
          } 
          retry_join {
            leader_api_addr = "http://vault-3.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-4.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-5.vault-internal:8200"
          } 
        }
        seal "transit" {
          address = "http://vault-transit-0.vault-transit-internal:8200"
          #token = $AUTOUNSEAL_TOKEN
          disable_renewal = "false"
          key_name = "autounseal"
          mount_path = "transit/"
          tls_skip_verify = "true"
        } 
EOF

In [None]:
# View raft status of your Vault cluster and see that autopilot is slowing promoting the new nodes and demoting the old nodes
# Re-run this command to see the following changes happening
# - new nodes getting added as non-voters
# - autopilot kicks in and new nodes get promoted to voters
# - leader is transferred to the new nodes
# - the old nodes are demoted to non-voters
vault operator raft list-peers

In [None]:
# View raft status of your Vault cluster.  Verify that Healthy is true and the voter nodes are the new Vault nodes.
# The older version Vault nodes are now non-voters.
vault operator raft autopilot state

In [None]:
# Now delete vault-0, vault-1, vault-2 pods and get them to upgrade to the target version
kubectl delete pod vault-0 vault-1 vault-2 -n $KUBENAMESPACE

In [None]:
# See that the vault-0, vault-1, and vault-2 pods gets recreated.  When all the pods are in running state, go to the next step
kubectl get pods -n $KUBENAMESPACE

In [None]:
# View raft status of your Vault cluster and verify that the recreated Vault nodes are coming back as voter nodes
vault operator raft list-peers

In [None]:
# View raft status of your Vault cluster.  Verify that all nodes on are on target version now.
# Note down which is the current leader node.
vault operator raft autopilot state

In [None]:
# For a Kubernetes StatefulSet with N replicas, note that pods are deleted in the reverse order that they are created.
# i.e. From N-1 to 0.
# Ref: https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/#deployment-and-scaling-guarantees
# Now we will be scaling back to our original 3 nodes.  We will be removing vault-3, vault-4, vault-5 from the raft cluster.

# Tip:
# The removal of the leader node takes a bit of time for the re-election to occur.
# So for vault-3, vault-4, vault-5, one of them is the current leader node.
# We do not want any of these removed nodes to be re-elected.
# So run the following removals in the correct sequence accordingly.
# i.e. Remove the follower nodes first before removing the leader node.  

# Remove vault-3 from raft cluster
vault operator raft remove-peer vault-3

In [None]:
# Remove vault-4 from raft cluster
vault operator raft remove-peer vault-4


In [None]:
# Remove vault-5 from raft cluster
vault operator raft remove-peer vault-5

In [None]:
# View raft status of your Vault cluster and verify that it now only has the original first three nodes.
# i.e. vault-0, vault-1, vault-2
vault operator raft list-peers

In [None]:
# Now scale back our helm chart to 3 replicas
# server.image.tag is left at the target version and server.ha.replicas is changed from 6 to 3.
helm upgrade vault hashicorp/vault --version 0.27.0 -n $KUBENAMESPACE -f - <<EOF
injector:
  enabled: false
server:
  image:
    repository: hashicorp/vault-enterprise
    tag: $TARGET_VAULT_VERSION
  enterpriseLicense:
    secretName: vault-ent-license
  logLevel: trace
  auditStorage:
    enabled: true
  extraEnvironmentVars:
    VAULT_TOKEN: $AUTOUNSEAL_TOKEN
  ha:
    enabled: true
    replicas: 3
    raft:
      enabled: true
      setNodeId: true
      config: |
        disable_mlock = true
        ui = true
        listener "tcp" {
          tls_disable = 1
          address = "[::]:8200"
          cluster_address = "[::]:8201"
        }
        storage "raft" {
          # PVC Volume to keep Vault data
          path = "/vault/data"
          # For auto-join to the raft cluster
          retry_join {
            leader_api_addr = "http://vault-0.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-1.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-2.vault-internal:8200"
          } 
          retry_join {
            leader_api_addr = "http://vault-3.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-4.vault-internal:8200"
          }
          retry_join {
            leader_api_addr = "http://vault-5.vault-internal:8200"
          } 
        }
        seal "transit" {
          address = "http://vault-transit-0.vault-transit-internal:8200"
          #token = $AUTOUNSEAL_TOKEN
          disable_renewal = "false"
          key_name = "autounseal"
          mount_path = "transit/"
          tls_skip_verify = "true"
        } 
EOF

In [None]:
# You will now see that we have gone back to the original 3 Vault nodes configuration
kubectl get pods -n $KUBENAMESPACE

In [None]:
# Note that Vault nodes 4-6 PVCs are still there
kubectl -n $KUBENAMESPACE get pvc

In [None]:
# Delete the PVCs used by Vault nodes 4-6
kubectl -n $KUBENAMESPACE delete pvc audit-vault-3 audit-vault-4 audit-vault-5 data-vault-3 data-vault-4 data-vault-5

# Verify that Vault nodes 4-6 PVCs are removed
kubectl -n $KUBENAMESPACE get pvc

# This completes the autopilot upgrade process

## Clean Up

In [None]:
# Clean up temp files
rm init.json
rm transit-init.json
rm token.json

# Uninstall metrics server
helm delete metrics-server -n kube-system

# Disable file audit device
vault audit disable file

# Remove the NodePort
kubectl delete svc port-vault-svc -n $KUBENAMESPACE 

# Delete Vault cluster
helm delete vault -n $KUBENAMESPACE

# Delete the Vault for Transit Auto Unseal
helm delete vault-transit -n $KUBENAMESPACE

# Clear Vault PVCs
kubectl -n $KUBENAMESPACE delete pvc --all 

# Delete kind cluster
kind delete cluster --name vault

# Appendix - Other Useful Commands

In [None]:
# Optional: Turn on the file audit device, this allows you to keep a detailed log of all requests to Vault
vault audit enable file file_path=/vault/audit/vault_audit.log

In [None]:
# Optional: view pod logs
kubectl logs vault-0 -n $KUBENAMESPACE

In [None]:
# Optional: view pod details
kubectl describe pod vault-0 -n $KUBENAMESPACE

In [None]:
# Optional: Add metrics-server to be able to view CPU and memory usage
helm repo add metrics-server https://kubernetes-sigs.github.io/metrics-server/
helm repo update
helm upgrade --install --set args={--kubelet-insecure-tls} metrics-server metrics-server/metrics-server --namespace kube-system

In [None]:
# Optional: You can use k9s to view your pods.
# You can also use the following commands to see the utlization on your nodes/pods
kubectl top nodes
echo
kubectl top pod -n $KUBENAMESPACE

In [None]:
# To get a shell into a Vault pod
kubectl exec -ti vault-0 -n $KUBENAMESPACE -- /bin/sh
#kubectl exec -ti vault-0 -n vault-ns -- /bin/sh

In [None]:
# Show ConfigMap resources for Vault
kubectl get configmap -n $KUBENAMESPACE -o=yaml

In [None]:
# Show vault-config details
kubectl describe configmaps vault-config -n $KUBENAMESPACE

In [None]:
# Show Vault pod details
kubectl describe pod vault-0 -n $KUBENAMESPACE

In [None]:
# Show Persistent Volume Claims in use by Vault
kubectl get pvc -n $KUBENAMESPACE -o=yaml