Skip to content

Commit

Permalink
Active/Active XSite fencing. Resolves keycloak/keycloak#29303
Browse files Browse the repository at this point in the history
- User alert routing enabled on ROSA clusters

- PrometheusRule used to trigger AWS Lambda webhook in the event of a
  split-brain so that only a single site remains in the global accelerator endpoints

- Global Accelerator scripts refactored to use OpenTofu when creating
  AWS resources

- Task created to deploy/undeploy Active/Active

- Task created to simulate split-brain scenarios

- 'active-active' flag added to GH actions to differentiate between
  active/passive and active/active deployments

Signed-off-by: Ryan Emerson <remerson@redhat.com>
  • Loading branch information
ryanemerson committed May 30, 2024
1 parent 6daa6a8 commit 88742c9
Show file tree
Hide file tree
Showing 37 changed files with 899 additions and 193 deletions.
39 changes: 38 additions & 1 deletion .github/workflows/rosa-cluster-auto-provision-on-schedule.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,11 +53,48 @@ jobs:
createCluster: false
secrets: inherit

run-scaling-benchmark-with-peristent-sessions:
run-scaling-benchmark-with-persistent-sessions:
needs: keycloak-deploy-with-persistent-sessions
uses: ./.github/workflows/rosa-scaling-benchmark.yml
with:
clusterName: gh-keycloak-a # ${{ env.CLUSTER_PREFIX }}-a -- unfortunately 'env.' doesn't work here ${{ env.CLUSTER_PREFIX }}-a
skipCreateDataset: true
outputArchiveSuffix: 'persistent-sessions'
secrets: inherit

keycloak-undeploy-with-persistent-sessions:
needs: run-scaling-benchmark-with-persistent-sessions
name: Undeploy Keycloak deployment on the multi-az cluster
if: github.event_name != 'schedule' || github.repository == 'keycloak/keycloak-benchmark'
uses: ./.github/workflows/rosa-multi-az-cluster-undeploy.yml
with:
clusterPrefix: gh-keycloak # ${{ env.CLUSTER_PREFIX }} -- unfortunately 'env.' doesn't work here
skipAuroraDeletion: true
secrets: inherit

keycloak-deploy-active-active:
needs: keycloak-undeploy-with-persistent-sessions
name: ROSA Scheduled Create Active/ACtive cluster with Persistent Sessions
if: github.event_name != 'schedule' || github.repository == 'keycloak/keycloak-benchmark'
uses: ./.github/workflows/rosa-multi-az-cluster-create.yml
with:
clusterPrefix: gh-keycloak # ${{ env.CLUSTER_PREFIX }} -- unfortunately 'env.' doesn't work here
enablePersistentSessions: true
createCluster: false
activeActive: true
secrets: inherit

run-functional-tests-active-active:
needs: keycloak-deploy-active-active
uses: ./.github/workflows/rosa-run-crossdc-func-tests.yml
with:
clusterPrefix: gh-keycloak # ${{ env.CLUSTER_PREFIX }} -- unfortunately 'env.' doesn't work here
secrets: inherit

run-scaling-benchmark-active-active:
needs: run-functional-tests-active-active
uses: ./.github/workflows/rosa-scaling-benchmark.yml
with:
clusterName: gh-keycloak-a # ${{ env.CLUSTER_PREFIX }}-a -- unfortunately 'env.' doesn't work here ${{ env.CLUSTER_PREFIX }}-a
outputArchiveSuffix: 'active-active'
secrets: inherit
63 changes: 58 additions & 5 deletions .github/workflows/rosa-multi-az-cluster-create.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,10 @@ on:
keycloakRepository:
description: 'The repository to deploy Keycloak from. If not set nightly image is used'
type: string
activeActive:
description: 'When true deploy an Active/Active Keycloak deployment'
type: boolean
default: false
enablePersistentSessions:
description: 'To enable Persistent user and client sessions to the DB'
type: boolean
Expand All @@ -32,16 +36,20 @@ on:
description: 'The AWS region to create both clusters in. Defaults to "vars.AWS_DEFAULT_REGION" if omitted.'
type: string
createCluster:
description: 'Check to Create Cluster'
description: 'Check to Create Cluster.'
type: boolean
default: true
keycloakRepository:
description: 'The repository to deploy Keycloak from. If not set nightly image is used'
type: string
activeActive:
description: 'When true deploy an Active/Active Keycloak deployment'
type: boolean
default: false
enablePersistentSessions:
description: 'To enable Persistent user and client sessions to the DB'
type: boolean
default: false
keycloakRepository:
description: 'The repository to deploy Keycloak from. If not set nightly image is used'
type: string
keycloakBranch:
description: 'The branch to deploy Keycloak from. If not set nightly image is used'
type: string
Expand Down Expand Up @@ -109,6 +117,11 @@ jobs:
- name: Checkout repository
uses: actions/checkout@v4

- name: Setup OpenTofu
uses: opentofu/setup-opentofu@v1
with:
tofu_wrapper: false

- name: Setup ROSA CLI
uses: ./.github/actions/rosa-cli-setup
with:
Expand Down Expand Up @@ -140,6 +153,7 @@ jobs:
ROSA_CLUSTER_NAME_2: ${{ env.CLUSTER_PREFIX }}-b

- name: Create Route53 Loadbalancer
if: ${{ !inputs.activeActive }}
working-directory: provision/rosa-cross-dc
run: |
task route53 > route53
Expand All @@ -150,10 +164,49 @@ jobs:
ROSA_CLUSTER_NAME_1: ${{ env.CLUSTER_PREFIX }}-a
ROSA_CLUSTER_NAME_2: ${{ env.CLUSTER_PREFIX }}-b

- name: Deploy
- name: Deploy Active/Passive
if: ${{ !inputs.activeActive }}
working-directory: provision/rosa-cross-dc
run: task
env:
AURORA_CLUSTER: ${{ env.CLUSTER_PREFIX }}
AURORA_REGION: ${{ env.REGION }}
ROSA_CLUSTER_NAME_1: ${{ env.CLUSTER_PREFIX }}-a
ROSA_CLUSTER_NAME_2: ${{ env.CLUSTER_PREFIX }}-b
KC_ACTIVE_ACTIVE: ${{ inputs.activeActive }}
KC_CPU_REQUESTS: 6
KC_INSTANCES: 3
KC_DISABLE_STICKY_SESSION: true
KC_PERSISTENT_SESSIONS: ${{ env.KC_PERSISTENT_SESSIONS }}
KC_MEMORY_REQUESTS_MB: 3000
KC_MEMORY_LIMITS_MB: 4000
KC_DB_POOL_INITIAL_SIZE: 30
KC_DB_POOL_MAX_SIZE: 30
KC_DB_POOL_MIN_SIZE: 30
KC_DATABASE: "aurora-postgres"
MULTI_AZ: "true"
KC_REPOSITORY: ${{ inputs.keycloakRepository }}
KC_BRANCH: ${{ inputs.keycloakBranch }}

- name: Create Accelerator Loadbalancer
if: ${{ inputs.activeActive }}
working-directory: provision/rosa-cross-dc
run: |
task global-accelerator-create 2>&1 | tee accelerator
echo "ACCELERATOR_DNS=$(grep -Po 'ACCELERATOR DNS: \K.*' accelerator)" >> $GITHUB_ENV
echo "ACCELERATOR_WEBHOOK=$(grep -Po 'ACCELERATOR WEBHOOK: \K.*' accelerator)" >> $GITHUB_ENV
env:
ACCELERATOR_NAME: ${{ env.CLUSTER_PREFIX }}
ROSA_CLUSTER_NAME_1: ${{ env.CLUSTER_PREFIX }}-a
ROSA_CLUSTER_NAME_2: ${{ env.CLUSTER_PREFIX }}-b

- name: Deploy Active/Active
if: ${{ inputs.activeActive }}
working-directory: provision/rosa-cross-dc
run: task active-active
env:
ACCELERATOR_DNS: ${{ env.ACCELERATOR_DNS }}
ACCELERATOR_WEBHOOK_URL: ${{ env.ACCELERATOR_WEBHOOK }}
AURORA_CLUSTER: ${{ env.CLUSTER_PREFIX }}
AURORA_REGION: ${{ env.REGION }}
ROSA_CLUSTER_NAME_1: ${{ env.CLUSTER_PREFIX }}-a
Expand Down
19 changes: 16 additions & 3 deletions .github/workflows/rosa-multi-az-cluster-delete.yml
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ on:
type: string

jobs:
route53:
loadbalancer:
runs-on: ubuntu-latest
steps:
- name: Checkout repository
Expand Down Expand Up @@ -40,19 +40,32 @@ jobs:
echo "SUBDOMAIN=$(echo $KEYCLOAK_URL | grep -oP '(?<=client.).*?(?=.keycloak-benchmark.com)')" >> $GITHUB_ENV
- name: Delete Route53 Records
run: |
./provision/aws/route53/route53_delete.sh
run: ./provision/aws/route53/route53_delete.sh
env:
SUBDOMAIN: ${{ env.SUBDOMAIN }}

- name: Set ACCELERATOR_DNS env variable for Global Accelerator processing
run: |
echo "ACCELERATOR_DNS=${KEYCLOAK_URL#"https://"}" >> $GITHUB_ENV
- name: Delete Global Accelerator
run: ./provision/aws/global-accelerator/accelerator_multi_az_delete.sh
env:
ACCELERATOR_DNS: ${{ env.ACCELERATOR_DNS }}
CLUSTER_1: ${{ inputs.clusterPrefix }}-a
CLUSTER_2: ${{ inputs.clusterPrefix }}-b
KEYCLOAK_NAMESPACE: runner-keycloak

cluster1:
needs: loadbalancer
uses: ./.github/workflows/rosa-cluster-delete.yml
with:
clusterName: ${{ inputs.clusterPrefix }}-a
deleteAll: no
secrets: inherit

cluster2:
needs: loadbalancer
uses: ./.github/workflows/rosa-cluster-delete.yml
with:
clusterName: ${{ inputs.clusterPrefix }}-b
Expand Down
60 changes: 51 additions & 9 deletions .github/workflows/rosa-run-crossdc-func-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,20 @@ on:
clusterPrefix:
description: 'The prefix used when creating the Cross DC clusters'
type: string
activeActive:
description: 'Must be true when testing against an Active/Active Keycloak deployment'
type: boolean
default: false

workflow_dispatch:
inputs:
clusterPrefix:
description: 'The prefix used when creating the Cross DC clusters'
type: string
activeActive:
description: 'Must be true when testing against an Active/Active Keycloak deployment'
type: boolean
default: false

concurrency:
# Only run once for the latest commit per ref and cancel other (previous) runs.
Expand All @@ -32,6 +40,7 @@ jobs:
distribution: 'temurin'
java-version: '17'
cache: 'maven'

- name: Cache Maven Wrapper
uses: actions/cache@v4
with:
Expand All @@ -40,37 +49,70 @@ jobs:
key: ${{ runner.os }}-maven-wrapper-${{ hashFiles('**/maven-wrapper.properties') }}
restore-keys: |
${{ runner.os }}-maven-wrapper-
- name: Setup ROSA CLI
uses: ./.github/actions/rosa-cli-setup
with:
aws-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-default-region: ${{ vars.AWS_DEFAULT_REGION }}
rosa-token: ${{ secrets.ROSA_TOKEN }}

- name: Login to OpenShift cluster A
uses: ./.github/actions/oc-keycloak-login
with:
clusterName: ${{ inputs.clusterPrefix }}-a
- name: Get DC1 URLs

- name: Get DC1 Infinispan URLs
shell: bash
run: |
KEYCLOAK_DC1_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" aws-health-route -o jsonpath='{.spec.host}')
ISPN_DC1_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" -l app=infinispan-service-external -o jsonpath='{.items[*].spec.host}')
echo "ISPN_DC1_URL=$ISPN_DC1_URL" >> "$GITHUB_ENV"
- name: Get DC1 Active/Passive URLs
if: ${{ !inputs.activeActive }}
shell: bash
run: |
KEYCLOAK_DC1_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" accelerator-loadbalancer -o jsonpath='{.spec.host}')
echo "KEYCLOAK_DC1_URL=$KEYCLOAK_DC1_URL" >> "$GITHUB_ENV"
LOAD_BALANCER_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" -l app=keycloak -o jsonpath='{.items[*].spec.host}')
echo "LOAD_BALANCER_URL=$LOAD_BALANCER_URL" >> "$GITHUB_ENV"
ISPN_DC1_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" -l app=infinispan-service-external -o jsonpath='{.items[*].spec.host}')
echo "ISPN_DC1_URL=$ISPN_DC1_URL" >> "$GITHUB_ENV"
- name: Get DC1 Active/Active URLs
if: inputs.activeActive
shell: bash
run: |
KEYCLOAK_DC1_URL=https://$(kubectl get svc -n "${{ env.PROJECT }}" accelerator-loadbalancer -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "KEYCLOAK_DC1_URL=$KEYCLOAK_DC1_URL" >> "$GITHUB_ENV"
LOAD_BALANCER_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" -l app=keycloak -o jsonpath='{.items[*].spec.host}')
echo "LOAD_BALANCER_URL=$LOAD_BALANCER_URL" >> "$GITHUB_ENV"
- name: Login to OpenShift cluster B
uses: ./.github/actions/oc-keycloak-login
with:
clusterName: ${{ inputs.clusterPrefix }}-b
- name: Get DC2 URLs

- name: Get DC2 Infinispan URLs
shell: bash
run: |
KEYCLOAK_DC2_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" aws-health-route -o jsonpath='{.spec.host}')
echo "KEYCLOAK_DC2_URL=$KEYCLOAK_DC2_URL" >> "$GITHUB_ENV"
ISPN_DC2_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" -l app=infinispan-service-external -o jsonpath='{.items[*].spec.host}')
echo "ISPN_DC2_URL=$ISPN_DC2_URL" >> "$GITHUB_ENV"
- name: Run CrossDC functional tests
- name: Get DC2 Active/Passive URLs
if: ${{ !inputs.activeActive }}
shell: bash
run: |
./provision/rosa-cross-dc/keycloak-benchmark-crossdc-tests/run-crossdc-tests.sh
KEYCLOAK_DC2_URL=https://$(kubectl get routes -n "${{ env.PROJECT }}" aws-health-route -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "KEYCLOAK_DC2_URL=$KEYCLOAK_DC2_URL" >> "$GITHUB_ENV"
- name: Get DC2 Active/Active URLs
if: inputs.activeActive
shell: bash
run: |
KEYCLOAK_DC2_URL=https://$(kubectl get svc -n "${{ env.PROJECT }}" accelerator-loadbalancer -o jsonpath='{.status.loadBalancer.ingress[0].hostname}')
echo "KEYCLOAK_DC2_URL=$KEYCLOAK_DC2_URL" >> "$GITHUB_ENV"
- name: Run CrossDC functional tests
run: ./provision/rosa-cross-dc/keycloak-benchmark-crossdc-tests/run-crossdc-tests.sh
env:
ACTIVE_ACTIVE: ${{ inputs.activeActive }}
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -101,3 +101,4 @@ provision/environment_data.json
**/*.tfstate*
**/*.terraform*
!**/*.terraform.lock.hcl
provision/opentofu/modules/aws/accelerator/builds/*
Original file line number Diff line number Diff line change
Expand Up @@ -76,9 +76,10 @@ oc login https://api.**<domain name>**:6443 -u **<username>**

NOTE: The session will expire approximately one a day, and you'll need to re-login.

== Enable user workload monitoring
== Enable alert routing for user-defined projects

By default, OpenShift HCP doesn't enable alert routing for user-defined projects.

By default, OpenShift doesn't monitor user workloads.
Apply the following ConfigMap link:{github-files}/provision/openshift/cluster-monitoring-config.yaml[cluster-monitoring-config.yaml] which is located in the `/provision/openshift` folder to OpenShift:

[source,bash]
Expand All @@ -93,14 +94,11 @@ After this has been deployed, several new pods spin up in the *openshift-user-wo
kubectl get pods -n openshift-user-workload-monitoring
----

The metrics and targets are then available in the menu entry *Observe* in the OpenShift console.

Additional steps are necessary to enable persistent volumes for the recorded metrics.
Alerts defined in `PrometheusRule` CR are then available to view in the menu entry *Observe->Alerting* in the OpenShift console.

Further reading:

* https://docs.openshift.com/container-platform/4.12/monitoring/configuring-the-monitoring-stack.html[Configure OpenShift monitoring stack]
* https://docs.openshift.com/container-platform/4.12/monitoring/enabling-monitoring-for-user-defined-projects.html[Enabling monitoring for user-defined projects]
* https://docs.openshift.com/rosa/observability/monitoring/enabling-alert-routing-for-user-defined-projects.html[Enabling alert routing for user-defined projects]

[#switching-between-different-kubernetes-clusters]
== Switching between different Kubernetes clusters
Expand Down
Loading

0 comments on commit 88742c9

Please sign in to comment.