Skip to content

trifork/cheetah-development-infrastructure

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Cheetah development infrastructure

This repository is used to setup infrastructure when developing locally using Kafka/OpenSearch.

The repository consists of a set of docker-compose files which are all referenced in the .env file. This allows invoking docker compose up <service-name> on a service in any of the docker-compose files, from the root of the repository.

See also: https://docs.cheetah.trifork.dev/reference/development-infrastructure

Start infrastructure

docker compose up --quiet-pull

Prerequisites

  1. Follow: https://docs.cheetah.trifork.dev/getting-started/guided-tour/prerequisites#run-standard-jobs
  2. Run docker network create cheetah-infrastructure

Resource requirements

The infrastructure requires a lot of resources, especially memory when running all services at once.

Here is some basic profiling done while running through WSL2 with 16GB RAM:

# See if your docker supports memory limits
docker info --format '{{json .MemoryLimit}}'
# Get total memory for docker
docker info --format '{{json .MemTotal}}' | numfmt --from=auto --to=iec
# Get total CPUs for docker
docker info --format '{{json .NCPU}}'
Profile MEM USAGE / LIMIT
core 2.4GB / 4.4GB
kafka 1.3GB / 2.2GB
opensearch 1.9GB / 2.9GB
full 2.9GB / 5.2GB

Estimated requirements:

Profile CPUs Docker available memory (RAM) Disk space (Images)
Minimum 2 4GB >6.6GB
Recommended 8 8GB >20GB
Best 16 16GB >40GB

Security model

The development infrastructure follows the Reference Security Model.
For local development we are using Keycloak inside docker-compose/keycloak.yaml as a local IDP.

See sections below for details on security model configuration.

Kafka

The kafka setup consists of different services:

  • kafka - Strimzi Kafka with the Cheetah Kafka Authorizer
  • zookeeper - Strimzi Zookeeper
  • redpanda - A Console provides a user interface to manage multiple Kafka connect clusters. https://docs.redpanda.com/docs/manage/console/
  • kafka-setup - A bash script which sets up a Kafka User for redpanda to use when connecting to Kafka, as well as some predefined topics. The topics to be created are determined by the environment variable INITIAL_KAFKA_TOPICS, which can be set in the .env file or overritten in your local environment.
  • schema-registry - Schema registry
  • kafka-minion - Kafka Prometheus exporter

Running Kafka and its associated services

Run:

docker compose --profile=kafka --profile=oauth --profile=schemaregistry --profile=redpanda up -d

When all of the services are running, you can go to:

Listeners

5 different listeners is setup for Kafka on different internal and external ports (see server.properties for the configuration):

  • localhost:9092 - Used for connecting to kafka with OAuth2 authentication from outside the docker environment.
  • localhost:9093 - Used for connecting to kafka without authentication from outside the docker environment.
  • kafka:19092 - Used for connecting to kafka with OAuth2 authentication from a docker container in the cheetah-infrastructure docker network.
  • kafka:19093 - Used for connecting to kafka without authentication from a docker container in the cheetah-infrastructure docker network.
  • kafka:19094 - Only used by Redpanda, since it does not support Oauth2.

Authentication

To require Oauth2 authentication when connecting to kafka, you can remove ;User:ANONYMOUS from the super.users property in server.properties.
This will cause all connections from unauthenticated sources to be rejected by CheetahKafkaAuthorizer.

OpenSearch

The OpenSearch setup consists of different services:

  • OpenSearch - OpenSearch data storage solution
  • OpenSearch-Dashboard - Dashboard solution for interacting with OpenSearch API
  • OpenSearch Configurer - Uses OpenSearch Template Configuration Script to setup Index Templates and more.

Files placed in any subdirectory of config/opensearch-configurer/ are automatically applied to the OpenSearch instance.

Running OpenSearch and its associated services

Run:

docker compose --profile=opensearch --profile=opensearch_dashboard up -d

When all of the services are running, you can go to:

Authentication

Services should connect using the OAuth2 protocol.
You can choose to set DISABLE_SECURITY_DASHBOARDS_PLUGIN=true and DISABLE_SECURITY_PLUGIN=true to disable security completely.

Basic auth access

When working locally, you can use admin:admin user and query OpenSearch like this:

curl -k -s -u "admin:admin" $OPENSEARCH_URL/_cat/indices

OAuth2 token

If you do not want to use basicauth locally, you can get a token using this curl command:

ACCESS_TOKEN=$(curl -s -X POST $OPENSEARCH_TOKEN_URL \
     -H "Content-Type: application/x-www-form-urlencoded" \
     -d "grant_type=client_credentials&client_id=$OPENSEARCH_CLIENT_ID&client_secret=$OPENSEARCH_CLIENT_SECRET&scope=$OPENSEARCH_SCOPE" \
     | jq -r '.access_token')
     #| grep -o '"access_token":"[^"]*' | grep -o '[^"]*$')

And query OpenSearch like this:

curl -k -s -H "Authorization: Bearer $ACCESS_TOKEN" $OPENSEARCH_URL/_cat/indices

Timescale

The Timescale setup consists of different services:

  • TimescaleDB PostgreSQL with the timescale extension
  • PgAdmin GUI for managing TimescaleDB

Running TimescaleDB and its associated services

Run:

docker compose --profile=timescale up -d

When all of the services are running, you can go to:

Authentication

By default a single user is setup:

  • Username: postgres, Password: admin

List of all profiles in docker compose

List of profiles:

  • full
  • core
  • kafka
  • opensearch
  • observability
  • timescale

Here is further explanation on what each profile starts.

Images / profiles kafka-core opensearch-core schema-registry-core core kafka opensearch observability timescale full
Keycloak x x x x x x x x
Kafka x x x x x x
Redpanda console x x
Opensearch x x x x
Opensearch dashboard x x
Opensearch configurer x x x x
Schema registry x x x x
Prometheus x x
Grafana x x
Timescale x

Keycloak

Keycloak is used as a local identity provider, to be able to mimic a production security model with service to service authentication.

Useful urls:

Default clients:

A set of default clients have been defined which covers most common usecases.

All roles are mapped to the roles claim in the JWT. This configuration is defined in local-development.json and is applied to keycloak using the keycloak-setup service. To modify the configuration either go to the admin console (Username: admin Password: admin) or edit the local-development.json following this guide

  • Default access
    • Description: Read and write access to all data Kafka, OpenSearch and Schema registry
    • client_id: default-access
    • client_secret: default-access-secret
    • default_scopes: [ ]
    • optional_scopes:
      • kafka
        • Roles:
          • Kafka_*_all
      • opensearch
        • Roles:
          • opensearch_default_access
      • schema-registry
        • Roles:
          • sr-producer
  • Default write
    • Description: Write access to all data in Kafka, OpenSearch and Schema registry
    • client_id: default-write
    • client_secret: default-write-secret
    • default_scopes: [ ]
    • optional_scopes:
      • kafka
        • Roles:
          • Kafka_*_write
      • opensearch
        • Roles:
          • opensearch_default_write
      • schema-registry
        • Roles:
          • sr-producer
  • Default read
    • Description: Read access to all data in Kafka, OpenSearch and Schema registry
    • client_id: default-read
    • client_secret: default-read-secret
    • default_scopes: [ ]
    • optional_scopes:
      • kafka
        • Roles:
          • Kafka_*_read
      • opensearch
        • Roles:
          • opensearch_default_read
  • Users
    • Description: User login via browser such as OpenSearch Dashboard (See Users for user details)
    • client_id: users
    • client_secret: users-secret
    • default_scopes: [ ]
    • optional_scopes:
      • kafka
      • opensearch
      • schema-registry
  • Custom client
    • Description: A custom client which can be configured using Environment variables. Useful for pipelines where services require custom roles.
    • client_id: $DEMO_CLIENT_NAME
    • client_secret: $DEMO_CLIENT_SECRET
    • default_scopes:
      • custom-client
        • Roles: $DEMO_CLIENT_ROLES - Should be a comma separated list e.g. (my_view_role,my_edit_role,my_admin_role)
    • optional_scopes: [ ]

Users:

  • developer
    • Username: developer
    • Password: developer
    • Roles:
      • opensearch_developer
      • opensearch_default_read
      • Kafka_*_all
      • sr-producer