-
Notifications
You must be signed in to change notification settings - Fork 2
Description
After the Raft cluster forms and a leader is elected, the leader retrieves the test configuration from a centralized remote source rather than requiring identical env vars or YAML files on every node. The leader then distributes the parsed config to all followers via gRPC (Issue #46).
Cluster mode must be enabled
This feature only applies when CLUSTER_ENABLED=true. In standalone mode (the default) config continues to come from env vars / YAML files — no change to existing behaviour.
Config sources
GCS Bucket — GCP deployments
Leader downloads a YAML config object from GCS after election:
- Uses GCP Application Default Credentials (ADC) — no extra auth config needed on GCP VMs or container-as-a-service; the metadata server provides credentials automatically
- Config object is the same YAML schema already supported by the config system
google-cloud-storageRust crate (orreqwestagainst the GCS JSON API) for download
CLUSTER_CONFIG_SOURCE=gcs
GCS_CONFIG_BUCKET=my-loadtest-configs
GCS_CONFIG_OBJECT=configs/prod-test.yaml
Consul KV — local / dev deployments
Leader reads config from Consul KV store after election. The Consul agent is already running for service discovery and DNS (Issue #47) — the KV lookup uses the same agent, no additional connection needed.
- Key value must be a YAML string matching the existing config schema
- Consul blocking queries can be used for optional hot-reload when the KV value is updated while a test is running
- The leader's Consul health tag transitions to
leader(Issue [Phase 4] Leader election and coordination (Raft) #47) before the KV fetch, so operators can verify the right node is fetching config
CLUSTER_CONFIG_SOURCE=consul-kv
CONSUL_CONFIG_KEY=loadtest/config # KV path (default: loadtest/config)
Write config to Consul KV before starting a test:
consul kv put loadtest/config @my-test-config.yamlFlow
1. CLUSTER_ENABLED=true → nodes start, discover peers
- Consul: query loadtest-cluster.service.consul for peers
- Static: read CLUSTER_NODES list
2. Raft cluster forms, leader elected (Issue #47)
3. Leader updates Consul tag to leader and /health/cluster returns state=leader
4. Leader reads CLUSTER_CONFIG_SOURCE env var
5. Leader fetches config:
- gcs: GCS download via ADC
- consul-kv: Consul KV GET at CONSUL_CONFIG_KEY
6. Leader parses YAML → existing Config struct (no new schema needed)
7. Leader pushes TestConfig proto to all followers via gRPC DistributeConfig (Issue #46)
8. Followers acknowledge → health check returns cluster_ready=true on all nodes
9. Leader issues StartTest with coordinated start_at timestamp (Issue #48)
Error handling
- GCS object not found or ADC credentials fail → leader logs error and aborts (config problem, not a leader problem — re-election will not help)
- Consul KV key missing → abort with clear error:
consul kv put loadtest/config @config.yaml - Followers that do not receive config within
CLUSTER_CONFIG_TIMEOUT_SECS(default 30s) log a warning; leader retries delivery up to 3 times before aborting - Config parse error → abort with line/field details from the YAML parser
Standalone mode (CLUSTER_ENABLED=false — the default)
No change. Config comes from env vars and/or local YAML file as it does today. GCS and Consul KV are not consulted.
Full env var reference
CLUSTER_CONFIG_SOURCE=gcs # gcs | consul-kv (required when CLUSTER_ENABLED=true)
GCS_CONFIG_BUCKET= # GCS bucket name (gcs source only)
GCS_CONFIG_OBJECT= # GCS object path e.g. configs/test.yaml (gcs source only)
CONSUL_CONFIG_KEY=loadtest/config # Consul KV path (consul-kv source only)
CLUSTER_CONFIG_TIMEOUT_SECS=30 # max seconds to wait for config distribution to followers