Horizontal Scaling

openWCS scales horizontally: every service's request path is stateless (all durable state lives in Postgres or Kafka), so you can run multiple replicas behind a load balancer. The two places that previously assumed a single instance — scheduled jobs and the conveyor-loop capacity check — were made replication-safe in the 9920fba release.

Full detail: docs/SCALING.md. Ready-to-apply Kubernetes manifests: deploy/k8s/.

What scales, and the guarantees

Area	Mechanism	Scales?
REST request path (all services)	stateless; state in Postgres	✅ replicas / HPA
Outbox relays — order-management, txlog	ShedLock `@SchedulerLock` — one replica drains per tick	✅
Off-peak jobs — slotting velocity/replenishment/reslot, counting sweep, host webhook	ShedLock `@SchedulerLock`	✅
Conveyor loop capacity — flow-orchestrator	pessimistic row lock on the loop row makes count-and-enter atomic	✅
Stock reservation — inventory / allocation	already serialized by a pessimistic lock on `AVAILABLE` stock rows	✅
txlog→stock projection (inventory), velocity learner (slotting)	Kafka consumer group + idempotent apply on `event_id`	✅ up to topic partitions
conveyor-sniffer	TCP telegram stream per controller cannot be split across replicas	⚠️ single instance — see below

ShedLock (scheduled-job leader election)

order-management, txlog, slotting, counting, and integration-host each carry ShedLock (JDBC). A @SchedulerLock-annotated @Scheduled method acquires a cluster-wide lock — a row in that service's own <schema>.shedlock table (added by a Flyway migration; see each service's ShedLockConfig) — so it runs on only one replica per fire. Outbox relays use a short lockAtMostFor (1 min) for fast failover if a holder dies; daily sweep jobs use the service default.

Operational must-dos

Kafka partition count caps consumer scaling. The inventory stock projection (group.id = inventory-stock-projection) and the slotting velocity learner (slotting-velocity-learner) scale only up to the partition count of txlog.stream. Provision that topic with at least as many partitions as the desired max replica count; replicas beyond the partition count idle.
Do not load-balance the conveyor-sniffer. It accepts a long-lived TCP telegram stream per controller; splitting that stream across replicas fragments telegrams. It is pinned to one replica (replicas: 1, Recreate strategy in deploy/k8s/adapters.yaml). To scale sniffing, run one instance per controller/site (distinct SNIFFER_LISTEN/ALLOWED_IPS) or use a sticky (session- affinity) L4 load balancer — not round-robin.

Kubernetes starter manifests

deploy/k8s/ contains plain-YAML starter manifests for running the whole estate on Kubernetes:

deploy/k8s/
├── namespace.yaml   # the openwcs namespace
├── config.yaml      # shared ConfigMap (service DNS, Kafka, security) + DB Secret placeholder
├── services.yaml    # Deployment + Service for every Java service
├── adapters.yaml    # Deployment + Service for the Go adapters (sniffer pinned to replicas: 1)
└── hpa.yaml         # HorizontalPodAutoscalers for the high-traffic services (2→10 on CPU)

These are a starting point — set your image registry, secrets, ingress, and resource sizing for your cluster before relying on them. See Deployment Guide for cloud-specific recipes (GKE, AKS, ECS Fargate) and the deploy/k8s/README.md for conventions baked into the manifests.

# After editing config.yaml (image registry, DB/Kafka endpoints, secrets):
kubectl apply -f deploy/k8s/

Inter-service URLs use Kubernetes DNS (http://master-data:8081, …) — identical to the Docker Compose service names, so no application config changes are needed.

Known throughput tuning (not required for correctness)

Scan hot path: the routing fast path (RoutingGraphCache + reverse-Dijkstra next-hop tables memoised per target) replaced the per-scan full edge fetch + Dijkstra. Warm scans answer in low single-digit milliseconds with two synchronous DB ops: one indexed route-plan read (deliberately not cached — dynamic state shared across stateless replicas) and one route-position UPDATE (loop occupancy + plan progression are read from it). Counters and SCANNED trace writes are async (ScanSideEffects). At very high rates the remaining bottleneck is the route-position UPDATE; adding replicas helps (each carries its own graph snapshot, evicted on topology writes or the 60 s TTL). Monitor the ~10 ms per-scan budget via GET /api/flow/reports/decision-latency.
Outbox relays are single-writer by ShedLock. If relay throughput becomes the limit, switch to a partitioned FOR UPDATE SKIP LOCKED drain keyed per stream (preserves per-stream order while allowing parallel workers).

openWCS

Design

Flows

Reporting & Dashboards

Operations

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Horizontal Scaling

Horizontal Scaling

What scales, and the guarantees

ShedLock (scheduled-job leader election)

Operational must-dos

Kubernetes starter manifests

Known throughput tuning (not required for correctness)

See also

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

openWCS

Clone this wiki locally