Nexcast is a autoscaler that forecasts demand and turns that forecast into replica recommendations. It can operate with either Docker or Kubernetes. Traffic demand is calculated from per-service traffic metrics plus capacity settings defined in services.yaml.
Nexcast supports two backends:
dockerfor scaling locally-managed Docker containerskubernetesfor scaling existing Kubernetes Deployments
betais the service's CPU cost per request rate unit; higherbetameans each extra unit of traffic consumes more CPU.ais a fixed utilization offset that accounts for baseline overhead or inefficiency before useful traffic work is done.utilization_targetis the desired safe operating utilization for the service, usually kept below 1.0 to leave headroom.cores_instanceis the effective CPU capacity one replica can contribute.
beta and a should ideally come from load testing or production observations for each service. If they are estimated poorly, traffic-based scaling will become noisy and less reliable.
Make sure Go is installed, then fetch dependencies and verify the project builds:
go mod download
go build .Run it locally with:
go run .Nexcast loads .env automatically if present.
Run it as a service with:
sudo cp nexcast.service /etc/systemd/system/
sudo mkdir -p /etc/nexcast
sudo cp .env /etc/nexcast/nexcast.env
sudo systemctl daemon-reload
sudo systemctl enable --now nexcastWhat the autoscaler does:
- Loads runtime configuration and the shared
services.yamlinventory - Starts an HTTP API (
/nodeInfo,/servicesState,/history) for the dashboard - Collects local service state
- Posts one observation per service per reconcile cycle to the observation collector (optional)
- Calculates replica recommendations locally from current traffic and service capacity settings
- Applies replica changes through the selected backend
- Persists a rolling history snapshot for the dashboard charts
If a service exposes traffic metrics and includes capacity coefficients in services.yaml, Nexcast scrapes current RPS and converts that demand into replica recommendations locally.
kubectl apply -f nextcast.yaml
kubectl rollout restart deployment/nextcast -n default
kubectl rollout status deployment/nextcast -n default
kubectl get pods -n default -l app=nextcast -o widekubectl apply -f nextcast.yaml
kubectl get deploy,pods -n default -l app=nextcast -o wideBuild the sample app image:
docker build -t example-server:latest ./example/dockerBuild the example image, then apply the manifests from example/kubernetes/:
docker build -t example-server:latest ./example/docker
kubectl apply -f example/kubernetes/kubernetes.yamlCreate a shared service inventory in services.yaml on every node:
services:
- name: api
system_id: 0
image_name: example-server:latest
container_prefix: nextcast-api
port_base: 18080
metrics_path: /metrics
min_replicas: 1
max_replicas: 10
target_per_node: 65.0
scale_up_step: 2
scale_down_step: 1
beta: 0.02
utilization_target: 0.75
a: 0.10
cores_instance: 0.50Example:
BACKEND=docker
LISTEN_ADDR=:8081
SERVICES_FILE=services.yaml
CHECK_INTERVAL=20s
COOLDOWN=60s
OBSERVATION_URL=http://localhost:8000/observationsCreate a Kubernetes inventory in services.yaml on every Nexcast peer:
services:
- name: api
system_id: 0
namespace: default
deployment_name: nextcast-example
min_replicas: 1
max_replicas: 10
target_per_node: 65.0
scale_up_step: 2
scale_down_step: 1BACKEND=kubernetes
LISTEN_ADDR=:8081
SERVICES_FILE=/etc/nexcast/services.yaml
K8S_NAMESPACE=default
METRICS_FALLBACK_POLICY=scale-up-only
CHECK_INTERVAL=20s
COOLDOWN=60s
OBSERVATION_URL=http://predictor.default.svc.cluster.local:8000/observationsMetrics behavior:
- If the Metrics API is available, Nexcast computes CPU and memory utilization from pod usage versus pod resource requests
- If metrics are unavailable, Nexcast falls back to replica-count-only mode and, by default, only allows scale-up decisions while holding steady on scale-down recommendations
Training data behavior:
- Nexcast emits one observation per service on every reconcile cycle, even when no scale action is applied
- Observations can be forwarded to any external collector that accepts the JSON payload
The Kubernetes backend uses the in-cluster API by default. Override the connection with these environment variables when needed:
K8S_API_SERVERK8S_BEARER_TOKENorK8S_TOKEN_FILEK8S_CA_FILEK8S_INSECURE_SKIP_TLS_VERIFY=true
Traffic metrics behavior:
- Docker mode scrapes each managed container via its mapped host port and
metrics_path - Kubernetes mode scrapes each pod via
podIP:metrics_port + metrics_path - the built-in example app exposes
GET /metricswith a rollingrpsfield - Nexcast uses recent observed
rpssamples to smooth demand before sizing replicas
