Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions api/v1alpha1/cluster_types.go
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,15 @@ type ClusterSpec struct {

// Environment variables to set in the gNMIc pods
Env []corev1.EnvVar `json:"env,omitempty"`

// The target distribution configuration
TargetDistribution *TargetDistributionConfig `json:"targetDistribution,omitempty"`
}

type TargetDistributionConfig struct {
// The capacity per pod for distributing targets
// To be used in conjunction with Horizontal Pod Autoscaling (HPA) scaling.
PodCapacity int `json:"podCapacity,omitempty"`
}

type APIConfig struct {
Expand Down Expand Up @@ -107,6 +116,9 @@ type ClusterStatus struct {
PipelinesCount int32 `json:"pipelinesCount"`
// The number of targets referenced by the pipelines
TargetsCount int32 `json:"targetsCount"`
// The number of targets that could not be assigned to any pod due to capacity limits.
// Non-zero when total targets exceed numPods × perPodCapacity.
UnassignedTargets int32 `json:"unassignedTargets"`
// The number of subscriptions referenced by the pipelines
SubscriptionsCount int32 `json:"subscriptionsCount"`
// The number of inputs referenced by the pipelines
Expand All @@ -124,6 +136,7 @@ type ClusterStatus struct {
// +kubebuilder:printcolumn:name="Ready",type=integer,JSONPath=`.status.readyReplicas`
// +kubebuilder:printcolumn:name="Pipelines",type=integer,JSONPath=`.status.pipelinesCount`
// +kubebuilder:printcolumn:name="Targets",type=integer,JSONPath=`.status.targetsCount`
// +kubebuilder:printcolumn:name="Unassigned",type=integer,JSONPath=`.status.unassignedTargets`
// +kubebuilder:printcolumn:name="Subs",type=integer,JSONPath=`.status.subscriptionsCount`
// +kubebuilder:printcolumn:name="Inputs",type=integer,JSONPath=`.status.inputsCount`
// +kubebuilder:printcolumn:name="Outputs",type=integer,JSONPath=`.status.outputsCount`
Expand Down
20 changes: 20 additions & 0 deletions api/v1alpha1/zz_generated.deepcopy.go

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

19 changes: 19 additions & 0 deletions config/crd/bases/operator.gnmic.dev_clusters.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ spec:
- jsonPath: .status.targetsCount
name: Targets
type: integer
- jsonPath: .status.unassignedTargets
name: Unassigned
type: integer
- jsonPath: .status.subscriptionsCount
name: Subs
type: integer
Expand Down Expand Up @@ -396,6 +399,15 @@ spec:
More info: https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/
type: object
type: object
targetDistribution:
description: The target distribution configuration
properties:
podCapacity:
description: |-
The capacity per pod for distributing targets
To be used in conjunction with Horizontal Pod Autoscaling (HPA) scaling.
type: integer
type: object
required:
- image
type: object
Expand Down Expand Up @@ -486,6 +498,12 @@ spec:
description: The number of targets referenced by the pipelines
format: int32
type: integer
unassignedTargets:
description: |-
The number of targets that could not be assigned to any pod due to capacity limits.
Non-zero when total targets exceed numPods × perPodCapacity.
format: int32
type: integer
required:
- inputsCount
- outputsCount
Expand All @@ -494,6 +512,7 @@ spec:
- selector
- subscriptionsCount
- targetsCount
- unassignedTargets
type: object
type: object
served: true
Expand Down
198 changes: 141 additions & 57 deletions docs/content/docs/advanced/scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,25 +28,33 @@ spec:

### Scale Up ( 3 → 5 pods)

1. Kubernetes creates new pods (`gnmic-3`, `gnmic-4`)
2. Operator waits for pods to be ready
3. Operator redistributes targets using bounded load rendezvous hashing
4. Some targets move from existing pods to new pods
5. Configuration is applied to all pods
1. Kubernetes creates new pods (`gnmic-3`, `gnmic-4`).
2. Operator waits for pods to be ready.
3. Operator recomputes the distribution plan. Existing target assignments are
preserved — only unassigned targets or targets displaced by capacity limits
are placed on the new pods.
4. Configuration is applied to all pods.

### Scale Down ( 5 → 3 pods)

1. Operator redistributes targets away from pods being removed
2. Configuration is applied to remaining pods
3. Kubernetes terminates pods (`gnmic-4`, `gnmic-3`)
4. Targets from terminated pods are handled by remaining pods
1. Operator recomputes the distribution plan for the reduced replica count.
Targets from removed pods flow through rendezvous hashing onto surviving
pods, bounded by each pod's capacity.
2. Configuration is applied to remaining pods.
3. Kubernetes terminates pods (`gnmic-4`, `gnmic-3`).

## Target Redistribution

The operator uses **bounded load rendezvous hashing** to distribute targets:
The operator uses **bounded load rendezvous hashing** to distribute targets.
See [Target Distribution](../target-distribution/) for a detailed explanation
of the algorithm.

- **Stable**: Same target tends to stay on same pod
- **Even**: Targets are distributed evenly (within 1-2 of each other)
Key properties:

- **Stable**: Targets stay on their current pod unless forced to move.
- **Even**: No pod exceeds its capacity.
- **Current-assignment aware**: The operator reads each target's current pod
from its status and feeds this as input to the algorithm, minimizing churn.

### Example Distribution

Expand All @@ -56,10 +64,10 @@ Pod 0: [target1, target5, target8] (3 targets)
Pod 1: [target2, target4, target9] (3 targets)
Pod 2: [target3, target6, target7, target10] (4 targets)

# After scaling to 4 pods
# After scaling to 4 pods — existing assignments are preserved
Pod 0: [target1, target5, target8] (3 targets) - unchanged
Pod 1: [target2, target4] (2 targets) - lost target9
Pod 2: [target3, target7, target10] (3 targets) - lost target6
Pod 1: [target2, target4] (2 targets) - target9 moved to new pod
Pod 2: [target3, target7, target10] (3 targets) - target6 moved to new pod
Pod 3: [target6, target9] (2 targets) - new pod
```

Expand Down Expand Up @@ -105,56 +113,38 @@ gnmic_target_status{cluster="my-cluster"}

## Horizontal Pod Autoscaler

The operator's Cluster resource supports the `scale` subresource, allowing you to enable automatic scaling using the Horizontal Pod Autoscaler (HPA).

To set up autoscaling, create an HPA resource that targets the Cluster resource. Specify the desired minimum and maximum number of replicas, as well as the metrics that will determine when scaling occurs:

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gnmic-c1-hpa
spec:
scaleTargetRef:
apiVersion: operator.gnmic.dev/v1alpha1
kind: Cluster
name: c1
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
The operator's Cluster resource supports the `scale` subresource, allowing you
to use the Horizontal Pod Autoscaler (HPA) for automatic scaling.

> **Note:** You must install the Kubernetes metrics server to enable HPA based on CPU or Memory:
>
> ```shell
> kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
> ```
> HPA scales the **Cluster CR**, not the StatefulSet directly. This ensures the
> operator remains in control of target redistribution and configuration rollout.

### Autoscaling based on custom resources
### Scaling based on target count (recommended)

gNMIc pods provide various Prometheus metrics that can be leveraged by an HPA resource for autoscaling.
Target count is the most accurate scaling signal for gNMIc — CPU/memory don't
reliably reflect the load from long-lived gRPC streaming connections.

One common use case is to scale based on the number of targets assigned to each Pod.
The gNMIc pods export metrics like:
gNMIc pods export per-target metrics:

```
gnmic_target_up{name="default/leaf1"} 0
gnmic_target_up{name="default/leaf2"} 0
gnmic_target_up{name="default/spine1"} 1
```

Here, a value of `1` indicates that the target is present, while `0` denotes it is absent.
A value of `1` indicates that the target is present, `0` denotes it is absent.

With [Prometheus Adapter](https://github.com/kubernetes-sigs/prometheus-adapter), this metric can be made available as `targets_per_pod{cluster="c1", pod="gnmic-c1-0"}` = 1.
You can use the following promQL to aggregate these into a “targets per pod metric: `sum(gnmic_target_up == 1) by (namespace, pod)`.
With [Prometheus Adapter](https://github.com/kubernetes-sigs/prometheus-adapter),
aggregate these into a per-pod metric:

> You can assign `namespace` and `pod` labels to metrics using scrape configurations or relabeling.
```promql
sum(gnmic_target_up{namespace!="",pod!=""} == 1) by (namespace, pod)
```

> You can assign `namespace` and `pod` labels to metrics using scrape
> configurations or relabeling.

Example Prometheus Adapter rule:

```yaml
apiVersion: v1
Expand All @@ -181,9 +171,8 @@ data:
sum(gnmic_target_up{<<.LabelMatchers>>} == 1) by (namespace, pod)
```

The corresponding HPA resource would look like this:

In other words: Scale **Cluster** `c1` to a max of `10` replicas if the average number of targets present in the current pods is above `30`.
The corresponding HPA resource — scale Cluster `c1` up when the average number
of targets per pod exceeds 75:

```yaml
apiVersion: autoscaling/v2
Expand All @@ -204,8 +193,104 @@ spec:
name: gnmic_targets_present
target:
type: AverageValue
averageValue: "30"
averageValue: "75"
```

### Threshold vs Capacity

When using HPA, the Cluster CR's `spec.targetDistribution.perPodCapacity` acts
as a hard assignment ceiling — the operator never assigns more than
`perPodCapacity` targets to a single pod. The HPA **averageValue** (the scaling
threshold) should be set **lower** than capacity to create a buffer zone that
gives new pods time to start:

```
0 ─────── HPA threshold ─────── Capacity
(scale trigger) (assignment stops)
```

1. When the average target count crosses the HPA threshold, HPA increases
`.spec.replicas`.
2. While the new pod is starting, existing pods continue receiving targets up
to `capacity`.
3. If all pods reach `capacity` before the new pod is ready, overflow targets
remain unassigned until the next reconciliation. The Cluster status reports
the count via `status.unassignedTargets` and the `CapacityExhausted`
condition.

**Sizing guidance** — set the HPA threshold to ~70–80% of capacity:

| Cluster Capacity | HPA averageValue | Headroom per pod |
|---|---|---|
| 50 | 35 | 15 (30%) |
| 100 | 75 | 25 (25%) |
| 200 | 150 | 50 (25%) |

For bursty workloads (e.g., many targets appearing at once via
`TunnelTargetPolicy`), use a wider buffer (lower threshold-to-capacity ratio).

### Monitoring Capacity

When targets exceed the total cluster capacity, the Cluster status makes this
visible:

```bash
kubectl get clusters
```

```
NAME IMAGE REPLICAS READY PIPELINES TARGETS UNASSIGNED SUBS INPUTS OUTPUTS
c1 ... 3 3 2 100 4 5 2 3
```

The `CapacityExhausted` condition provides detail:

```bash
kubectl describe cluster c1
```

```
Conditions:
Type Status Reason Message
CapacityExhausted True InsufficientCapacity 4 targets could not be assigned, all pods at capacity
```

Once HPA scales up and all targets are assigned, the condition clears
automatically.

### Scaling based on CPU/Memory

You can also use resource-based metrics:

```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: gnmic-c1-hpa
spec:
scaleTargetRef:
apiVersion: operator.gnmic.dev/v1alpha1
kind: Cluster
name: c1
minReplicas: 1
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```

> **Note:** You must install the Kubernetes metrics server for CPU/Memory-based HPA:
>
> ```shell
> kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/latest/download/components.yaml
> ```

Target-count-based scaling is recommended over CPU/Memory because gRPC
streaming connections don't always correlate with CPU utilization.

## Considerations

Expand All @@ -222,4 +307,3 @@ gNMIc pods are stateless by design:
- No persistent volumes required
- Configuration comes from operator via REST API
- Targets can move between pods without data loss

Loading
Loading