Introducing "Service Scaler”, a kubernetes operator which pro-actively monitors and controls the HPA object of a corresponding deployment enabling gradual scaling of workloads based on a time based configuration.
“Time-based” scaling is controlled by a custom configuration which looks like:
apiVersion: scaler.udaan.io/v1
kind: ServiceScaler
metadata:
name: dummy-acorn-service
namespace: prod
spec:
hpa:
maxReplicas: 8
minReplicas: 4
targetCPUUtilization: 50
targetMemoryUtilization: 75
timeRangeSpec:
- kind: ZonedTime
from: 16:00+05:30
to: 00:00+05:30
replicaSpec:
hpa:
minReplicas: 3
targetMemoryUtilization: 0
- kind: ZonedTime
from: 00:00+05:30
to: 08:00+05:30
replicaSpec:
hpa:
minReplicas: 2
targetMemoryUtilization: 0
- What does the above configuration mean?
- between 16:00IST - 00:00IST
minReplicas
is overridden to 3 andtargetMemoryUtilzation
is removed. - between 00:00IST - 08:00IST
minReplicas
is overridden to 2 andtargetMemoryUtilzation
is removed. - defaults under
hpa:
are applied if no time range matches.
- between 16:00IST - 00:00IST
- hpa parameters
minReplicas
maxReplicas
targetCPUUtilization
(0
would mean removal of cpu based scaling)targetMemoryUtiliization
(0
would mean removal of memory based scaling)
Defaults
under thehpa:
sectionOverrides
undertimeRangeSpec:
, specify any of the above parameter overrides which will be applied during the specified time range.- Time range controls for
from:
andto:
- ZonedTime:
HH:MM<tz-offset>
Ex:08:00+05:30
- ZonedDateTime:
rfc3339
format Ex:2023-01-11T08:00:00+05:30
- ZonedTime:
Defaults
are applied when no time range matches.
For those rare instances when things might not go as planned, a kill switch has been crafted. By adding a simple annotation to the HPA, the Service Scaler can be bypassed, putting control back in the hands of the user.
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
annotations:
service-scaler.kubernetes.io/managed: "false" # <-- THIS LINE
name: dummy-acorn-service
namespace: prod
spec:
maxReplicas: 8
minReplicas: 4
scaleTargetRef:
apiVersion: apps/v2beta2
kind: Deployment
name: dummy-acorn-service
targetCPUUtilizationPercentage: 50
Once the above annotation is added, time based scaling is disabled for dummy-acorn-service
, users are expected to manually set hpa parameters of their choice.
The status
block of the service scaler object shows the following:
- What was the last active configuration of the scaler object?
- When was the scaler object last updated?
- Is there a time range spec match? (considering the current timestamp)
status:
lastKnownConfig:
maxReplicas: 8
minReplicas: 4
targetCPUUtilization: 50
targetMemoryUtilization: 75
lastObservedGeneration: 1
lastUpdatedTime: 2024-01-19T11:40Z+0530
timeRangeMatch: false
- Have a kubernetes cluster up and running.
- Install the CRD
kubectl --context=<context> create -f servicescaler.scaler.udaan.io.yaml
- Ensure that rbac is setup (refer rbac template)
- Build using
cargo build
- Run using
RUST_LOG=info cargo run
- Flexibility to watch a subset of hpas are provided via the
LABEL_SELECTOR
environment variable.
After installing the CRD and running the operator, to see the service scaler in action, let's create a sample deployment called dummy-bee-service
with a service scaler object with the following specification:
default
- 3 replicas16:00 - 00:00
- 2 replicas00:00 - 08:00
- 1 replica
- apply the example and examine if the replicas of
dummy-bee-service
are following the overrides.
kubectl --context=<context> apply -f example.yaml
- Do not specify “overlapping” time ranges as this will result in undefined behaviour.
- Refer architecture diagram to understand the mechanics of the operator.
- Battle-tested on kubernetes 1.16 and 1.22.
- For newer kubernetes clusters (Ex: 1.30)
- pin the following versions for
kube
andk8s-openapi
kube = { version = "0.93.1", default-features = true, features = ["derive", "runtime", "config"]} k8s-openapi = { version = "0.22.0", features = ["latest"]}
- migrate from
autoscaling/v2beta2
toautoscaling/v2
- migrate from
apps/v2beta2
toapps/v1
- pin the following versions for
- Build the docker image.
- Push the image to a container registry.
- Setup a service account with the corresponding rolebinding objects with the required permissions.
- Create a deployment object with the pushed image.
- Helmify the operator for easier deployment.
- Capability to "hibernate" services.