Service Scaler

Introducing "Service Scaler”, a kubernetes operator which pro-actively monitors and controls the HPA object of a corresponding deployment enabling gradual scaling of workloads based on a time based configuration.

The Configuration (CRD)

“Time-based” scaling is controlled by a custom configuration which looks like:

    apiVersion: scaler.udaan.io/v1
    kind: ServiceScaler
    metadata:
      name: dummy-acorn-service
      namespace: prod
    spec:
      hpa:
        maxReplicas: 8
        minReplicas: 4
        targetCPUUtilization: 50
        targetMemoryUtilization: 75
      timeRangeSpec:
      - kind: ZonedTime
    	from: 16:00+05:30
        to: 00:00+05:30
        replicaSpec:
          hpa:
            minReplicas: 3
            targetMemoryUtilization: 0
      - kind: ZonedTime
        from: 00:00+05:30
        to: 08:00+05:30
        replicaSpec:
          hpa:
            minReplicas: 2
            targetMemoryUtilization: 0

What does the above configuration mean?
- between 16:00IST - 00:00IST minReplicas is overridden to 3 and targetMemoryUtilzation is removed.
- between 00:00IST - 08:00IST minReplicas is overridden to 2 and targetMemoryUtilzation is removed.
- defaults under hpa: are applied if no time range matches.

The Control knobs

hpa parameters
- minReplicas
- maxReplicas
- targetCPUUtilization (0 would mean removal of cpu based scaling)
- targetMemoryUtiliization (0 would mean removal of memory based scaling)
Defaults under the hpa: section
Overrides under timeRangeSpec: , specify any of the above parameter overrides which will be applied during the specified time range.
Time range controls for from: and to:
- ZonedTime: HH:MM<tz-offset> Ex: 08:00+05:30
- ZonedDateTime: rfc3339 format Ex: 2023-01-11T08:00:00+05:30
Defaults are applied when no time range matches.

The Kill Switch

For those rare instances when things might not go as planned, a kill switch has been crafted. By adding a simple annotation to the HPA, the Service Scaler can be bypassed, putting control back in the hands of the user.

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
metadata:
  annotations:
    service-scaler.kubernetes.io/managed: "false" # <-- THIS LINE
  name: dummy-acorn-service
  namespace: prod
spec:
  maxReplicas: 8
  minReplicas: 4
  scaleTargetRef:
    apiVersion: apps/v2beta2
    kind: Deployment
    name: dummy-acorn-service
  targetCPUUtilizationPercentage: 50

Once the above annotation is added, time based scaling is disabled for dummy-acorn-service, users are expected to manually set hpa parameters of their choice.

The “status” sub resource

The status block of the service scaler object shows the following:

What was the last active configuration of the scaler object?
When was the scaler object last updated?
Is there a time range spec match? (considering the current timestamp)

status:
  lastKnownConfig:
    maxReplicas: 8
    minReplicas: 4
    targetCPUUtilization: 50
    targetMemoryUtilization: 75
  lastObservedGeneration: 1
  lastUpdatedTime: 2024-01-19T11:40Z+0530
  timeRangeMatch: false

Installation

Have a kubernetes cluster up and running.

Install the CRD

kubectl --context=<context> create -f servicescaler.scaler.udaan.io.yaml

Ensure that rbac is setup (refer rbac template)
Build using cargo build
Run using RUST_LOG=info cargo run
Flexibility to watch a subset of hpas are provided via the LABEL_SELECTOR environment variable.

Example

After installing the CRD and running the operator, to see the service scaler in action, let's create a sample deployment called dummy-bee-service with a service scaler object with the following specification:

default - 3 replicas
16:00 - 00:00 - 2 replicas
00:00 - 08:00 - 1 replica

apply the example and examine if the replicas of dummy-bee-service are following the overrides.

kubectl --context=<context> apply -f example.yaml

Points to note

Do not specify “overlapping” time ranges as this will result in undefined behaviour.
Refer architecture diagram to understand the mechanics of the operator.
Battle-tested on kubernetes 1.16 and 1.22.

For newer kubernetes clusters (Ex: 1.30)

pin the following versions for kube and k8s-openapi

kube = { version = "0.93.1", default-features = true, features = ["derive", "runtime", "config"]}
k8s-openapi = { version = "0.22.0", features = ["latest"]}

migrate from autoscaling/v2beta2 to autoscaling/v2
migrate from apps/v2beta2 to apps/v1

Deployment Strategy (k8s)

Build the docker image.
Push the image to a container registry.
Setup a service account with the corresponding rolebinding objects with the required permissions.
Create a deployment object with the pushed image.

Future Work

Helmify the operator for easier deployment.
Capability to "hibernate" services.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
src		src
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
architecture.png		architecture.png
example.yaml		example.yaml
rbac.yaml		rbac.yaml
servicescaler.scaler.udaan.io.yaml		servicescaler.scaler.udaan.io.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Service Scaler

The Configuration (CRD)

The Control knobs

The Kill Switch

The “status” sub resource

Installation

Example

Points to note

Deployment Strategy (k8s)

Future Work

References

About

Releases

Packages

Languages

License

udaan-com/service-scaler

Folders and files

Latest commit

History

Repository files navigation

Service Scaler

The Configuration (CRD)

The Control knobs

The Kill Switch

The “status” sub resource

Installation

Example

Points to note

Deployment Strategy (k8s)

Future Work

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages