# Autoscaling Seldon Deployments


## Prerequisites
 
- The cluster should have `metric-server` running in the `kube-system` namespace
- Seldon core installed with istio


In [1]:
!kubectl create namespace seldon

Error from server (AlreadyExists): namespaces "seldon" already exists


In [2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon

Context "kind-ansible" modified.


## Create model with autoscaler

To create a model with an HorizontalPodAutoscaler there are three steps:


  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:
  
```
          resources:
            requests:
              cpu: '0.5'
     
```
     
  1. Add an HPA Spec referring to this Deployment, e.g.:
  
```
       - hpaSpec:
         maxReplicas: 3
         metrics:
         - resource:
            name: cpu
            targetAverageUtilization: 10
           type: Resource
         minReplicas: 1
```

The full SeldonDeployment spec is shown below.

In [5]:
!pygmentize model_with_hpa.yaml

[94mapiVersion[39;49;00m: machinelearning.seldon.io/v1
[94mkind[39;49;00m: SeldonDeployment
[94mmetadata[39;49;00m:
  [94mname[39;49;00m: seldon-model
[94mspec[39;49;00m:
  [94mname[39;49;00m: test-deployment
  [94mpredictors[39;49;00m:
  - [94mcomponentSpecs[39;49;00m:
    - [94mhpaSpec[39;49;00m:
        [94mmaxReplicas[39;49;00m: 3
        [94mmetrics[39;49;00m:
        - [94mresource[39;49;00m:
            [94mname[39;49;00m: cpu
            [94mtarget[39;49;00m:
              [94mtype[39;49;00m: AverageValue
              [94maverageUtilization[39;49;00m: 10
          [94mtype[39;49;00m: Resource
        [94mminReplicas[39;49;00m: 1
      [94mspec[39;49;00m:
        [94mcontainers[39;49;00m:
        - [94mimage[39;49;00m: seldonio/mock_classifier:1.5.0-dev
          [94mimagePullPolicy[39;49;00m: IfNotPresent
          [94mname[39;49;00m: classifier
          [94mresources[39;49;00m:
            [94mrequests

In [6]:
!kubectl create -f model_with_hpa.yaml

seldondeployment.machinelearning.seldon.io/seldon-model created


In [7]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')

Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out


## Create Load

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

In [9]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust

error: 'role' already has a value (locust), and --overwrite is false


In [10]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1

NAME: loadtester
LAST DEPLOYED: Sun Jun 26 15:04:22 2022
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None


After a few mins you should see the deployment `my-dep` scaled to 3 deployments

In [11]:
import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled

1
1
1
1
1
1
1
1
1
1
1
1
3


In [12]:
!kubectl get pods,deployments,hpa

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/locust-master-1-qlrrq                                1/1     Running   0          86s
pod/locust-slave-1-7lqhj                                 1/1     Running   0          86s
pod/seldon-model-example-0-classifier-76949b4669-5ts2h   0/2     Pending   0          20s
pod/seldon-model-example-0-classifier-76949b4669-bmbx2   0/2     Running   0          20s
pod/seldon-model-example-0-classifier-76949b4669-sgmzs   2/2     Running   0          3m5s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   1/3     3            1           3m5s

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example

## Remove Load
After 5-10 mins you should see the deployments replicas decrease to 1

In [13]:
!helm delete loadtester -n seldon

release "loadtester" uninstalled


In [14]:
!kubectl get pods,deployments,hpa

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/seldon-model-example-0-classifier-76949b4669-5ts2h   0/2     Pending   0          32s
pod/seldon-model-example-0-classifier-76949b4669-bmbx2   2/2     Running   0          32s
pod/seldon-model-example-0-classifier-76949b4669-sgmzs   2/2     Running   0          3m17s

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/seldon-model-example-0-classifier   2/3     3            2           3m17s

NAME                                                                    REFERENCE                                      TARGETS   MINPODS   MAXPODS   REPLICAS   AGE
horizontalpodautoscaler.autoscaling/seldon-model-example-0-classifier   Deployment/seldon-model-example-0-classifier   59%/10%   1         3         3          3m17s


In [15]:
!kubectl delete -f model_with_hpa.yaml

seldondeployment.machinelearning.seldon.io "seldon-model" deleted
