# Autoscaling Seldon Deployments


## Prerequisites
 
- The cluster should have `heapster` and `metric-server` running in the `kube-system` namespace
- For Kind install `../../testing/scripts/metrics.yaml` See https://github.com/kubernetes-sigs/kind/issues/398
- For Minikube run:
    
    ```
    minikube addons enable metrics-server
    minikube addons enable heapster
    ```
    

## Setup Seldon Core

Use the setup notebook to [Setup Cluster](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Setup-Cluster) with [Ambassador Ingress](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Ambassador) and [Install Seldon Core](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html#Install-Seldon-Core). Instructions [also online](https://docs.seldon.io/projects/seldon-core/en/latest/examples/seldon_core_setup.html).

In [1]:
!kubectl create namespace seldon

Error from server (AlreadyExists): namespaces "seldon" already exists


In [2]:
!kubectl config set-context $(kubectl config current-context) --namespace=seldon

Context "gke_airy-berm-306315_us-central1-c_ambassador-1" modified.


## Create model with autoscaler

To create a model with an HorizontalPodAutoscaler there are three steps:


  1. Ensure you have a resource request for the metric you want to scale on if it is a standard metric such as cpu or memory, e.g.:
  
```
          resources:
            requests:
              cpu: '0.5'
     
```
     
  1. Add an HPA Spec referring to this Deployment, e.g.:
  
```
    - hpaSpec:
        maxReplicas: 3
        metrics:
        - resource:
            name: cpu
            targetAverageUtilization: 10
          type: Resource
        minReplicas: 1

```

The full SeldonDeployment spec is shown below.

In [3]:
!pygmentize model_with_hpa.yaml

[94mapiVersion[39;49;00m: machinelearning.seldon.io/v1
[94mkind[39;49;00m: SeldonDeployment
[94mmetadata[39;49;00m:
  [94mname[39;49;00m: seldon-model
[94mspec[39;49;00m:
  [94mname[39;49;00m: test-deployment
  [94mpredictors[39;49;00m:
  - [94mcomponentSpecs[39;49;00m:
    - [94mhpaSpec[39;49;00m:
        [94mmaxReplicas[39;49;00m: 3
        [94mmetrics[39;49;00m:
        - [94mresource[39;49;00m:
            [94mname[39;49;00m: cpu
            [94mtargetAverageUtilization[39;49;00m: 10
          [94mtype[39;49;00m: Resource
        [94mminReplicas[39;49;00m: 1
      [94mspec[39;49;00m:
        [94mcontainers[39;49;00m:
        - [94mimage[39;49;00m: seldonio/mock_classifier:1.5.0-dev
          [94mimagePullPolicy[39;49;00m: IfNotPresent
          [94mname[39;49;00m: classifier
          [94mresources[39;49;00m:
            [94mrequests[39;49;00m:
              [94mcpu[39;49;00m: [33m'[39;49;00m[33m0.5[39;49;00m[33m'[39;49;00m
    

In [4]:
!kubectl create -f model_with_hpa.yaml

seldondeployment.machinelearning.seldon.io/seldon-model created


In [5]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-model -o jsonpath='{.items[0].metadata.name}')

Waiting for deployment "seldon-model-example-0-classifier" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-model-example-0-classifier" successfully rolled out


## Create Load

We label some nodes for the loadtester. We attempt the first two as for Kind the first node shown will be the master.

In [6]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
# !kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}') role=locust

node/gke-ambassador-1-default-pool-fc670239-v256 labeled


In [8]:
!helm install loadtester ../../../helm-charts/seldon-core-loadtesting  \
    --set locust.host=http://seldon-model-example:8000 \
    --set oauth.enabled=false \
    --set locust.hatchRate=1 \
    --set locust.clients=1 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=1

NAME: loadtester
LAST DEPLOYED: Wed Jan 19 15:20:46 2022
NAMESPACE: seldon
STATUS: deployed
REVISION: 1
TEST SUITE: None


After a few mins you should see the deployment `my-dep` scaled to 3 deployments

In [9]:
import json
import time


def getNumberPods():
    dp = !kubectl get deployment seldon-model-example-0-classifier -o json
    dp = json.loads("".join(dp))
    return dp["status"]["replicas"]


scaled = False
for i in range(60):
    pods = getNumberPods()
    print(pods)
    if pods > 1:
        scaled = True
        break
    time.sleep(5)
assert scaled

3


In [None]:
!kubectl get pods,deployments,hpa

## Remove Load
After 5-10 mins you should see the deployments replicas decrease to 1

In [9]:
!helm delete loadtester -n seldon

release "loadtester" uninstalled


In [10]:
!kubectl get pods,deployments,hpa

NAME                                                     READY   STATUS    RESTARTS   AGE
pod/ambassador-6747c68887-2rddl                          1/1     Running   0          22h
pod/jaeger-5cb557b89d-khfb8                              1/1     Running   0          22h
pod/jaeger-operator-67777ffc99-m25fp                     1/1     Running   0          22h
pod/locust-master-1-6sbss                                1/1     Running   0          125m
pod/locust-slave-1-nlwgv                                 1/1     Running   0          125m
pod/seldon-model-example-0-classifier-7cf4bd7485-fvn7f   2/2     Running   0          126m
pod/seldon-model-example-0-classifier-7cf4bd7485-jlsjg   2/2     Running   0          124m
pod/seldon-model-example-0-classifier-7cf4bd7485-p9j4w   0/2     Pending   0          124m

NAME                                                READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/ambassador                          1/1     1            1          

In [11]:
!kubectl delete -f model_with_hpa.yaml

seldondeployment.machinelearning.seldon.io "seldon-model" deleted
