# Throughtput Benchmarking  Seldon-Core on GCP Kubernetes

The notebook will provide a benchmarking of seldon-core for maximum throughput test. We will run a stub model and test using REST and gRPC predictions. This will provide a maximum theoretical throughtput for model deployment in the given infrastructure scenario:
  
   * 1 replica of the model running on n1-standard-16 GCP node
   
For a real model the throughput would be less. Future benchmarks will test realistic models scenarios.


## Create Cluster

Create a cluster of 4 nodes of machine type n1-standard-16. You can use GKE console or `gcloud` command line.

## Install helm

In [20]:
!kubectl -n kube-system create sa tiller
!kubectl create clusterrolebinding tiller --clusterrole cluster-admin --serviceaccount=kube-system:tiller
!helm init --service-account tiller

serviceaccount/tiller created
clusterrolebinding.rbac.authorization.k8s.io/tiller created
$HELM_HOME has been configured at /home/clive/.helm.

Tiller (the Helm server-side component) has been installed into your Kubernetes Cluster.

Please note: by default, Tiller is deployed with an insecure 'allow unauthenticated users' policy.
To prevent this, run `helm init` with the --tiller-tls-verify flag.
For more information on securing your installation see: https://docs.helm.sh/using_helm/#securing-your-helm-installation
Happy Helming!


In [21]:
!kubectl rollout status deploy/tiller-deploy -n kube-system

deployment "tiller-deploy" successfully rolled out


## Cordon off loadtest nodes

In [22]:
!kubectl get nodes

NAME                                                STATUS   ROLES    AGE     VERSION
gke-standard-cluster-1-default-pool-88be49f0-5533   Ready    <none>   8m54s   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-88be49f0-gbtg   Ready    <none>   8m53s   v1.12.8-gke.10
gke-standard-cluster-1-default-pool-88be49f0-k4js   Ready    <none>   9m2s    v1.12.8-gke.10
gke-standard-cluster-1-default-pool-88be49f0-qhvj   Ready    <none>   8m53s   v1.12.8-gke.10


We cordon off first 3 nodes so seldon-core and the model will not be deployed on the 1 remaining node.

In [23]:
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}')
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[2].metadata.name}')

node/gke-standard-cluster-1-default-pool-88be49f0-5533 cordoned
node/gke-standard-cluster-1-default-pool-88be49f0-gbtg cordoned
node/gke-standard-cluster-1-default-pool-88be49f0-k4js cordoned


Label the nodes so they can be used by locust.

In [24]:
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}') role=locust
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}') role=locust
!kubectl label nodes $(kubectl get nodes -o jsonpath='{.items[2].metadata.name}') role=locust

node/gke-standard-cluster-1-default-pool-88be49f0-5533 labeled
node/gke-standard-cluster-1-default-pool-88be49f0-gbtg labeled
node/gke-standard-cluster-1-default-pool-88be49f0-k4js labeled


## Start seldon-core

In [28]:
!helm install ../helm-charts/seldon-core-operator --name seldon-core --set usageMetrics.enabled=true --namespace seldon-system    

NAME:   seldon-core
E0625 13:28:54.390314    3430 portforward.go:363] error copying from remote stream to local connection: readfrom tcp4 127.0.0.1:41235->127.0.0.1:33400: write tcp4 127.0.0.1:41235->127.0.0.1:33400: write: broken pipe
LAST DEPLOYED: Tue Jun 25 13:28:53 2019
NAMESPACE: seldon-system
STATUS: DEPLOYED

RESOURCES:
==> v1/ClusterRole
NAME                          AGE
seldon-operator-manager-role  1s

==> v1/ClusterRoleBinding
NAME                                 AGE
seldon-operator-manager-rolebinding  1s

==> v1/ConfigMap
NAME                     DATA  AGE
seldon-spartakus-config  3     1s

==> v1/Pod(related)
NAME                                         READY  STATUS             RESTARTS  AGE
seldon-operator-controller-manager-0         0/1    ContainerCreating  0         0s
seldon-spartakus-volunteer-6954cffb89-dzqmv  0/1    ContainerCreating  0         0s

==> v1/Secret
NAME                                   TYPE    DATA  AGE
seldon-operator-webhook-server-secret  Opaq

In [29]:
!kubectl rollout status statefulset.apps/seldon-operator-controller-manager -n seldon-system

Waiting for 1 pods to be ready...
partitioned roll out complete: 1 new pods have been updated...


## Create Stub Deployment

In [30]:
!pygmentize resources/loadtest_simple_model.json

{
    [34;01m"apiVersion"[39;49;00m: [33m"machinelearning.seldon.io/v1alpha2"[39;49;00m,
    [34;01m"kind"[39;49;00m: [33m"SeldonDeployment"[39;49;00m,
    [34;01m"metadata"[39;49;00m: {
        [34;01m"labels"[39;49;00m: {
            [34;01m"app"[39;49;00m: [33m"seldon"[39;49;00m
        },
        [34;01m"name"[39;49;00m: [33m"seldon-core-loadtest"[39;49;00m
    },
    [34;01m"spec"[39;49;00m: {
        [34;01m"annotations"[39;49;00m: {
            [34;01m"project_name"[39;49;00m: [33m"loadtest"[39;49;00m,
            [34;01m"deployment_version"[39;49;00m: [33m"v1"[39;49;00m
        },
        [34;01m"name"[39;49;00m: [33m"loadtest"[39;49;00m,
        [34;01m"oauth_key"[39;49;00m: [33m"oauth-key"[39;49;00m,
        [34;01m"oauth_secret"[39;49;00m: [33m"oauth-secret"[39;49;00m,
        [34;01m"predictors"[39;49;00m: [
            {
                [34;01m"componentSpecs"[39;49;00m: [{
                    [34;01m"spe

In [31]:
!kubectl apply -f resources/loadtest_simple_model.json

seldondeployment.machinelearning.seldon.io/seldon-core-loadtest created


Wait for deployment to be running.

In [34]:
!kubectl rollout status deployment.apps/loadtest-loadtest-9eecb7d

deployment "loadtest-loadtest-9eecb7d" successfully rolled out


## Run benchmark

Uncorden the first 3 nodes so they can be used to schedule locust

In [35]:
!kubectl uncordon $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
!kubectl uncordon $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}')
!kubectl uncordon $(kubectl get nodes -o jsonpath='{.items[2].metadata.name}')

node/gke-standard-cluster-1-default-pool-88be49f0-5533 uncordoned
node/gke-standard-cluster-1-default-pool-88be49f0-gbtg uncordoned
node/gke-standard-cluster-1-default-pool-88be49f0-k4js uncordoned


## gRPC
Start locust load test for gRPC

In [36]:
!helm install ../helm-charts/seldon-core-loadtesting --name loadtest  \
    --set locust.host=loadtest-seldon-core-loadtest:5001 \
    --set locust.script=predict_grpc_locust.py \
    --set oauth.enabled=false \
    --set oauth.key=oauth-key \
    --set oauth.secret=oauth-secret \
    --set locust.hatchRate=1 \
    --set locust.clients=256 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=64 

NAME:   loadtest
LAST DEPLOYED: Tue Jun 25 13:31:19 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Pod(related)
NAME                   READY  STATUS             RESTARTS  AGE
locust-master-1-qxgr6  0/1    ContainerCreating  0         1s
locust-slave-1-25sgw   0/1    Pending            0         0s
locust-slave-1-2x4dg   0/1    Pending            0         0s
locust-slave-1-64r2t   0/1    Pending            0         0s
locust-slave-1-968hb   0/1    Pending            0         0s
locust-slave-1-bm47q   0/1    ContainerCreating  0         1s
locust-slave-1-cskhh   0/1    ContainerCreating  0         1s
locust-slave-1-lrspz   0/1    Pending            0         0s
locust-slave-1-m24xl   0/1    Pending            0         0s
locust-slave-1-n4499   0/1    ContainerCreating  0         1s
locust-slave-1-p59d5   0/1    Pending            0         1s
locust-slave-1-qddf4   0/1    Pending            0         0s
locust-slave-1-qq44k   0/1    ContainerCreating  0         1s
locust

To download stats use 

```bash
if [ "$#" -ne 2 ]; then
    echo "Illegal number of parameters: <experiment> <rest|grpc>"
fi

EXPERIMENT=$1
TYPE=$2

MASTER=`kubectl get pod -l name=locust-master-1 -o jsonpath='{.items[0].metadata.name}'`

kubectl cp ${MASTER}:stats_distribution.csv ${EXPERIMENT}_${TYPE}_stats_distribution.csv
kubectl cp ${MASTER}:stats_requests.csv ${EXPERIMENT}_${TYPE}_stats_requests.csv
```

You can get live stats by viewing the logs of the locust master

In [37]:
!kubectl logs $(kubectl get pod -l name=locust-master-1 -o jsonpath='{.items[0].metadata.name}') --tail=10

 grpc loadtest-seldon-core-loadtest:5001                      13998858     0(0.00%)      10       0     525  |       9 5800.80
--------------------------------------------------------------------------------------------------------------------------------------------
 Total                                                        13998858     0(0.00%)                                    5800.80

 Name                                                          # reqs      # fails     Avg     Min     Max  |  Median   req/s
--------------------------------------------------------------------------------------------------------------------------------------------
 grpc loadtest-seldon-core-loadtest:5001                      14013158     0(0.00%)      10       0     525  |       9 5729.30
--------------------------------------------------------------------------------------------------------------------------------------------
 Total                                                       

In [38]:
!helm delete loadtest --purge

release "loadtest" deleted


## REST 
Run REST benchmark

In [39]:
!helm install ../helm-charts/seldon-core-loadtesting --name loadtest  \
    --set locust.host=http://loadtest-seldon-core-loadtest:8000 \
    --set oauth.enabled=false \
    --set oauth.key=oauth-key \
    --set oauth.secret=oauth-secret \
    --set locust.hatchRate=1 \
    --set locust.clients=256 \
    --set loadtest.sendFeedback=0 \
    --set locust.minWait=0 \
    --set locust.maxWait=0 \
    --set replicaCount=64

NAME:   loadtest
LAST DEPLOYED: Tue Jun 25 14:14:35 2019
NAMESPACE: default
STATUS: DEPLOYED

RESOURCES:
==> v1/Pod(related)
NAME                   READY  STATUS             RESTARTS  AGE
locust-master-1-6dp2w  0/1    Pending            0         0s
locust-slave-1-2s5d4   0/1    Pending            0         0s
locust-slave-1-4rqb9   0/1    Pending            0         0s
locust-slave-1-5b5tm   0/1    Pending            0         0s
locust-slave-1-5bnsn   0/1    Pending            0         0s
locust-slave-1-769zb   0/1    Pending            0         0s
locust-slave-1-7f468   0/1    Pending            0         0s
locust-slave-1-8f2ng   0/1    ContainerCreating  0         0s
locust-slave-1-8vzd6   0/1    Pending            0         0s
locust-slave-1-b7lh2   0/1    ContainerCreating  0         0s
locust-slave-1-dkpp4   0/1    Pending            0         0s
locust-slave-1-fxjp6   0/1    Pending            0         0s
locust-slave-1-gl4jg   0/1    Pending            0         0s
locust

Get stats as per gRPC and/or monitor

In [40]:
!kubectl logs $(kubectl get pod -l name=locust-master-1 -o jsonpath='{.items[0].metadata.name}') --tail=10

 POST predictions                                               93406     0(0.00%)      19       3     393  |      16 4083.30
--------------------------------------------------------------------------------------------------------------------------------------------
 Total                                                          93406     0(0.00%)                                    4083.30

 Name                                                          # reqs      # fails     Avg     Min     Max  |  Median   req/s
--------------------------------------------------------------------------------------------------------------------------------------------
 POST predictions                                              105895     0(0.00%)      19       3     393  |      16 4289.00
--------------------------------------------------------------------------------------------------------------------------------------------
 Total                                                         1

In [41]:
!helm delete loadtest --purge

release "loadtest" deleted


In [42]:
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[0].metadata.name}')
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[1].metadata.name}')
!kubectl cordon $(kubectl get nodes -o jsonpath='{.items[2].metadata.name}')

node/gke-standard-cluster-1-default-pool-88be49f0-5533 cordoned
node/gke-standard-cluster-1-default-pool-88be49f0-gbtg cordoned
node/gke-standard-cluster-1-default-pool-88be49f0-k4js cordoned


## Tear Down

In [43]:
!kubectl delete -f resources/loadtest_simple_model.json

seldondeployment.machinelearning.seldon.io "seldon-core-loadtest" deleted


In [44]:
!helm delete seldon-core --purge

release "seldon-core" deleted
