# Deploying Machine Learning Models using kubectl on Minikube (no RBAC)
This demo shows how you can interact directly with kubernetes using kubectl to create and manage runtime machine learning models. It uses Minikube as the target Kubernetes cluster.
<img src="images/deploy-graph.png" alt="predictor with canary" title="ml graph"/>

## Prerequistes
You will need
 - [Git clone of Seldon Core](https://github.com/SeldonIO/seldon-core)
 - [Helm](https://github.com/kubernetes/helm)
 - [Minikube](https://github.com/kubernetes/minikube) version v0.24.0 or greater
 - [python grpc tools](https://grpc.io/docs/quickstart/python.html)

Start minikube and ensure custom resource validation is activated and ther is 5G of memory.

In [None]:
!minikube start --memory=5000 --feature-gates=CustomResourceValidation=true

Install Helm

In [None]:
!helm init

Label the node to allow load testing to run on it

In [None]:
!kubectl label nodes `kubectl get nodes -o jsonpath='{.items[0].metadata.name}'` role=locust --overwrite

## Start seldon-core

Install the custom resource definition

In [None]:
!helm install ../helm-charts/seldon-core-crd --name seldon-core-crd \
    --set usage_metrics.enabled=true \
    --set rbac.enabled=false

In [None]:
!kubectl create namespace seldon

In [None]:
!helm install ../helm-charts/seldon-core --name seldon-core --namespace seldon \
    --set rbac.enabled=false

Install prometheus and grafana for analytics

In [None]:
!helm install ../helm-charts/seldon-core-analytics --name seldon-core-analytics \
    --set grafana_prom_admin_password=password \
    --set persistence.enabled=false \
    --set rbac.enabled=false \
    --namespace seldon

Check all services are running before proceeding.

In [None]:
!kubectl get pods -n seldon

## Set up REST and gRPC methods

Install gRPC modules for the prediction protos.

In [None]:
!cp ../proto/prediction.proto ./proto
!python -m grpc.tools.protoc -I. --python_out=. --grpc_python_out=. ./proto/prediction.proto

Illustration of both REST and gRPC requests. 

In [None]:
import requests
from requests.auth import HTTPBasicAuth
from proto import prediction_pb2
from proto import prediction_pb2_grpc
import grpc
try:
    from commands import getoutput # python 2
except ImportError:
    from subprocess import getoutput # python 3


NAMESPACE='seldon'
MINIKUBE_IP=getoutput('minikube ip')
MINIKUBE_HTTP_PORT=getoutput("kubectl get svc -n "+NAMESPACE+" -l app=seldon-apiserver-container-app -o jsonpath='{.items[0].spec.ports[0].nodePort}'")
MINIKUBE_GRPC_PORT=getoutput("kubectl get svc -n "+NAMESPACE+" -l app=seldon-apiserver-container-app -o jsonpath='{.items[0].spec.ports[1].nodePort}'")

def get_token():
    payload = {'grant_type': 'client_credentials'}
    response = requests.post(
                "http://"+MINIKUBE_IP+":"+MINIKUBE_HTTP_PORT+"/oauth/token",
                auth=HTTPBasicAuth('oauth-key', 'oauth-secret'),
                data=payload)
    print(response.text)
    token =  response.json()["access_token"]
    return token

def rest_request():
    token = get_token()
    headers = {'Authorization': 'Bearer '+token}
    payload = {"data":{"names":["a","b"],"tensor":{"shape":[2,2],"values":[0,0,1,1]}}}
    response = requests.post(
                "http://"+MINIKUBE_IP+":"+MINIKUBE_HTTP_PORT+"/api/v0.1/predictions",
                headers=headers,
                json=payload)
    print(response.text)
    
def grpc_request():
    token = get_token()
    datadef = prediction_pb2.DefaultData(
            names = ["a","b"],
            tensor = prediction_pb2.Tensor(
                shape = [3,2],
                values = [1.0,1.0,2.0,3.0,4.0,5.0]
                )
            )
    request = prediction_pb2.SeldonMessage(data = datadef)
    channel = grpc.insecure_channel(MINIKUBE_IP+":"+MINIKUBE_GRPC_PORT)
    stub = prediction_pb2_grpc.SeldonStub(channel)
    metadata = [('oauth_token', token)]
    response = stub.Predict(request=request,metadata=metadata)
    print(response)


# Integrating with Kubernetes API

## Validation

Using OpenAPI Schema certain basic validation can be done before the custom resource is accepted.

In [None]:
!kubectl create -f resources/model_invalid1.json -n seldon

## Normal Operation
A simple example is shown below we use a single prepacked model for illustration. The spec contains a set of predictors each of which contains a ***componentSpec*** which is a Kubernetes [PodTemplateSpec](https://kubernetes.io/docs/api-reference/v1.9/#podtemplatespec-v1-core) alongside a ***graph*** which describes how components fit together.

In [None]:
!pygmentize resources/model.json

## Create Seldon Deployment

Deploy the runtime graph to kubernetes.

In [None]:
!kubectl apply -f resources/model.json -n seldon

In [None]:
!kubectl get seldondeployments -n seldon

In [None]:
!kubectl describe seldondeployments seldon-deployment-example -n seldon

Get the status of the SeldonDeployment. **When ready the replicasAvailable should be 1**.

In [None]:
!kubectl get seldondeployments seldon-deployment-example -o jsonpath='{.status}' -n seldon

## Get predictions

#### REST Request

In [None]:
rest_request()

#### gRPC Request

In [None]:
grpc_request()

## Update deployment with canary

We will change the deployment to add a "canary" deployment. This illustrates:
 - Updating a deployment with no downtime
 - Adding an extra predictor to run alongside th exsting predictor.
 
 You could manage different traffic levels by controlling the number of replicas of each.

In [None]:
!pygmentize resources/model_with_canary.json

In [None]:
!kubectl apply -f resources/model_with_canary.json -n seldon

Check the status of the deployments. Note: **Might need to run several times until replicasAvailable is 1 for both predictors**.

In [None]:
!kubectl get seldondeployments seldon-deployment-example -o jsonpath='{.status}' -n seldon

#### REST Request

In [None]:
rest_request()

#### gRPC request

In [None]:
grpc_request()

## Load test

Start a load test which will post REST requests at 10 requests per second.

In [None]:
!helm install seldon-core-loadtesting --name loadtest  \
    --set oauth.key=oauth-key \
    --set oauth.secret=oauth-secret \
    --namespace seldon \
    --repo https://storage.googleapis.com/seldon-charts

You should port-foward the grafana dashboard

```bash
kubectl port-forward $(kubectl get pods -n seldon -l app=grafana-prom-server -o jsonpath='{.items[0].metadata.name}') -n seldon 3000:3000
```

You can then iew an analytics dashboard inside the cluster at http://localhost:3000/dashboard/db/prediction-analytics?refresh=5s&orgId=1. Your IP address may be different. get it via minikube ip. Login with:
 - Username : admin
 - password : password (as set when starting seldon-core above)
 
 The dashboard should look like below:
 
 
 <img src="images/dashboard.png" alt="predictor with canary" title="ml graph"/>

# Tear down

In [None]:
!helm delete loadtest --purge

In [None]:
!kubectl delete -f resources/model_with_canary.json -n seldon

In [None]:
!helm delete seldon-core-analytics --purge

In [None]:
!helm delete seldon-core --purge

In [None]:
!helm delete seldon-core-crd --purge