# Launch a Seldon Deployment
> Get an ML endpoint up and running on your cluster!

- toc: true 
- badges: true
- comments: true
- categories: [kubernetes, docker]

### Reqs
* access to kubernetes cluster 
    * If you are coming from [Launch a local kubernetes cluster](https://ntorba.github.io/writing/jupyter/2020/07/17/local-kubernetes.html), you are good to follow this example. If not, you can quickly follow that post before running the example here!

### Goal
* Launch first seldon deployment with grpc or rest 

We will do this by following these steps: 
1. Define python component
2. Write a Dockerfile, requirements.txt, then build the image
3. Run a container based on new image to test the endpoint
4. Define SeldonDeployment yaml file 
5. `kubectl apply` SeldonDeployment to the cluster. 

### Define Python Component
I'm taking this example code directly from [seldon-core irisClassifier example](https://github.com/SeldonIO/seldon-core/blob/master/examples/models/sklearn_iris/sklearn_iris.ipynb). 
This is a classic sklearn example we will be able to get up quick. 

In [4]:
#hide_output
!mkdir iris_classifier

mkdir: iris_classifier: File exists


In [5]:
%%writefile iris_classifier/train_iris.py
#collapse_show
#hide_output
import joblib
from sklearn.pipeline import Pipeline
from sklearn.linear_model import LogisticRegression
from sklearn import datasets


OUTPUT_FILE = "iris_classifier/IrisClassifier.sav"


def main():
    clf = LogisticRegression(solver="liblinear", multi_class="ovr")
    p = Pipeline([("clf", clf)])
    print("Training model...")
    p.fit(X, y)
    print("Model trained!")

    print(f"Saving model in {OUTPUT_FILE}")
    joblib.dump(p, OUTPUT_FILE)
    print("Model saved!")


if __name__ == "__main__":
    print("Loading iris data set...")
    iris = datasets.load_iris()
    X, y = iris.data, iris.target
    print("Dataset loaded!")

    main()

Overwriting iris_classifier/train_iris.py


In [6]:
#hide_output
!python iris_classifier/train_iris.py

Loading iris data set...
Dataset loaded!
Training model...
Model trained!
Saving model in iris_classifier/IrisClassifier.sav
Model saved!


In [7]:
%%writefile iris_classifier/IrisClassifier.py
#collapse_show
#hide_output
import joblib

class IrisClassifier(object):

    def __init__(self):
        self.model = joblib.load('IrisClassifier.sav')

    def predict(self,X,features_names):
        return self.model.predict_proba(X)

Overwriting iris_classifier/IrisClassifier.py


I'm going to slightly differ from their example, and use a Dockerfile to create the docker image for this component instead of s2i. Feel free to use s2i directly from their example instead! 

In [8]:
%%writefile iris_classifier/requirements.txt
#hide_output
sklearn
seldon-core

Overwriting iris_classifier/requirements.txt


In [9]:
%%writefile iris_classifier/Dockerfile
#collapse_show
#hide_output
FROM python:3.7-slim
COPY . /app
WORKDIR /app
RUN pip install -r requirements.txt
EXPOSE 5000

# Define environment variable
ENV MODEL_NAME IrisClassifier 
ENV API_TYPE GRPC
ENV SERVICE_TYPE MODEL 
ENV PERSISTENCE 0

# seldon-core-microservice is a command line tool installed with the seldon-core python libray. You can use this locally as well!
CMD exec seldon-core-microservice $MODEL_NAME $API_TYPE --service-type $SERVICE_TYPE --persistence $PERSISTENCE

Overwriting iris_classifier/Dockerfile


To test this example, let's build and run the docker image! 

In [10]:
#hide_output
!docker build iris_classifier/ -t localhost:5000/iris_ex:latest

Sending build context to Docker daemon  11.78kB
Step 1/10 : FROM python:3.7-slim
 ---> b386e7420fc3
Step 2/10 : COPY . /app
 ---> 8043c32e806b
Step 3/10 : WORKDIR /app
 ---> Running in 42a51b1c659a
Removing intermediate container 42a51b1c659a
 ---> e394066644bf
Step 4/10 : RUN pip install -r requirements.txt
 ---> Running in 78caa2bd9937
Collecting sklearn
  Downloading sklearn-0.0.tar.gz (1.1 kB)
Collecting seldon-core
  Downloading seldon_core-1.2.2-py3-none-any.whl (108 kB)
Collecting scikit-learn
  Downloading scikit_learn-0.23.2-cp37-cp37m-manylinux1_x86_64.whl (6.8 MB)
Collecting gunicorn<20.1.0,>=19.9.0
  Downloading gunicorn-20.0.4-py2.py3-none-any.whl (77 kB)
Collecting flatbuffers<2.0.0
  Downloading flatbuffers-1.12-py2.py3-none-any.whl (15 kB)
Collecting Flask-cors<4.0.0
  Downloading Flask_Cors-3.0.8-py2.py3-none-any.whl (14 kB)
Collecting protobuf<4.0.0
  Downloading protobuf-3.12.4-cp37-cp37m-manylinux1_x86_64.whl (1.3 MB)
Collecting opentracing<2.4.0,>=2.2.0
  Downloadi

In [11]:
!docker run --name "iris_predictor" -d --rm -p 5001:5000 localhost:5000/iris_ex:latest

37d79dbbf6061564bb0e2db262db1dd9833da7baea988b8265a9a6441de8a9ea


You could also remove the -d argument from the above command and run this command in a separate window to see the log output while sending requests to the endpoint. Test the endpoint with the curl below! 

In [27]:
import numpy as np
import grpc 
from seldon_core.proto import prediction_pb2
from seldon_core.proto import prediction_pb2_grpc

# !curl -s http://localhost:5001/predict -H "Content-Type: application/json" -d '{"data":{"ndarray":[[5.964,4.006,2.081,1.031]]}}'

data = np.array([[5.964,4.006,2.081,1.031]])

datadef = prediction_pb2.DefaultData(
    tensor=prediction_pb2.Tensor(shape=data.shape, values=data.flatten())
)
request = prediction_pb2.SeldonMessage(data=datadef)
with grpc.insecure_channel("localhost:5001") as channel:
    stub = prediction_pb2_grpc.ModelStub(channel)
    response = stub.Predict(request=request)
print(response)

meta {
}
data {
  names: "t:0"
  names: "t:1"
  names: "t:2"
  tensor {
    shape: 1
    shape: 3
    values: 0.9548873249364059
    values: 0.04505474761562512
    values: 5.792744796895372e-05
  }
}



If you see successful output, you have your first seldon-core-microservice up and running! Now, we will deploy this as a simple inference graph on our kubernetes cluster. 
First, let's take down the running docker container:

Next, need to define our deployment configuration file. Here is a seldon config file for our deployment: 

In [13]:
!docker container rm iris_predictor --force

iris_predictor


In [14]:
%%writefile iris_classifier/sklearn_iris_deployment.yaml
#hide_output
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: seldon-deployment-example
spec:
  name: sklearn-iris-deployment
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/sklearn-iris:0.1
          imagePullPolicy: IfNotPresent
          name: sklearn-iris-classifier
    graph:
      children: []
      endpoint:
        type: REST
      name: sklearn-iris-classifier
      type: MODEL
    name: sklearn-iris-predictor
    replicas: 1

Overwriting iris_classifier/sklearn_iris_deployment.yaml


Some important notes about the deployment config: 
* apiVersion: this sends out request to the appropriate endpoint of the kubernets api, which was installed by helm earlier in this tutorial
* kind: tells Kubernetes what kind of resource to create. 
* metadata: add labels, like name, to the deployment
* spec: 
    * predictors: this is a list of predictors to deploy. It is a list because you have the option to create multiple inference graphs in the same spec. This is useful for things like Canary deployment, where you only want a new graph to recieve a small percentage of traffic
        * componentSpecs: add information about the containers that need to be pulled to create our graph. In our case, we only need a single containe to serve our model. If we were creating a more complex inference graph (maybe with a transformer, router, and another model, then we would need to include the docker containers that house them in this section)
        * graph: this is where you define the flow of components. This is easy in our case, there is only one component so we define one endpoint with no children. If there were more compnoents, we would fill out the children componenets in the children attriubte of the head of the graph. Seldon graphs are built implicitly through the use of the children attribute of each node in the graph. 
        
There is one last step to deploy our graph, we must push our docker container to a registry! I am running a local registry with my kind cluster, thanks to the script given [here](https://kind.sigs.k8s.io/docs/user/local-registry/). You can also push to DockerHub as well. 

In [15]:
!docker push localhost:5000/iris_ex:latest

The push refers to repository [localhost:5000/iris_ex]

[1Bcf9a18a9: Preparing 
[1B092e542c: Preparing 
[1B63f2d025: Preparing 
[1Bf01300cf: Preparing 
[1Ba0be9040: Preparing 
[1B1a837902: Preparing 
[7Bcf9a18a9: Pushed     276MB/269.5MB[6A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A[2K[7A

With our docker image in a registry, it is available to our cluster, so we can deploy!

In [16]:
!kubectl apply -f iris_classifier/sklearn_iris_deployment.yaml
from time import sleep
sleep(5) # give the clsuter some to get the deployment running before executing the rollout

seldondeployment.machinelearning.seldon.io/seldon-deployment-example created


You can check the status of your deployment. 

In [17]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-deployment-example \
                                 -o jsonpath='{.items[0].metadata.name}')

Waiting for deployment "seldon-92a927e5e90d7602e08ba9b9304f70e8" rollout to finish: 0 of 1 updated replicas are available...
deployment "seldon-92a927e5e90d7602e08ba9b9304f70e8" successfully rolled out


Once the deployment is ready, you will need to port-forward the pod to your localhost in order check the request. That can be done wiht kubectl port-forward command 
```bash 
kubectl port-forward $(kubectl get pods -l seldon-app=seldon-deployment-example-sklearn-iris-predictor -o jsonpath='{.items[0].metadata.name}') 9000:9000
```

You must run this command in a separate window because it will need to run while we curl the endpoint. 

In [45]:
# dir(prediction_pb2_grpc) 

In [47]:
import numpy as np
import grpc 
from seldon_core.proto import prediction_pb2
from seldon_core.proto import prediction_pb2_grpc

# !curl -s http://localhost:9000/predict -H "Content-Type: application/json" -d '{"data":{"ndarray":[[5.964,4.006,2.081,1.031]]}}'

data = np.array([[5.964,4.006,2.081,1.031]])

datadef = prediction_pb2.DefaultData(tensor=prediction_pb2.Tensor(shape=data.shape, values=data.flatten()))
request = prediction_pb2.SeldonMessage(data=datadef)
with grpc.insecure_channel("localhost:9000/predict") as channel:
    stub = prediction_pb2_grpc.ModelStub(channel)
    print(dir(stub))
    response = stub.Predict(request=request)
print(response)

['Predict', 'SendFeedback', '__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__']


_InactiveRpcError: <_InactiveRpcError of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "failed to connect to all addresses"
	debug_error_string = "{"created":"@1596626969.097455000","description":"Failed to pick subchannel","file":"src/core/ext/filters/client_channel/client_channel.cc","file_line":3962,"referenced_errors":[{"created":"@1596626969.097454000","description":"failed to connect to all addresses","file":"src/core/ext/filters/client_channel/lb_policy/pick_first/pick_first.cc","file_line":394,"grpc_status":14}]}"
>

You have successfully created a seldon endpoint on kubernetes! 

In [48]:
## Cleanup
!kubectl delete -f sklearn_iris_deployment.yaml


seldondeployment.machinelearning.seldon.io "seldon-deployment-example" deleted


### Conclusion 
In this quick example, we scratched the surface of seldon-core by deploying a simple model endpoint on kubernetes. 
If you are hungry for more, chech out more of the posts in the [Seldon Super Series](). There, you can find notebooks similar to this that deploy more complex inference graphs, or dive into the underlying kubernetes concepts that seldon uses to make this possible! 