# Deploy First Microservice
> Deploy your first microservice on kubernetes!

- toc: true 
- badges: true
- comments: true
- categories: [kubernetes, docker]

### Pre-reqs
* [Build a Seldon-core Microservice]()
* [Launch a local kubernetes cluster](https://ntorba.github.io/writing/jupyter/2020/07/17/local-kubernetes.html)
    * or any other kubernetes cluster readily available


### Goal
* Deploy our docker image from [Build a Seldon-core Microservice](https://ntorba.github.io/writing/seldon-core/docker/2020/07/30/first-seldon-core-microservice.html) onto kubernetes!

#### Steps
1. Define SeldonDeployment yaml file 
2. `kubectl apply` SeldonDeployment to the kubernetes cluster. 

### Define SeldonDeployment yaml file
First, I show a completed seldon deployment configuration file. Under this file I walk through some of the important details to take take note of. 
In this example, we are deploying a custom python component. Seldon-core also provides [pre-packaged model servers](https://docs.seldon.io/projects/seldon-core/en/latest/servers/overview.html), which allow you to get models up faster, but the concepts don't transfer as well to more complex inference graphs with other components. 

In [None]:
%%writefile iris_classifier/sklearn_iris_deployment.yaml
#hide_output
apiVersion: machinelearning.seldon.io/v1alpha2
kind: SeldonDeployment
metadata:
  name: seldon-deployment-example
spec:
  name: sklearn-iris-deployment
  predictors:
  - componentSpecs:
    - spec:
        containers:
        - image: seldonio/sklearn-iris:0.1
          imagePullPolicy: IfNotPresent
          name: sklearn-iris-classifier
    graph:
      children: []
      endpoint:
        type: REST
      name: sklearn-iris-classifier
      type: MODEL
    name: sklearn-iris-predictor
    replicas: 1

Some important notes about the deployment config: 
* apiVersion: this sends out request to the appropriate endpoint of the kubernets api, which was installed by helm earlier in this tutorial
    * learn more about this in our [kubernetes posts]()
* kind: tells Kubernetes what kind of resource to create.
    * Because we installed seldon-core on our cluster, it recognizes SeldonDeployment as a custom resource
* metadata: add labels, like name, to the deployment
* spec: 
    * predictors: this is a list of predictors to deploy. It is a list because you have the option to create multiple inference graphs in the same spec. This is useful for things like Canary deployment, where you only want a new graph to recieve a small percentage of traffic
        * componentSpecs: add information about the containers that need to be pulled to create our graph. In our case, we only need a single containe to serve our model. If we were creating a more complex inference graph (maybe with a transformer, router, and another model, then we would need to include the docker containers that house them in this section)
        * graph: this is where you define the flow of components. This is easy in our case, there is only one component so we define one endpoint with no children. If there were more compnoents, we would fill out the children componenets in the children attriubte of the head of the graph. Seldon graphs are built implicitly through the use of the children attribute of each node in the graph. 
        
There is one last step to deploy our graph, we must push our docker container to a registry! I am running a local registry with my kind cluster, thanks to the script given [here](https://kind.sigs.k8s.io/docs/user/local-registry/). You can also push to DockerHub as well. 

In [None]:
!docker push localhost:5000/iris_ex:latest

With our docker image in a registry, it is available to our cluster, so we can deploy!

In [None]:
!kubectl apply -f iris_classifier/sklearn_iris_deployment.yaml
from time import sleep
sleep(5) # give the clsuter some to get the deployment running before executing the rollout

You can check the status of your deployment. 

In [None]:
!kubectl rollout status deploy/$(kubectl get deploy -l seldon-deployment-id=seldon-deployment-example \
                                 -o jsonpath='{.items[0].metadata.name}')

Once the deployment is ready, you will need to port-forward the pod to your localhost in order check the request. That can be done wiht kubectl port-forward command 
```bash 
kubectl port-forward $(kubectl get pods -l seldon-app=seldon-deployment-example-sklearn-iris-predictor -o jsonpath='{.items[0].metadata.name}') 9000:9000
```

You must run this command in a separate window because it will need to run while we curl the endpoint. 

In [None]:
import numpy as np
import grpc 
from seldon_core.proto import prediction_pb2
from seldon_core.proto import prediction_pb2_grpc


### Test REST endpoint
res = !curl -s http://localhost:9000/predict -H "Content-Type: application/json" -d '{"data":{"ndarray":[[5.964,4.006,2.081,1.031]]}}'
print(res)

In [None]:
## Cleanup
!kubectl delete -f sklearn_iris_deployment.yaml


### Conclusion 
In this quick example, we scratched the surface of seldon-core by deploying a simple model endpoint on kubernetes. 
If you are hungry for more, chech out more of the posts in the [Seldon Super Series](). There, you can find notebooks similar to this that deploy more complex inference graphs, or dive into the underlying kubernetes concepts that seldon runs on top of! 

In [None]:
### Next Up
* other seldon components 
* seldon graph construction 
* multi-component inference graph
* operators and custom resources 