# E2E example with logging

In a real production setup a lot of other concerns need to be handled as well:

- hyper parameter optimization
- high availability
- fast serving
- authorization
- logging
- monitoring of quality over time

Here, no standard has evolved yet.

Kubernetes looks like the winner for hosting any mordern IT workload. But serving infrastructure tailored to ML use cases on top of k8s is still unclear.
Two frameworks which look interesting:
- https://www.kubeflow.org
- http://clipper.ai/

> Do not get intimidated by the ecosystem: https://landscape.cncf.io 

Have a look at them after the course.
Here, I will show you something a bit simpler but still with some of the most important components available. For a more detailed explanation see https://www.youtube.com/watch?v=K0hg6o9MWKQ:

It is based on: https://github.com/ThoughtWorksInc/ml-cd-starter-kit
- https://github.com/ThoughtWorksInc/ml-app-template


## app template

The template can be found at https://github.com/lecturing/DHBW-DS101-ml-app-template and at:

In [None]:
%ls ../advanced/ml-app-template/

In a more complex case additional logging or serving customizations might be ready.

Train the model inside `../advanced/ml-app-template/`:

```bash
SHOULD_USE_MLFLOW=false python src/train.py
```

Then only look at the code from jupyter (running only works well from the commandline):

In [None]:
%load ../advanced/ml-app-template/src/app.py

To start the custom server locally execute:

```bash
python src/app_logging_simple.py
```

alternatively via docker:

```bash
docker build . -t ml-app-template
docker run -it  -v $(pwd):/home/ml-app-template \
                -p 5555:5555 \
                -p 8888:8888 \
                ml-app-template bash
python src/app_logging_simple.py
```

now you can query the model for predictions:

- to test that it works open: http://localhost:5555, you should see: `{"response": "hello world!"}`
- to get an actual prediction run an HTTP POST reuqest to your API
- do not forget to look at the logs and the LIME interpretation

In [None]:
!curl --request POST "http://localhost:5555/predict" \
     --header "Content-Type: application/json" \
     --data '{ "AGE": 65.2, "B": 396.9, "CHAS": 0, "CRIM": 0.00632, "DIS": 4.09, "INDUS": 2.31, "LSTAT": 4.98, "NOX": 0.538, "PTRATIO": 15.3, "RAD": 1.0, "RM": 16.575, "TAX": 296, "ZN": 18}'

## running it - creating a prediction

```bash
# TODO remove these!!!
# password: ds101
# hostname: wwidscluster.dhbw-stuttgart.de

# password: <<xx>>
# hostname: <<xx>>
ssh student@wwidscluster.dhbw-stuttgart.de \
    -L 8001:localhost:8001 \
    -L 42147:localhost:42147 \
    -L 5000:10.96.201.0:5000
```

then open:

- https://localhost:8001 # dashboard
    - http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/
    - token: `kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')`
- http://localhost:5000 # mlflow
- http://localhost:8099 # grafana
    - user: `admin` password: ``
- http://localhost:4439 # kibana
    - not ready yet
    
- http://localhost:32035 # the APP
    - example to create prediction below

## setup (not relevant for the course)

## minikube (mac)

- connect to VPN (maybe not required when directly on campus
  - on a mac: https://tunnelblick.net
  - https://www.dhbw-stuttgart.de/themen/einrichtungen/itservice-center/informationen-fuer-studierende/wlan-vpn-zugang/

On a mac install:

- homebrew https://brew.sh/index_de
- minikube `brew cask install minikube`
- hypervisor `brew install hyperkit`
- helm `brew install kubernetes-helm`

```bash
minikube start --kubernetes-version v1.15.4 --vm-driver=hyperkit --cpus 6 --memory 8192 --bootstrapper=kubeadm --extra-config=apiserver.authorization-mode=RBAC
minikube start --kubernetes-version v1.15.4 --vm-driver=none --bootstrapper=kubeadm --extra-config=apiserver.authorization-mode=RBAC

# see that it is working
minikube dashboard

# install and configure helm
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller

helm init

kubectl create clusterrolebinding default-cluster-rule --clusterrole=cluster-admin --serviceaccount=default:default

# first you need to disable some services if you do not have enough resources
# https://github.com/ThoughtWorksInc/ml-cd-starter-kit/blob/master/docs/minikube.md
# helm install --name ml-cd-starter-kit .

kubectl get services

minikube tunnel
```

## server

- https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/install-kubeadm/
- https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/

```bash
yum install -y kubelet-1.15.4 kubeadm-1.15.4 kubectl-1.15.4 --disableexcludes=kubernetes

# fix various bugs (with warnings)
kubeadm init

kubeadm config images --kubernetes-version='v1.15.4' pull


kubeadm init --kubernetes-version=v1.15.4 --pod-network-cidr=10.244.0.0/16
kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/2140ac876ef134e0ed5af15c65e414cf26827915/Documentation/kube-flannel.yml

kubectl taint nodes --all node-role.kubernetes.io/master-

#######################
# does not work
# kubectl create clusterrolebinding kubernetes-dashboard --clusterrole=cluster-admin --serviceaccount=kube-system:kubernetes-dashboard

## works
https://medium.com/@kanrangsan/creating-admin-user-to-access-kubernetes-dashboard-723d6c9764e4
#######################

kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml

kubectl -n kube-system -l=k8s-app=kube-dns get pods
kubectl get svc --all-namespaces

kubectl -n kubernetes-dashboard describe secret $(kubectl -n kubernetes-dashboard get secret | grep admin-user | awk '{print $1}')


kubectl proxy

http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/

----------------
dashboard-adminuser.yml


apiVersion: v1
kind: ServiceAccount
metadata:
  name: admin-user
  namespace: kube-system


kubectl apply -f dashboard-adminuser.yml

admin-role-binding.yml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: admin-user
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: cluster-admin
subjects:
- kind: ServiceAccount
  name: admin-user
  namespace: kube-system

kubectl apply -f admin-role-binding.yml
kubectl -n kube-system describe secret $(kubectl -n kube-system get secret | grep admin-user | awk '{print $1}')
----------------
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/


****************************
```

helm setup

```bash

# install and configure helm
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller

helm init

kubectl create clusterrolebinding default-cluster-rule --clusterrole=cluster-admin --serviceaccount=default:default
```

debugging tiller

```bash
helm status

kubectl -n kube-system get po
```

installing the ML things:

```bash
kubectl patch deploy --namespace kube-system tiller-deploy -p '{"spec":{"template":{"spec":{"serviceAccount":"tiller"}}}}'
helm install --name ml-cd-starter-kit .

curl ifconfig.me

kubectl patch svc ml-cd-starter-kit-elasticsearch-client -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-fluentd -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-gocd-server -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-grafana -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-kibana -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-ml-app-template -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'
kubectl patch svc ml-cd-starter-kit-mlflow -n default -p '{"spec": {"type": "LoadBalancer", "externalIPs":["141.31.111.79"]}}'


```

local cluaster works, but does not have persistent volumnes configured.
For sake of brevity instead of using the local one I will switch to google cloud.

setup in gcloud:

```bash
brew cask install google-cloud-sdk

# create a new project in the UI
# enable kubernetes API
gcloud init
# select that project

# provision cluster on GCP
gcloud container clusters create my-cluster --region europe-west3-a
# note: 
# - you may have to enable Kubernetes Engine API for your project in the GCP console. If you have not done so, running the command above will provide a link for you to do so.
# - if you are new to Google Cloud Platform, you might need to upgrade your account when prompted.

# create tiller service account and give tiller access to default namespace
kubectl --namespace kube-system create serviceaccount tiller
kubectl create clusterrolebinding tiller-cluster-rule --clusterrole=cluster-admin --serviceaccount=kube-system:tiller

# initialize helm on k8s cluster (install tiller into the cluster)
helm init --service-account tiller

# give gocd service account access to run kubectl commands to deploy to staging and prod
kubectl create clusterrolebinding default-cluster-rule --clusterrole=cluster-admin --serviceaccount=default:default

# wait for tiller-deploy pod to be ready
kubectl get pods,services --all-namespaces
# mac users can `brew install watch` and run:
# watch kubectl get pods,services --all-namespaces #(Hit Ctrl+C to exit) (Hit Ctrl+C to exit)

helm install --name ml-cd-starter-kit .
# note: you can replace ml-cd-starter-kit with the name of your release if you want.
# if you do that, you have to replace `ml-cd-starter-kit` with the name of your release in ./values.yaml: elasticsearch.url: http://YOUR_RELEASE_NAME-elasticsearch-client:9200`
# also, you'll need to replace `ml-cd-starter-kit` with your release name in ml-app-template/ci.gocd.yaml

watch kubectl get pods,services --all-namespaces #(Hit Ctrl+C to exit) (Hit Ctrl+C to exit)

# create a new release for our 'production' app (ml-app-template)
# the first helm install command installed the app as our 'staging' app
cd charts/ml-app-template
helm install --name ml-cd-starter-kit-prod -f values.yaml .


# dashboard
kubectl apply -f https://raw.githubusercontent.com/kubernetes/dashboard/v2.0.0-beta4/aio/deploy/recommended.yaml

gcloud container clusters get-credentials my-cluster --zone europe-west3-a --project dhbw-ds101-19
gcloud config config-helper --format=json | jq -r '.credential.access_token'

kubectl proxy
http://localhost:8001/api/v1/namespaces/kubernetes-dashboard/services/https:kubernetes-dashboard:/proxy/


### cleanup

helm delete ml-cd-starter-kit
kubectl delete pvc -l release=ml-cd-starter-kit,component=data
gcloud container clusters delete my-cluster --region europe-west3-a
```