# DS in Production
In this notebook we will show how you can convert a scikit-learn model into a REST-Api hosted in a docker container.

## But, why? 
Docker provides us with a method to package a model and api into a single package which if it runs on your laptop, will run everywhere. Regardless if the machine hosting it run on windows, linux, or mac. Only has 1 cpu, or 32. Eg it provides us with an ideal start to test out a model in production.

## Prerequisites
Install:
* Docker (https://www.docker.com/products/docker-desktop)
* Anaconda or
    * scikit-learn
    * flask


Tutorials:
* https://docs.docker.com/get-started/
* https://scikit-learn.org/stable/tutorial/index.html
* http://flask.pocoo.org/docs/1.0/tutorial/

## Overview of this Notebook
We'll start by introducing the three components we are going to use:
* Scikit-learn
* Flask
* Docker


We will start by building a simple model. Scikit-learn is a library which makes it really easy to do so. Next, we need a method to expose this model to the world. In order to do this, we will build a REST-Api using Flask, to be able to serve predictions over http. Finally, we will use Docker to build a package (container) of our solution, readying it for deployment in the cloud.

# Scikit-Learn Iris
Iris is a dataset included in Scikit-learn which is used in many tutorials. It's a basic dataset, and contains measurements of three different species of Iris. With it, we will train a KNN which can classify a particular input and predict which Iris it is likely to be.

Let's start with loading the dataset

In [None]:
from sklearn import datasets

iris = datasets.load_iris()
dir(iris)

As you can see the iris dataset has 5 attributes. Let's print the feature_names, and target_names.

In [None]:
print(iris.feature_names)
print(iris.target_names)

We'll need to build the KNN based on these 4 features. Sepal length/width, and Petal lenght/width.
And with it predict which of those three Iris plants it is.

Ok, let's look a the first 10 lines of the data, and target labels.

In [None]:
print(iris.data[:10])

In [None]:
print(iris.target[:10])

We call the data X, and the target y. 

Next, lets fit the KNN.
For more information wrt the KNN, have a look here: https://scikit-learn.org/stable/modules/generated/sklearn.neighbors.KNeighborsClassifier.html

In [None]:
# Split iris data in train an test data
# A random permutation, to split the data randomly
import numpy as np
np.random.seed(0)

iris_X = iris.data
iris_y = iris.target

# Generate some random indices
indices = np.random.permutation(len(iris_X))

# And use them to use 10 rows for testing
iris_X_train = iris_X[indices[:-10]]
iris_y_train = iris_y[indices[:-10]]
iris_X_test = iris_X[indices[-10:]]
iris_y_test = iris_y[indices[-10:]]

# Create and fit a nearest-neighbor classifier
from sklearn.neighbors import KNeighborsClassifier
knn = KNeighborsClassifier()
knn.fit(iris_X_train, iris_y_train)

So now we have a KNN. Lets use the testset to verify it the model is any good. 

In [None]:
iris_y_pred = knn.predict(iris_X_test)
print(iris_y_pred)

And if we compare those to our test set, we get.

In [None]:
print(iris_y_test)

Scikit comes with scoring build in, so by calling that we get a score of

In [None]:
knn.score(iris_X_test, iris_y_test)

Good enough for our first deployment, lets continue to Flask

# My First Flask App
Flask is a small framework which allows you to create a Python Webservice with only few lines.
Let's look at this example.

If it's running go to http://localhost:8080/

Be sure to click the square box at the top of the screen if you're done testing it. As `app.run` is blocking and will never finish.

In [None]:
from flask import Flask
app = Flask(__name__)

# tell flask to map this method to /
@app.route("/")
def hello():
    return "Hello World!"

# this is a blocking call, 
# you need to interrupt the kernel to stop
app.run(port=8080)

As you can see, Flask only requires a couple of lines to get a webservice up and running. It's not suitable for big production workloads, but to test out a model on a relatively small scale it more than enough.

Let's combine Flask and Scikit. And see how we can expose our trained KNN to the world.

# Combining Flask + Scikit

In [None]:
def train_model():
    from sklearn import datasets

    iris = datasets.load_iris()

    # Split iris data in train an test data
    # A random permutation, to split the data randomly
    import numpy as np
    np.random.seed(0)

    iris_X = iris.data
    iris_y = iris.target

    # Generate some random indices
    indices = np.random.permutation(len(iris_X))

    # And use them to use 10 rows for testing
    iris_X_train = iris_X[indices[:-10]]
    iris_y_train = iris_y[indices[:-10]]
    iris_X_test = iris_X[indices[-10:]]
    iris_y_test = iris_y[indices[-10:]]

    # Create and fit a nearest-neighbor classifier
    from sklearn.neighbors import KNeighborsClassifier
    knn = KNeighborsClassifier()
    knn.fit(iris_X_train, iris_y_train)
    
    return knn

This is the same code as we used before, but wrapped into a small method. Next, we extend the Flask code to be able to serve predictions from it.

Same as before, stop with the square icon.

Test it out with the following url:
http://localhost:8080/5.9/3.2/4.8/1.8

In [None]:
from flask import Flask, jsonify
app = Flask(__name__)
knn = train_model()

@app.route("/")
def index():
    return "<html>Welcome to the worlds best Iris predictor</html>"

@app.route("/<float:s_length>/<float:s_width>/<float:p_length>/<float:p_width>")
def predict(s_length, s_width, p_length, p_width):
    result = knn.predict([[s_length, s_width, p_length, p_width]])
    return jsonify({"prediction": str(iris.target_names[int(result[0])])})

# this is a blocking call, 
# you need to interrupt the kernel to stop
app.run(port=8080)

# Small intro into Docker
Next up, lets wrap in into a Docker package/container.

First, let's see if we have docker installed:

In [None]:
!docker -v

List running all containers

In [None]:
!docker ps

Run a prebuilt container

In [None]:
!docker run debian:latest ls -lh /

Building your own container

In [None]:
%%writefile Dockerfile

# This new container is based on the latest version of debian
FROM debian:latest

# Let's create a directory inside this new container
RUN mkdir /my_container

In [None]:
!docker build -t my_container .

Let's run our own container, and see it's result

In [None]:
!docker run -t my_container ls -lh /

## Back to the tutorial, lets install our flask api into a container
First we create a file called `main.py` which contains the flask api code

In [None]:
%%writefile main.py
import numpy as np

from sklearn import datasets
from flask import Flask, jsonify

def train_model():
    iris = datasets.load_iris()
    iris_X = iris.data
    iris_y = iris.target

    # Split iris data in train an test data
    # A random permutation, to split the data randomly
    np.random.seed(0)
    indices = np.random.permutation(len(iris_X))
    iris_X_train = iris_X[indices[:-10]]
    iris_y_train = iris_y[indices[:-10]]
    iris_X_test = iris_X[indices[-10:]]
    iris_y_test = iris_y[indices[-10:]]
    # Create and fit a nearest-neighbor classifier
    from sklearn.neighbors import KNeighborsClassifier
    knn = KNeighborsClassifier()
    knn.fit(iris_X_train, iris_y_train)
    
    return knn, iris.target_names

app = Flask(__name__)
knn, labels = train_model()

@app.route("/")
def index():
    return "<html>Welcome to the worlds best Iris predictor</html>"

@app.route("/<float:s_length>/<float:s_width>/<float:p_length>/<float:p_width>")
def predict(s_length, s_width, p_length, p_width):
    result = knn.predict([[s_length, s_width, p_length, p_width]])
    return jsonify({"prediction": str(labels[int(result[0])])})

# this is a blocking call, 
# you need to interrupt the kernel to stop
app.run(port=8080, host="0.0.0.0")

In [None]:
%%writefile Dockerfile_model

# Extend the python 3.6 container
FROM python:3.6-slim
    
# Install flask and scikit-learn
RUN pip install flask scikit-learn

# Copy main.py
ADD main.py /app/main.py

# And set the default command to main.py
CMD ["python", "/app/main.py"]

Now we can build this container, but note that we pass the Dockerfile_model. We saved it to `Dockerfile_model`

In [None]:
!docker build -t my_model_container -f Dockerfile_model .

Start the container, and expose it on port 8080

Go to http://localhost:8080/5.9/3.2/4.8/1.8 to see the results

In [None]:
lines = !docker run -d -p 8080:8080 -t my_model_container
docker_id = lines[0]

Finally, let's stop the container

In [None]:
!docker rm -f $docker_id