# Build an API for our Machine Learning Model
## June 8, 2020 
## About RIHAD VARIAWA
> As a Data Scientist and former head of global fintech research at Malastare.ai, I find fulfillment tacking challenges to solve complex problems using data

![](https://media.giphy.com/media/U4FkC2VqpeNRHjTDQ5/giphy.gif)

As a data scientist, I want to make impact with my machine learning models. However, this is easier said than done. When starting a new project, it starts with playing around with the data in a **Jupyter notebook**. Once you’ve got a full understanding of what data you’re dealing with and have aligned with the client on what steps to take, one of the outcomes can be to create a predictive ML model

You get excited and go back to your notebook to make the best model possible. The model and the results are presented and everyone is happy. The client wants to run the model in their infrastructure to test if they can really create the expected impact. Also, when people can use the model, you get the input necessary to improve it step by step. But how can we quickly do this, given that the client has some complicated infrastructure that you might not be familiar with?

For this purpose you need a tool that can fit in their complicated infrastructure, preferably in a language that you’re familiar with. This is where you can use **Flask**

> Flask is a micro web framework written in Python. It can create a *REST API* that allows you to send data, and receive a prediction as a response

## Create the model

Let me show you how this works. For the purpose of demonstration, I will train a simple DecisionTreeClassifier model on an example dataset which can be loaded from the scikit-learn package

In [None]:
import pandas as pd
import numpy as np
from sklearn import tree
from sklearn.datasets import load_wine
from sklearn.metrics import accuracy_score

In [None]:
wine = load_wine()

In [None]:
df = pd.DataFrame(data=np.c_[wine['data'], wine['target']], columns=wine['feature_names'] + ['target'])

df.head()

Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline,target
0,14.23,1.71,2.43,15.6,127.0,2.8,3.06,0.28,2.29,5.64,1.04,3.92,1065.0,0.0
1,13.2,1.78,2.14,11.2,100.0,2.65,2.76,0.26,1.28,4.38,1.05,3.4,1050.0,0.0
2,13.16,2.36,2.67,18.6,101.0,2.8,3.24,0.3,2.81,5.68,1.03,3.17,1185.0,0.0
3,14.37,1.95,2.5,16.8,113.0,3.85,3.49,0.24,2.18,7.8,0.86,3.45,1480.0,0.0
4,13.24,2.59,2.87,21.0,118.0,2.8,2.69,0.39,1.82,4.32,1.04,2.93,735.0,0.0


In [None]:
X_train = df[:-20]
X_test = df[-20:]

y_train = X_train.target
y_test = X_test.target

X_train = X_train.drop('target', 1)
X_test = X_test.drop('target', 1)

In [None]:
clf = tree.DecisionTreeClassifier()
clf = clf.fit(X_train, y_train)

In [None]:
y_pred = clf.predict(X_test)

In [None]:
print('accuracy_score: %.2f' % accuracy_score(y_test, y_pred))

accuracy_score: 0.85


In [1]:
#import pickle
#pickle.dump(clf, open('model/final_prediction.pickle', 'wb'))

Once the client is happy with the model you have created, you can save it as pickle file. You can then open this pickle file later and call the function predict to get a prediction for new input data. This is exactly what we will do in *Flask*

## Run Flask
 
*Flask* runs on a server. This can be in the environment of the client or a different server depending on the client’s requirements. When running python app.py it first loads the created pickle file. Once this is loaded you can start making predictions

~~~
# code needed for a simple api in Flask

from flask import Flask, request, redirect, url_for, flash, jsonify
import numpy as np
import pickle as p
import json


app = Flask(__name__)


@app.route('/api/', methods=['POST'])
def makecalc():
    data = request.get_json()
    prediction = np.array2string(model.predict(data))

    return jsonify(prediction)

if __name__ == '__main__':
    modelfile = 'model/final_prediction.pickle'
    model = p.load(open(modelfile, 'rb'))
    app.run(debug=True, host='0.0.0.0')
~~~

## Request predictions
 
Predictions are made by passing a POST JSON request to the created Flask web server which is on port 5000 by default. In app.py this request is received and a prediction is based on the already loaded prediction function of our model. It returns the prediction in JSON format

```
# test our API

import requests
import json

url = 'http://0.0.0.0:5000/api/'

data = [[14.34, 1.68, 2.7, 25.0, 98.0, 2.8, 1.31, 0.53, 2.7, 13.0, 0.57, 1.96, 660.0]]
j_data = json.dumps(data)
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}
r = requests.post(url, data=j_data, headers=headers)
print(r, r.text)
```

Now, all you need to do is call the web server with the correct syntax of data points. This corresponds with the format of the original dataset to get this JSON response of your predictions. For example:

python request.py -> <Response[200]> '[1.]'

For the data we sent we got a prediction of class 1 as output of our model. Actually all you are doing is sending data in an array to an endpoint, which is transformed to JSON format. The endpoint reads the JSON post and transforms it back to the original array

With these simple steps you can easily let other people use your machine learning model and quickly make a big impact

## Conclusion
 
In this post, I didn’t account for any errors in the data or other exceptions. This article shows how to simply start and learn from the models output, but needs a lot of improvements before it is ready to be put into production. This solution can be made scalable when creating a docker file with the API and hosting it on to Kubernetes so you can balance the load across different machines. But these are all steps to take when going from a proof of concept to a production environment

![](https://drive.google.com/uc?export=view&id=1i7fzIUxz-oEs8V4uMdoZCQUl51NMrbVz)
