# Use Flask to serve a machine learning model as RESTful API

### Overview 

Most of us know how to

    * Write machine learning models
    * Train them 
    * Test them as well 

But how do we deploy them for production? 

Well in this repo I'm going to explain how to deploy machine learning models to production using [Flask](http://flask.pocoo.org/) (a micro web framework written in Python), in addition how to serve them as a RESTful API (web services). So I will be building just a simple model in order to walk you through the essential stuff ...

### App architecture 

Let's create a Flask app to serve a simple random forest model as a RESTful API around iris built in data. The app will allow the user to request the server in order to predcit whether a given flower is Setosa, Versicolour or Virginica.

![](ml.png)

### Import required libraries 

In [6]:
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier
from sklearn.cross_validation import train_test_split
from sklearn.metrics import accuracy_score
from sklearn.metrics import classification_report
import cloudpickle as pickle 
import requests, json 

### Dataset 

In [8]:
# load iris dataset
iris = load_iris()
# features (predictors)
print (iris.feature_names)
# target 
print (iris.target_names)
# features sample 
print (iris.data[0:5])
# target sample
print (iris.target[0:5])
# full description of the iris dataset 
print (iris.DESCR)

['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']
['setosa' 'versicolor' 'virginica']
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]
[0 0 0 0 0]
Iris Plants Database

Notes
-----
Data Set Characteristics:
    :Number of Instances: 150 (50 in each of three classes)
    :Number of Attributes: 4 numeric, predictive attributes and the class
    :Attribute Information:
        - sepal length in cm
        - sepal width in cm
        - petal length in cm
        - petal width in cm
        - class:
                - Iris-Setosa
                - Iris-Versicolour
                - Iris-Virginica
    :Summary Statistics:

                    Min  Max   Mean    SD   Class Correlation
    sepal length:   4.3  7.9   5.84   0.83    0.7826
    sepal width:    2.0  4.4   3.05   0.43   -0.4194
    petal length:   1.0  6.9   3.76   1.76    0.9490  (high!)
    petal width:    0.1  2.5   1.20  0.76     0.9565  (high!)

   

### Training and testing datasets 

In [9]:
# get the data 
x = iris.data
y = iris.target
# split the data into traiing and testing datasets 
x_train, x_test, y_train, y_test = train_test_split(x,y)

### Building the model 

In [10]:
# A simple random forest classifier 
rfc = RandomForestClassifier(n_estimators=100,n_jobs=2)

In [11]:
# Train the model 
rfc.fit(x_train,y_train)

RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
            max_depth=None, max_features='auto', max_leaf_nodes=None,
            min_impurity_decrease=0.0, min_impurity_split=None,
            min_samples_leaf=1, min_samples_split=2,
            min_weight_fraction_leaf=0.0, n_estimators=100, n_jobs=2,
            oob_score=False, random_state=None, verbose=0,
            warm_start=False)

In [13]:
# Model accuracy and performance 
print ("Accuracy = %0.2f" % accuracy_score(y_test, rfc.predict(x_test)))
print (classification_report(y_test, rfc.predict(x_test)))

Accuracy = 0.97
             precision    recall  f1-score   support

          0       1.00      1.00      1.00        11
          1       1.00      0.91      0.95        11
          2       0.94      1.00      0.97        16

avg / total       0.98      0.97      0.97        38



### Model serialization / marshalling 

Well here we are going to use **pickle** to save the trained model in the disk for later use. The reason why using this is just to avoid training the model every time we wanna use it, so we train it once, save it then open it many times. 

In [42]:
# Save the model into disk 

pickle.dump(rfc, open("rfc.pkl","wb"))

     # Predict using the random forest model
    # y = random_forest_model.predict(predict_request)

In [97]:
# Load the random forest model back 
random_forest = pickle.load(open("rfc.pkl","rb"))
data = {"sl": 5.84, "sw": 3.0, "pl": 3.75, "pw": 1.1}# request.get_json(force=True)

     # Convert JSON to numpy array
predict_request = [data['sl'],data['sw'],data['pl'],data['pw']]
predict_request = np.reshape(np.array(predict_request),(1,len(predict_request)))

     # Predict using the random forest model
y = rfc.predict(predict_request)

     # Return prediction
output = [y[0]]
print(output[0])
     

1


In [17]:
# Use it 
print (classification_report(y_test,random_forest.predict(x_test)))

             precision    recall  f1-score   support

          0       1.00      1.00      1.00        11
          1       1.00      0.91      0.95        11
          2       0.94      1.00      0.97        16

avg / total       0.98      0.97      0.97        38



### Start Flask app

First run the [Flask app](https://github.com/a-djebali/flask-machine-learning-resful/blob/master/app.py) in the terminal

```python
$python app.py
```

Once the app is running we can request it for predictions

```
 * Running on http://127.0.0.1:9000/ (Press CTRL+C to quit)
 * Restarting with stat
 * Debugger is active!
 * Debugger pin code: 580-987-602
```

Let's ask the random forest model through the web service to get which class the following flower belongs to. The flower's features are:

    * sepal length: 5.84
    * sepal width: 3.0
    * petal length: 3.75
    * petal width: 1

In [None]:
# Request the model 
url = "http://127.0.0.1:9000/predict_api"
x=input("Enter sl: ")
y=input("Enter sw: ")
z=input("Enter pl: ")
d=input("Enter pw: ")
#'sl':4.9,'sw':3.0,'pl':1.4,'pw':0.2
data = json.dumps({'sl':x,'sw':y,'pl':z,'pw':d})

r = requests.post(url,data)
print (r.json())

**The flower's class is Versicolour** ... Mission completed ;)