#Exercise 2: Deploy a Model as Web API
In this exercise, we will learn how to deploy a pre-trained Machine Learning model as a Web API that will accept HTTP POST requests for predicting the class type of Breast cancer for a given patient.

Note
The dataset used for this exercise is the Breast Cancer Detection shared by Dr. WIlliam H. Wolberg from the University of Wisconsin Hospitals and the attribute information can be found here - https://archive.ics.uci.edu/ml/datasets/breast+cancer+wisconsin+(original)

The dataset can also be found in our repository here - https://raw.githubusercontent.com/PacktWorkshops/The-Data-Science-Workshop/master/Chapter11/dataset/breast-cancer-wisconsin.data


1. Open on a new Colab notebook

2. Import the packages pandas and joblib, RandomForestClassifier from sklearn.ensemble and train_test_split from from sklearn.model_selection

In [0]:
import pandas as pd
import joblib
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

3. Assign the link to the Breast Cancer dataset to a variable called 'file_url'

In [0]:
file_url = 'https://raw.githubusercontent.com/PacktWorkshops/The-Data-Science-Workshop/master/Chapter11/dataset/breast-cancer-wisconsin.data'

4. Create a list called 'col_names' with the following names: 'Sample code number','Clump Thickness','Uniformity of Cell Size','Uniformity of Cell Shape','Marginal Adhesion','Single Epithelial Cell Size',
'Bare Nuclei','Bland Chromatin','Normal Nucleoli','Mitoses','Class'

In [0]:
col_names = ['Sample code number','Clump Thickness','Uniformity of Cell Size','Uniformity of Cell Shape','Marginal Adhesion','Single Epithelial Cell Size',
'Bare Nuclei','Bland Chromatin','Normal Nucleoli','Mitoses','Class']

5. Load the dataset into DataFrame using pd.read_csv() with the follwong parameters: header=None, names=col_names, na_values='?'

In [0]:
df = pd.read_csv(file_url, header=None, names=col_names, na_values='?')

6. Print the first 5 rows using the method .head()

In [0]:
df.head()

Unnamed: 0,Sample code number,Clump Thickness,Uniformity of Cell Size,Uniformity of Cell Shape,Marginal Adhesion,Single Epithelial Cell Size,Bare Nuclei,Bland Chromatin,Normal Nucleoli,Mitoses,Class
0,1000025,5,1,1,1,2,1.0,3,1,1,2
1,1002945,5,4,4,5,7,10.0,3,2,1,2
2,1015425,3,1,1,1,2,2.0,3,1,1,2
3,1016277,6,8,8,1,3,4.0,3,7,1,2
4,1017023,4,1,1,3,2,1.0,3,1,1,2


7. Replace all missing values with 0 using the method .fillna()

In [0]:
df.fillna(0, inplace=True)

8. Extract the response variable 'Class' using the method .pop()

In [0]:
y = df.pop('Class')

9. Remove the column 'Sample code number' using the method .drop() with axis=1 as parameter to specify we are dropping columns and not rows. Save the result into a DataFrame called 'X'

In [0]:
X = df.drop('Sample code number', axis=1)

10. Print the first 5 rows using the method .head()

In [0]:
X.head()

Unnamed: 0,Clump Thickness,Uniformity of Cell Size,Uniformity of Cell Shape,Marginal Adhesion,Single Epithelial Cell Size,Bare Nuclei,Bland Chromatin,Normal Nucleoli,Mitoses
0,5,1,1,1,2,1.0,3,1,1
1,5,4,4,5,7,10.0,3,2,1
2,3,1,1,1,2,2.0,3,1,1
3,6,8,8,1,3,4.0,3,7,1
4,4,1,1,3,2,1.0,3,1,1


11. Split into training and testing sets using the function 'train_test_split' with the parameters test_size=0.33, random_state=888

In [0]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.33, random_state=888)

12. Instantiate a RandomForestClassifier with random_state=1 and save it into a new variable called 'rf_model'

In [0]:
rf_model = RandomForestClassifier(random_state=1)

13. Train the RandomForest model with X_train and y_train

In [0]:
rf_model.fit(X_train, y_train)



RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
                       max_depth=None, max_features='auto', max_leaf_nodes=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=10,
                       n_jobs=None, oob_score=False, random_state=1, verbose=0,
                       warm_start=False)

14. Predict the outcome for the first record of X_test using the method .predict()

In [0]:
rf_model.predict([X_test.iloc[0,]])

array([2])

15. Save the RandomForest model as a sperate file called 'model.pkl' using joblib.dump()

In [0]:
joblib.dump(rf_model, "model.pkl") 

['model.pkl']

16.  Import the socket, threading, requests, json and numpy packages and the classes Flask, jsonify and request from the package flask

In [0]:
import socket
import threading
import requests
import json
from flask import Flask, jsonify, request
import numpy as np

17. Save the host IP address into a new variable called 'ip_address' using the method .gethostbyname() and .gethostname(). Display the value of this new variable

In [0]:
ip_address = socket.gethostbyname(socket.gethostname())
ip_address

'172.28.0.2'

18. Create a Flask app and save it into a new variable called 'app'

In [0]:
app = Flask(__name__)

19. Load the pre-trained model using joblib.load()

In [0]:
trained_model = joblib.load("model.pkl")

20. Create an API endpoint for the path 'api' that accepts only POST requests and will call a function called predict(). This function will read the JSON received using the method request.get_json(), predict the outcome with 'trained_model', convert the prediction from numpy array to string with array2string() and then to JSON with jsonify()

In [0]:
@app.route('/api', methods=['POST'])
def predict():
  data = request.get_json()
  prediction = trained_model.predict(data)
  str_pred = np.array2string(prediction)
  return jsonify(str_pred)

21. Create a new thread for running your Flask app using the method threading.Thread with the following parameters: target=app.run, kwargs={'host':'0.0.0.0','port':80}

In [0]:
flask_thread = threading.Thread(target=app.run, kwargs={'host':'0.0.0.0','port':80})
flask_thread.start()

 * Serving Flask app "__main__" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: off


Exception in thread Thread-6:
Traceback (most recent call last):
  File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 990, in run
    run_simple(host, port, self, **options)
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 1010, in run_simple
    inner()
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 963, in inner
    fd=fd,
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 806, in make_server
    host, port, app, request_handler, passthrough_errors, ssl_context, fd=fd
  File "/usr/local/lib/python3.6/dist-packages/werkzeug/serving.py", line 699, in __init__
    HTTPServer.__init__(self, server_address, handler)
  File "/usr/lib/python3.6/socketserver.py", line 456, in __init__
    self.server_bind()
  File "

22. Convert the first record of X_test into a list and print its content

In [0]:
record = X_test.iloc[0,].to_list()
record

[2.0, 3.0, 1.0, 1.0, 5.0, 1.0, 1.0, 1.0, 1.0]

23. Create a variable called j_data that will convert this record into JSON by calling the method json.dumps(). 

In [0]:
j_data = json.dumps([record])

24. Create a dictionary called headers with the following key-value pairs: 'content-type': 'application/json', 'Accept-Charset': 'UTF-8'

In [0]:
headers = {'content-type': 'application/json', 'Accept-Charset': 'UTF-8'}

25. Send a HTTP POST request to the server using the method requests.post() with the HTTP url to the endpoint, j_data and headers as its parameters and print its .text attribute

In [0]:
r = requests.post(f"http://{ip_address}/api", data=j_data, headers=headers)
r.text

172.28.0.2 - - [03/Nov/2019 20:56:13] "[37mPOST /api HTTP/1.1[0m" 200 -


'"[2]"\n'

Excellent! We just deployed our pre-trained Machine Learning algorithm into a WeB API. In a real-world project, you will have to deploy it on separate server within your organisation and need to configure networking settings so that the authorised systems or services can send requests to this API.