<a href="https://colab.research.google.com/github/Amanjyot62/Data-Science-Using-Pyhton/blob/master/Deploying_a_machine_Learning_Model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Deploying a Machine Learning Model

Building a web API is the most popular way to expose a machine learning model in a production environment. But before diving into this, we need to understand what a web API is first.

A web API (also known as a web service) is a programming interface that allows web communication between a client and a server. It is usually composed of one or multiple endpoints that expose resources from the server side that can be accessed externally. A web API relies on a request-response messaging mechanism for handling received requests and sent responses.

But we need to train a model first. Let's build a classifier with RandomForest algorithm with the Bank Marketing dataset

First, we need to import the required packages:

Then we will load the dataset into a DataFrame:



In [0]:
import pandas as pd
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
file_url = 'https://raw.githubusercontent.com/PacktWorkshops/The-Data-Science-Workshop/master/Chapter03/bank-full.csv'
df = pd.read_csv(file_url, sep=';')

Then we will extract the response variable, which is the y column in this dataset, using the .pop() method from pandas:
After this, we need to one-hot encode the categorical variables using the .get_dummies() method:

In [0]:
y = df.pop('y')
df_dummies = pd.get_dummies(df)


The final step before modeling is to split the data into training and testing sets. To do so, we will use the train_test_split() function from sklearn:

In [0]:
X_train, X_test, y_train, y_test = train_test_split(df_dummies, y, test_size=0.33, random_state=42)
rf_model = RandomForestClassifier(random_state=8)
rf_model.fit(X_train, y_train)
rf_model.predict(X_test)

We can also predict the outcome on a single record from the test set. sklearn models expect a 2-dimensional array as input, so we need to wrap our record into another list:

In [0]:
rf_model.predict([X_test.iloc[3776,]])


Note: We are not interested in improving the performance of this model. We just need a trained model that we can deploy on our Flask app.



Before adding our model to the Flask app, we need to save it as a file. We will use the .dump() method from the joblib package:



In [0]:
import joblib
joblib.dump(rf_model, "model.pkl")


Your model is saved on the filesystem, and the filename is model.pkl. To load this model, we can use the .load() method:



In [0]:
saved_model = joblib.load("model.pkl")
saved_model.predict([X_test.iloc[3776,]])


Now we can create a new API endpoint called /predict that will predict the outcome using this model on the data it receives as input. Within the API function, we need to read the input data, perform the prediction with our pre-loaded model, convert the prediction into a string using the array2string method from numpy, and finally convert it to JSON using jsonify()

In [0]:
import numpy as np

@app.route('/predict', methods=['POST'])
def rf_predict():
  data = request.get_json()
  prediction = saved_model.predict(data)
  str_pred = np.array2string(prediction)
  return jsonify(str_pred)

Now we need to send a POST request with the record we want to get prediction from. We will use the same example as previously: record number 3776. First, we need to convert it into a list by using the .to_list() method from pandas:

In [0]:
record = X_test.iloc[3776,].to_list()
record
j_data = json.dumps([record])


In [0]:
Finally, we can send a POST request with this converted record:

r = requests.post("http://172.28.0.2/predict", data=j_data, headers=headers)
r.text

In [0]:
Great! We got the exact same prediction as before, but this time we got it from our model deployed as a Flask app. As you can see, it is relatively simple to expose a machine learning algorithm as a web API.