# How to Deploy Machine Learning Algorithms Into the Production Environment

## A Step-by-Step Guide for Deploying Machine Learning Algorithms into Production Systems

1. Develop the ML algorithm
2. Develop a function to make individual predictions
3. Develop a web service wrapper to call individual predictions
4. Deploy the solution onto a production server

https://www.kaggle.com/abohelal/random-forest-99-accuracy-knn-96-accuracy/data

In [57]:
import pandas as pd
from sklearn.tree import DecisionTreeClassifier
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import KFold, cross_val_score
import pickle
import requests
from ipywidgets import Label, FloatSlider, FloatText, BoundedIntText, Button, Output, VBox, HBox

### Stage 1: Develop The Machine Learning Algorithm
The first stage is to develop your machine learning model. In the real-world this step could take many months of development and iteration around stages of the data science pipeline, but for this example the model has been kept as simple as possible so that the focus can be on how to deploy it on later stages.

#### Step 1.1: Read the data

In [20]:
df_drug = pd.read_csv("drug200.csv")
df_drug.head()

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,F,HIGH,HIGH,25.355,DrugY
1,47,M,LOW,HIGH,13.093,drugC
2,47,M,LOW,HIGH,10.114,drugC
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY


In [21]:
df_drug.describe()

Unnamed: 0,Age,Na_to_K
count,200.0,200.0
mean,44.315,16.084485
std,16.544315,7.223956
min,15.0,6.269
25%,31.0,10.4455
50%,45.0,13.9365
75%,58.0,19.38
max,74.0,38.247


#### Step 1.2: Encode the categorical features

In [6]:
label_encoder = LabelEncoder()

categorical_features = [feature for feature in df_drug.columns if df_drug[feature].dtypes == 'O']
for feature in categorical_features:
    df_drug[feature]=label_encoder.fit_transform(df_drug[feature])

df_drug.head()

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,0,0,0,25.355,0
1,47,1,1,0,13.093,3
2,47,1,1,0,10.114,3
3,28,0,2,0,7.798,4
4,61,0,1,0,18.043,0


#### Step 1.3: Train the model

In [7]:
X = df_drug.drop("Drug", axis=1)
y = df_drug["Drug"]

model = DecisionTreeClassifier(criterion="entropy")
model.fit(X, y)

DecisionTreeClassifier(criterion='entropy')

#### Step 1.4: Evaluate the model performance

In [8]:
kfold = KFold(random_state=42, shuffle=True)
cv_results = cross_val_score(model, X, y, cv=kfold, scoring="accuracy")
print(cv_results.mean(), cv_results.std())

0.99 0.012247448713915901


#### Step 1.5: Save the model
The final step of stage 1 is to save the state of the model using ``pickle`` so that it can be re-used in the production deployment without having to re-train it ...

In [9]:
pickle_file = open('model.pkl', 'ab')
pickle.dump(model, pickle_file)                     
pickle_file.close()

### Stage 2: Make Individual Predictions from the Model
Now that we have a trained model with the state saved using ``pickle`` we need to develop a function that can make individual functions.

This is assuming that our production system needs to make individual predictions as opposed to batch, but for our example we are going to assume that the production system is a doctor's desktop application that needs to predict a drug matching patients features.

#### Step 2.1: Re-read the raw data and review the categorical features and the numerical values

In [10]:
df_drug = pd.read_csv("drug200.csv")

label_encoder = LabelEncoder()

categorical_features = [feature for feature in df_drug.columns if df_drug[feature].dtypes == 'O']
for feature in categorical_features:
    print(df_drug[feature].unique())
    print(label_encoder.fit_transform(df_drug[feature].unique()))

['F' 'M']
[0 1]
['HIGH' 'LOW' 'NORMAL']
[0 1 2]
['HIGH' 'NORMAL']
[0 1]
['DrugY' 'drugC' 'drugX' 'drugA' 'drugB']
[0 3 4 1 2]


#### Step 2.1 Develop a function to make an individual prediction
We are now going to develop a function to make an individual prediction for a single patient and to assume that the data is "raw" i.e. that "F" = Female and "M" = Male for gender etc. Here goes -

In [12]:
gender_map = {"F": 0, "M": 1}
bp_map = {"HIGH": 0, "LOW": 1, "NORMAL": 2}
cholestol_map = {"HIGH": 0, "NORMAL": 1}
drug_map = {0: "DrugY", 3: "drugC", 4: "drugX", 1: "drugA", 2: "drugB"}

def predict_drug(Age, 
                 Sex, 
                 BP, 
                 Cholesterol, 
                 Na_to_K):

    # 1. Read the machine learning model from its saved state ...
    pickle_file = open('model.pkl', 'rb')     
    model = pickle.load(pickle_file)
    
    # 2. Transform the "raw data" passed into the function to the encoded / numerical values using the maps / dictionaries
    Sex = gender_map[Sex]
    BP = bp_map[BP]
    Cholesterol = cholestol_map[Cholesterol]

    # 3. Make an individual prediction for this set of data
    y_predict = model.predict([[Age, Sex, BP, Cholesterol, Na_to_K]])[0]

    # 4. Return the "raw" version of the prediction i.e. the actual name of the drug rather than the numerical encoded version
    return drug_map[y_predict]  

#### Step 2.3: Test the function ...

In [13]:
predict_drug(47, "F", "LOW",  "HIGH", 14)

'drugC'

In [14]:
predict_drug(60, "F", "LOW",  "HIGH", 20)

'DrugY'

### Stage 3 Develop a Web Service Wrapper
At this stage we have everything we need to deploy the machine learning model into a production system, provided that system is written in Python and that we are able to change it.

If these conditions hold we can simply add the ``predict_drug`` function and put a copy of the ``model.pkl`` file in the same directory and everything will work.

But what if the production system is written in Java, C# or another non-Python language? If this is the case we cannot use any part of our solution because it is written in Python and even if we convert the code these languages do not have ``scikit-learn`` or ``pickle`` libraries so the converted code is not going to run.

If this is the case then we will need to provide a web-service wrapper for our precition function ...

https://programminghistorian.org/en/lessons/creating-apis-with-python-and-flask

https://code.visualstudio.com/docs/python/tutorial-flask

If you would like to know how to set up a flask app inside VS Code such that you can debug it and step through the code, check out my article on this subject using the following link - https://grahamharrison-86487.medium.com/how-to-debug-flask-applications-in-vs-code-c65c9bdbef21

It is in First Azure App, not this folder!!!

https://graham-harrison68-web02.azurewebsites.net/drug?Age=47&Sex=F&BP=LOW&Cholesterol=HIGH&Na_to_K=14

https://graham-harrison68-web02.azurewebsites.net/drug?Age=60&Sex=F&BP=LOW&Cholesterol=HIGH&Na_to_K=20

https://www.nylas.com/blog/use-python-requests-module-rest-apis/

curl -X GET "http://127.0.0.1:5000/drug?Age=60&Sex=F&BP=LOW&Cholesterol=HIGH&Na_to_K=20"

In [63]:
prescribe_label = Label('Drug prescription prediction for age, gender, bp, cholesterol and "Na to K"')
age_text = BoundedIntText(min=16, max=100, value=47, description="Age:", disabled=False)
gender_dropdown = Dropdown(options=['F', 'M'], description='Gender:', disabled=False)
bp_dropdown = Dropdown(options=['HIGH', 'LOW', 'NORMAL'], value="LOW", description='BP:', disabled=False)
cholesterol_dropdown = Dropdown(options=['HIGH', 'NORMAL'], description='Cholesterol:', disabled=False)
na_to_k_text = BoundedFloatText(min=0.0, max=50.0, value=14, description="Na to K", disabled=False)
prescribe_button = Button(description="Presribe")
prescribe_output = Output()

# Button click event handlers ...
def prescribe_button_on_click(b):
    
    request_url = f"https://graham-harrison68-web03.azurewebsites.net/drug?Age={age_text.value}&Sex={gender_dropdown.value}&BP={bp_dropdown.value}&Cholesterol={cholesterol_dropdown.value}&Na_to_K={na_to_k_text.value}"
    response = requests.get(request_url)
    recommended_drug = response.json()["recommended_drug"]

    prescribe_output.clear_output()
    with prescribe_output:

        print(f"The recommended drug is {recommended_drug}")
        
prescribe_button.on_click(prescribe_button_on_click)

vbox_prescribe = VBox([prescribe_label, age_text, gender_dropdown, bp_dropdown, cholesterol_dropdown, na_to_k_text, prescribe_button, prescribe_output])

vbox_prescribe

VBox(children=(Label(value='Drug prescription prediction for age, gender, bp, cholesterol and "Na to K"'), Bou…