# Operationalising ML

![](https://www.researchgate.net/profile/Philipp-Hartlieb/publication/361258805/figure/fig1/AS:1168965330042880@1655714461815/Main-phases-of-the-ML-life-cycle-targeted-at-operationalising-ML-models-in-production.ppm)

In [4]:
import pandas as pd
df3 = pd.read_csv('datasets/drug200.csv')

In [5]:
df3

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,F,HIGH,HIGH,25.355,DrugY
1,47,M,LOW,HIGH,13.093,drugC
2,47,M,LOW,HIGH,10.114,drugC
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY
...,...,...,...,...,...,...
195,56,F,LOW,HIGH,11.567,drugC
196,16,M,LOW,HIGH,12.006,drugC
197,52,M,NORMAL,HIGH,9.894,drugX
198,23,M,NORMAL,NORMAL,14.020,drugX


# Where we left it off

In [6]:
# Making it work for 2 inputs. This is your task
test_data = [[28, 'F', 'NORMAL', 'HIGH', 7.798], [61, 'F', 'LOW', 'HIGH', 18.043]]

In [8]:
df3.iloc[3:5,:]

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY


# Answer

In [23]:
from joblib import load, dump

def get_prediction(model_path, encoder_path, label_encoder_path, user_input):
    
    # Let's load our model
    clf = load(model_path) # load and reuse the model
    print('Model Successfully Loaded')

    # Let's load our encoder
    enc = load(encoder_path) # load and reuse the model
    print('Encoder Successfully Loaded')
 
    # Let's load my label encoder
    le = load(label_encoder_path) # load and reuse the model
    print('Label Encoder Successfully Loaded')
    
    # 1. Firstly, create a DataFrame out of the user input
    
    # Now, i know i have told you that you can do this with dictionary comprehensions. 
    # But i was just sneakily trying to teach you about them. 
    # The truth is, you could have easily done the same with pd.DataFrame. Look !!!
    # Hey, don't be mad at me, we all learned something here. 
    
    pd_columns = ['Age', 'Sex', 'BP', 'Cholesterol', 'Na_to_K']

    # Creating DataFrame
    df_temp = pd.DataFrame(user_input, columns = pd_columns)
    print("DataFrame Successfully created from user data.")
    
    # 2. Get your categorical df with df3[['Sex', 'BP', 'Cholesterol']]
    cat_data = df3[['Sex', 'BP', 'Cholesterol']]
    
    # 3. Get your numerical df with df3[['Age', 'Na_to_K']]
    num_data = df3[['Age', 'Na_to_K']]
    
    # 4. Encode your categorical columns
    enc.transform(cat_data) # This will encode, but give me a sparse matrix in result
    enc.transform(cat_data).toarray() # This will give me the array i want
    
    # Let's create a DataFrame now
    pd.DataFrame(enc.transform(cat_data).toarray(), columns = enc.get_feature_names_out())
    print("DataFrame Successfully Encoded.")
    
    # 5. Save your encoded df.
    df_encoded = pd.DataFrame(enc.transform(cat_data).toarray(), columns = enc.get_feature_names_out())
    
    # 6. Combine your encoded df with your df_num
    df_X = num_data.join(df_encoded)
    print("Data Successfully Transformed to desired format.")
    
    # This about this
    # 7. At step 7, your data looks exactly like the data you used to train your model
    
    # ---------------------- This is Done now ----------------------
    
    # 8. Can you not just do clf.predict(yourdata)
    clf.predict(df_X) # This gives me a label as a prediction in an array
    
    prediction = clf.predict(df_X) # Saving my prediction
    print("Successfully captured prediction.")

    
    # 9. This will give your a label.
    le.inverse_transform(prediction) # This will again give me the output saved in an array. I will extract the 0'th item and return that.
    output = le.inverse_transform(prediction) # Saving my output
    print("Label Generated")
    
    print('------------------------------------------------------')
    
    # 10. You will have to convert that encoded label to the actual label, you can do that with you label encoder. 
    return output

# Let's see if this works now

In [24]:
# Let's see
get_prediction('my_df3_decision_tree.joblib', 'my_df3_encoder.joblib', 'my_df3_label_encoder.joblib', [[28, 'F', 'NORMAL', 'HIGH', 7.798], [61, 'F', 'LOW', 'HIGH', 18.043]])

Model Successfully Loaded
Encoder Successfully Loaded
Label Encoder Successfully Loaded
DataFrame Successfully created from user data.
DataFrame Successfully Encoded.
Data Successfully Transformed to desired format.
Successfully captured prediction.
Label Generated
------------------------------------------------------


'DrugY'

In [25]:
get_prediction('my_df3_decision_tree.joblib', 'my_df3_encoder.joblib', 'my_df3_label_encoder.joblib', [[28, 'F', 'NORMAL', 'HIGH', 7.798], [61, 'F', 'LOW', 'HIGH', 18.043]])

Model Successfully Loaded
Encoder Successfully Loaded
Label Encoder Successfully Loaded
DataFrame Successfully created from user data.
DataFrame Successfully Encoded.
Data Successfully Transformed to desired format.
Successfully captured prediction.
Label Generated
------------------------------------------------------


'DrugY'

### An explanation, of how i used it 

In [None]:
# Import pandas library
import pandas as pd

In [30]:
# initialize list of lists
data = [['SomeText', 20], ['Sometext2', 5], ['some_text3', 100]]

In [31]:
# Create the pandas DataFrame
df = pd.DataFrame(data, columns = ['Name', 'Age'])

In [33]:
# Let's look at the dataframe.
df

Unnamed: 0,Name,Age
0,SomeText,20
1,Sometext2,5
2,some_text3,100


### Now, let's try to do this on our DataFrame

In [35]:
user_data = [[28, 'F', 'NORMAL', 'HIGH', 7.798], [61, 'F', 'LOW', 'HIGH', 18.043]]

In [36]:
user_data

[[28, 'F', 'NORMAL', 'HIGH', 7.798], [61, 'F', 'LOW', 'HIGH', 18.043]]

In [37]:
pd_columns = ['Age', 'Sex', 'BP', 'Cholesterol', 'Na_to_K']

In [38]:
pd_columns

['Age', 'Sex', 'BP', 'Cholesterol', 'Na_to_K']

In [39]:
pd.DataFrame(user_data, columns=pd_columns)

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K
0,28,F,NORMAL,HIGH,7.798
1,61,F,LOW,HIGH,18.043


### Well
Isn't this exactly what we were looking for. Job done 

-------

# What if we want to read from a csv. 

That would make my life much easier

In [1]:
import pandas as pd
df3 = pd.read_csv('datasets/drug200.csv')

In [2]:
df3

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,F,HIGH,HIGH,25.355,DrugY
1,47,M,LOW,HIGH,13.093,drugC
2,47,M,LOW,HIGH,10.114,drugC
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY
...,...,...,...,...,...,...
195,56,F,LOW,HIGH,11.567,drugC
196,16,M,LOW,HIGH,12.006,drugC
197,52,M,NORMAL,HIGH,9.894,drugX
198,23,M,NORMAL,NORMAL,14.020,drugX


In [41]:
df3[['Sex', 'BP' , 'Cholesterol']].drop_duplicates().index

Int64Index([0, 1, 3, 4, 8, 9, 11, 17, 27, 28, 35, 36], dtype='int64')

In [42]:
index = df3[['Sex', 'BP' , 'Cholesterol']].drop_duplicates().index

In [44]:
df3.iloc[index]

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,F,HIGH,HIGH,25.355,DrugY
1,47,M,LOW,HIGH,13.093,drugC
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY
8,60,M,NORMAL,HIGH,15.171,DrugY
9,43,M,LOW,NORMAL,19.368,DrugY
11,34,F,HIGH,NORMAL,19.199,DrugY
17,43,M,HIGH,HIGH,13.972,drugA
27,49,F,NORMAL,NORMAL,9.381,drugX
28,39,F,LOW,NORMAL,22.697,DrugY


In [45]:
df_temp = df3.iloc[index]

In [46]:
df_temp

Unnamed: 0,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,23,F,HIGH,HIGH,25.355,DrugY
1,47,M,LOW,HIGH,13.093,drugC
3,28,F,NORMAL,HIGH,7.798,drugX
4,61,F,LOW,HIGH,18.043,DrugY
8,60,M,NORMAL,HIGH,15.171,DrugY
9,43,M,LOW,NORMAL,19.368,DrugY
11,34,F,HIGH,NORMAL,19.199,DrugY
17,43,M,HIGH,HIGH,13.972,drugA
27,49,F,NORMAL,NORMAL,9.381,drugX
28,39,F,LOW,NORMAL,22.697,DrugY


In [47]:
df_temp.reset_index()

Unnamed: 0,index,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,0,23,F,HIGH,HIGH,25.355,DrugY
1,1,47,M,LOW,HIGH,13.093,drugC
2,3,28,F,NORMAL,HIGH,7.798,drugX
3,4,61,F,LOW,HIGH,18.043,DrugY
4,8,60,M,NORMAL,HIGH,15.171,DrugY
5,9,43,M,LOW,NORMAL,19.368,DrugY
6,11,34,F,HIGH,NORMAL,19.199,DrugY
7,17,43,M,HIGH,HIGH,13.972,drugA
8,27,49,F,NORMAL,NORMAL,9.381,drugX
9,28,39,F,LOW,NORMAL,22.697,DrugY


In [48]:
df_temp = df_temp.reset_index()

In [49]:
df_temp

Unnamed: 0,index,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,0,23,F,HIGH,HIGH,25.355,DrugY
1,1,47,M,LOW,HIGH,13.093,drugC
2,3,28,F,NORMAL,HIGH,7.798,drugX
3,4,61,F,LOW,HIGH,18.043,DrugY
4,8,60,M,NORMAL,HIGH,15.171,DrugY
5,9,43,M,LOW,NORMAL,19.368,DrugY
6,11,34,F,HIGH,NORMAL,19.199,DrugY
7,17,43,M,HIGH,HIGH,13.972,drugA
8,27,49,F,NORMAL,NORMAL,9.381,drugX
9,28,39,F,LOW,NORMAL,22.697,DrugY


In [50]:
df_temp.shape[0]

12

In [51]:
import numpy as np

In [59]:
np.zeros(12)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.])

In [60]:
df_temp['Drug'] = np.zeros(12)

In [61]:
df_temp['Drug']

0     0.0
1     0.0
2     0.0
3     0.0
4     0.0
5     0.0
6     0.0
7     0.0
8     0.0
9     0.0
10    0.0
11    0.0
Name: Drug, dtype: float64

In [67]:
df_temp

Unnamed: 0,index,Age,Sex,BP,Cholesterol,Na_to_K,Drug
0,0,23,F,HIGH,HIGH,25.355,0.0
1,1,47,M,LOW,HIGH,13.093,0.0
2,3,28,F,NORMAL,HIGH,7.798,0.0
3,4,61,F,LOW,HIGH,18.043,0.0
4,8,60,M,NORMAL,HIGH,15.171,0.0
5,9,43,M,LOW,NORMAL,19.368,0.0
6,11,34,F,HIGH,NORMAL,19.199,0.0
7,17,43,M,HIGH,HIGH,13.972,0.0
8,27,49,F,NORMAL,NORMAL,9.381,0.0
9,28,39,F,LOW,NORMAL,22.697,0.0


In [68]:
df_temp.to_csv('my_predictions.csv',index=False)

# What will we do now. 
- Load the models 
- Read the csv
- Make the predictions
- Save the predictions

# Answer

In [64]:
from joblib import load, dump

def get_prediction(model_path, encoder_path, label_encoder_path, user_file_path):
    
    # Let's load our model
    clf = load(model_path) # load and reuse the model
    print('Model Successfully Loaded')

    # Let's load our encoder
    enc = load(encoder_path) # load and reuse the model
    print('Encoder Successfully Loaded')
 
    # Let's load my label encoder
    le = load(label_encoder_path) # load and reuse the model
    print('Label Encoder Successfully Loaded')
    
    # Let's load the user file
    df3 = pd.read_csv(user_file_path)
    print('User File successfully loaded')
    
    # 2. Get your categorical df with df3[['Sex', 'BP', 'Cholesterol']]
    cat_data = df3[['Sex', 'BP', 'Cholesterol']]
    
    # 3. Get your numerical df with df3[['Age', 'Na_to_K']]
    num_data = df3[['Age', 'Na_to_K']]
    
    # 4. Encode your categorical columns
    enc.transform(cat_data) # This will encode, but give me a sparse matrix in result
    enc.transform(cat_data).toarray() # This will give me the array i want
    
    # Let's create a DataFrame now
    pd.DataFrame(enc.transform(cat_data).toarray(), columns = enc.get_feature_names_out())
    print("DataFrame Successfully Encoded.")
    
    # 5. Save your encoded df.
    df_encoded = pd.DataFrame(enc.transform(cat_data).toarray(), columns = enc.get_feature_names_out())
    
    # 6. Combine your encoded df with your df_num
    df_X = num_data.join(df_encoded)
    print("Data Successfully Transformed to desired format.")
    
    # This about this
    # 7. At step 7, your data looks exactly like the data you used to train your model
    
    # ---------------------- This is Done now ----------------------
    
    # 8. Can you not just do clf.predict(yourdata)
    clf.predict(df_X) # This gives me a label as a prediction in an array
    
    prediction = clf.predict(df_X) # Saving my prediction
    print("Successfully captured prediction.")

    
    # 9. This will give your a label.
    le.inverse_transform(prediction) # This will again give me the output saved in an array. I will extract the 0'th item and return that.
    output = le.inverse_transform(prediction) # Saving my output
    print("Label Generated")
    # 10. You will have to convert that encoded label to the actual label, you can do that with you label encoder. 
    
    # 11. Now that we have our predictions, time to save our predictions to our new columns
    df3['Drug'] = output
    
    # 12. Saving our DataFrame with our predictions
    df3.to_csv('predictions.csv')
    
    print('------------------------------------------------------')

    
    return output

# Let's see if this works now

In [70]:
# Let's see
get_prediction('my_df3_decision_tree.joblib', 'my_df3_encoder.joblib', 'my_df3_label_encoder.joblib', 'my_predictions.csv')

Model Successfully Loaded
Encoder Successfully Loaded
Label Encoder Successfully Loaded
User File successfully loaded
DataFrame Successfully Encoded.
Data Successfully Transformed to desired format.
Successfully captured prediction.
Label Generated
------------------------------------------------------


array(['DrugY', 'DrugY'], dtype=object)

# One Last Problem.
How will the user know which all things are possible.
- Time to solve for the same.
- By you !!!

# Now, we move onto scripting