# Model Deployment Example in Python

Aim of this notebook is to demonstrate how we can deploy model in production. Following are the steps which I am following below.
* Create SQL lite database
* Populate IRIS data into created database
* Generate Train, Dev and Test datasets
* Train and Dev are used to test model performance. This step can be iterated regularly with new data for model maintainance.
* Test data will be used as new dataset for which we want to perform prediction.

In [179]:
import os

In [173]:
from sklearn import datasets
import sqlite3   

In [141]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import pickle

### SQL Lite Database creation

In [197]:
def getDatabaseConnection():
    fd = os.open('example.db', os.O_CREAT)
    conn = sqlite3.connect('example.db')
    return conn

### Create table for storing Iris data

In [201]:
def createTable():
    conn = getDatabaseConnection()
    
    iris = datasets.load_iris()
    
    c = conn.cursor()

    # Create table
    c.execute('''CREATE TABLE iris
                 (sepal_length real, sepal_width real, petal_length real, petal_width real, target int)''')
    
    print("....Database Created....")

### Load Iris data into database

In [202]:
def loadIrisToDatabase():
    conn = getDatabaseConnection()
    
    c = conn.cursor()
    
    for i in range(1,iris.data.shape[0]):
        c.execute("INSERT INTO iris VALUES (%d,%d,%d,%d,%d)" % (iris.data[i,0],iris.data[i,1],iris.data[i,2],iris.data[i,3], iris.target[i]))
    
    conn.commit()
    print('.... Data Loaded....')

### Create Datasets as Train, Dev and Test

Here we are just calling single query to get all data. But in production systems train and dev data will in one table separately and test set will be provided when actual predictions are required.

In [163]:
def createDatasets():
    df = pd.DataFrame(list(c.execute('SELECT * FROM iris')))
    traindevx, testx, traindevy, testy = train_test_split(df.loc[:,:4], df.loc[:,4], test_size=0.2)
    trainx, devx, trainy, devy = train_test_split(traindevx, traindevy, test_size=0.3)
    return (traindevx, traindevy, devx, devy, testx, testy)

### Train model on Train data set

Train and Dev are used to validate model performance over time. This step is iterated in a schuduled way to keep model updated with new datasets.<br>
We are using PICKLE here to store model as file in disk. This file will be loaded again in future to work with model.

In [142]:
def trainModel(trainx, trainy):
    model = RandomForestClassifier()
    model.fit(trainx, trainy)
    pickle.dump(model, open('saved_model', 'wb'))

### Predict Method

We are loaded saved PICKLE file as model and used to return prediction output.

In [145]:
def predictModel(testx):
    model = pickle.load(open('saved_model', 'rb'))
    pred = model.predict(testx)
    return pred

### Model Accuracy

In this step we are printing confusion metrix, precision, recall and accuracy of model.

In [151]:
def checkAccuracy(predy, actualy):
    con_mat = confusion_matrix(trainy_pred, trainy)
    print(con_mat)
    precision = precision_score(trainy_pred, trainy, average=None)
    recall = recall_score(trainy_pred, trainy, average=None)
    accuracy = accuracy_score(trainy_pred, trainy)
    print('Precision is {} Recall is {} Accuracy is {}'.format(precision, recall, accuracy))

## Above steps in ACTION

In [203]:
createTable()
loadIrisToDatabase()

....Database Created....
.... Data Loaded....


In [204]:
data_sets = createDatasets()


data_sets[0], data_sets[1] - Train dataset
data_sets[2], data_sets[3] - Dev dataset
data_sets[4], data_sets[5] - Test data

In [None]:
trainModel(data_sets[0], data_sets[1])

## Check Accuracy on Dev set

In [205]:
pred = predictModel(data_sets[2])
checkAccuracy(pred, data_sets[3])

[[12  0  0]
 [ 0 11  0]
 [ 0  0 13]]
Precision is [ 1.  1.  1.] Recall is [ 1.  1.  1.] Accuracy is 1.0


## Check Accuracy on Test set

In [206]:
pred = predictModel(data_sets[4])
checkAccuracy(pred, data_sets[5])

[[ 7  0  0]
 [ 0 14  0]
 [ 0  0  9]]
Precision is [ 1.  1.  1.] Recall is [ 1.  1.  1.] Accuracy is 1.0
