## AccelerateAI Data Science Global Bootcamp

### Saving a Trained model for use later

In production, we do not train the model. We only use a trained model.
There are multiple ways in which a trained model can be saved for use in production. 

***

In [None]:
# Let's train a simple model 
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.datasets import make_regression

# Read the stock price dataset 
stock_df = pd.read_csv("MLR_Q10_StockPrice.csv")       

# Train a Linear regression model 
Y = stock_df["Stock Price"]
X = stock_df[["ROE", "Dividend"]]

lr_model = LinearRegression()
lr_model.fit(X,Y)

#Predict 
x=[[15,2.5]]
lr_model.predict(x)

## 1. Save as Pickle 
Python pickle module is used for serializing and de-serializing a Python object structure. 
- Any object in Python can be pickled so that it can be saved on disk.
- Pickling is a way to convert a python object (including ML models) into a character stream. 
- This character stream contains all the information necessary to reconstruct the object later.

In [None]:
# loading library
import pickle

# create an iterator object with write permission - model.pkl
with open('model_pkl', 'wb') as files:
    pickle.dump(lr_model, files)
    
# load saved model
with open('model_pkl' , 'rb') as f:
    lr = pickle.load(f)

# check prediction
lr.predict(x)

## 2. Save using Joblib

Joblib is a set of tools to provide lightweight pipelining in Python. In particular:
- transparent disk-caching of functions and lazy re-evaluation 
- easy simple parallel computing

In [None]:
from joblib import dump, load

# saving our model # model - model , filename-model_jlib
dump(lr_model, 'model_jlib.joblib') 

# opening the file- model_jlib
m_jlib = load('model_jlib.joblib')

# check prediction
m_jlib.predict(x)

## 3. PMML - Predictive Modeling Markup Language

PMML provides a way for analytic applications to describe and exchange predictive models produced by data mining and machine learning algorithms. 
- PMML is developed by the Data Mining Group (DMG), a consortium of commercial and open-source data mining companies. 
- PMML is XML-based.The standard was developed in late 1990s, with latest version of PMML,released in December 2011
- PMML can help move predictive models from one statistical software to another.

In [None]:
# Using sklearn2pmml - 2 step process
from sklearn.pipeline import Pipeline
from sklearn import datasets, tree
from sklearn.tree import DecisionTreeClassifier
from sklearn2pmml.pipeline import PMMLPipeline

iris = datasets.load_iris()

pipeline = PMMLPipeline(
    [
        (
            "classifier",
            DecisionTreeClassifier(),
        )
    ]
)


pipeline.fit(iris.data, iris.target)

from sklearn2pmml import sklearn2pmml, make_pmml_pipeline
sklearn2pmml(pipeline, "DecisionTree.pmml", with_repr = True)

In [None]:
# Using nyoka - 1 step process
from nyoka import skl_to_pmml

skl_to_pmml(pipeline, features, target, "DecisionTree.pmml")

After the pmml is created, the acccuracy of predictions from the pmml fie can be compared with that from the scikit-learn by using the pypmml library.

In [None]:
from pypmml import Model
import pandas as pd

model = Model.fromFile("DecisionTree.pmml") 

## 4. Tensorflow 

A SavedModel contains a complete TensorFlow program, including trained parameters (i.e, tf.Variables) and computation. 

We can save and load a model in the SavedModel format using the following APIs:
- Low-level tf.saved_model API. This document describes how to use this API in detail.
  - Save: tf.saved_model.save(model, path_to_dir)
  - Load: model = tf.saved_model.load(path_to_dir)
- High-level tf.keras.Model API 
- Keras also supports saving a single HDF5 file 

In [None]:
# Create and train a new model instance.
model = create_model()
model.fit(train_images, train_labels, epochs=5)

# Save the entire model as a SavedModel.
!mkdir -p saved_model
model.save('saved_model/my_model')

new_model = tf.keras.models.load_model('saved_model/my_model')

# Check its architecture
new_model.summary()

In [None]:
# Calling `save('my_model.h5')` creates a h5 file `my_model.h5`.
model.save("my_h5_model.h5")

# It can be used to reconstruct the model identically.
reconstructed_model = keras.models.load_model("my_h5_model.h5")