<a href="https://www.kaggle.com/code/tasnimniger/model-serialization-using-pickle-and-joblib?scriptVersionId=157734301" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Serialization Trained Model in Python

**Serialization** is a technique that transforms an object into a format that can be saved or transmitted. This serialized data can be stored or transmitted, and later we can reconstruct the object from it. The reconstructed object will have the same structure as the original one, making it easier to use it again instead of creating it from scratch.

Here, we will explore two common serialization libraries in Python, namely pickle and joblib.

In [1]:
# import python libraries
import sklearn
import sklearn.datasets
import sklearn.ensemble
import sklearn.model_selection

Create a simple classification model using the most famous Iris-dataset.

In [2]:
#load iris dataset
iris = sklearn.datasets.load_iris()

In [3]:
# Split dataset into raining and testing subsets
train, test, labels_train, labels_test = sklearn.model_selection.train_test_split(iris.data, iris.target, train_size=0.80)

In [4]:
#training an ensemble model Random Forest CLassifier
rf = sklearn.ensemble.RandomForestClassifier(n_estimators=500)
rf.fit(train, labels_train)
rf

In [5]:
#Changing working directory
import os
os.chdir('/kaggle/working/')


In [6]:
#Checking the path to the current working directory
os.getcwd()

'/kaggle/working'

# Saving trained model with pickle

The Python **pickle** module allows us to serialize and deserialize Python objects. Pickling is the process of converting a Python object hierarchy into a byte stream. If we want to save the machine learning models as a pickle file, we can use the Pickle module that comes pre-installed with Python.

To save the model, we need to define a filename with a ‘.sav’ or ‘.pkl’ extension and call the dump() method from Pickle or Joblib. 

Decide a filename and dump model to a pickle file.

In [7]:
import pickle
# save the iris classification model as a pickle file
filename = 'model_pickle.pkl'
pickle.dump(rf, open(filename, 'wb'))

# Load model with Pickle

In [8]:
# load model from pickle file
loaded_model = pickle.load(open(filename, 'rb'))
result = loaded_model.score(test, labels_test)
print(result)

0.9333333333333333


# Saving trained model with Joblib

**Joblib** is a Python library that provides lightweight pipelining capabilities. It is mainly focused on disk-caching, memoization, and parallel computing. It is optimized for NumPy arrays, making it an ideal choice for machine learning models. 

To save larger models, use the preinstalled Python Joblib module.

In [9]:
import joblib
filename = 'model_joblib.sav'
joblib.dump(rf, filename)

['model_joblib.sav']

# Load model with Joblib

In [10]:
loaded_model = joblib.load(filename)
result = loaded_model.score(test, labels_test)
print(result)

0.9333333333333333
