## What is Pickle?
Pickle is one of the Python standard libraries. Which is so powerful and the best choice to perform the task like

* Serialization
* Marshalling
The above two functionalities are popularly known as  Pickling and Unpickling

**Pickling**
Pickling is the process converting any Python object into a stream of bytes by following the hierarchy of the object we are trying to convert.
**Unpickling**
Unpickling is the process of converting the pickled (stream of bytes) back into to the original Python object by following the object hierarchy

In [7]:
import pandas as pd
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
import sys
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pd.read_csv(r"C:\Users\J554696\Desktop\diabetic_data.csv",names=names)
dataframe.head()

Unnamed: 0,preg,plas,pres,skin,test,mass,pedi,age,class
0,6,148,72,35,0,33.6,0.627,50,1
1,1,85,66,29,0,26.6,0.351,31,0
2,8,183,64,0,0,23.3,0.672,32,1
3,1,89,66,23,94,28.1,0.167,21,0
4,0,137,40,35,168,43.1,2.288,33,1


In [10]:
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)

LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
          intercept_scaling=1, max_iter=100, multi_class='ovr', n_jobs=1,
          penalty='l2', random_state=None, solver='liblinear', tol=0.0001,
          verbose=0, warm_start=False)

In [11]:
# save the model to disk
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))

In [12]:
# load the model from disk
loaded_model = pickle.load(open(filename, 'rb'))
result = loaded_model.score(X_test, Y_test)
print(result)

0.7559055118110236


In [None]:
# Close the pickle instances
loaded_model.close()

## joblib
Joblib is part of the SciPy ecosystem and provides utilities for pipelining Python jobs.

The Joblib API provides utilities for saving and loading Python objects that make use of NumPy data structures, efficiently. It may be a faster approach for you to use with very large models.

Install Job Lib : **pip install joblib**

The API looks a lot like the pickle API, for example, you may save your trained model as follows:

In [13]:
joblib.dump(model, "pima.joblib.dat")

NameError: name 'joblib' is not defined

You can later load the model from file and use it to make predictions as follows:

In [None]:
# load model from file
loaded_model = joblib.load("pima.joblib.dat")

## Refrence

http://dataaspirant.com/2017/02/13/save-scikit-learn-models-with-python-pickle/

https://machinelearningmastery.com/save-gradient-boosting-models-xgboost-python/