# **Why do We Need to Save a Machine Learning Model?**

After training a machine learning model, we can save it to use the same model again in the future to test it on a new dataset or compare its performance with other models. We can also use a saved machine learning model to deploy in the final application.

There are many methods of saving a machine learning model. If you have never saved a model before then in the section below you will learn how to easily save and reuse a machine learning model by using the Python programming language.

# **Steps to Save a Machine Learning Model**

Below are the steps for saving a machine learning model:

1. First train a model
2. Then convert it into a byte stream
3. Save the byte stream as a binary file


So these were the steps for saving a machine learning model. In the Python programming language, the concept of converting a machine learning model into a byte stream is known as **pickling**, and if you want to use the saved model again then you have to convert the byte stream file into a usable machine learning model which is known as **unpickling**.

Now let’s see how to save a machine learning model. Here I will first train a linear regression model and then I will save it by using the pickle method in Python:

In [2]:
import zipfile
import os

In [3]:
!wget --no-check-certificate \
    "https://github.com/hussain0048/Projects-/archive/refs/heads/master.zip" \
    -O "/tmp/Projects-.zip"


--2021-07-03 04:43:05--  https://github.com/hussain0048/Projects-/archive/refs/heads/master.zip
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://codeload.github.com/hussain0048/Projects-/zip/refs/heads/master [following]
--2021-07-03 04:43:06--  https://codeload.github.com/hussain0048/Projects-/zip/refs/heads/master
Resolving codeload.github.com (codeload.github.com)... 140.82.114.9
Connecting to codeload.github.com (codeload.github.com)|140.82.114.9|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [application/zip]
Saving to: ‘/tmp/Projects-.zip’

/tmp/Projects-.zip      [  <=>               ]   1.11M  2.61MB/s    in 0.4s    

2021-07-03 04:43:06 (2.61 MB/s) - ‘/tmp/Projects-.zip’ saved [1168951]



In [4]:
zip_ref = zipfile.ZipFile('/tmp/Projects-.zip', 'r') #Opens the zip file in read mode
zip_ref.extractall('/tmp') #Extracts the files into the /tmp folder
zip_ref.close()

In [6]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.utils import shuffle

In [7]:
data = pd.read_csv("/tmp/Projects--master/Datasets/student-mat.csv")

In [8]:
data = data[["G1", "G2", "G3", "studytime", "failures", "absences"]]


In [9]:
predict = "G3"


In [10]:
x = np.array(data.drop([predict], 1))
y = np.array(data[predict])

In [11]:
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)


In [12]:
linear_regression = LinearRegression()
linear_regression.fit(xtrain, ytrain)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [13]:
import pickle
with open("pickle_model", "wb") as file:
    pickle.dump(linear_regression, file)

So we have saved the model, you can see a new file created as “pickle_model” in the same directory where your Python file is. Now if you want to use the saved model to predict values on the test set then you can run the code mentioned below:

In [14]:
with open("pickle_model", "rb") as file:
    model = pickle.load(file)

predictions = model.predict(xtest)

for i in range(len(predictions)):
    print(predictions[x], xtest[x], [ytest[x]])

[[ 8.46371183 10.01179555  8.71377892  9.74035918 10.01179555]
 [ 8.46371183  8.46371183  8.71377892  9.74035918 15.25045985]
 [10.71608231 10.66522518  8.71377892 17.13599701 15.39686268]
 ...
 [15.39686268 10.66522518  8.46371183 17.13599701 17.13599701]
 [10.18087334 11.88469807  8.46371183  9.74035918  9.74035918]
 [10.66522518  7.4220796   8.46371183  9.74035918  8.46371183]] [[[ 8  9  2  0  4]
  [11 10  1  0  8]
  [10  9  2  0  4]
  [11 10  1  0  0]
  [11 10  1  0  8]]

 [[ 8  9  2  0  4]
  [ 8  9  2  0  4]
  [10  9  2  0  4]
  [11 10  1  0  0]
  [16 15  2  0  2]]

 [[11 11  1  0  0]
  [10 11  2  0  4]
  [10  9  2  0  4]
  [17 17  4  0  0]
  [15 15  2  0 10]]

 ...

 [[15 15  2  0 10]
  [10 11  2  0  4]
  [ 8  9  2  0  4]
  [17 17  4  0  0]
  [17 17  4  0  0]]

 [[12 10  2  1 18]
  [12 12  1  0  2]
  [ 8  9  2  0  4]
  [11 10  1  0  0]
  [11 10  1  0  0]]

 [[10 11  2  0  4]
  [ 9  8  4  0  2]
  [ 8  9  2  0  4]
  [11 10  1  0  0]
  [ 8  9  2  0  4]]] [array([[10, 10, 11, 10, 10]

# **References**

[How to Save a Machine Learning Model?](https://thecleverprogrammer.com/2021/05/13/how-to-save-a-machine-learning-model/)