### How to Save a Machine Learning Model?

Whenever we want to solve the same problem on a new dataset with the same features that we solved before then we can use the same machine learning model that we trained earlier. But to use the same model again we need to save the model. 

In this file, we will see how to save a machine learning model using Python.

### Why do We Need to Save a Machine Learning Model?

After training a machine learning model, we can save it to use the same model again in the future to test it on a new dataset or compare its performance with other models. We can also use a saved machine learning model to deploy in the final application.

There are many methods of saving a machine learning model. In the section we will learn how to easily save and reuse a machine learning model.

### Steps to Save a Machine Learning Model

Below are the steps for saving a machine learning model:

1. First train a model
2. Then convert it into a byte stream
3. Save the byte stream as a binary file

So these were the steps for saving a machine learning model. In the Python programming language, the concept of converting a machine learning model into a byte stream is known as **pickling**, and if we want to use the saved model again then we have to convert the byte stream file into a usable machine learning model which is known as **unpickling**.

let’s see how to save a machine learning model. Here we will first train a linear regression model and then we will save it by using the pickle method in Python:

In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.utils import shuffle

In [2]:
data = pd.read_csv("student-mat.csv")

data = data[["G1", "G2", "G3", "studytime", "failures", "absences"]]

In [3]:
predict = "G3"
x = np.array(data.drop([predict], 1))
y = np.array(data[predict])

In [4]:
from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)

linear_regression = LinearRegression()
linear_regression.fit(xtrain, ytrain)

LinearRegression()

In [5]:
import pickle
with open("pickle_model", "wb") as file:
    pickle.dump(linear_regression, file)

We have saved the model, we can see a new file created as **pickle_model** in the same directory where our Python file is. Now if we want to use the saved model to predict values on the test set then we can run the code mentioned below:

In [7]:
with open("pickle_model", "rb") as file:
    model = pickle.load(file)

predictions = model.predict(xtest)

for i in range(len(predictions)):
    print(predictions[x], xtest[x], [ytest[x]])

[[ 7.52002922 -1.81262598  9.72463333  9.5966973  -1.81262598]
 [ 7.52002922  7.52002922  9.72463333  9.5966973   7.30204379]
 [11.83395047  3.29001872  9.72463333 12.7619896  10.77022997]
 ...
 [10.77022997  3.29001872  9.24276466 12.7619896  12.7619896 ]
 [ 9.52785891  3.80380459  9.24276466  9.5966973   9.5966973 ]
 [ 3.29001872  6.85391017  9.24276466  9.5966973   7.52002922]] [[[ 8  9  1  3  0]
  [ 5  0  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 5  0  1  3  0]]

 [[ 8  9  1  3  0]
  [ 8  9  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 9  8  4  0  2]]

 [[13 12  3  0  1]
  [ 6  5  1  3  0]
  [10 10  2  1 14]
  [12 13  2  0  0]
  [12 11  2  0  0]]

 ...

 [[12 11  2  0  0]
  [ 6  5  1  3  0]
  [10 10  2  1  0]
  [12 13  2  0  0]
  [12 13  2  0  0]]

 [[10 10  2  0  0]
  [ 7  5  3  1  0]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [10 10  2  0  2]]

 [[ 6  5  1  3  0]
  [ 9  8  1  3  6]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [ 8  9  1  3  0]]] [array([[10,  0,  9, 10,  0]

 [ 3.29001872  6.85391017  9.24276466  9.5966973   7.52002922]] [[[ 8  9  1  3  0]
  [ 5  0  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 5  0  1  3  0]]

 [[ 8  9  1  3  0]
  [ 8  9  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 9  8  4  0  2]]

 [[13 12  3  0  1]
  [ 6  5  1  3  0]
  [10 10  2  1 14]
  [12 13  2  0  0]
  [12 11  2  0  0]]

 ...

 [[12 11  2  0  0]
  [ 6  5  1  3  0]
  [10 10  2  1  0]
  [12 13  2  0  0]
  [12 13  2  0  0]]

 [[10 10  2  0  0]
  [ 7  5  3  1  0]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [10 10  2  0  2]]

 [[ 6  5  1  3  0]
  [ 9  8  1  3  6]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [ 8  9  1  3  0]]] [array([[10,  0,  9, 10,  0],
       [10, 10,  9, 10,  8],
       [12,  0,  9, 13, 12],
       ...,
       [12,  0,  0, 13, 13],
       [10,  0,  0, 10, 10],
       [ 0, 10,  0, 10, 10]], dtype=int64)]
[[ 7.52002922 -1.81262598  9.72463333  9.5966973  -1.81262598]
 [ 7.52002922  7.52002922  9.72463333  9.5966973   7.30204379]
 [11.83395047  3.290

 [ 3.29001872  6.85391017  9.24276466  9.5966973   7.52002922]] [[[ 8  9  1  3  0]
  [ 5  0  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 5  0  1  3  0]]

 [[ 8  9  1  3  0]
  [ 8  9  1  3  0]
  [10 10  2  1 14]
  [10 10  2  0  2]
  [ 9  8  4  0  2]]

 [[13 12  3  0  1]
  [ 6  5  1  3  0]
  [10 10  2  1 14]
  [12 13  2  0  0]
  [12 11  2  0  0]]

 ...

 [[12 11  2  0  0]
  [ 6  5  1  3  0]
  [10 10  2  1  0]
  [12 13  2  0  0]
  [12 13  2  0  0]]

 [[10 10  2  0  0]
  [ 7  5  3  1  0]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [10 10  2  0  2]]

 [[ 6  5  1  3  0]
  [ 9  8  1  3  6]
  [10 10  2  1  0]
  [10 10  2  0  2]
  [ 8  9  1  3  0]]] [array([[10,  0,  9, 10,  0],
       [10, 10,  9, 10,  8],
       [12,  0,  9, 13, 12],
       ...,
       [12,  0,  0, 13, 13],
       [10,  0,  0, 10, 10],
       [ 0, 10,  0, 10, 10]], dtype=int64)]
[[ 7.52002922 -1.81262598  9.72463333  9.5966973  -1.81262598]
 [ 7.52002922  7.52002922  9.72463333  9.5966973   7.30204379]
 [11.83395047  3.290

This is how a machine learning model can be easily saved as a byte stream so that we can use it in future on a new dataset.