Whenever we want to solve the same problem on a new dataset with the same features that we solved before then we can use the same machine learning model that we trained earlier. But to use the same model again we need to save the model.

## Why do We Need to Save a Machine Learning Model?

After training a machine learning model, we can save it to use the same model again in the future to test it on a new dataset or compare its performance with other models. We can also use a saved machine learning model to deploy in the final application.

There are many methods of saving a machine learning model. If you have never saved a model before then in the section below you will learn how to easily save and reuse a machine learning model by using the Python programming language.

## Steps to Save a Machine Learning Model

Below are the steps for saving a machine learning model:

- First train a model
- Then convert it into a byte stream
- Save the byte stream as a binary file

So these were the steps for saving a machine learning model. In the Python programming language, the concept of converting a machine learning model into a byte stream is known as pickling, and if you want to use the saved model again then you have to convert the byte stream file into a usable machine learning model which is known as unpickling.

Now let’s see how to save a machine learning model. Here I will first train a linear regression model and then I will save it by using the pickle method in Python:

In [1]:
import numpy as np
import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.utils import shuffle

data = pd.read_csv("https://raw.githubusercontent.com/amankharwal/Website-data/master/student-mat.csv")

data = data[["G1", "G2", "G3", "studytime", "failures", "absences"]]

predict = "G3"
x = np.array(data.drop([predict], 1))
y = np.array(data[predict])

from sklearn.model_selection import train_test_split
xtrain, xtest, ytrain, ytest = train_test_split(x, y, test_size=0.2)

linear_regression = LinearRegression()
linear_regression.fit(xtrain, ytrain)

import pickle
with open("pickle_model", "wb") as file:
    pickle.dump(linear_regression, file)

  x = np.array(data.drop([predict], 1))


So we have saved the model, you can see a new file created as “pickle_model” in the same directory where your Python file is. Now if you want to use the saved model to predict values on the test set then you can run the code mentioned below:

In [2]:
with open("pickle_model", "rb") as file:
    model = pickle.load(file)

predictions = model.predict(xtest)

for i in range(len(predictions)):
    print(predictions[x], xtest[x], [ytest[x]])

[[14.25996694  7.17365981  9.77007108 15.20129667  7.17365981]
 [14.25996694 14.25996694  9.77007108 15.20129667  8.39791724]
 [ 7.11608334  9.58828843  9.77007108  4.19815412  4.81761026]
 ...
 [ 4.81761026  9.58828843 14.18365488  4.19815412  4.19815412]
 [18.78260627  6.26869363 14.18365488 15.20129667 15.20129667]
 [ 9.58828843  8.39791724 14.18365488 15.20129667 14.25996694]] [[[14 14  1  0  4]
  [ 8  8  3  0  2]
  [10 10  2  1 14]
  [15 15  2  0  4]
  [ 8  8  3  0  2]]

 [[14 14  1  0  4]
  [14 14  1  0  4]
  [10 10  2  1 14]
  [15 15  2  0  4]
  [10  9  3  0  2]]

 [[ 9  8  4  0  2]
  [ 8 10  1  0  4]
  [10 10  2  1 14]
  [ 6  5  2  0  4]
  [ 7  6  1  2  0]]

 ...

 [[ 7  6  1  2  0]
  [ 8 10  1  0  4]
  [14 14  1  0  2]
  [ 6  5  2  0  4]
  [ 6  5  2  0  4]]

 [[18 18  1  0  6]
  [ 8  7  2  0  0]
  [14 14  1  0  2]
  [15 15  2  0  4]
  [15 15  2  0  4]]

 [[ 8 10  1  0  4]
  [10  9  3  0  2]
  [14 14  1  0  2]
  [15 15  2  0  4]
  [14 14  1  0  4]]] [array([[14, 10,  9, 15, 10]

## Summary

So this is how a machine learning model can be easily saved as a byte stream so that we can use it in future on a new dataset.