# **Save and Load Machine Learning Models in Python with scikit-learn**

Two main methods
1. **Using Python’s pickle module**
    * Standard way to serialize Python objects.
    * Use pickle.dump() to save a trained model object to a file.
    * Use pickle.load() to load it back and make predictions.
    * Example: Train a LinearRegression, save to 'linear_model.sav', load back, then compute RMSE on test data.
→ RMSE ≈ 72.12

    import pickle\
    ... train model ...\
    pickle.dump(model, open('model.sav','wb'))\
    loaded = pickle.load(open('model.sav','rb'))\
    preds = loaded.predict(X_test)

2. **Using joblib (preferred for models with large NumPy arrays)**
    * Designed for performance with large arrays.
    * Use joblib.dump() and joblib.load() similarly.
    * Example: Exactly same workflow with LinearRegression yields RMSE ≈ 72.12

    import joblib\
    ... train model ...\
    joblib.dump(model, open('model_j.sav','wb'))\
    loaded = joblib.load(open('model_j.sav','rb'))\
    preds = loaded.predict(X_test)


In [None]:
# import packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn import metrics
import pickle
import joblib

# import the dataset
dataset = pd.read_csv('headbrain1.csv')

X = dataset.iloc[:, : -1].values
Y = dataset.iloc[:, -1].values

# train test split
X_train, X_test, y_train, y_test = train_test_split(
    X, Y, test_size=0.2, random_state=0)

# create a linear regression model
regressor = LinearRegression()
regressor.fit(X_train, y_train)


# PICKEL method:
# save the model
filename = 'linear_model.sav'
pickle.dump(regressor, open(filename, 'wb'))

# load the model
load_model = pickle.load(open(filename, 'rb'))

y_pred = load_model.predict(X_test)
print('root mean squared error : ', np.sqrt(
    metrics.mean_squared_error(y_test, y_pred)))


#JOBLIB method:
# save the model
filename = 'linear_model_2.sav'
joblib.dump(regressor, open(filename, 'wb'))

# load the model
load_model = joblib.load(open(filename, 'rb'))

y_pred = load_model.predict(X_test)
print('root mean squared error : ', np.sqrt(
    metrics.mean_squared_error(y_test, y_pred)))