# Pickle

pickle is a built-in Python module used to serialize and deserialize Python objects. In simple terms:

Serialization (Pickling): Converting a Python object (like a model, list, or dictionary) into a byte stream (a format that can be saved to a file or sent over a network).

Deserialization (Unpickling): Reconstructing the object from the byte stream back into Python.


🔹 Why Use Pickle?

1. Save trained ML models to disk.

2. Share Python objects across different programs or sessions.

3. Avoid recomputation by storing intermediate results.

#### 1. Save Model

In [1]:
# Predict house prices based on the square footage of homes.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score

import pickle


# Step 1: Define data
 
X = np.array([1000, 1500, 2000, 2500, 3000]).reshape(-1,1)  # Square Feet & .reshape(-1,1) --> It’s reshaping a 1D array into a 2D column vector
Y = np.array([200000, 250000, 300000, 350000, 400000])      # Price


# Step 2: Create and train the model

Model = LinearRegression()
Model.fit(X,Y)


# Step 3: Make predictions

Y_pred = Model.predict(X)
print("Predicted Price for X :", Y_pred)


# Step 4: Output coefficients 

print("Coeffecient m:", Model.coef_)
print("Inrecept b:", Model.intercept_)


# Step 5: Predict price for 2200 sq ft

Pred = Model.predict([[2200]])
print("Predicted Price for 2200 sq.ft :", Pred)



# Save the model

with open('LinearRegression_Model.pickle', 'wb') as f:

    pickle.dump(Model,f)


Predicted Price for X : [200000. 250000. 300000. 350000. 400000.]
Coeffecient m: [100.]
Inrecept b: 100000.0
Predicted Price for 2200 sq.ft : [320000.]


#### 2. Load Model

In [5]:
# Load the model

with open('LinearRegression_Model.pickle', 'rb') as f:

    LR_pkl = pickle.load(f)


# Predict price for 3000 sq ft

LoadedModel = LR_pkl.predict([[3000]])

print("Predicted Price for 3000 sq.ft :", LoadedModel)


# Output coefficients 

print("Coeffecient m:", Model.coef_)
print("Inrecept b:", Model.intercept_)


Predicted Price for 3000 sq.ft : [400000.]
Coeffecient m: [100.]
Inrecept b: 100000.0


# Joblib

joblib is a Python library designed for:

🔹Saving and loading large data or machine learning models efficiently

🔹Parallel processing (though that’s a separate feature)

🔹It's especially optimized for numerical arrays (like NumPy), which makes it ideal for scikit-learn models, or any object with large datasets.

🧠 Key Features of joblib:

1. Faster and more memory-efficient than pickle for large data (e.g., models with NumPy arrays)

2. Compresses files automatically (optional)

3. Easy syntax for saving/loading objects

#### 1. Save Model

In [7]:
# Predict house prices based on the square footage of homes.

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error
from sklearn.metrics import r2_score

import joblib


# Step 1: Define data
 
X = np.array([1000, 1500, 2000, 2500, 3000]).reshape(-1,1)  # Square Feet & .reshape(-1,1) --> It’s reshaping a 1D array into a 2D column vector
Y = np.array([200000, 250000, 300000, 350000, 400000])      # Price


# Step 2: Create and train the model

Model = LinearRegression()
Model.fit(X,Y)


# Step 3: Make predictions

Y_pred = Model.predict(X)
print("Predicted Price for X :", Y_pred)


# Step 4: Output coefficients 

print("Coeffecient m:", Model.coef_)
print("Inrecept b:", Model.intercept_)


# Step 5: Predict price for 2200 sq ft

Pred = Model.predict([[2200]])
print("Predicted Price for 2200 sq.ft :", Pred)



# Save the model

joblib.dump(Model,'LinearRegression_Model.joblib')


Predicted Price for X : [200000. 250000. 300000. 350000. 400000.]
Coeffecient m: [100.]
Inrecept b: 100000.0
Predicted Price for 2200 sq.ft : [320000.]


['LinearRegression_Model.joblib']

#### 2. Load Model

In [11]:
# Load the model

LR_jb = joblib.load('LinearRegression_Model.joblib')


# Predict price for 3000 sq ft

LoadedModel_1 = LR_jb.predict([[3000]])
print("Predicted Price for 3000 sq.ft :", LoadedModel_1)

# Output coefficients 

print("Coeffecient m:", Model.coef_)
print("Inrecept b:", Model.intercept_)


Predicted Price for 3000 sq.ft : [400000.]
Coeffecient m: [100.]
Inrecept b: 100000.0


![image.png](attachment:42e7526f-00f1-4d21-96d0-244f434742d3.png)

![image.png](attachment:eb8347eb-4d95-4124-bf86-09e64e04dc2b.png)