# Deserialize
---
In this notebook, we will walk through deserializing the models that were serialized in the serialization notebook.

In [0]:
import numpy as np
import pandas as pd
import torch
import torch.nn as nn
import torch.optim as optim
from torch.nn import functional as F

import warnings
warnings.filterwarnings('ignore')

In [3]:
# Mount to google drive in order to save there

from google.colab import drive
drive.mount('/content/gdrive', force_remount=True)

Mounted at /content/gdrive


In [4]:
# Ensure that the data artifacts are located at the expected location
! ls /content/gdrive/My\ Drive/MLOPS/hands_on/serialization/models/

iris_model.pt  iris_model_state_dict.pt


In [0]:
# Here we are creating an identical model architecture to the trained model from the previous notebook.
# The only difference is that we have changed the name slightly: IrisNet -> IrisNewNet.

input_size = 4
output_size = 3
hidden_size = 30

class IrisNetNew(nn.Module):
    def __init__(self):
        super(IrisNetNew, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)

    def forward(self, X):
        X = torch.sigmoid((self.fc1(X)))
        X = torch.sigmoid(self.fc2(X))
        X = self.fc3(X)

        return F.log_softmax(X, dim=-1)

# Load Model

This cell operation fails because when we pickled our model, python pickle library didn’t actually save the model class code.  It saves the path to the file that contains the class.  So because of this, the directory structure is also tied to the serialized object.  Meaning if you change a directory name in your project, it could be a breaking change

The cell operation is looking for a IrisNet class on __main__ and it does not exist.

In [6]:
model_name = 'iris_model.pt'
model_path = f"/content/gdrive/My Drive/MLOPS/hands_on/serialization/models/{model_name}" 

# Load in the model save with torch.save
model = torch.load(model_path)

AttributeError: ignored

In [0]:
# Let’s create another network, this time with the same architecture and same class name.
# This time it works!

input_size = 4
output_size = 3
hidden_size = 30

class IrisNet(nn.Module):
    def __init__(self):
        super(IrisNet, self).__init__()
        self.fc1 = nn.Linear(input_size, hidden_size)
        self.fc2 = nn.Linear(hidden_size, hidden_size)
        self.fc3 = nn.Linear(hidden_size, output_size)

    def forward(self, X):
        X = torch.sigmoid((self.fc1(X)))
        X = torch.sigmoid(self.fc2(X))
        X = self.fc3(X)

        return F.log_softmax(X, dim=-1)

In [0]:
# It successfully loads the model now that the IrisNet lives on main

model_name = 'iris_model.pt'
model_path = f"/content/gdrive/My Drive/MLOPS/hands_on/serialization/models/{model_name}" 

model = torch.load(model_path)

In [10]:
model

IrisNet(
  (fc1): Linear(in_features=4, out_features=30, bias=True)
  (fc2): Linear(in_features=30, out_features=30, bias=True)
  (fc3): Linear(in_features=30, out_features=3, bias=True)
)

In [11]:
model.eval()
example = torch.tensor([5.1, 3.5, 1.4, 0.2])
pred = model(example)
print(torch.argmax(pred))

tensor(0)


# Load State Dict

Here we can see why serializing the state_dict is a more robust approach.  We can see that the parameters of our model are more decoupled from the class name or file name of the model.  This is why it is recommended over the standard .save() command


In [0]:
model_2 = IrisNetNew()

In [17]:
model_name = 'iris_model_state_dict.pt'
model_path = f"/content/gdrive/My Drive/MLOPS/hands_on/serialization/models/{model_name}" 

model_2.load_state_dict(torch.load(model_path))

<All keys matched successfully>

In [18]:
model_2.eval()
example = torch.tensor([5.1, 3.5, 1.4, 0.2])
pred = model_2(example)
print(torch.argmax(pred))

tensor(0)
