# Save and Load Model
- In this file we will see how we can save and load the **model checkpoint** that i can train.
- We will see how we can save the best model.

# Step
- Get data
- Build Model
- Train Model
- Evulate Model
- Save the Model

In [1]:
import torch
from torch import nn
from torch.utils.data import DataLoader # for data loading
from torchvision import datasets # prebuild datasets
import torchvision
from torchvision.transforms import ToTensor # Transformations

# Load Dataset
- We will use the prebuild dataset from troch so that we can train our custom model.
- We will use the fashion Mnist Dataset which have 10 classes.

In [2]:
# we wil use pre build image net dataset
train_data = torchvision.datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

# Test data
test_data = torchvision.datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26.4M/26.4M [00:02<00:00, 12.4MB/s]


Extracting data/FashionMNIST/raw/train-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29.5k/29.5k [00:00<00:00, 297kB/s]


Extracting data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4.42M/4.42M [00:00<00:00, 4.95MB/s]


Extracting data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5.15k/5.15k [00:00<00:00, 13.9MB/s]


Extracting data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to data/FashionMNIST/raw



In [3]:
train_data

Dataset FashionMNIST
    Number of datapoints: 60000
    Root location: data
    Split: Train
    StandardTransform
Transform: ToTensor()

In [4]:
test_data

Dataset FashionMNIST
    Number of datapoints: 10000
    Root location: data
    Split: Test
    StandardTransform
Transform: ToTensor()

# Observation
- we can get the data successfully.
- Now we can make a dataloader so that we can load the data efficently.

In [5]:
# Train Loader
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)

# Test Loader
test_loader = DataLoader(test_data, batch_size=32, shuffle=False)

In [6]:
# View the barch
for x, y in train_loader:
    print(x.shape)
    print(y.shape)
    break

torch.Size([32, 1, 28, 28])
torch.Size([32])


# Build Model
- Now we can build a CNN Model with 3 conv layers and 1 fully connected layers

In [17]:
class MyCustomModel(nn.Module):
    def __init__(self):
        super(MyCustomModel,self).__init__()
        self.flatten = nn.Flatten()

        self.linear_seq = nn.Sequential(
            nn.Linear(28*28,512),
            nn.ReLU(),
            nn.Linear(512,512),
            nn.ReLU(),
            nn.Linear(512,10)
        )

    def forward(self,x):
        x= self.flatten(x)
        x= self.linear_seq(x)
        return x

# Observation
- We can make a simple CNN Model with `3 Conv layer` and `1 linear layer.`
- Our Goal is not make the perfect model.

In [18]:
# Set Optimizer and Loss
model = MyCustomModel()

loss_fn = nn.CrossEntropyLoss()

optimizer = torch.optim.SGD(model.parameters(), lr=0.0001)

model, loss_fn, optimizer

(MyCustomModel(
   (flatten): Flatten(start_dim=1, end_dim=-1)
   (linear_seq): Sequential(
     (0): Linear(in_features=784, out_features=512, bias=True)
     (1): ReLU()
     (2): Linear(in_features=512, out_features=512, bias=True)
     (3): ReLU()
     (4): Linear(in_features=512, out_features=10, bias=True)
   )
 ),
 CrossEntropyLoss(),
 SGD (
 Parameter Group 0
     dampening: 0
     differentiable: False
     foreach: None
     fused: None
     lr: 0.0001
     maximize: False
     momentum: 0
     nesterov: False
     weight_decay: 0
 ))

# Save Model Checkpoint
- Now we can make a fun that can save the model checkpoint.
- But we will notice only best model will be save on every epochs.

In [20]:
def save_checkpoint(state, filename="Best_Model.pth.tar"):
    print(f"Saving Model checkoint {filename}")
    torch.save(state, filename)

# Training Loop

In [12]:
from tqdm import tqdm
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [21]:
model = model.to(device)
epochs = 15
for epoch in range(epochs):
    train_loss = 0
    num_correct = 0
    num_samples = 0
    best_loss = float('inf')
    for image, label in tqdm(train_loader):
        image = image.to(device)
        label = label.to(device)

        # do forward pass
        output = model(image)

        # Calculate the loss
        loss = loss_fn(output,label)

        # Set the optimizer to zero_grad
        optimizer.zero_grad()

        # do backward
        loss.backward()

        optimizer.step()

        train_loss +=loss.item()

        # calculate the accuracy
        _, prediction = output.max(1)
        num_correct += (prediction == label).sum()
        num_samples += prediction.size(0)

    # calculate the accuracy
    accuracy = num_correct/num_samples

    # Save best model checkpoint
    if train_loss < best_loss:
        best_loss = train_loss
        checkpoint = {
            "state_dict": model.state_dict(),
            "optimizer": optimizer.state_dict(),
            "epoch": epoch,
            "loss": best_loss,
            "accuracy": accuracy,
        }
        save_checkpoint(checkpoint)
    print(f"Epoch: {epoch + 1}/{epochs}, Train Loss: {train_loss / len(train_loader):.4f}, Train Accuracy: {accuracy * 100:.2f}%")

100%|██████████| 1875/1875 [00:19<00:00, 95.98it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 1/15, Train Loss: 2.2952, Train Accuracy: 15.32%


100%|██████████| 1875/1875 [00:19<00:00, 94.92it/s] 


Saving Model checkoint Best_Model.pth.tar
Epoch: 2/15, Train Loss: 2.2685, Train Accuracy: 22.35%


100%|██████████| 1875/1875 [00:19<00:00, 96.48it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 3/15, Train Loss: 2.2428, Train Accuracy: 26.12%


100%|██████████| 1875/1875 [00:19<00:00, 93.76it/s] 


Saving Model checkoint Best_Model.pth.tar
Epoch: 4/15, Train Loss: 2.2156, Train Accuracy: 33.83%


100%|██████████| 1875/1875 [00:19<00:00, 94.80it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 5/15, Train Loss: 2.1853, Train Accuracy: 41.42%


100%|██████████| 1875/1875 [00:19<00:00, 94.38it/s] 


Saving Model checkoint Best_Model.pth.tar
Epoch: 6/15, Train Loss: 2.1507, Train Accuracy: 48.10%


100%|██████████| 1875/1875 [00:19<00:00, 95.98it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 7/15, Train Loss: 2.1106, Train Accuracy: 52.30%


100%|██████████| 1875/1875 [00:20<00:00, 92.84it/s] 


Saving Model checkoint Best_Model.pth.tar
Epoch: 8/15, Train Loss: 2.0642, Train Accuracy: 54.89%


100%|██████████| 1875/1875 [00:19<00:00, 97.43it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 9/15, Train Loss: 2.0108, Train Accuracy: 56.90%


100%|██████████| 1875/1875 [00:22<00:00, 85.20it/s] 


Saving Model checkoint Best_Model.pth.tar
Epoch: 10/15, Train Loss: 1.9501, Train Accuracy: 58.28%


100%|██████████| 1875/1875 [00:19<00:00, 94.60it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 11/15, Train Loss: 1.8824, Train Accuracy: 59.33%


100%|██████████| 1875/1875 [00:23<00:00, 79.98it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 12/15, Train Loss: 1.8090, Train Accuracy: 60.03%


100%|██████████| 1875/1875 [00:22<00:00, 81.52it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 13/15, Train Loss: 1.7326, Train Accuracy: 60.79%


100%|██████████| 1875/1875 [00:24<00:00, 76.93it/s]


Saving Model checkoint Best_Model.pth.tar
Epoch: 14/15, Train Loss: 1.6561, Train Accuracy: 61.37%


100%|██████████| 1875/1875 [00:22<00:00, 83.40it/s]

Saving Model checkoint Best_Model.pth.tar
Epoch: 15/15, Train Loss: 1.5822, Train Accuracy: 62.24%






# Model Evulation

In [22]:
model.eval()
with torch.no_grad():
    test_loss = 0
    num_correct = 0
    num_samples = 0
    for image, label in tqdm(test_loader):
        image = image.to(device)
        label = label.to(device)

        # do forward pass
        output = model(image)

        # calculate the loss
        loss = loss_fn(output,label)

        test_loss += loss.item()

        # calculate the accuracy
        _,prediction = output.max(1)
        num_correct += (prediction == label).sum()
        num_samples += prediction.size(0)

    accuracy = num_correct/num_samples
    print(f"Test Loss: {test_loss / len(test_loader):.4f}, Test Accuracy: {accuracy * 100:.2f}%")

100%|██████████| 313/313 [00:02<00:00, 151.24it/s]

Test Loss: 1.5507, Test Accuracy: 61.61%





# Observation
- Now we can load the model checkpoits and do prediction

# Load the model

In [23]:
# Initilize the model
model = MyCustomModel()

# Load the model state
checkpoint = torch.load("/content/Best_Model.pth.tar")
model.load_state_dict(checkpoint['state_dict'])
optimizer.load_state_dict(checkpoint['optimizer'])

  checkpoint = torch.load("/content/Best_Model.pth.tar")


#  Test Model Again on test data

In [24]:
model.eval()
with torch.no_grad():
    test_loss = 0
    num_correct = 0
    num_samples = 0
    for image, label in tqdm(test_loader):
        image = image.to(device)
        label = label.to(device)

        # do forward pass
        output = model(image)

        # calculate the loss
        loss = loss_fn(output,label)

        test_loss += loss.item()

        # calculate the accuracy
        _,prediction = output.max(1)
        num_correct += (prediction == label).sum()
        num_samples += prediction.size(0)

    accuracy = num_correct/num_samples
    print(f"Test Loss: {test_loss / len(test_loader):.4f}, Test Accuracy: {accuracy * 100:.2f}%")

100%|██████████| 313/313 [00:02<00:00, 145.53it/s]

Test Loss: 1.5507, Test Accuracy: 61.61%



