1. Our models are underperforming (not fitting the data well). What are 3 methods for preventing underfitting? Write them down and explain each with a sentence.

* If the model is too simple for the dataset, then it will underfit.
* Making sure we have enough training data.
* Regularisation: It is the process of penalising the loss function, which discourages the  model from using overly complex solutions.

2. Recreate the data loading functions we built in sections 1, 2, 3 and 4. You should have train and test DataLoader's ready to use.

In [1]:
from pathlib import Path 

DATA_PATH= Path('../data')
IMAGE_PATH= DATA_PATH/'pizza_steak_sushi_20_percent'

train_dir= IMAGE_PATH/'train'
test_dir= IMAGE_PATH/'test'

In [2]:
from torch.utils.data.dataloader import DataLoader
from torchvision import datasets, transforms
import os

############### TRANSFORM FUNCTION ###############
train_transform= transforms.Compose([
    transforms.Resize(size=(64, 64)),
    transforms.RandomHorizontalFlip(p=0.5),
    transforms.ToTensor()
])

test_transform= transforms.Compose([
    transforms.Resize(size=(64, 64)),
    transforms.ToTensor()
])

############### DATASETS ###############
train_data= datasets.ImageFolder(root=train_dir,
                                 transform=train_transform,
                                 target_transform=None)

test_data= datasets.ImageFolder(root=test_dir,
                                transform=test_transform,
                                target_transform=None)

BATCH_SIZE= 32
NUM_WORKER= os.cpu_count()

############### DATALOADERS ###############
train_dataloader= DataLoader(dataset= train_data,
                             batch_size= BATCH_SIZE,
                             shuffle= True,
                             num_workers= NUM_WORKER)

test_dataloader= DataLoader(dataset= test_data,
                            batch_size= 32,
                            shuffle= False,
                            num_workers= NUM_WORKER)

In [3]:
image_batch, label_batch = next(iter(train_dataloader))

In [4]:
class_names= train_data.classes
len(class_names)
print(image_batch)


tensor([[[[0.0863, 0.0902, 0.0863,  ..., 0.0588, 0.0588, 0.0627],
          [0.0863, 0.0863, 0.0902,  ..., 0.0784, 0.0706, 0.0706],
          [0.0863, 0.0824, 0.0824,  ..., 0.2863, 0.1098, 0.0784],
          ...,
          [0.7725, 0.7686, 0.7647,  ..., 0.8588, 0.8471, 0.8118],
          [0.5961, 0.6627, 0.7137,  ..., 0.6706, 0.5882, 0.5294],
          [0.2235, 0.2902, 0.3882,  ..., 0.4039, 0.3765, 0.3882]],

         [[0.0863, 0.0902, 0.0863,  ..., 0.0549, 0.0510, 0.0549],
          [0.0863, 0.0863, 0.0902,  ..., 0.0784, 0.0627, 0.0627],
          [0.0863, 0.0824, 0.0824,  ..., 0.2863, 0.0980, 0.0667],
          ...,
          [0.2314, 0.2235, 0.2235,  ..., 0.3608, 0.4039, 0.4471],
          [0.2039, 0.2118, 0.2196,  ..., 0.4549, 0.4118, 0.3725],
          [0.1059, 0.1137, 0.1333,  ..., 0.2863, 0.2627, 0.2627]],

         [[0.1255, 0.1216, 0.1176,  ..., 0.0706, 0.0627, 0.0667],
          [0.1255, 0.1176, 0.1216,  ..., 0.0863, 0.0745, 0.0745],
          [0.1255, 0.1137, 0.1137,  ..., 0

3. Recreate `model_0` we built in section 7.

In [5]:
from torch import nn

class TinyVGG(nn.Module):
    def __init__(self,
                 input_shape: int,
                 output_shape: int,
                 hidden_units: int= 10) -> None:
        super().__init__()
        self.covo_block_1= nn.Sequential(
            nn.Conv2d(in_channels= input_shape,
                      out_channels= hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=0),
            nn.ReLU(),
            nn.Conv2d(in_channels= hidden_units,
                      out_channels= hidden_units,
                      kernel_size= 3,
                      stride=1,
                      padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.covo_block_2= nn.Sequential(
            nn.Conv2d(in_channels= hidden_units,
                      out_channels= hidden_units,
                      kernel_size=3,
                      stride=1,
                      padding=0),
            nn.ReLU(),
            nn.Conv2d(in_channels= hidden_units,
                      out_channels= hidden_units,
                      kernel_size= 3,
                      stride=1,
                      padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=2, stride=2)
        )
        self.classifier= nn.Sequential(
            nn.Flatten(),
            nn.Linear(in_features= hidden_units*13*13,
                      out_features= output_shape)
        )

    def forward(self, x):
        x= self.covo_block_1(x)
        # print(x)
        x= self.covo_block_2(x)
        # print(x)
        x= self.classifier(x)
        return x


In [6]:
from torchinfo import summary
import torch

model_0= TinyVGG(input_shape=3,
                 output_shape=len(class_names),
                 hidden_units=10)

In [7]:
summary(model= model_0, input_size= [1, 3, 64, 64])

Layer (type:depth-idx)                   Output Shape              Param #
TinyVGG                                  [1, 3]                    --
├─Sequential: 1-1                        [1, 10, 30, 30]           --
│    └─Conv2d: 2-1                       [1, 10, 62, 62]           280
│    └─ReLU: 2-2                         [1, 10, 62, 62]           --
│    └─Conv2d: 2-3                       [1, 10, 60, 60]           910
│    └─ReLU: 2-4                         [1, 10, 60, 60]           --
│    └─MaxPool2d: 2-5                    [1, 10, 30, 30]           --
├─Sequential: 1-2                        [1, 10, 13, 13]           --
│    └─Conv2d: 2-6                       [1, 10, 28, 28]           910
│    └─ReLU: 2-7                         [1, 10, 28, 28]           --
│    └─Conv2d: 2-8                       [1, 10, 26, 26]           910
│    └─ReLU: 2-9                         [1, 10, 26, 26]           --
│    └─MaxPool2d: 2-10                   [1, 10, 13, 13]           --
├─Sequentia

In [8]:
device= 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

4. Create training and testing functions for `model_0`

In [9]:
def train_step(model: nn.Module,
               dataloader: DataLoader,
               loss_fn: nn.Module,
               optimizer: torch.optim.Optimizer,
               device: torch.device= 'cpu'):
    
    # Setting model to training mode
    model.to(device).train()

    # Accumilating train and test loss per batch
    train_loss, train_acc= 0, 0

    for batch, (X,y) in enumerate(dataloader):
        X,y= X.to(device), y.to(device)

        # Forward pass
        y_pred_logit= model(X)

        # Loss caclulation
        loss = loss_fn(y_pred_logit, y)
        train_loss += loss.item()

        # Optimizer Zero Grad
        optimizer.zero_grad()

        # Loss backward
        loss.backward()

        # Optimizer Step
        optimizer.step()

        # Calculating Accuracy
        y_pred_label= torch.argmax(torch.softmax(y_pred_logit, dim=1), dim=1)
        train_acc += (y_pred_label==y).sum().item()/len(y_pred_logit)

    train_loss /= len(dataloader)
    train_acc /= len(dataloader)
    return train_loss, train_acc

In [10]:
def test_step(model: nn.Module,
              dataloader: DataLoader,
              loss_fn: nn.Module,
              device: torch.device= 'cpu'):
    
    # Setting model to evaluation mode
    model.to(device).eval()

    # Accumilating test loss and acc per batch
    test_loss, test_acc= 0, 0

    with torch.inference_mode():
        for batch, (X,y) in enumerate(dataloader):
            # Setting all variables to same device
            X, y= X.to(device), y.to(device)

            # Forward Pass
            test_pred_logits= model(X)
            
            # Loss Calculation
            loss = loss_fn(test_pred_logits, y)
            test_loss += loss.item()

            # Accuracy Calculation
            test_pred_labels= torch.argmax(torch.softmax(test_pred_logits, dim=1), dim=1)
            test_acc+= (test_pred_labels==y).sum().item()/len(test_pred_labels)

        test_loss /= len(dataloader)
        test_acc /= len(dataloader)
    return test_loss, test_acc

5. Try training the model you made in exercise 3 for 5, 20 and 50 epochs, what happens to the results?
    * Use `torch.optim.Adam()` with a learning rate of 0.001 as the optimizer.

In [11]:
from tqdm import tqdm
def train(model: nn.Module,
          train_dataloader: DataLoader,
          test_dataloader: DataLoader,
          epochs: int,
          loss_fn: nn.Module,
          optimizer: torch.optim.Optimizer,
          device: torch.device= 'cpu'):
    
    results={'train_loss': [],
             'train_acc': [],
             'test_loss': [],
             'test_acc': []}
    
    for epoch in tqdm(range(epochs)):

        # Training
        train_loss, train_acc= train_step(model= model,
                                          dataloader= train_dataloader,
                                          loss_fn= loss_fn,
                                          optimizer= optimizer,
                                          device= device)

        # Testing
        test_loss, test_acc= test_step(model=model,
                                       dataloader= test_dataloader,
                                       loss_fn= loss_fn,
                                       device= device)

        print(f"Epoch: {epoch} | Train Loss: {train_loss:.3f} | Train Accuracy: {100*train_acc:.2f}% | Test Loss: {test_loss:.3f} | Test Accuracy: {100*test_acc:.2f}%")
        
        # Storing results
        results['train_loss'].append(train_loss)
        results['train_acc'].append(train_acc)
        results['test_loss'].append(test_loss)
        results['test_acc'].append(test_acc)
        
    return results      

In [12]:
EPOCHS= 5
loss_fn = nn.CrossEntropyLoss()
optimizer= torch.optim.Adam(params= model_0.parameters(), lr= 0.001)

model_0_results_5_epochs= train(model= model_0,
                                train_dataloader= train_dataloader,
                                test_dataloader= test_dataloader,
                                epochs= EPOCHS,
                                loss_fn= loss_fn,
                                optimizer= optimizer,
                                device= device)

 20%|██        | 1/5 [00:17<01:08, 17.16s/it]

Epoch: 0 | Train Loss: 1.100 | Train Accuracy: 36.04% | Test Loss: 1.082 | Test Accuracy: 48.35%


 40%|████      | 2/5 [00:36<00:55, 18.57s/it]

Epoch: 1 | Train Loss: 1.071 | Train Accuracy: 42.92% | Test Loss: 1.025 | Test Accuracy: 47.95%


 60%|██████    | 3/5 [00:54<00:36, 18.30s/it]

Epoch: 2 | Train Loss: 0.991 | Train Accuracy: 51.25% | Test Loss: 0.961 | Test Accuracy: 52.73%


 80%|████████  | 4/5 [01:09<00:16, 16.83s/it]

Epoch: 3 | Train Loss: 1.019 | Train Accuracy: 50.00% | Test Loss: 0.999 | Test Accuracy: 52.39%


100%|██████████| 5/5 [01:26<00:00, 17.29s/it]

Epoch: 4 | Train Loss: 1.045 | Train Accuracy: 43.96% | Test Loss: 0.971 | Test Accuracy: 52.67%





In [13]:
EPOCHS= 20
loss_fn = nn.CrossEntropyLoss()
optimizer= torch.optim.Adam(params= model_0.parameters(), lr= 0.001)

model_0_results_20_epochs= train(model= model_0,
                                train_dataloader= train_dataloader,
                                test_dataloader= test_dataloader,
                                epochs= EPOCHS,
                                loss_fn= loss_fn,
                                optimizer= optimizer,
                                device= device)

  5%|▌         | 1/20 [00:16<05:18, 16.78s/it]

Epoch: 0 | Train Loss: 0.953 | Train Accuracy: 48.96% | Test Loss: 0.932 | Test Accuracy: 52.78%


 10%|█         | 2/20 [00:31<04:41, 15.66s/it]

Epoch: 1 | Train Loss: 0.886 | Train Accuracy: 56.04% | Test Loss: 0.908 | Test Accuracy: 57.33%


 15%|█▌        | 3/20 [00:47<04:30, 15.92s/it]

Epoch: 2 | Train Loss: 0.914 | Train Accuracy: 52.08% | Test Loss: 0.898 | Test Accuracy: 57.78%


 20%|██        | 4/20 [01:02<04:06, 15.41s/it]

Epoch: 3 | Train Loss: 0.904 | Train Accuracy: 58.13% | Test Loss: 0.892 | Test Accuracy: 57.78%


 25%|██▌       | 5/20 [01:15<03:40, 14.67s/it]

Epoch: 4 | Train Loss: 0.849 | Train Accuracy: 62.50% | Test Loss: 0.911 | Test Accuracy: 56.70%


 30%|███       | 6/20 [01:29<03:19, 14.22s/it]

Epoch: 5 | Train Loss: 0.856 | Train Accuracy: 62.08% | Test Loss: 0.895 | Test Accuracy: 58.69%


 35%|███▌      | 7/20 [01:43<03:04, 14.16s/it]

Epoch: 6 | Train Loss: 0.812 | Train Accuracy: 64.17% | Test Loss: 0.901 | Test Accuracy: 60.06%


 40%|████      | 8/20 [02:03<03:11, 15.98s/it]

Epoch: 7 | Train Loss: 0.842 | Train Accuracy: 58.96% | Test Loss: 0.909 | Test Accuracy: 58.75%


 45%|████▌     | 9/20 [02:15<02:44, 14.94s/it]

Epoch: 8 | Train Loss: 0.812 | Train Accuracy: 63.54% | Test Loss: 0.879 | Test Accuracy: 60.34%


 50%|█████     | 10/20 [02:30<02:28, 14.80s/it]

Epoch: 9 | Train Loss: 0.753 | Train Accuracy: 66.67% | Test Loss: 0.896 | Test Accuracy: 60.68%


 55%|█████▌    | 11/20 [02:45<02:13, 14.88s/it]

Epoch: 10 | Train Loss: 0.749 | Train Accuracy: 68.54% | Test Loss: 0.886 | Test Accuracy: 60.34%


 60%|██████    | 12/20 [02:59<01:56, 14.62s/it]

Epoch: 11 | Train Loss: 0.773 | Train Accuracy: 65.21% | Test Loss: 0.926 | Test Accuracy: 54.26%


 65%|██████▌   | 13/20 [03:12<01:38, 14.05s/it]

Epoch: 12 | Train Loss: 0.776 | Train Accuracy: 64.38% | Test Loss: 0.869 | Test Accuracy: 57.39%


 70%|███████   | 14/20 [03:24<01:21, 13.64s/it]

Epoch: 13 | Train Loss: 0.785 | Train Accuracy: 60.00% | Test Loss: 0.855 | Test Accuracy: 61.59%


 75%|███████▌  | 15/20 [03:37<01:06, 13.35s/it]

Epoch: 14 | Train Loss: 0.720 | Train Accuracy: 70.00% | Test Loss: 0.880 | Test Accuracy: 58.75%


 80%|████████  | 16/20 [03:50<00:52, 13.11s/it]

Epoch: 15 | Train Loss: 0.677 | Train Accuracy: 69.79% | Test Loss: 0.907 | Test Accuracy: 57.56%


 85%|████████▌ | 17/20 [04:02<00:38, 12.97s/it]

Epoch: 16 | Train Loss: 0.745 | Train Accuracy: 65.62% | Test Loss: 0.917 | Test Accuracy: 55.68%


 90%|█████████ | 18/20 [04:16<00:26, 13.21s/it]

Epoch: 17 | Train Loss: 0.815 | Train Accuracy: 61.67% | Test Loss: 0.878 | Test Accuracy: 58.13%


 95%|█████████▌| 19/20 [04:31<00:13, 13.91s/it]

Epoch: 18 | Train Loss: 0.709 | Train Accuracy: 71.04% | Test Loss: 0.859 | Test Accuracy: 55.34%


100%|██████████| 20/20 [04:46<00:00, 14.33s/it]

Epoch: 19 | Train Loss: 0.722 | Train Accuracy: 65.42% | Test Loss: 0.858 | Test Accuracy: 58.47%





In [14]:
EPOCHS= 50
loss_fn = nn.CrossEntropyLoss()
optimizer= torch.optim.Adam(params= model_0.parameters(), lr= 0.001)

model_0_results_50_epochs= train(model= model_0,
                                train_dataloader= train_dataloader,
                                test_dataloader= test_dataloader,
                                epochs= EPOCHS,
                                loss_fn= loss_fn,
                                optimizer= optimizer,
                                device= device)

  2%|▏         | 1/50 [00:14<11:36, 14.22s/it]

Epoch: 0 | Train Loss: 0.768 | Train Accuracy: 66.04% | Test Loss: 0.869 | Test Accuracy: 60.40%


  4%|▍         | 2/50 [00:30<12:12, 15.26s/it]

Epoch: 1 | Train Loss: 0.676 | Train Accuracy: 67.29% | Test Loss: 0.888 | Test Accuracy: 59.38%


  6%|▌         | 3/50 [00:43<11:11, 14.29s/it]

Epoch: 2 | Train Loss: 0.690 | Train Accuracy: 68.33% | Test Loss: 0.906 | Test Accuracy: 57.56%


  8%|▊         | 4/50 [00:56<10:29, 13.69s/it]

Epoch: 3 | Train Loss: 0.643 | Train Accuracy: 71.25% | Test Loss: 0.950 | Test Accuracy: 60.28%


 10%|█         | 5/50 [01:09<10:04, 13.43s/it]

Epoch: 4 | Train Loss: 0.690 | Train Accuracy: 70.00% | Test Loss: 0.915 | Test Accuracy: 57.56%


 12%|█▏        | 6/50 [01:21<09:40, 13.19s/it]

Epoch: 5 | Train Loss: 0.588 | Train Accuracy: 78.54% | Test Loss: 0.902 | Test Accuracy: 58.81%


 14%|█▍        | 7/50 [01:34<09:20, 13.02s/it]

Epoch: 6 | Train Loss: 0.733 | Train Accuracy: 74.17% | Test Loss: 0.931 | Test Accuracy: 57.27%


 16%|█▌        | 8/50 [01:47<09:04, 12.97s/it]

Epoch: 7 | Train Loss: 0.581 | Train Accuracy: 74.79% | Test Loss: 0.919 | Test Accuracy: 60.34%


 18%|█▊        | 9/50 [01:59<08:45, 12.81s/it]

Epoch: 8 | Train Loss: 0.591 | Train Accuracy: 75.42% | Test Loss: 0.939 | Test Accuracy: 56.99%


 20%|██        | 10/50 [02:12<08:30, 12.75s/it]

Epoch: 9 | Train Loss: 0.598 | Train Accuracy: 74.17% | Test Loss: 0.958 | Test Accuracy: 59.60%


 22%|██▏       | 11/50 [02:25<08:16, 12.72s/it]

Epoch: 10 | Train Loss: 0.512 | Train Accuracy: 79.17% | Test Loss: 0.954 | Test Accuracy: 58.81%


 24%|██▍       | 12/50 [02:37<08:05, 12.76s/it]

Epoch: 11 | Train Loss: 0.520 | Train Accuracy: 80.21% | Test Loss: 1.112 | Test Accuracy: 58.30%


 26%|██▌       | 13/50 [02:51<07:58, 12.92s/it]

Epoch: 12 | Train Loss: 0.511 | Train Accuracy: 78.96% | Test Loss: 0.982 | Test Accuracy: 57.27%


 28%|██▊       | 14/50 [03:04<07:50, 13.07s/it]

Epoch: 13 | Train Loss: 0.476 | Train Accuracy: 80.21% | Test Loss: 1.032 | Test Accuracy: 52.90%


 30%|███       | 15/50 [03:17<07:34, 12.97s/it]

Epoch: 14 | Train Loss: 0.452 | Train Accuracy: 82.08% | Test Loss: 1.049 | Test Accuracy: 54.43%


 32%|███▏      | 16/50 [03:30<07:20, 12.94s/it]

Epoch: 15 | Train Loss: 0.443 | Train Accuracy: 84.79% | Test Loss: 1.145 | Test Accuracy: 56.88%


 34%|███▍      | 17/50 [03:43<07:11, 13.09s/it]

Epoch: 16 | Train Loss: 0.405 | Train Accuracy: 82.92% | Test Loss: 1.235 | Test Accuracy: 57.27%


 36%|███▌      | 18/50 [03:57<07:05, 13.30s/it]

Epoch: 17 | Train Loss: 0.473 | Train Accuracy: 77.50% | Test Loss: 1.136 | Test Accuracy: 55.11%


 38%|███▊      | 19/50 [04:12<07:06, 13.77s/it]

Epoch: 18 | Train Loss: 0.441 | Train Accuracy: 82.08% | Test Loss: 1.177 | Test Accuracy: 56.88%


 40%|████      | 20/50 [04:26<06:52, 13.74s/it]

Epoch: 19 | Train Loss: 0.407 | Train Accuracy: 84.17% | Test Loss: 1.225 | Test Accuracy: 58.69%


 42%|████▏     | 21/50 [04:39<06:37, 13.70s/it]

Epoch: 20 | Train Loss: 0.396 | Train Accuracy: 84.17% | Test Loss: 1.222 | Test Accuracy: 57.16%


 44%|████▍     | 22/50 [04:54<06:33, 14.05s/it]

Epoch: 21 | Train Loss: 0.338 | Train Accuracy: 88.54% | Test Loss: 1.272 | Test Accuracy: 52.56%


 46%|████▌     | 23/50 [05:07<06:12, 13.81s/it]

Epoch: 22 | Train Loss: 0.331 | Train Accuracy: 85.00% | Test Loss: 1.323 | Test Accuracy: 53.52%


 48%|████▊     | 24/50 [05:20<05:52, 13.54s/it]

Epoch: 23 | Train Loss: 0.338 | Train Accuracy: 85.83% | Test Loss: 1.373 | Test Accuracy: 55.62%


 50%|█████     | 25/50 [05:33<05:31, 13.26s/it]

Epoch: 24 | Train Loss: 0.336 | Train Accuracy: 86.25% | Test Loss: 1.477 | Test Accuracy: 50.68%


 52%|█████▏    | 26/50 [05:45<05:14, 13.09s/it]

Epoch: 25 | Train Loss: 0.322 | Train Accuracy: 88.75% | Test Loss: 1.471 | Test Accuracy: 54.72%


 54%|█████▍    | 27/50 [05:58<04:57, 12.92s/it]

Epoch: 26 | Train Loss: 0.483 | Train Accuracy: 78.54% | Test Loss: 1.485 | Test Accuracy: 56.25%


 56%|█████▌    | 28/50 [06:13<04:58, 13.55s/it]

Epoch: 27 | Train Loss: 0.382 | Train Accuracy: 83.75% | Test Loss: 1.323 | Test Accuracy: 54.37%


 58%|█████▊    | 29/50 [06:26<04:43, 13.49s/it]

Epoch: 28 | Train Loss: 0.308 | Train Accuracy: 88.96% | Test Loss: 1.291 | Test Accuracy: 55.68%


 60%|██████    | 30/50 [06:40<04:31, 13.55s/it]

Epoch: 29 | Train Loss: 0.248 | Train Accuracy: 91.25% | Test Loss: 1.519 | Test Accuracy: 51.99%


 62%|██████▏   | 31/50 [06:54<04:17, 13.55s/it]

Epoch: 30 | Train Loss: 0.278 | Train Accuracy: 90.42% | Test Loss: 1.459 | Test Accuracy: 56.19%


 64%|██████▍   | 32/50 [07:07<04:01, 13.42s/it]

Epoch: 31 | Train Loss: 0.226 | Train Accuracy: 93.54% | Test Loss: 1.673 | Test Accuracy: 55.06%


 66%|██████▌   | 33/50 [07:21<03:50, 13.55s/it]

Epoch: 32 | Train Loss: 0.268 | Train Accuracy: 90.62% | Test Loss: 1.746 | Test Accuracy: 52.90%


 68%|██████▊   | 34/50 [07:34<03:34, 13.40s/it]

Epoch: 33 | Train Loss: 0.232 | Train Accuracy: 91.67% | Test Loss: 1.659 | Test Accuracy: 52.90%


 70%|███████   | 35/50 [07:47<03:20, 13.34s/it]

Epoch: 34 | Train Loss: 0.203 | Train Accuracy: 94.38% | Test Loss: 1.760 | Test Accuracy: 55.06%


 72%|███████▏  | 36/50 [08:01<03:09, 13.55s/it]

Epoch: 35 | Train Loss: 0.265 | Train Accuracy: 88.54% | Test Loss: 1.796 | Test Accuracy: 55.68%


 74%|███████▍  | 37/50 [08:14<02:53, 13.36s/it]

Epoch: 36 | Train Loss: 0.218 | Train Accuracy: 91.25% | Test Loss: 1.702 | Test Accuracy: 52.90%


 76%|███████▌  | 38/50 [08:27<02:38, 13.25s/it]

Epoch: 37 | Train Loss: 0.192 | Train Accuracy: 92.71% | Test Loss: 1.872 | Test Accuracy: 53.52%


 78%|███████▊  | 39/50 [08:40<02:25, 13.25s/it]

Epoch: 38 | Train Loss: 0.141 | Train Accuracy: 95.21% | Test Loss: 1.965 | Test Accuracy: 54.72%


 80%|████████  | 40/50 [08:53<02:11, 13.15s/it]

Epoch: 39 | Train Loss: 0.132 | Train Accuracy: 96.67% | Test Loss: 2.108 | Test Accuracy: 53.81%


 82%|████████▏ | 41/50 [09:06<01:57, 13.03s/it]

Epoch: 40 | Train Loss: 0.111 | Train Accuracy: 97.08% | Test Loss: 2.152 | Test Accuracy: 51.65%


 84%|████████▍ | 42/50 [09:18<01:43, 12.96s/it]

Epoch: 41 | Train Loss: 0.101 | Train Accuracy: 97.29% | Test Loss: 2.288 | Test Accuracy: 54.09%


 86%|████████▌ | 43/50 [09:31<01:30, 12.89s/it]

Epoch: 42 | Train Loss: 0.100 | Train Accuracy: 97.50% | Test Loss: 2.486 | Test Accuracy: 56.19%


 88%|████████▊ | 44/50 [09:44<01:16, 12.80s/it]

Epoch: 43 | Train Loss: 0.377 | Train Accuracy: 87.92% | Test Loss: 2.188 | Test Accuracy: 56.76%


 90%|█████████ | 45/50 [09:56<01:03, 12.70s/it]

Epoch: 44 | Train Loss: 0.350 | Train Accuracy: 86.04% | Test Loss: 1.810 | Test Accuracy: 51.59%


 92%|█████████▏| 46/50 [10:09<00:50, 12.71s/it]

Epoch: 45 | Train Loss: 0.395 | Train Accuracy: 82.50% | Test Loss: 1.722 | Test Accuracy: 58.13%


 94%|█████████▍| 47/50 [10:22<00:38, 12.70s/it]

Epoch: 46 | Train Loss: 0.442 | Train Accuracy: 85.21% | Test Loss: 1.802 | Test Accuracy: 52.22%


 96%|█████████▌| 48/50 [10:34<00:25, 12.67s/it]

Epoch: 47 | Train Loss: 0.276 | Train Accuracy: 89.17% | Test Loss: 1.619 | Test Accuracy: 52.56%


 98%|█████████▊| 49/50 [10:48<00:13, 13.10s/it]

Epoch: 48 | Train Loss: 0.183 | Train Accuracy: 92.29% | Test Loss: 1.739 | Test Accuracy: 53.52%


100%|██████████| 50/50 [11:03<00:00, 13.27s/it]

Epoch: 49 | Train Loss: 0.153 | Train Accuracy: 95.83% | Test Loss: 1.937 | Test Accuracy: 55.34%





6. Double the number of hidden units in your model and train it for 20 epochs, what happens to the results?

In [15]:
model_1= TinyVGG(input_shape=3,
                 output_shape=len(class_names),
                 hidden_units=20)

In [16]:
EPOCHS= 20
loss_fn = nn.CrossEntropyLoss()
optimizer= torch.optim.Adam(params= model_1.parameters(), lr= 0.001)

model_1_results= train(model= model_1, 
                       train_dataloader= train_dataloader,
                       test_dataloader= test_dataloader,
                       epochs= EPOCHS,
                       loss_fn= loss_fn,
                       optimizer= optimizer,
                       device= device)

  5%|▌         | 1/20 [00:13<04:12, 13.28s/it]

Epoch: 0 | Train Loss: 1.099 | Train Accuracy: 31.46% | Test Loss: 1.096 | Test Accuracy: 34.72%


 10%|█         | 2/20 [00:27<04:14, 14.12s/it]

Epoch: 1 | Train Loss: 1.092 | Train Accuracy: 31.67% | Test Loss: 1.068 | Test Accuracy: 41.59%


 15%|█▌        | 3/20 [00:40<03:49, 13.52s/it]

Epoch: 2 | Train Loss: 1.029 | Train Accuracy: 45.42% | Test Loss: 0.980 | Test Accuracy: 47.50%


 20%|██        | 4/20 [00:53<03:33, 13.32s/it]

Epoch: 3 | Train Loss: 0.955 | Train Accuracy: 48.33% | Test Loss: 0.944 | Test Accuracy: 48.41%


 25%|██▌       | 5/20 [01:07<03:19, 13.29s/it]

Epoch: 4 | Train Loss: 0.869 | Train Accuracy: 58.13% | Test Loss: 0.922 | Test Accuracy: 54.32%


 30%|███       | 6/20 [01:19<03:03, 13.09s/it]

Epoch: 5 | Train Loss: 0.894 | Train Accuracy: 53.12% | Test Loss: 0.928 | Test Accuracy: 55.80%


 35%|███▌      | 7/20 [01:32<02:49, 13.01s/it]

Epoch: 6 | Train Loss: 0.918 | Train Accuracy: 60.42% | Test Loss: 0.910 | Test Accuracy: 60.62%


 40%|████      | 8/20 [01:46<02:39, 13.31s/it]

Epoch: 7 | Train Loss: 0.846 | Train Accuracy: 64.38% | Test Loss: 0.902 | Test Accuracy: 55.17%


 45%|████▌     | 9/20 [02:08<02:56, 16.05s/it]

Epoch: 8 | Train Loss: 0.834 | Train Accuracy: 61.67% | Test Loss: 0.892 | Test Accuracy: 60.06%


 50%|█████     | 10/20 [02:22<02:32, 15.27s/it]

Epoch: 9 | Train Loss: 0.805 | Train Accuracy: 63.75% | Test Loss: 0.887 | Test Accuracy: 61.31%


 55%|█████▌    | 11/20 [02:36<02:13, 14.85s/it]

Epoch: 10 | Train Loss: 0.792 | Train Accuracy: 65.21% | Test Loss: 0.921 | Test Accuracy: 59.09%


 60%|██████    | 12/20 [02:48<01:52, 14.02s/it]

Epoch: 11 | Train Loss: 0.794 | Train Accuracy: 64.17% | Test Loss: 0.884 | Test Accuracy: 63.18%


 65%|██████▌   | 13/20 [03:00<01:33, 13.38s/it]

Epoch: 12 | Train Loss: 0.737 | Train Accuracy: 69.58% | Test Loss: 0.878 | Test Accuracy: 58.35%


 70%|███████   | 14/20 [03:12<01:17, 12.96s/it]

Epoch: 13 | Train Loss: 0.741 | Train Accuracy: 62.71% | Test Loss: 0.892 | Test Accuracy: 59.15%


 75%|███████▌  | 15/20 [03:24<01:03, 12.74s/it]

Epoch: 14 | Train Loss: 0.691 | Train Accuracy: 70.00% | Test Loss: 0.941 | Test Accuracy: 54.32%


 80%|████████  | 16/20 [03:36<00:50, 12.57s/it]

Epoch: 15 | Train Loss: 0.711 | Train Accuracy: 69.38% | Test Loss: 0.950 | Test Accuracy: 56.93%


 85%|████████▌ | 17/20 [03:48<00:37, 12.41s/it]

Epoch: 16 | Train Loss: 0.712 | Train Accuracy: 67.71% | Test Loss: 0.934 | Test Accuracy: 56.93%


 90%|█████████ | 18/20 [04:00<00:24, 12.27s/it]

Epoch: 17 | Train Loss: 0.651 | Train Accuracy: 72.92% | Test Loss: 0.961 | Test Accuracy: 59.77%


 95%|█████████▌| 19/20 [04:13<00:12, 12.52s/it]

Epoch: 18 | Train Loss: 0.673 | Train Accuracy: 67.92% | Test Loss: 0.906 | Test Accuracy: 57.50%


100%|██████████| 20/20 [04:26<00:00, 13.35s/it]

Epoch: 19 | Train Loss: 0.625 | Train Accuracy: 73.96% | Test Loss: 0.956 | Test Accuracy: 58.47%





7. Double the data you're using with your model and train it for 20 epochs, what happens to the results?
    * **Note**: You can use the custom data creation notebook to scale up your Food101 dataset.
    * You can also find the already formatted double data (20% instead of 10% subset) dataset on GitHub, you will need to write download code like in exercise 2 to get it into this notebook.

### Downloading More data from Food101 Dataset

In [17]:
# Setup data directory
import pathlib
data_dir = pathlib.Path("../data")

In [18]:
# Get training data
train_data = datasets.Food101(root=data_dir,
                              split="train",
                              # transform=transforms.ToTensor(),
                              download=True)

# Get testing data
test_data = datasets.Food101(root=data_dir,
                             split="test",
                             # transform=transforms.ToTensor(),
                             download=True)

In [19]:
train_data

Dataset Food101
    Number of datapoints: 75750
    Root location: ..\data
    split=train

In [20]:
class_names = train_data.classes
class_names[:10]

['apple_pie',
 'baby_back_ribs',
 'baklava',
 'beef_carpaccio',
 'beef_tartare',
 'beet_salad',
 'beignets',
 'bibimbap',
 'bread_pudding',
 'breakfast_burrito']

In [21]:
import random 
import pathlib

# Setup data paths
data_path = DATA_PATH / 'food-101'/ 'images'
target_classes= ['pizza', 'steak', 'sushi']

# Change amount to get (0.2 = random 20%)
amount_to_get= 0.2

def get_subset(image_path= data_path,
               data_splits= ['train', 'test'],
               target_classes=['pizza', 'steak', 'sushi'],
               amount= amount_to_get,
               seed= 42):
    
    random.seed(seed)
    label_splits= {}

    # Get labels 
    for data_split in data_splits:
        print(f"[INFO] Creating image split for {data_split}...")
        label_path= DATA_PATH/ 'food-101'/'meta'/f'{data_split}.txt'
        with open(label_path, 'r') as f:
            labels = [line.strip("\n") for line in f.readlines() if line.split("/")[0] in target_classes] 
        
        # Get random subset of target classes image ID's
        number_to_sample = round(amount * len(labels))
        print(f"[INFO] Getting random subset of {number_to_sample} images for {data_split}...")
        sampled_images = random.sample(labels, k=number_to_sample)
        
        # Apply full paths
        image_paths = [pathlib.Path(str(image_path / sample_image) + ".jpg") for sample_image in sampled_images]
        label_splits[data_split] = image_paths
    return label_splits
        
label_splits = get_subset(amount=amount_to_get)
label_splits["train"][:10]

[INFO] Creating image split for train...
[INFO] Getting random subset of 450 images for train...
[INFO] Creating image split for test...
[INFO] Getting random subset of 150 images for test...


[WindowsPath('../data/food-101/images/pizza/3269634.jpg'),
 WindowsPath('../data/food-101/images/pizza/1524655.jpg'),
 WindowsPath('../data/food-101/images/steak/2825100.jpg'),
 WindowsPath('../data/food-101/images/steak/225990.jpg'),
 WindowsPath('../data/food-101/images/steak/1839481.jpg'),
 WindowsPath('../data/food-101/images/pizza/38349.jpg'),
 WindowsPath('../data/food-101/images/pizza/3018077.jpg'),
 WindowsPath('../data/food-101/images/sushi/93139.jpg'),
 WindowsPath('../data/food-101/images/pizza/2702825.jpg'),
 WindowsPath('../data/food-101/images/sushi/200025.jpg')]

In [22]:
# Create target directory path
target_dir_name = f"../data/pizza_steak_sushi_{str(int(amount_to_get*100))}_percent"
print(f"Creating directory: '{target_dir_name}'")

# Setup the directories
target_dir = pathlib.Path(target_dir_name)

# Make the directories
target_dir.mkdir(parents=True, exist_ok=True)

Creating directory: '../data/pizza_steak_sushi_20_percent'


In [23]:
import shutil

for image_split in label_splits.keys():
    for image_path in label_splits[str(image_split)]:
        dest_dir = target_dir / image_split / image_path.parent.stem / image_path.name
        if not dest_dir.parent.is_dir():
            dest_dir.parent.mkdir(parents=True, exist_ok=True)
        print(f"[INFO] Copying {image_path} to {dest_dir}...")
        shutil.copy2(image_path, dest_dir)

[INFO] Copying ..\data\food-101\images\pizza\3269634.jpg to ..\data\pizza_steak_sushi_20_percent\train\pizza\3269634.jpg...
[INFO] Copying ..\data\food-101\images\pizza\1524655.jpg to ..\data\pizza_steak_sushi_20_percent\train\pizza\1524655.jpg...
[INFO] Copying ..\data\food-101\images\steak\2825100.jpg to ..\data\pizza_steak_sushi_20_percent\train\steak\2825100.jpg...
[INFO] Copying ..\data\food-101\images\steak\225990.jpg to ..\data\pizza_steak_sushi_20_percent\train\steak\225990.jpg...
[INFO] Copying ..\data\food-101\images\steak\1839481.jpg to ..\data\pizza_steak_sushi_20_percent\train\steak\1839481.jpg...
[INFO] Copying ..\data\food-101\images\pizza\38349.jpg to ..\data\pizza_steak_sushi_20_percent\train\pizza\38349.jpg...
[INFO] Copying ..\data\food-101\images\pizza\3018077.jpg to ..\data\pizza_steak_sushi_20_percent\train\pizza\3018077.jpg...
[INFO] Copying ..\data\food-101\images\sushi\93139.jpg to ..\data\pizza_steak_sushi_20_percent\train\sushi\93139.jpg...
[INFO] Copying ..\

In [24]:
# Check lengths of directories
def walk_through_dir(dir_path):
  """
  Walks through dir_path returning its contents.
  Args:
    dir_path (str): target directory
  
  Returns:
    A print out of:
      number of subdiretories in dir_path
      number of images (files) in each subdirectory
      name of each subdirectory
  """
  import os
  for dirpath, dirnames, filenames in os.walk(dir_path):
    print(f"There are {len(dirnames)} directories and {len(filenames)} images in '{dirpath}'.")
    
walk_through_dir(target_dir)

There are 2 directories and 0 images in '..\data\pizza_steak_sushi_20_percent'.
There are 3 directories and 0 images in '..\data\pizza_steak_sushi_20_percent\test'.
There are 0 directories and 46 images in '..\data\pizza_steak_sushi_20_percent\test\pizza'.
There are 0 directories and 58 images in '..\data\pizza_steak_sushi_20_percent\test\steak'.
There are 0 directories and 46 images in '..\data\pizza_steak_sushi_20_percent\test\sushi'.
There are 3 directories and 0 images in '..\data\pizza_steak_sushi_20_percent\train'.
There are 0 directories and 154 images in '..\data\pizza_steak_sushi_20_percent\train\pizza'.
There are 0 directories and 146 images in '..\data\pizza_steak_sushi_20_percent\train\steak'.
There are 0 directories and 150 images in '..\data\pizza_steak_sushi_20_percent\train\sushi'.


In [25]:
# Zip pizza_steak_sushi images
zip_file_name = data_dir / f"pizza_steak_sushi_{str(int(amount_to_get*100))}_percent"
shutil.make_archive(zip_file_name, 
                    format="zip", 
                    root_dir=target_dir)

'c:\\Users\\Kushagra\\Desktop\\PyTorch_Course\\data\\pizza_steak_sushi_20_percent.zip'

In [26]:
walk_through_dir("../data/pizza_steak_sushi_20_percent")

There are 2 directories and 0 images in '../data/pizza_steak_sushi_20_percent'.
There are 3 directories and 0 images in '../data/pizza_steak_sushi_20_percent\test'.
There are 0 directories and 46 images in '../data/pizza_steak_sushi_20_percent\test\pizza'.
There are 0 directories and 58 images in '../data/pizza_steak_sushi_20_percent\test\steak'.
There are 0 directories and 46 images in '../data/pizza_steak_sushi_20_percent\test\sushi'.
There are 3 directories and 0 images in '../data/pizza_steak_sushi_20_percent\train'.
There are 0 directories and 154 images in '../data/pizza_steak_sushi_20_percent\train\pizza'.
There are 0 directories and 146 images in '../data/pizza_steak_sushi_20_percent\train\steak'.
There are 0 directories and 150 images in '../data/pizza_steak_sushi_20_percent\train\sushi'.


### Updated directory paths above and re-ran trained the model on new data

8. Make a prediction on your own custom image of pizza/steak/sushi (you could even download one from the internet) and share your prediction.
* Does the model you trained in exercise 7 get it right?
* If not, what do you think you could do to improve it?

* The model improved by almost 20 percent, to improve the model we can do the following things to get better accuracy:
    1. Get more training data
    2. Add hidden units
    3. Apply Reinforcement learning on the loss function
    4. train for more epochs (**not recommended since it may lead to overfitting after a while**)