<a href="https://colab.research.google.com/github/BobojonM/NeuralNetworks/blob/main/Test_3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Task 1 (10 Points)

Select padding sizes:

In [8]:
import torch

N = 4
C = 3
C_out = 10
H = 8
W = 16

x = torch.ones((N, C, H, W))

#torch.Size([4, 10, 8, 16])
out1 = torch.nn.Conv2d(C, C_out, kernel_size=(3, 3), padding=(1, 1))(x)
#print(out1.shape) # for self-test

# torch.Size([4, 10, 8, 16])
out2 = torch.nn.Conv2d(C, C_out, kernel_size=(5, 5), padding=(2, 2))(x)
#print(out2.shape) # for self-test

# torch.Size([4, 10, 8, 16])
out3 = torch.nn.Conv2d(C, C_out, kernel_size=(7, 7), padding=(3, 3))(x)
#print(out3.shape) # for self-test

# torch.Size([4, 10, 8, 16])
out4 = torch.nn.Conv2d(C, C_out, kernel_size=(9, 9), padding=(4, 4))(x)
#print(out4.shape) # for self-test

# torch.Size([4, 10, 8, 16])
out5 = torch.nn.Conv2d(C, C_out, kernel_size=(3, 5), padding=(1, 2))(x)
#print(out5.shape) # for self-test

# torch.Size([4, 10, 22, 30])
out6 = torch.nn.Conv2d(C, C_out, kernel_size=(3, 3), padding=(8, 8))(x)
#print(out6.shape) # for self-test

# torch.Size([4, 10, 7, 15])
out7 = torch.nn.Conv2d(C, C_out, kernel_size=(4, 4), padding=(1, 1))(x)
#print(out7.shape) # for self-test

# torch.Size([4, 10, 9, 17])
out8 = torch.nn.Conv2d(C, C_out, kernel_size=(2, 2), padding=(1, 1))(x)
#print(out8.shape) # for self-test

## Task 2 (40 Points)

Develop an architecture according to the data from the article.
To test the functionality, test your architecture on any suitable data set.

### Architectural Design Strategies
**Strategy 1.** Replace 3×3 filters with 1×1 filters
Given a budget of a certain number of convolution filters, we can choose to make the majority of these filters 1×1, since a 1×1 filter has 9× fewer parameters than a 3×3 filter.

**Strategy 2.** Decrease the number of input channels to 3×3 filters
Consider a convolution layer that is comprised entirely of 3×3 filters. The total quantity of parameters in this layer is:
(number of input channels) × (number of filters) × (3×3)
We can decrease the number of input channels to 3×3 filters using squeeze layers, mentioned in the next section.

**Strategy 3.** Downsample late in the network so that convolution layers have large activation maps
The intuition is that large activation maps (due to delayed downsampling) can lead to higher classification accuracy.

### Fire Module
![](https://miro.medium.com/v2/resize:fit:930/format:webp/1*ONk0HfLLjDcUhUjuu8iq1w.png)
A Fire module is comprised of: a squeeze convolution layer (which has only 1×1 filters), feeding into an expand layer that has a mix of 1×1 and 3×3 convolution filters.

There are three tunable dimensions (hyperparameters) in a Fire module: s1×1, e1×1, and e3×3.

s1×1: The number of 1×1 in squeeze layer.

e1×1 and e3×3: The number of 1×1 and 3×3 in expand layer.

When we use Fire modules we set s1×1 to be less than (e1×1 + e3×3), so the squeeze layer helps to limit the number of input channels to the 3×3 filters, as per Strategy 2 in previous section.
To me, it is quite a like of Inception Module.

![](https://miro.medium.com/v2/resize:fit:1400/format:webp/1*y87bqk95D-IndWdHM_K9-g.png)
![](https://miro.medium.com/v2/resize:fit:4800/format:webp/1*XQGAKZb8kjoF_1lSXeIQxg.png)

## Step 0. Data preparation.

In [6]:
from google.colab import files
files.upload()

Saving kaggle.json to kaggle.json


{'kaggle.json': b'{"username":"bobojonm","key":"4aa2a11dd8e7344e8537f7bf3434fd7e"}'}

In [7]:
!rm -r ~/.kaggle
!mkdir ~/.kaggle
!mv ./kaggle.json ~/.kaggle/
!chmod 600 ~/.kaggle/kaggle.json

In [8]:
!kaggle datasets download -d jessicali9530/stanford-cars-dataset

stanford-cars-dataset.zip: Skipping, found more recently modified local copy (use --force to force download)


In [10]:
import zipfile
zip_ref = zipfile.ZipFile('stanford-cars-dataset.zip', 'r')
zip_ref.extractall('/stanford-cars')
zip_ref.close()

In [11]:
import os
print(os.listdir('/stanford-cars'))

['cars_annos.mat', 'cars_train', 'cars_test']


In [1]:
import torch
from torchvision import datasets, transforms
from torch.utils.data import random_split, DataLoader

# Define transformations
transform = transforms.Compose([
    transforms.Resize((224, 224)), # Resizing to the same size
    transforms.ToTensor(),         # Convert PIL image to PyTorch tensor
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize tensor values between [-1, 1]
])

# Load datasets using ImageFolder
train_dataset = datasets.ImageFolder(root='/stanford-cars/cars_train/', transform=transform)
test_dataset = datasets.ImageFolder(root='/stanford-cars/cars_test/', transform=transform)

# Split the training dataset into training and validation sets
train_len = int(0.8 * len(train_dataset))
val_len = len(train_dataset) - train_len

train_data, val_data = random_split(train_dataset, [train_len, val_len])

# Create DataLoaders
train_loader = DataLoader(train_data, batch_size=32, shuffle=True)
val_loader = DataLoader(val_data, batch_size=32, shuffle=False)
test_loader = DataLoader(test_dataset, batch_size=32, shuffle=False)


## Step 1. Neural network architecture

In [2]:
import torch.nn as nn

class Fire(nn.Module):
    def __init__(self, x, s1x1, e1x1, e3x3):
        super(Fire, self).__init__()

        self.s1x1 = nn.Conv2d(x, s1x1, kernel_size=1, padding=1)
        self.ac1 = nn.ReLU(inplace=True)

        self.e1x1 = nn.Conv2d(s1x1, e1x1, kernel_size=1)
        self.ac2 = nn.ReLU(inplace=True)

        self.e3x3 = nn.Conv2d(s1x1, e3x3, kernel_size=3, padding=1)
        self.ac3 = nn.ReLU(inplace=True)

    def forward(self, x):
        x = self.ac1(self.s1x1(x))
        return torch.cat([
            self.ac2(self.e1x1(x)),
            self.ac3(self.e3x3(x))
        ], 1)


class SqueezeNet(nn.Module):
    def __init__(self, num_classes=1000):
        super(SqueezeNet, self).__init__()

        self.conv1 = nn.Conv2d(3, 96, kernel_size=7, stride=2, padding=3)
        self.ac1 = nn.ReLU(inplace=True)
        self.pool1 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

        self.fire1 = Fire(96, 16, 64, 64)
        self.fire2 = Fire(128, 16, 64, 64)
        self.fire3 = Fire(128, 32, 128, 128)
        self.pool2 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.fire4 = Fire(256, 32, 128, 128)
        self.fire5 = Fire(256, 48, 192, 192)
        self.fire6 = Fire(384, 48, 192, 192)
        self.fire7 = Fire(384, 64, 256, 256)
        self.pool3 = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)
        self.fire8 = Fire(512, 64, 256, 256)

        self.conv2 = nn.Conv2d(512, num_classes, kernel_size=1)
        self.ac2 = nn.ReLU(inplace=True)
        self.avpool = nn.AdaptiveAvgPool2d((1, 1))

    def forward(self, x):
        x = self.conv1(x)
        x = self.ac1(x)
        x = self.pool1(x)

        x = self.fire1(x)
        x = self.fire2(x)
        x = self.fire3(x)
        x = self.pool2(x)
        x = self.fire4(x)
        x = self.fire5(x)
        x = self.fire6(x)
        x = self.fire7(x)

        x = self.pool3(x)
        x = self.fire8(x)

        x = self.conv2(x)
        x = self.ac2(x)
        x = self.avpool(x)

        return torch.flatten(x, 1)

model = SqueezeNet()

In [3]:
device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
model = model.to(device)

## Step 2.  Loss Function

In [4]:
loss = nn.CrossEntropyLoss()

## Step 3. Optimizer

In [5]:
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)

## Step 4. Train Loop

In [None]:
num_epochs = 20
for epoch in range(num_epochs):

    running_loss = 0.0
    total_trained = 0

    for batch_idx, (inputs, labels) in enumerate(train_loader):
        inputs, labels = inputs.to(device), labels.to(device)

        optimizer.zero_grad()

        outputs = model(inputs)

        loss_value = loss(outputs, labels)

        loss_value.backward()
        optimizer.step()

        running_loss += loss_value.item() * inputs.size(0)
        total_trained += labels.size(0)

        # Print statistics every 10 batches
        if (batch_idx + 1) % 10 == 0:
            print(f'Epoch [{epoch+1}/{num_epochs}], Step [{batch_idx+1}/{len(train_loader)}], Loss: {loss_value.item():.4f}')

    # Print the average loss for this epoch
    avg_loss = running_loss / total_trained
    print(f'Epoch [{epoch+1}/{num_epochs}], Average Loss: {avg_loss:.4f}')

print("Training completed!")



Epoch [1/20], Step [10/204], Loss: 6.9078
Epoch [1/20], Step [20/204], Loss: 6.9078
Epoch [1/20], Step [30/204], Loss: 6.9078
