#<font color='blue' size='5px'/> CIFAR10 Project<font/>

##  Problem Statement

The CIFAR10 project aims to classify 60,000 32x32 color images into 10 different classes, including airplane, automobile, bird, cat, deer, dog, frog, horse, ship, and truck. The challenge is to develop an accurate image classification model that can correctly identify the object in the image despite variations in lighting, color, and orientation.

## Packages

In [None]:
pip install torch torchvision



In [None]:
import torch
import torchvision
import torchvision.transforms as transforms ## For Transformation on Images

## Dataloader Pipeline

### Data Transforms
Define transformations to preprocess the data. Common transformations include resizing, normalizing, and converting data to PyTorch tensors. For example:

In [None]:
transform = transforms.Compose([
    transforms.Resize((32, 32)),           # Resize images to 32x32 pixels
    transforms.ToTensor(),                # Convert images to PyTorch tensors
    transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))  # Normalize pixel values
])

### Load Dataset

You can use PyTorch's torchvision.datasets module to download and load common datasets easily. For this example, we'll use CIFAR-10:

In [None]:
# Download and load the CIFAR-10 training dataset
train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, transform=transform, download=True)

# Download and load the CIFAR-10 test dataset
test_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, transform=transform, download=True)


Downloading https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz to ./data/cifar-10-python.tar.gz


100%|██████████| 170498071/170498071 [00:06<00:00, 27883953.60it/s]


Extracting ./data/cifar-10-python.tar.gz to ./data
Files already downloaded and verified


### Create DataLoaders

Data loaders help you iterate through the dataset conveniently during training. You can specify batch sizes and enable shuffling of data to enhance training performance:

You should use the torch.utils.data.DataLoader class to create data loaders.

- Ensure that you set the **num_workers** argument to utilize multiple CPU cores for data loading, which can significantly speed up the process.
- Also, set the shuffle argument to **True** for the training data loader to randomize the order of samples during training. For the test data loader, set shuffle to **False**.

In [None]:
# Create data loaders
batch_size = 64

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)


In [None]:
# Iterate through the training data loader
for images, labels in train_loader:
    # Your training code here
    print("Batch of images shape:", images.shape)
    print("Batch of labels:", labels)
    break  # Break after processing the first batch for this example


Batch of images shape: torch.Size([64, 3, 32, 32])
Batch of labels: tensor([2, 1, 6, 7, 2, 0, 4, 4, 2, 3, 5, 0, 4, 6, 4, 8, 0, 5, 2, 2, 1, 3, 8, 6,
        4, 2, 3, 7, 5, 1, 3, 4, 9, 8, 0, 6, 7, 8, 7, 3, 1, 3, 1, 3, 0, 2, 8, 3,
        8, 3, 0, 5, 8, 1, 6, 1, 2, 9, 4, 0, 7, 3, 4, 1])


##  Model Selection & Training

### Moidel Building

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim

- **Model Overview:**

  The following neural network model consists of two convolutional layers (`conv1` and `conv2`) that transform the input tensor into a higher-dimensional feature space. Specifically, `conv1` takes an input tensor with shape `(batch_size, 3, 32, 32)` and produces an output tensor with shape `(batch_size, 16, 32, 32)`, while `conv2` takes an input tensor with shape `(batch_size, 16, 16, 16)` and produces an output tensor with shape `(batch_size, 32, 8, 8)`.

  The output tensor of the second convolutional layer (`conv2`) is then flattened into a one-dimensional vector of length `32 x 8 x 8` using the `view` method. This one-dimensional vector is then passed through two fully connected layers (`fc1` and `fc2`) that transform the vector into a 10-dimensional output tensor.

  Each layer in the neural network model learns to extract and transform features from the input tensor to produce a higher-level representation of the data. By chaining multiple layers together in this way, the neural network is able to learn increasingly complex representations of the data that are useful for the task at hand (in this case, image classification on the CIFAR10 dataset).

- **View In the model**

  - The `view` method in PyTorch is used to reshape a tensor while preserving its total number of elements. In this case, the `-1` argument in `x.view(-1, 32 * 8 * 8)` indicates that the size of that dimension should be inferred based on the other dimensions and the total number of elements in the tensor.
  - `x = x.view(-1, 32 * 8 * 8)` is used to reshape the output tensor of the second convolutional layer (`conv2`) into a one-dimensional vector.

  - The output tensor of `conv2` has a shape of `(batch_size, 32, 8, 8)`, where `batch_size` is the number of samples in the batch. By reshaping it using `x.view(-1, 32 * 8 * 8)`, we flatten the tensor into a one-dimensional vector of length `32 * 8 * 8`. The `-1` argument allows the batch size to be automatically inferred based on the original shape of the tensor.

  - This reshaping step is necessary because the subsequent fully connected layers (`fc1` and `fc2`) expect a **one-dimensional input tensor**. By flattening the tensor, we ensure that the output of `conv2` can be properly fed into the fully connected layers for further processing.

  - The `x.view(-1, 32 * 8 * 8)` operation in PyTorch is equivalent to the `tf.reshape(x, [-1, 32 * 8 * 8])` operation in TensorFlow, **which flattens a tensor into a one-dimensional vector while preserving its total number of elements**.



In [None]:
class MyModel(nn.Module):
    def __init__(self):
        super(MyModel, self).__init__()

        self.conv1 = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, padding=1)
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)


        self.conv2 = nn.Conv2d(in_channels=16, out_channels=32, kernel_size=3, padding=1)


        ## The in_features parameter specifies the number of input features to the layer. In this case, the input tensor to the fully connected layer has shape (batch_size, 32, 8, 8)
        ## where batch_size is the number of samples in the batch,
        ## 32 is the number of output channels from the previous convolutional layer,
        ## 8 x 8 is the spatial size of the feature maps.
        ##The in_features parameter is set to 32 x 8 x 8 to reflect the fact that each feature map is flattened into a one-dimensional vector of length 32 x 8 x 8 before being passed to the fully connected layer.
        self.fc1 = nn.Linear(in_features=32 * 8 * 8, out_features=64)
        self.fc2 = nn.Linear(in_features=64, out_features=10)

    def forward(self, x):
        ## The reason for this difference is that torch.relu is a built-in function in PyTorch that applies the ReLU activation function element-wise to a tensor
        ## while self.pool is an instance of the nn.MaxPool2d class that performs max pooling on a tensor.
        x = torch.relu(self.conv1(x))
        x = self.pool(x)

        x = torch.relu(self.conv2(x))
        x = self.pool(x)

        ## The view is similar to Flatten
        x = x.view(-1, 32 * 8 * 8)

        ## self.fc1 is an instance of the nn.Linear class, we need to call it as a method of the instance (self.fc1(x)) in order to apply the linear transformation to the tensor.
        x = torch.relu(self.fc1(x))

        ## self.fc2 is an instance of the nn.Linear class, we need to call it as a method of the instance (self.fc2(x)) in order to apply the linear transformation to the tensor.
        x = self.fc2(x)
        return x

In [None]:
model=MyModel()

In [None]:
from torchsummary import summary

summary(model, (64, 3, 32, 32))

In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

### Optimizer & Loss Function

In [None]:
# Define the loss function and optimizer
model = MyModel()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

In [None]:
# Train the model
num_epochs = 10

### Train Model

In [None]:
for epoch in range(num_epochs):
    running_loss = 0.0
    for i, (inputs, labels) in enumerate(train_loader):
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()
        if i % 100 == 99:
            print('[Epoch %d, Batch %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 100))
            running_loss = 0.0

[Epoch 1, Batch   100] loss: 1.944
[Epoch 1, Batch   200] loss: 1.657
[Epoch 1, Batch   300] loss: 1.503
[Epoch 1, Batch   400] loss: 1.454
[Epoch 1, Batch   500] loss: 1.383
[Epoch 1, Batch   600] loss: 1.344
[Epoch 1, Batch   700] loss: 1.287
[Epoch 2, Batch   100] loss: 1.209
[Epoch 2, Batch   200] loss: 1.173
[Epoch 2, Batch   300] loss: 1.137
[Epoch 2, Batch   400] loss: 1.135
[Epoch 2, Batch   500] loss: 1.127
[Epoch 2, Batch   600] loss: 1.107
[Epoch 2, Batch   700] loss: 1.090
[Epoch 3, Batch   100] loss: 1.019
[Epoch 3, Batch   200] loss: 1.000
[Epoch 3, Batch   300] loss: 0.982
[Epoch 3, Batch   400] loss: 0.990
[Epoch 3, Batch   500] loss: 0.976
[Epoch 3, Batch   600] loss: 0.992
[Epoch 3, Batch   700] loss: 0.960
[Epoch 4, Batch   100] loss: 0.886
[Epoch 4, Batch   200] loss: 0.893
[Epoch 4, Batch   300] loss: 0.888
[Epoch 4, Batch   400] loss: 0.901
[Epoch 4, Batch   500] loss: 0.905
[Epoch 4, Batch   600] loss: 0.900
[Epoch 4, Batch   700] loss: 0.884
[Epoch 5, Batch   10

## Prediction

## Evaluation

In [None]:
# Test the model
model.eval()
correct_test = 0
total_test = 0
y_true = []
y_pred = []
with torch.no_grad():
    for inputs, labels in test_loader:
        outputs = model(inputs)
        _, predicted = torch.max(outputs.data, 1)
        y_true += labels.tolist()
        y_pred += predicted.tolist()
        total_test += labels.size(0)
        correct_test += (predicted == labels).sum().item()

        # Compute metrics and print results
        acc_test = correct_test / total_test
        print('Accuracy on test set: %.3f' % acc_test)

In [None]:
# Compute the confusion matrix and classification report
from sklearn.metrics import confusion_matrix, classification_report
classes = ('plane', 'car', 'bird', 'cat',
           'deer', 'dog', 'frog', 'horse', 'ship', 'truck')
cm = confusion_matrix(y_true, y_pred)
report = classification_report(y_true, y_pred, target_names=classes)

print("Confusion Matrix:")
print(cm)
print("Classification Report:")
print(report)

Confusion Matrix:
[[725  22  55  18  19  10  15  10  87  39]
 [ 23 777   9  11   7   6  14   9  29 115]
 [ 62   4 579  43  89  62  89  43  16  13]
 [ 18   7  93 441  74 181 107  39  28  12]
 [ 15   3  91  43 645  39  79  64  18   3]
 [  9   4  62 141  56 623  33  51  14   7]
 [  4   2  41  26  31  28 846   8  10   4]
 [ 11   4  35  27  81  61   8 748   5  20]
 [ 50  36   9   9  16   9   8   3 833  27]
 [ 30  76  11  17  15   6  14  24  45 762]]
Classification Report:
              precision    recall  f1-score   support

       plane       0.77      0.72      0.74      1000
         car       0.83      0.78      0.80      1000
        bird       0.59      0.58      0.58      1000
         cat       0.57      0.44      0.50      1000
        deer       0.62      0.65      0.63      1000
         dog       0.61      0.62      0.62      1000
        frog       0.70      0.85      0.76      1000
       horse       0.75      0.75      0.75      1000
        ship       0.77      0.83      0.