**Building a CNN-based architecture using PyTorch**

In [30]:
import torch
from torch import nn
from torch.utils.data import TensorDataset, Dataset, DataLoader
from torch.optim import SGD, Adam
from torchvision import datasets
from torch.nn import functional as F
import numpy as np
import matplotlib.pyplot as plt
from torchsummary import summary
from tqdm import tqdm


device = 'cuda' if torch.cuda.is_available() else 'cpu'

**Dataset**

In [2]:
X_train = torch.tensor([[[[1,2,3,4],[2,3,4,5],[5,6,7,8],[1,3,4,5]]],[[[-1,2,3,-4],[2,-3,4,5],[-5,6,-7,8],[-1,-3,-4,-5]]]]).to(device).float()
X_train /= 8
y_train = torch.tensor([0,1]).to(device).float()

**Model Architecture**

In [3]:
def get_model():
    model = nn.Sequential(
        nn.Conv2d(1, 1, kernel_size=3),
        nn.MaxPool2d(2),
        nn.ReLU(),
        nn.Flatten(),
        nn.Linear(1, 1),
        nn.Sigmoid()
    ).to(device)

    loss_fn = nn.BCELoss()
    optimizer = Adam(model.parameters(), lr=1e-3)
    return model, loss_fn, optimizer

In [4]:
model, loss_fn, optimizer = get_model()
summary(model, X_train, verbose=0)

Layer (type:depth-idx)                   Output Shape              Param #
├─Conv2d: 1-1                            [-1, 1, 2, 2]             10
├─MaxPool2d: 1-2                         [-1, 1, 1, 1]             --
├─ReLU: 1-3                              [-1, 1, 1, 1]             --
├─Flatten: 1-4                           [-1, 1]                   --
├─Linear: 1-5                            [-1, 1]                   2
├─Sigmoid: 1-6                           [-1, 1]                   --
Total params: 12
Trainable params: 12
Non-trainable params: 0
Total mult-adds (M): 0.00
Input size (MB): 0.00
Forward/backward pass size (MB): 0.00
Params size (MB): 0.00
Estimated Total Size (MB): 0.00

**Train on Batches of Data**

In [5]:
def train_batch(x, y, model, opt, loss_fn):
    model.train()
    prediction = model(x)
    batch_loss = loss_fn(prediction.squeeze(0), y)
    batch_loss.backward()
    optimizer.step()
    optimizer.zero_grad()
    return batch_loss.item()

In [6]:
trn_dl = DataLoader(TensorDataset(X_train, y_train))

`Batch:` A subset of the entire dataset used in one iteration of the training process. Training on batches allows models to update weights more frequently and manage memory more efficiently, especially with large datasets.

`TensorDataset:` A PyTorch utility that combines two tensors into a dataset. It allows you to pair input features (X_train) with their corresponding labels (y_train).

`trn_dl:` This is the DataLoader instance that provides batches of data.

In [13]:
def calculate_accuracy(loader, model, device):
    model.eval()  # Set model to evaluation mode
    correct = 0
    total = 0
    with torch.no_grad():
        for x, y in loader:
            x, y = x.to(device), y.to(device)
            outputs = model(x)
            _, predicted = torch.max(outputs, 1)
            total += y.size(0)
            correct += (predicted == y).sum().item()
    return correct / total * 100

In [15]:
for epoch in range(2000):
    epoch_loss = 0
    for ix, batch in enumerate(trn_dl):
        x, y = batch
        batch_loss = train_batch(x, y, model, optimizer, loss_fn)
        epoch_loss += batch_loss

train_accuracy = calculate_accuracy(trn_dl, model, device)
avg_loss = epoch_loss / len(trn_dl)
print(f'Final Epoch {epoch + 1}/2000 - Avg Loss: {avg_loss:.4f}, Train Accuracy: {train_accuracy:.2f}%')

Final Epoch 2000/2000 - Avg Loss: 0.0164, Train Accuracy: 50.00%


In [16]:
model(X_train[:1])

tensor([[0.0321]], grad_fn=<SigmoidBackward0>)

### Forward Propagating the output
This part of the code is demonstrating how the CNN model processes an input image step-by-step. Here's a simple explanation:

1. It extracts the weights and biases from the convolutional layer `(cnn_w, cnn_b)` and the linear laye+r -(lin_w, lin_b) of the trained model.

2. It then manually applies the convolutional operation on the input image (X_train[0]) using these extracted weights. This is done by sliding the convolutional filter over the image and computing the sum of element-wise multiplication plus the bias.

3. The result of this convolution is stored in the `sumprod` tensor.

4. Next, it applies the ReLU activation function to `sumprod` using clamp_min_(0), which sets all negative values to zero.

5. It then simulates the max pooling operation by taking the maximum value from `sumprod`.

6. The pooling output is then passed through the linear layer by multiplying with lin_w and adding lin_b.

7. Finally, it applies the sigmoid activation function to this result.

This process essentially "unrolls" the forward pass of the CNN, showing how the input is transformed at each step. It's a way to visualize and understand the inner workings of the neural network, demonstrating how the image data is processed through each layer to produce the final output.

**Extract various layers of the model**

In [18]:
list(model.children())

[Conv2d(1, 1, kernel_size=(3, 3), stride=(1, 1)),
 MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False),
 ReLU(),
 Flatten(start_dim=1, end_dim=-1),
 Linear(in_features=1, out_features=1, bias=True),
 Sigmoid()]

* Extract the layers among all the layers of the model that have the `weight` attribute associated with them

In [20]:
(cnn_w, cnn_b), (lin_w, lin_b) = [(layer.weight.data, layer.bias.data) for layer in list(model.children()) if hasattr(layer, 'weight')]

In [21]:
h_im, w_im = X_train.shape[2:]
h_conv, w_conv= cnn_w.shape[2:]
sumprod = torch.zeros((h_im -h_conv +1, w_im - w_conv + 1))

In [22]:
for i in range(h_im - h_conv + 1):
    for j in range(w_im - w_conv + 1):
        img_subset =X_train[0, 0, i:(i+3), j:(j+3)]
        model_filter = cnn_w.reshape(3, 3)
        val =torch.sum(img_subset * model_filter) + cnn_b
        sumprod[i, j] = val

In [23]:
sumprod

tensor([[-1.8059, -2.2239],
        [-0.9672, -1.4943]])

**perform ReLU operation on top of the output**

In [24]:
sumprod.clamp_min_(0)

tensor([[0., 0.],
        [0., 0.]])

* The output of the pooling layer can be calculated like so:

In [27]:
pooling_layer_output = torch.max(sumprod)
pooling_layer_output

tensor(0.)

* Pass the output through linear activation

In [29]:
intermediate_output_value = pooling_layer_output * lin_w + lin_b
intermediate_output_value

tensor([[-3.4069]])

* Pass the output through sigmoid operation

In [31]:
print(F.sigmoid(intermediate_output_value))  # from torch.nn import functional as F

tensor([[0.0321]])
