## Import Required Libraries

This section imports all the necessary libraries for building models for image classification using PyTorch and related tools.

In [8]:
import torch.nn as nn
from torchinfo import summary

## SimpleCNN Model Architecture

The `SimpleCNN` class defines a straightforward convolutional neural network for grayscale image classification. Below are the key architectural details:

- **Version:** 1.0
- **Input:** Grayscale images of size 256x256 (1 channel)
- **Layers:**
    - **Convolutional Layer:**  
        - `nn.Conv2d` with 1 input channel, 1 output channel, kernel size 6x6, stride 4, no padding  
        - Followed by `nn.ReLU` activation
        - Followed by `nn.MaxPool2d` with kernel size 3x3, stride 3
    - **Flatten Layer:**  
        - Flattens the output from the convolutional block to a 1D tensor
    - **Fully Connected Layer:**  
        - `nn.Linear` with 441 input features (1 × 21 × 21) and 14 output classes

### Forward Pass

1. The input image passes through the convolutional block (`layer1`).
2. The output is flattened into a vector.
3. The flattened vector is passed through a fully connected layer (`fc1`) to produce class logits.

### Usage

This model is suitable for simple grayscale image classification tasks with 14 output classes. The architecture is intentionally minimal for educational or prototyping purposes.

In [9]:
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.version = '1.0'
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=1, kernel_size=6, stride=4, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=3),
        )
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(1 * 21 * 21, 14)

    def forward(self, x):
        out = self.layer1(x)
        out = self.flatten(out)
        out = self.fc1(out)
        return out

summary(
    SimpleCNN(), 
    input_size=(1, 1, 256, 256), 
    col_names=[
        "kernel_size", "input_size", "output_size", 
        "num_params", "mult_adds", "trainable"
    ], 
    row_settings=["var_names", "depth"],
)

Layer (type (var_name):depth-idx)        Kernel Shape              Input Shape               Output Shape              Param #                   Mult-Adds                 Trainable
SimpleCNN (SimpleCNN)                    --                        [1, 1, 256, 256]          [1, 14]                   --                        --                        True
├─Sequential (layer1): 1-1               --                        [1, 1, 256, 256]          [1, 1, 21, 21]            --                        --                        True
│    └─Conv2d (0): 2-1                   [6, 6]                    [1, 1, 256, 256]          [1, 1, 63, 63]            37                        146,853                   True
│    └─ReLU (1): 2-2                     --                        [1, 1, 63, 63]            [1, 1, 63, 63]            --                        --                        --
│    └─MaxPool2d (2): 2-3                3                         [1, 1, 63, 63]            [1, 1, 21, 21]          

## Model Version Update and Input Adaptation

The `SimpleCNN` model has been upgraded to **version 1.1**. The primary change is in the input layer, which now expects **3-channel (RGB) images** instead of single-channel (grayscale) images. This update enables the model to process color images, expanding its applicability to a wider variety of image classification tasks.

### Key Details:

- **Input Layer:**  
    - Now configured for 3 input channels (`in_channels=3`), suitable for RGB images.
    - Previous version (1.0) accepted only grayscale images (`in_channels=1`).

- **Architecture:**  
    - The convolutional, pooling, flatten, and fully connected layers remain unchanged from the previous version.
    - The model outputs predictions for **14 classes**.

- **Image Preprocessing:**  
    - The transformation pipeline (`trans`) resizes images to 256x256 pixels and converts them to tensors.
    - The current sample image (`img`) is loaded in grayscale mode (`mode='L'`), which is incompatible with the new model expecting RGB input.
    - **Action Required:** To use the updated model, ensure images are loaded in RGB mode:
        ```python
        img = Image.open("path/to/image.jpg").convert('RGB')
        ```

- **Summary:**  
    - This version is better suited for real-world datasets where color information is important.
    - Always verify that your input data matches the model's expected input shape and channel configuration.

> **Note:** If you continue to use grayscale images, you must either convert them to RGB (by duplicating the single channel) or revert the model to accept single-channel input.

In [10]:
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.version = '1.1'
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=3, out_channels=1, kernel_size=6, stride=4, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=3),
        )
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(1 * 21 * 21, 14)

    def forward(self, x):
        out = self.layer1(x)
        out = self.flatten(out)
        out = self.fc1(out)
        return out

summary(
    SimpleCNN(), 
    input_size=(1, 3, 256, 256), 
    col_names=[
        "kernel_size", "input_size", "output_size", 
        "num_params", "mult_adds", "trainable"
    ], 
    row_settings=["var_names", "depth"],
)

Layer (type (var_name):depth-idx)        Kernel Shape              Input Shape               Output Shape              Param #                   Mult-Adds                 Trainable
SimpleCNN (SimpleCNN)                    --                        [1, 3, 256, 256]          [1, 14]                   --                        --                        True
├─Sequential (layer1): 1-1               --                        [1, 3, 256, 256]          [1, 1, 21, 21]            --                        --                        True
│    └─Conv2d (0): 2-1                   [6, 6]                    [1, 3, 256, 256]          [1, 1, 63, 63]            109                       432,621                   True
│    └─ReLU (1): 2-2                     --                        [1, 1, 63, 63]            [1, 1, 63, 63]            --                        --                        --
│    └─MaxPool2d (2): 2-3                3                         [1, 1, 63, 63]            [1, 1, 21, 21]          

## Model Version Update and output update

- **Image (`img`)**:  
    - Loaded as a grayscale image (`mode='L'`) with dimensions 256x256 pixels.
    - Suitable for models expecting single-channel input.

- **Transformation Pipeline (`trans`)**:  
    - Resizes images to 256x256 pixels.
    - Converts images to PyTorch tensors, scaling pixel values to [0, 1].
    - Ensures consistent input size and format for the model.

- **Model (`SimpleCNN`)**:  
    - Multiple versions are defined in the notebook:
        - **Version 1.0:** Expects grayscale input (1 channel), outputs 1 channel from the convolutional layer, and uses a fully connected layer with 441 input features.
        - **Version 1.1:** Updated to accept RGB input (3 channels), but the rest of the architecture remains similar.
        - **Version 1.2 (Current):** Expects grayscale input (1 channel), but the first convolutional layer outputs 10 channels. The fully connected layer is updated to accept 4410 input features (10 × 21 × 21), with 14 output classes.
    - The currently active model (version 1.2) matches the grayscale image input and transformation pipeline.

**Conclusion:**  
The current workflow is consistent: the image is grayscale, the transformation pipeline maintains this format, and the active model (`SimpleCNN` v1.2) is configured for single-channel input with an updated architecture. This setup is appropriate for grayscale image classification tasks with 14 classes. The notebook also documents previous model versions, including one for RGB input, providing flexibility for future adaptations.

In [11]:
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.version = '1.2'
        self.layer1 = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=10, kernel_size=6, stride=4, padding=0),
            nn.ReLU(),
            nn.MaxPool2d(kernel_size=3, stride=3),
        )
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(10 * 21 * 21, 14)

    def forward(self, x):
        out = self.layer1(x)
        out = self.flatten(out)
        out = self.fc1(out)
        return out

summary(
    SimpleCNN(), 
    input_size=(1, 1, 256, 256), 
    col_names=[
        "kernel_size", "input_size", "output_size", 
        "num_params", "mult_adds", "trainable"
    ], 
    row_settings=["var_names", "depth"],
)

Layer (type (var_name):depth-idx)        Kernel Shape              Input Shape               Output Shape              Param #                   Mult-Adds                 Trainable
SimpleCNN (SimpleCNN)                    --                        [1, 1, 256, 256]          [1, 14]                   --                        --                        True
├─Sequential (layer1): 1-1               --                        [1, 1, 256, 256]          [1, 10, 21, 21]           --                        --                        True
│    └─Conv2d (0): 2-1                   [6, 6]                    [1, 1, 256, 256]          [1, 10, 63, 63]           370                       1,468,530                 True
│    └─ReLU (1): 2-2                     --                        [1, 10, 63, 63]           [1, 10, 63, 63]           --                        --                        --
│    └─MaxPool2d (2): 2-3                3                         [1, 10, 63, 63]           [1, 10, 21, 21]         