### By Nour Mohamed Raafat  , Nada Abdelfatah

# Overview

This script performs experiments with Convolutional Neural Networks (CNNs) on the Fashion MNIST dataset using PyTorch. It trains multiple models with different configurations and evaluates their performance on the test set. Let's break down the key components of the code:

## Data Loading and Preprocessing

The Fashion MNIST dataset is loaded and preprocessed using torchvision transforms. It is then split into batches and loaded into DataLoader objects for training.

## Model Definition

A function `build_cnn()` is defined to construct CNN models with customizable parameters such as the number of input channels, hidden layers, activation functions, pooling methods, optimizer, learning rate, and dropout probability.

## Training and Evaluation

A function `train_and_evaluate()` is defined to train the model and evaluate its performance on the test set. It iterates over the specified number of epochs, computes the loss, and updates the model parameters using the chosen optimizer. After training, it evaluates the model accuracy on the test set.

## Experimentation

The script performs two types of experiments:

1. **Single Model Experiment (TRY NUM : 1):**
   - It defines a specific configuration for a single model.
   - The model is trained and evaluated with the given configuration.

2. **Multiple Model Experiments (Configurations):**
   - It defines multiple configurations, each representing a unique combination of activation functions, pooling methods, optimizers, learning rates, dropout probabilities, and data augmentation.
   - For each configuration, a CNN model is built, trained, and evaluated.

## Professional Overview

This script demonstrates a systematic approach to experimenting with CNN architectures and hyperparameters for image classification tasks. It follows best practices by modularizing the code, providing clear documentation, and conducting experiments with multiple configurations. The use of DataLoader objects ensures efficient data loading and batching during training. Additionally, the script leverages PyTorch's built-in functionalities for defining neural networks, loss computation, and optimization. Overall, it serves as a valuable tool for researchers and practitioners in the field of deep learning.


# Import necessary libraries

In [None]:
import torch
import torchvision
import torchvision.transforms as transforms
import torch.nn as nn
import torch.optim as optim

# Normalizing Data in Fashion MNIST Dataset

In machine learning, preprocessing data is often crucial for effective model training. In the provided code snippet, the Fashion MNIST dataset is loaded and normalized using PyTorch's torchvision library. Let's break down the steps involved:

## Transform Definition

A transform is a set of operations applied to each sample in the dataset during loading. In this case, the `transforms.Compose` function is used to create a sequence of transformations. Two transformations are applied:

1. **ToTensor**: This transformation converts the input image data into a PyTorch tensor. It converts the image data from PIL format (or other formats) to a tensor, which is a multi-dimensional array suitable for processing by neural networks.

2. **Normalize**: Normalization is a common preprocessing step in deep learning. It standardizes the pixel values of the input images to have a mean and standard deviation of 0.5. This helps in stabilizing the training process by bringing the input data to a similar scale.

## Loading Fashion MNIST Dataset

The Fashion MNIST dataset is a collection of grayscale images of clothing items belonging to 10 different categories. The `torchvision.datasets.FashionMNIST` class is used to load the dataset. Parameters such as the root directory, whether to download the dataset if not found locally, and the specified transform are provided to the constructor.

## DataLoader Configuration

After loading the dataset, it is split into batches using the `torch.utils.data.DataLoader` class. This class provides an iterable over the dataset, handling batching, shuffling, and multiprocessing for data loading. In this case, each batch contains 4 samples, and the data is shuffled during loading. The `num_workers` parameter specifies the number of subprocesses used for data loading.

By applying these transformations and configuring the DataLoader, the Fashion MNIST dataset is prepared for training neural network models, ensuring efficient processing and improved model performance.


In [None]:
# Define transform to normalize the data
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# Load Fashion MNIST dataset
trainset = torchvision.datasets.FashionMNIST(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=4,
                                          shuffle=True, num_workers=2)

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz


100%|██████████| 26421880/26421880 [00:03<00:00, 6699465.36it/s] 


Extracting ./data/FashionMNIST/raw/train-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz


100%|██████████| 29515/29515 [00:00<00:00, 112460.75it/s]


Extracting ./data/FashionMNIST/raw/train-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz


100%|██████████| 4422102/4422102 [00:01<00:00, 2248719.41it/s]


Extracting ./data/FashionMNIST/raw/t10k-images-idx3-ubyte.gz to ./data/FashionMNIST/raw

Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz
Downloading http://fashion-mnist.s3-website.eu-central-1.amazonaws.com/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz


100%|██████████| 5148/5148 [00:00<00:00, 4586295.03it/s]

Extracting ./data/FashionMNIST/raw/t10k-labels-idx1-ubyte.gz to ./data/FashionMNIST/raw






# CNN Model Function Explanation

The provided code defines a function called `build_cnn()` which is responsible for constructing a Convolutional Neural Network (CNN) model with customizable parameters. Let's go through the details of this function:

## Function Purpose

The purpose of this function is to create a CNN model tailored to specific requirements such as input image properties, architecture of hidden layers, activation functions, pooling methods, optimizer choice, learning rate, and dropout probability.

## Function Arguments

- **in_channels**: Specifies the number of input channels, typically 1 for grayscale images.
- **hidden_layers**: A list of tuples specifying the configuration of hidden convolutional layers. Each tuple contains the number of filters and filter size for the respective layer.
- **activation**: Activation function to be applied to hidden layers, such as 'Sigmoid', 'Tanh', or 'ReLU'.
- **pooling**: Pooling function to be used, either 'max' for max pooling or 'avg' for average pooling.
- **optimizer**: Optimizer to use for training the model, including options like 'adam', 'sgd', or 'rmsprop'.
- **lr**: Learning rate, which controls the step size during optimization.
- **dropout_prob**: Probability for dropout regularization, with a default value of 0.

## Model Construction Steps

1. **Input Layer**:
   - Adds a convolutional layer with the specified number of input channels, 32 output channels, kernel size of 3x3, and padding of 1.

2. **Hidden Convolutional Layers**:
   - Iterates through the list of hidden layers, adding convolutional layers with the specified number of filters and kernel size.
   - Applies the specified activation function dynamically using `getattr(nn, activation)()`.
   - Incorporates either max or average pooling based on the provided pooling function.

3. **Dropout Layer**:
   - Adds a dropout layer if the dropout probability is greater than 0.

4. **Flatten Layer**:
   - Flattens the output of convolutional layers to prepare for input to fully connected layers.

5. **Fully Connected Layers**:
   - Adds a fully connected layer with 64 neurons and applies the specified activation function.
   - Adds the output layer with 10 neurons for classifying into 10 fashion classes.

## Optimizer Selection

- The function selects the optimizer based on the provided optimizer argument and initializes it with the model parameters and learning rate.

## Error Handling

- Raises a ValueError if an invalid optimizer is provided.

This function provides a flexible and customizable way to create CNN models suited to various image classification tasks, allowing researchers and practitioners to experiment with different architectures and configurations.


# Models Building

In [None]:
# Define the CNN model function with path arguments
def build_cnn(in_channels, hidden_layers, activation, pooling, optimizer, lr, dropout_prob=0):
  """
  Builds a CNN model with customizable parameters.

  Args:
      in_channels: Number of input channels (usually 1 for grayscale images).
      hidden_layers: List of tuples specifying (number of filters, filter size) for convolutional layers.
      activation: Activation function for hidden layers (e.g., 'Sigmoid', 'Tanh', 'ReLU').
      pooling: Pooling function ('max' or 'avg').
      optimizer: Optimizer (e.g., 'adam', 'sgd', 'rmsprop').
      lr: Learning rate.
      dropout_prob: Probability for dropout regularization (default: 0).

  Returns:
      A PyTorch CNN model.
  """

  layers = []

  # Input layer (adjust based on input image size)
  layers.append(nn.Conv2d(in_channels, 32, kernel_size=3, padding=1))

  # Add hidden convolutional layers
  prev_channels = 32  # Initially, the number of input channels is 32
  for filters, kernel_size in hidden_layers:
    layers.append(nn.Conv2d(prev_channels, filters, kernel_size=kernel_size, padding=1))
    layers.append(getattr(nn, activation)())  # Use getattr for dynamic activation selection

    if pooling == 'max':
      layers.append(nn.MaxPool2d(2, 2))  # Max pooling
    elif pooling == 'avg':
      layers.append(nn.AvgPool2d(2, 2))  # Average pooling
    else:
      raise ValueError("Invalid pooling function: {}".format(pooling))

    # Add dropout layer if specified
    if dropout_prob > 0:
      layers.append(nn.Dropout(dropout_prob))

    prev_channels = filters  # Update the number of input channels for the next layer

  # Flatten for fully connected layers
  layers.append(nn.Flatten())

  # Fully connected layers (adjust number of neurons as needed)
  layers.append(nn.Linear(prev_channels * 7 * 7, 64))  # Example fully connected layer
  layers.append(getattr(nn, activation)())  # Apply activation to fully connected layers

  layers.append(nn.Linear(64, 10))  # Output layer for 10 fashion classes

  model = nn.Sequential(*layers)

  # Select optimizer based on argument
  if optimizer == 'adam':
    optimizer = torch.optim.Adam(model.parameters(), lr=lr)
  elif optimizer == 'sgd':
    optimizer = torch.optim.SGD(model.parameters(), lr=lr)
  elif optimizer == 'rmsprop':
    optimizer = torch.optim.RMSprop(model.parameters(), lr=lr)
  else:
    raise ValueError("Invalid optimizer: {}".format(optimizer))

  return model, optimizer

# Training and Evaluation Function Explanation

The provided code defines a function called `train_and_evaluate()` which serves the purpose of training a given CNN model and evaluating its performance on a test dataset. Let's break down the details of this function:

## Function Purpose

This function is responsible for training a CNN model using the provided optimizer and DataLoader for training data. After training, it evaluates the trained model's accuracy on a separate test dataset.

## Function Arguments

- **model**: The CNN model to be trained and evaluated.
- **optimizer**: The optimizer used for training the model.
- **train_loader**: DataLoader containing the training data.
- **test_loader**: DataLoader containing the test data.
- **epochs**: Number of training epochs.

## Training Loop

- The function iterates through each epoch, during which it trains the model using the provided optimizer and training data.
- Within each epoch, it iterates through batches of data from the training DataLoader (`train_loader`).
- For each batch, it performs a forward pass through the model, calculates the loss using the specified loss function (`nn.CrossEntropyLoss()`), computes gradients, and updates the model parameters using the optimizer.
- It also tracks the running loss for each epoch and prints the average loss every 2000 batches.

## Evaluation

- After completing training for all epochs, the function evaluates the model's performance on the test dataset.
- It switches the model to evaluation mode using `model.eval()` to disable dropout and batch normalization layers.
- Then, it iterates through batches of data from the test DataLoader (`test_loader`).
- For each batch, it performs a forward pass through the model to obtain predictions.
- It compares the predicted labels with the ground truth labels and calculates the total number of correct predictions.
- Finally, it computes the accuracy of the model on the test dataset and prints the result.

## Return Value

The function returns the average test accuracy over all epochs, providing a measure of the model's performance.

This function encapsulates the process of training and evaluating a CNN model, allowing users to easily monitor training progress and assess model accuracy on unseen data.


In [None]:
# Define training and evaluation function
def train_and_evaluate(model, optimizer, train_loader, test_loader, epochs):
  """
  Trains the model and evaluates performance on the test set.

  Args:
      model: The CNN model to train.
      optimizer: The optimizer to use.
      train_loader: DataLoader for training data.
      test_loader: DataLoader for test data.
      epochs: Number of training epochs.

  Returns:
      The average test accuracy over epochs.
  """

  criterion = nn.CrossEntropyLoss()  # Loss function

  for epoch in range(epochs):
    model.train()
    running_loss = 0.0
    for i, data in enumerate(train_loader, 0):
      inputs, labels = data
      optimizer.zero_grad()
      outputs = model(inputs)
      loss = criterion(outputs, labels)
      loss.backward()
      optimizer.step()

      running_loss += loss.item()
      if i % 2000 == 1999:
        print('[%d, %5d] loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
        running_loss = 0.0

  print('Finished Training')

  # Evaluation on test set
  model.eval()
  correct = 0
  total = 0
  with torch.no_grad():
    for data in test_loader:
      images, labels = data
      outputs = model(images)
      _, predicted = torch.max(outputs.data, 1)
      total += labels.size(0)
      correct += (predicted == labels).sum().item()

  print('Accuracy of the network on the test images: %d %%' % (100 * correct / total))
  return 100 * correct / total

# TRY NUM : 1

  Args:
  <br>
  *   in_channels:1
      <br>
  *   hidden_layers: [(64, 3), (64, 3)]  
      
  *   activation: 'ReLU'
      <br>
  *   pooling: 'max'
      <br>
  *   optimizer: 'adam'
      <br>
  *   lr: 0.001
      <br>
  *   dropout_prob: 0.5

In [None]:
# Define parameters for CNN model
in_channels = 1
hidden_layers = [(64, 3), (64, 3)]  # Example hidden layers
activation = 'ReLU'  # Corrected activation function name
pooling = 'max'
optimizer = 'adam'
lr = 0.001
dropout_prob = 0.5
epochs = 5

# Build the CNN model
model, optimizer = build_cnn(in_channels, hidden_layers, activation, pooling, optimizer, lr, dropout_prob)

# Train and evaluate the model
test_accuracy = train_and_evaluate(model, optimizer, trainloader, trainloader, epochs)

  self.pid = os.fork()


[1,  2000] loss: 0.763
[1,  4000] loss: 0.558
[1,  6000] loss: 0.505
[1,  8000] loss: 0.476
[1, 10000] loss: 0.476
[1, 12000] loss: 0.451
[1, 14000] loss: 0.447


  self.pid = os.fork()


[2,  2000] loss: 0.426
[2,  4000] loss: 0.426
[2,  6000] loss: 0.407
[2,  8000] loss: 0.411
[2, 10000] loss: 0.409
[2, 12000] loss: 0.413
[2, 14000] loss: 0.415
[3,  2000] loss: 0.399
[3,  4000] loss: 0.390
[3,  6000] loss: 0.399
[3,  8000] loss: 0.401
[3, 10000] loss: 0.364
[3, 12000] loss: 0.396
[3, 14000] loss: 0.407
[4,  2000] loss: 0.375
[4,  4000] loss: 0.375
[4,  6000] loss: 0.385
[4,  8000] loss: 0.378
[4, 10000] loss: 0.375
[4, 12000] loss: 0.406
[4, 14000] loss: 0.358
[5,  2000] loss: 0.380
[5,  4000] loss: 0.381
[5,  6000] loss: 0.354
[5,  8000] loss: 0.362
[5, 10000] loss: 0.373
[5, 12000] loss: 0.399
[5, 14000] loss: 0.371
Finished Training
Accuracy of the network on the test images: 89 %


# More Architectures
## Parameters for Base CNN Model
- Input Channels: 1 (grayscale images)
- Hidden Layers: Two convolutional layers with 64 filters each and kernel size of 3x3
- Epochs: 5

## Configurations to Try
We define six different configurations, each with variations in activation, pooling, optimizer, learning rate, dropout probability, and data augmentation.


**Configuration 2:**
   - Activation: Tanh
   - Pooling: Max
   - Optimizer: Adam
   - Learning Rate: 0.001
   - Dropout Probability: 0.5
   - Data Augmentation: Disabled

**Configuration 3:**
   - Activation: ReLU
   - Pooling: Average
   - Optimizer: Adam
   - Learning Rate: 0.01
   - Dropout Probability: 0
   - Data Augmentation: Enabled

**Configuration 4:**
   - Activation: ReLU
   - Pooling: Average
   - Optimizer: Adam
   - Learning Rate: 0.01
   - Dropout Probability: 0.5
   - Data Augmentation: Disabled

**Configuration 5:**
   - Activation: ReLU
   - Pooling: Max
   - Optimizer: SGD
   - Learning Rate: 0.01
   - Dropout Probability: 0
   - Data Augmentation: Enabled

**Configuration 6:**
   - Activation: ReLU
   - Pooling: Max
   - Optimizer: SGD
   - Learning Rate: 0.01
   - Dropout Probability: 0.5
   - Data Augmentation: Enabled

**Configuration 7:**
   - Activation: Sigmoid
   - Pooling: Max
   - Optimizer: SGD
   - Learning Rate: 0.05
   - Dropout Probability: 0.5
   - Data Augmentation: Enabled

**Configuration 8:**
   - Activation: Sigmoid
   - Pooling: Max
   - Optimizer: SGD
   - Learning Rate: 0.05
   - Dropout Probability: 0
   - Data Augmentation: Enabled




In [None]:
# Define parameters for CNN model
in_channels = 1
hidden_layers = [(64, 3), (64, 3)]  # Example hidden layers
epochs = 5

In [None]:


# Define different configurations to try
configurations = [
    {'activation': 'Sigmoid', 'pooling': 'max', 'optimizer': 'adam', 'lr': 0.001, 'dropout_prob': 0.5, 'augmentation': True},
    {'activation': 'Tanh', 'pooling': 'max', 'optimizer': 'adam', 'lr': 0.001, 'dropout_prob': 0.5, 'augmentation': False},
    {'activation': 'ReLU', 'pooling': 'avg', 'optimizer': 'adam', 'lr': 0.01, 'dropout_prob': 0, 'augmentation': True},
    {'activation': 'ReLU', 'pooling': 'avg', 'optimizer': 'adam', 'lr': 0.01, 'dropout_prob': 0.5, 'augmentation': False},
    {'activation': 'ReLU', 'pooling': 'max', 'optimizer': 'sgd', 'lr': 0.01, 'dropout_prob': 0, 'augmentation': True},
    {'activation': 'ReLU', 'pooling': 'max', 'optimizer': 'sgd', 'lr': 0.01, 'dropout_prob': 0.5, 'augmentation': True},
]

# Iterate over configurations and train models
for idx, config in enumerate(configurations, start=1):
    print(f"Training Model {idx}")

    activation = config['activation']
    pooling = config['pooling']
    optimizer = config['optimizer']
    lr = config['lr']
    dropout_prob = config['dropout_prob']
    augmentation = config['augmentation']

    # Build the CNN model
    model, optimizer = build_cnn(in_channels, hidden_layers, activation, pooling, optimizer, lr, dropout_prob)

    # Data augmentation
    if augmentation:
        trainset_augmented = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True,
                                                               transform=transforms.Compose([
                                                                   transforms.RandomHorizontalFlip(),
                                                                   transforms.RandomRotation(10),
                                                                   transforms.ToTensor(),
                                                                   transforms.Normalize((0.5,), (0.5,))
                                                               ]))
        trainloader_augmented = torch.utils.data.DataLoader(trainset_augmented, batch_size=4, shuffle=True, num_workers=2)
        trainloader_used = trainloader_augmented
    else:
        trainloader_used = trainloader

    # Train and evaluate the model
    test_accuracy = train_and_evaluate(model, optimizer, trainloader_used, trainloader, epochs)


Training Model 1


  self.pid = os.fork()


[1,  2000] loss: 1.092
[1,  4000] loss: 0.727
[1,  6000] loss: 0.632
[1,  8000] loss: 0.575
[1, 10000] loss: 0.562
[1, 12000] loss: 0.545
[1, 14000] loss: 0.507


  self.pid = os.fork()


[2,  2000] loss: 0.493
[2,  4000] loss: 0.483
[2,  6000] loss: 0.465
[2,  8000] loss: 0.469
[2, 10000] loss: 0.484
[2, 12000] loss: 0.454
[2, 14000] loss: 0.453
[3,  2000] loss: 0.441
[3,  4000] loss: 0.440
[3,  6000] loss: 0.432
[3,  8000] loss: 0.443
[3, 10000] loss: 0.431
[3, 12000] loss: 0.437
[3, 14000] loss: 0.427
[4,  2000] loss: 0.413
[4,  4000] loss: 0.409
[4,  6000] loss: 0.425
[4,  8000] loss: 0.409
[4, 10000] loss: 0.397
[4, 12000] loss: 0.411
[4, 14000] loss: 0.422
[5,  2000] loss: 0.392
[5,  4000] loss: 0.405
[5,  6000] loss: 0.391
[5,  8000] loss: 0.416
[5, 10000] loss: 0.383
[5, 12000] loss: 0.397
[5, 14000] loss: 0.403
Finished Training
Accuracy of the network on the test images: 88 %
Training Model 2
[1,  2000] loss: 0.674
[1,  4000] loss: 0.544
[1,  6000] loss: 0.531
[1,  8000] loss: 0.519
[1, 10000] loss: 0.519
[1, 12000] loss: 0.506
[1, 14000] loss: 0.492
[2,  2000] loss: 0.482
[2,  4000] loss: 0.502
[2,  6000] loss: 0.498
[2,  8000] loss: 0.493
[2, 10000] loss: 0.

In [None]:
# Define different configurations to try
configurations = [
    {'activation': 'ReLU', 'pooling': 'max', 'optimizer': 'sgd', 'lr': 0.01, 'dropout_prob': 0.5, 'augmentation': True},
    {'activation': 'Sigmoid', 'pooling': 'max', 'optimizer': 'sgd', 'lr': 0.05, 'dropout_prob': 0.5, 'augmentation': True}
]

# Iterate over configurations and train models
for idx, config in enumerate(configurations, start=1):
    print(f"Training Model {idx}")

    activation = config['activation']
    pooling = config['pooling']
    optimizer = config['optimizer']
    lr = config['lr']
    dropout_prob = config['dropout_prob']
    augmentation = config['augmentation']

    # Build the CNN model
    model, optimizer = build_cnn(in_channels, hidden_layers, activation, pooling, optimizer, lr, dropout_prob)

    # Data augmentation
    if augmentation:
        trainset_augmented = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True,
                                                               transform=transforms.Compose([
                                                                   transforms.RandomHorizontalFlip(),
                                                                   transforms.RandomRotation(10),
                                                                   transforms.ToTensor(),
                                                                   transforms.Normalize((0.5,), (0.5,))
                                                               ]))
        trainloader_augmented = torch.utils.data.DataLoader(trainset_augmented, batch_size=4, shuffle=True, num_workers=2)
        trainloader_used = trainloader_augmented
    else:
        trainloader_used = trainloader

    # Train and evaluate the model
    test_accuracy = train_and_evaluate(model, optimizer, trainloader_used, trainloader, epochs)



Training Model 1


  self.pid = os.fork()


[1,  2000] loss: 1.039
[1,  4000] loss: 0.679
[1,  6000] loss: 0.600
[1,  8000] loss: 0.541
[1, 10000] loss: 0.530
[1, 12000] loss: 0.501
[1, 14000] loss: 0.482


  self.pid = os.fork()


[2,  2000] loss: 0.461
[2,  4000] loss: 0.470
[2,  6000] loss: 0.445
[2,  8000] loss: 0.427
[2, 10000] loss: 0.438
[2, 12000] loss: 0.435
[2, 14000] loss: 0.405
[3,  2000] loss: 0.410
[3,  4000] loss: 0.406
[3,  6000] loss: 0.406
[3,  8000] loss: 0.415
[3, 10000] loss: 0.397
[3, 12000] loss: 0.392
[3, 14000] loss: 0.382
[4,  2000] loss: 0.385
[4,  4000] loss: 0.384
[4,  6000] loss: 0.386
[4,  8000] loss: 0.381
[4, 10000] loss: 0.379
[4, 12000] loss: 0.371
[4, 14000] loss: 0.366
[5,  2000] loss: 0.360
[5,  4000] loss: 0.358
[5,  6000] loss: 0.363
[5,  8000] loss: 0.379
[5, 10000] loss: 0.370
[5, 12000] loss: 0.365
[5, 14000] loss: 0.355
Finished Training
Accuracy of the network on the test images: 89 %
Training Model 2
[1,  2000] loss: 2.309
[1,  4000] loss: 2.306
[1,  6000] loss: 2.306
[1,  8000] loss: 2.306
[1, 10000] loss: 2.305
[1, 12000] loss: 2.185
[1, 14000] loss: 1.025
[2,  2000] loss: 0.699
[2,  4000] loss: 0.650
[2,  6000] loss: 0.632
[2,  8000] loss: 0.613
[2, 10000] loss: 0.

In [None]:
# Define different configurations to try
configurations = [

    {'activation': 'Sigmoid', 'pooling': 'max', 'optimizer': 'sgd', 'lr': 0.05, 'dropout_prob': 0, 'augmentation': True}
]

# Iterate over configurations and train models
for idx, config in enumerate(configurations, start=1):
    print(f"Training Model {idx}")

    activation = config['activation']
    pooling = config['pooling']
    optimizer = config['optimizer']
    lr = config['lr']
    dropout_prob = config['dropout_prob']
    augmentation = config['augmentation']

    # Build the CNN model
    model, optimizer = build_cnn(in_channels, hidden_layers, activation, pooling, optimizer, lr, dropout_prob)

    # Data augmentation
    if augmentation:
        trainset_augmented = torchvision.datasets.FashionMNIST(root='./data', train=True, download=True,
                                                               transform=transforms.Compose([
                                                                   transforms.RandomHorizontalFlip(),
                                                                   transforms.RandomRotation(10),
                                                                   transforms.ToTensor(),
                                                                   transforms.Normalize((0.5,), (0.5,))
                                                               ]))
        trainloader_augmented = torch.utils.data.DataLoader(trainset_augmented, batch_size=4, shuffle=True, num_workers=2)
        trainloader_used = trainloader_augmented
    else:
        trainloader_used = trainloader

    # Train and evaluate the model
    test_accuracy = train_and_evaluate(model, optimizer, trainloader_used, trainloader, 7)



Training Model 1


  self.pid = os.fork()


[1,  2000] loss: 2.309
[1,  4000] loss: 2.305
[1,  6000] loss: 2.306
[1,  8000] loss: 2.305
[1, 10000] loss: 2.305
[1, 12000] loss: 1.678
[1, 14000] loss: 0.822


  self.pid = os.fork()


[2,  2000] loss: 0.640
[2,  4000] loss: 0.601
[2,  6000] loss: 0.563
[2,  8000] loss: 0.524
[2, 10000] loss: 0.527
[2, 12000] loss: 0.506
[2, 14000] loss: 0.472
[3,  2000] loss: 0.473
[3,  4000] loss: 0.438
[3,  6000] loss: 0.428
[3,  8000] loss: 0.435
[3, 10000] loss: 0.406
[3, 12000] loss: 0.403
[3, 14000] loss: 0.387
[4,  2000] loss: 0.380
[4,  4000] loss: 0.369
[4,  6000] loss: 0.351
[4,  8000] loss: 0.382
[4, 10000] loss: 0.365
[4, 12000] loss: 0.369
[4, 14000] loss: 0.344
[5,  2000] loss: 0.344
[5,  4000] loss: 0.333
[5,  6000] loss: 0.341
[5,  8000] loss: 0.333
[5, 10000] loss: 0.328
[5, 12000] loss: 0.332
[5, 14000] loss: 0.321
[6,  2000] loss: 0.309
[6,  4000] loss: 0.309
[6,  6000] loss: 0.312
[6,  8000] loss: 0.311
[6, 10000] loss: 0.304
[6, 12000] loss: 0.298
[6, 14000] loss: 0.298
[7,  2000] loss: 0.296
[7,  4000] loss: 0.288
[7,  6000] loss: 0.287
[7,  8000] loss: 0.282
[7, 10000] loss: 0.298
[7, 12000] loss: 0.281
[7, 14000] loss: 0.275
Finished Training
Accuracy of the 

# Result Summary

| Activation | Optimizer | Pooling | LR   | Drop-out | Augmentation | Accuracy |
|------------|-----------|---------|------|----------|--------------|----------|
| Sigmoid    | Adam      | Max     | 0.001| 0.5      | ✅           | 88%      |
| Tanh       | Adam      | Max     | 0.001| 0.5      | ❎           | 85%      |
| ReLU       | Adam      | Avg     | 0.01 | 0        | ✅           | 10%      |
| ReLU       | Adam      | Avg     | 0.01 | 0.5      | ❎           | 10%      |
| <tr style="background-color:#D1E9A9"><td>ReLU</td><td>SGD</td><td>Max</td><td>0.01</td><td>0</td><td>✅</td><td>92%</td></tr> |
| ReLU       | SGD       | Max     | 0.01 | 0.5      | ✅           | 89%      |
| Sigmoid    | SGD       | Max     | 0.05 | 0.5      | ✅           | 87%      |
| Sigmoid    | SGD       | Max     | 0.05 | 0        | ✅           | 90%         |
