### Question 1: What is the role of filters and feature maps in Convolutional Neural Network (CNN)?

**Answer:**
*   **Filters (Kernels):** Small matrices of numbers that slide across the input data (e.g., image) to detect specific features (edges, textures, patterns). Each filter specializes in detecting a different feature.
*   **Feature Maps (Activation Maps):** The output of applying a filter to an input. Each value in a feature map indicates the presence and strength of the detected feature at different locations in the input. Multiple feature maps are generated by different filters to form a comprehensive representation of the input's features.

### Question 2: Explain the concepts of padding and stride in CNNs (Convolutional Neural Network). How do they affect the output dimensions of feature maps?

**Answer:**
*   **Padding:** Adding extra rows and columns (usually zeros) around the border of the input image. It's used to preserve spatial information at the edges and to control the output size, preventing reduction in dimensions. 'Same' padding maintains output size, while 'valid' (no) padding reduces it.
*   **Stride:** The number of pixels a filter shifts over the input matrix at each step. A larger stride reduces the spatial dimensions of the output feature map, effectively downsampling the input.

**Effect on Output Dimensions:**
Output Size = `[(W - F + 2P) / S] + 1`
Where:
*   `W` = Input size (width or height)
*   `F` = Filter size
*   `P` = Padding amount
*   `S` = Stride

### Question 3: Define receptive field in the context of CNNs. Why is it important for deep architectures?

**Answer:**
*   **Receptive Field:** The region in the input image that a particular neuron in a feature map is 'looking at' or influenced by. As layers deepen in a CNN, the receptive field of neurons in subsequent layers grows, meaning they incorporate information from a larger area of the original input.
*   **Importance for Deep Architectures:** It allows deeper layers to capture more abstract and complex features by integrating information from a wider spatial context. A larger receptive field enables the model to understand global patterns and relationships in the input, which is crucial for tasks requiring high-level understanding like object recognition and scene interpretation.

### Question 4: Discuss how filter size and stride influence the number of parameters in a CNN.

**Answer:**
*   **Filter Size:** A larger filter size directly increases the number of parameters within that specific filter. For example, a 5x5 filter has 25 parameters, while a 3x3 filter has 9. The total number of parameters also depends on the number of input channels and output filters.
*   **Stride:** Stride does **not** directly influence the number of parameters within the filters themselves. However, a larger stride reduces the spatial dimensions of the output feature maps. This reduction in feature map size means that subsequent layers will process smaller inputs, potentially leading to a smaller number of parameters in those subsequent layers if they are fully connected or have fewer filters due to the reduced input dimensions.

### Question 5: Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.

**Answer:**
| Feature           | LeNet-5                          | AlexNet                           | VGG-16/19                               |
| :---------------- | :------------------------------- | :-------------------------------- | :-------------------------------------- |
| **Depth**         | Shallow (5 conv, 2 FC layers)    | Medium (5 conv, 3 FC layers)      | Very Deep (16 or 19 conv, 3 FC layers)  |
| **Filter Sizes**  | 5x5 (larger initial filters)     | Primarily 11x11, 5x5, 3x3 (larger initial filters) | Exclusively 3x3 (small, uniform filters) |
| **Performance**   | Good for small, simple images (e.g., MNIST) | Significant breakthrough on ImageNet, outperformed LeNet | State-of-the-art on ImageNet, very strong performance, deeper but computationally intensive |
| **Key Innovation**| First successful CNN, local receptive fields, shared weights, pooling | ReLU activation, GPU training, overlapping pooling, dropout, data augmentation | Uniform 3x3 filters, increased depth, simplicity of architecture |
| **Parameters**    | ~60k                             | ~60M                              | ~138M (VGG-16), ~144M (VGG-19)          |


#Question 6 (Keras MNIST CNN): Using Keras, build and train a simple CNN model on the MNIST dataset from scratch. This includes code for model creation, compilation, training, and evaluation.

In [29]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np

# 1. Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
print(f"Original x_train shape: {x_train.shape}")
print(f"Original y_train shape: {y_train.shape}")
print(f"Original x_test shape: {x_test.shape}")
print(f"Original y_test shape: {y_test.shape}")

# 2. Preprocess the data
# a. Reshape the input images to include a channel dimension (e.g., (28, 28, 1))
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1))
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1))

# b. Normalize the pixel values to a range of 0 to 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# c. Convert the labels to one-hot encoded vectors
y_train = tf.keras.utils.to_categorical(y_train)
y_test = tf.keras.utils.to_categorical(y_test)

print(f"Reshaped and normalized x_train shape: {x_train.shape}")
print(f"One-hot encoded y_train shape: {y_train.shape}")
print(f"Reshaped and normalized x_test shape: {x_test.shape}")
print(f"One-hot encoded y_test shape: {y_test.shape}")

Original x_train shape: (60000, 28, 28)
Original y_train shape: (60000,)
Original x_test shape: (10000, 28, 28)
Original y_test shape: (10000,)
Reshaped and normalized x_train shape: (60000, 28, 28, 1)
One-hot encoded y_train shape: (60000, 10)
Reshaped and normalized x_test shape: (10000, 28, 28, 1)
One-hot encoded y_test shape: (10000, 10)


**Reasoning**:
Now that the data is preprocessed, the next step is to define the CNN model architecture as specified in the instructions. This involves stacking several convolutional, pooling, flatten, and dense layers using `tf.keras.Sequential`.



In [30]:
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')
])

print("CNN model architecture defined successfully.")
model.summary()

CNN model architecture defined successfully.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [31]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print("CNN model compiled successfully.")

CNN model compiled successfully.


**Reasoning**:
With the model compiled, the next step is to train it using the preprocessed training data and labels, as specified in the instructions. I will use 5 epochs and a batch size of 64.



In [32]:
history = model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

print("CNN model training completed.")

Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m8s[0m 6ms/step - accuracy: 0.8749 - loss: 0.4231 - val_accuracy: 0.9792 - val_loss: 0.0687
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.9820 - loss: 0.0568 - val_accuracy: 0.9860 - val_loss: 0.0446
Epoch 3/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4ms/step - accuracy: 0.9877 - loss: 0.0396 - val_accuracy: 0.9898 - val_loss: 0.0339
Epoch 4/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.9911 - loss: 0.0275 - val_accuracy: 0.9873 - val_loss: 0.0438
Epoch 5/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3ms/step - accuracy: 0.9930 - loss: 0.0226 - val_accuracy: 0.9920 - val_loss: 0.0305
CNN model training completed.


**Reasoning**:
The model has been trained, and now I need to evaluate its performance on the test data as specified in the instructions. This will involve using the `model.evaluate()` method with the preprocessed test data and labels.



In [33]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)

print(f"\nTest accuracy: {test_acc}")
print(f"Test loss: {test_loss}")

313/313 - 2s - 5ms/step - accuracy: 0.9920 - loss: 0.0260

Test accuracy: 0.9919999837875366
Test loss: 0.02601625584065914



#Question 7 : Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show the preprocessing steps and the model architecture.


In [34]:
import tensorflow as tf
from tensorflow.keras import layers, models
import numpy as np

# 1. Load the CIFAR-10 dataset
(x_train_cifar, y_train_cifar), (x_test_cifar, y_test_cifar) = tf.keras.datasets.cifar10.load_data()
print(f"Original x_train_cifar shape: {x_train_cifar.shape}")
print(f"Original y_train_cifar shape: {y_train_cifar.shape}")
print(f"Original x_test_cifar shape: {x_test_cifar.shape}")
print(f"Original y_test_cifar shape: {y_test_cifar.shape}")

# 2. Normalize the pixel values
x_train_cifar = x_train_cifar.astype('float32') / 255.0
x_test_cifar = x_test_cifar.astype('float32') / 255.0

# 3. Convert labels to one-hot encoded vectors
y_train_cifar = tf.keras.utils.to_categorical(y_train_cifar, 10)
y_test_cifar = tf.keras.utils.to_categorical(y_test_cifar, 10)

print(f"Normalized x_train_cifar shape: {x_train_cifar.shape}")
print(f"One-hot encoded y_train_cifar shape: {y_train_cifar.shape}")
print(f"Normalized x_test_cifar shape: {x_test_cifar.shape}")
print(f"One-hot encoded y_test_cifar shape: {y_test_cifar.shape}")

# 4. Define a simple CNN model architecture
cifar_model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax') # 10 classes for CIFAR-10
])

# 5. Print the model summary
print("\nCIFAR-10 CNN model architecture defined successfully.")
cifar_model.summary()

Original x_train_cifar shape: (50000, 32, 32, 3)
Original y_train_cifar shape: (50000, 1)
Original x_test_cifar shape: (10000, 32, 32, 3)
Original y_test_cifar shape: (10000, 1)
Normalized x_train_cifar shape: (50000, 32, 32, 3)
One-hot encoded y_train_cifar shape: (50000, 10)
Normalized x_test_cifar shape: (10000, 32, 32, 3)
One-hot encoded y_test_cifar shape: (10000, 10)

CIFAR-10 CNN model architecture defined successfully.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [35]:
cifar_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print("CIFAR-10 CNN model compiled successfully.")

history_cifar = cifar_model.fit(x_train_cifar, y_train_cifar, epochs=10, batch_size=64, validation_split=0.1)

print("CIFAR-10 CNN model training completed.")

CIFAR-10 CNN model compiled successfully.
Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 8ms/step - accuracy: 0.3289 - loss: 1.8187 - val_accuracy: 0.5012 - val_loss: 1.4012
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.5503 - loss: 1.2535 - val_accuracy: 0.6064 - val_loss: 1.1190
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.6125 - loss: 1.0973 - val_accuracy: 0.6342 - val_loss: 1.0184
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.6477 - loss: 0.9983 - val_accuracy: 0.6672 - val_loss: 0.9518
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.6775 - loss: 0.9157 - val_accuracy: 0.6684 - val_loss: 0.9431
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 4ms/step - accuracy: 0.7005 - loss: 0.8452 - val_accuracy: 0.6778 - val

In [36]:
test_loss_cifar, test_acc_cifar = cifar_model.evaluate(x_test_cifar, y_test_cifar, verbose=2)

print(f"\nCIFAR-10 Test accuracy: {test_acc_cifar}")
print(f"CIFAR-10 Test loss: {test_loss_cifar}")

313/313 - 2s - 5ms/step - accuracy: 0.6969 - loss: 0.9014

CIFAR-10 Test accuracy: 0.6969000101089478
CIFAR-10 Test loss: 0.9014440178871155



#Question 8 :Using PyTorch, write a script to define and train a CNN on the MNIST dataset. This will include model definition, data loaders, a training loop, and accuracy evaluation.


In [37]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
import matplotlib.pyplot as plt

print("PyTorch libraries imported successfully.")

PyTorch libraries imported successfully.


In [38]:
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        # Calculate the input size for the first fully connected layer
        # MNIST images are 28x28. After two Conv2D(padding=1) and two MaxPool2D(kernel_size=2, stride=2):
        # 1st Conv: (28 - 3 + 2*1)/1 + 1 = 28
        # 1st Pool: 28 / 2 = 14
        # 2nd Conv: (14 - 3 + 2*1)/1 + 1 = 14
        # 2nd Pool: 14 / 2 = 7
        # Output channels from conv2 is 64, so flattened size is 64 * 7 * 7
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10) # 10 classes for MNIST

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN()
print("CNN model defined successfully:")
print(model)

CNN model defined successfully:
CNN(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu1): ReLU()
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu2): ReLU()
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (flatten): Flatten(start_dim=1, end_dim=-1)
  (fc1): Linear(in_features=3136, out_features=128, bias=True)
  (relu3): ReLU()
  (fc2): Linear(in_features=128, out_features=10, bias=True)
)


In [39]:
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

print("MNIST dataset loaded and transformed successfully.")
print(f"Number of training samples: {len(train_dataset)}")
print(f"Number of testing samples: {len(test_dataset)}")

MNIST dataset loaded and transformed successfully.
Number of training samples: 60000
Number of testing samples: 10000


In [40]:
batch_size = 64

train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

print(f"DataLoaders created with batch size: {batch_size}")
print(f"Number of training batches: {len(train_loader)}")
print(f"Number of testing batches: {len(test_loader)}")

DataLoaders created with batch size: 64
Number of training batches: 938
Number of testing batches: 157


In [41]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

print("Loss function (CrossEntropyLoss) and optimizer (Adam) defined successfully.")

Loss function (CrossEntropyLoss) and optimizer (Adam) defined successfully.


In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Define the CNN model architecture (from cell c6da006b)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        # Calculate the input size for the first fully connected layer
        # MNIST images are 28x28. After two Conv2D(padding=1) and two MaxPool2D(kernel_size=2, stride=2):
        # 1st Conv: (28 - 3 + 2*1)/1 + 1 = 28
        # 1st Pool: 28 / 2 = 14
        # 2nd Conv: (14 - 3 + 2*1)/1 + 1 = 14
        # 2nd Pool: 14 / 2 = 7
        # Output channels from conv2 is 64, so flattened size is 64 * 7 * 7
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10) # 10 classes for MNIST

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the model (from cell c6da006b)
model = CNN()

# Define transformations (from cell 99f89831)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset (from cell 99f89831)
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

# Create DataLoader (from cell 6c29e659)
batch_size = 64
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

# Define Loss function and Optimizer (from cell 059315d5)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

epochs = 5
train_losses = []

# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

print(f"Training model on {device}")

for epoch in range(epochs):
    model.train() # Set the model to training mode
    running_loss = 0.0
    for batch_idx, (images, labels) in enumerate(train_loader):
        images, labels = images.to(device), labels.to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimize
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    # Calculate average loss for the epoch
    avg_train_loss = running_loss / len(train_loader)
    train_losses.append(avg_train_loss)

    print(f"Epoch [{epoch+1}/{epochs}], Loss: {avg_train_loss:.4f}")

print("PyTorch model training completed.")

100%|██████████| 9.91M/9.91M [00:00<00:00, 57.7MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.63MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 14.8MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 6.97MB/s]


Training model on cpu
Epoch [1/5], Loss: 0.1237
Epoch [2/5], Loss: 0.0396
Epoch [3/5], Loss: 0.0280
Epoch [4/5], Loss: 0.0186
Epoch [5/5], Loss: 0.0151
PyTorch model training completed.


**Reasoning**:
The previous execution of the training loop only showed output for the first epoch, suggesting it might have been interrupted or the output truncated. To ensure the model completes training for all 5 specified epochs as required, I will re-execute the full training loop including all necessary definitions to restore the environment state.



In [4]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Define the CNN model architecture (from cell c6da006b)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        # Calculate the input size for the first fully connected layer
        # MNIST images are 28x28. After two Conv2D(padding=1) and two MaxPool2D(kernel_size=2, stride=2):
        # 1st Conv: (28 - 3 + 2*1)/1 + 1 = 28
        # 1st Pool: 28 / 2 = 14
        # 2nd Conv: (14 - 3 + 2*1)/1 + 1 = 14
        # 2nd Pool: 14 / 2 = 7
        # Output channels from conv2 is 64, so flattened size is 64 * 7 * 7
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10) # 10 classes for MNIST

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the model (from cell c6da006b)
model = CNN()

# Define transformations (from cell 99f89831)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset (from cell 99f89831)
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)

# Create DataLoader (from cell 6c29e659)
batch_size = 64
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)

# Define Loss function and Optimizer (from cell 059315d5)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

epochs = 5
train_losses = []

# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

print(f"Training model on {device}")

for epoch in range(epochs):
    model.train() # Set the model to training mode
    running_loss = 0.0
    for batch_idx, (images, labels) in enumerate(train_loader):
        images, labels = images.to(device), labels.to(device)

        # Zero the parameter gradients
        optimizer.zero_grad()

        # Forward pass
        outputs = model(images)
        loss = criterion(outputs, labels)

        # Backward pass and optimize
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    # Calculate average loss for the epoch
    avg_train_loss = running_loss / len(train_loader)
    train_losses.append(avg_train_loss)

    print(f"Epoch [{epoch+1}/{epochs}], Loss: {avg_train_loss:.4f}")

print("PyTorch model training completed.")

Training model on cpu
Epoch [1/5], Loss: 0.1315
Epoch [2/5], Loss: 0.0420
Epoch [3/5], Loss: 0.0281
Epoch [4/5], Loss: 0.0208
Epoch [5/5], Loss: 0.0168
PyTorch model training completed.


## Evaluate PyTorch Model

### Subtask:
Evaluate the trained PyTorch CNN model on the MNIST test dataset to calculate its final accuracy and loss.


**Reasoning**:
To evaluate the trained PyTorch CNN model, I need to define the evaluation logic, including loading the test dataset, setting up the test data loader, and performing a forward pass without gradient calculations to compute the test loss and accuracy.



In [5]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# Define the CNN model architecture (from cell c6da006b)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the model (from cell c6da006b)
model = CNN()

# Define transformations (from cell 99f89831)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset (from cell 99f89831)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoader (from cell 6c29e659)
batch_size = 64
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define Loss function (from cell 059315d5)
criterion = nn.CrossEntropyLoss()

# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Set the model to evaluation mode
model.eval()

test_loss = 0
correct = 0

# Disable gradient calculations during evaluation
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        test_loss += loss.item() * images.size(0) # Multiply by batch size to get total loss for batch
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum().item()

# Calculate average test loss and accuracy
test_loss /= len(test_loader.dataset)
test_accuracy = 100. * correct / len(test_loader.dataset)

print(f"\nTest Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.2f}%")



Test Loss: 2.2990
Test Accuracy: 10.56%



#Question 9 :LGiven a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.


In [6]:
import os

# Define the base directory for the custom dataset
base_dir = 'custom_dataset'

# Define class names
class_names = ['cat', 'dog'] # Using more descriptive class names for simulation

# Define the number of dummy images per class for train and validation
num_train_images_per_class = 10
num_val_images_per_class = 5

# Create the base directory if it doesn't exist
if not os.path.exists(base_dir):
    os.makedirs(base_dir)
    print(f"Created directory: {base_dir}")

# Create 'train' and 'validation' directories
train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

for directory in [train_dir, validation_dir]:
    if not os.path.exists(directory):
        os.makedirs(directory)
        print(f"Created directory: {directory}")

# Create class subdirectories and populate with dummy image files
for class_name in class_names:
    # Training data
    train_class_dir = os.path.join(train_dir, class_name)
    if not os.path.exists(train_class_dir):
        os.makedirs(train_class_dir)
        print(f"Created directory: {train_class_dir}")
    for i in range(num_train_images_per_class):
        dummy_file_path = os.path.join(train_class_dir, f'{class_name}_{i+1:03d}.jpg')
        with open(dummy_file_path, 'w') as f:
            f.write('') # Create an empty file as a dummy image

    # Validation data
    val_class_dir = os.path.join(validation_dir, class_name)
    if not os.path.exists(val_class_dir):
        os.makedirs(val_class_dir)
        print(f"Created directory: {val_class_dir}")
    for i in range(num_val_images_per_class):
        dummy_file_path = os.path.join(val_class_dir, f'{class_name}_{i+1:03d}.jpg')
        with open(dummy_file_path, 'w') as f:
            f.write('') # Create an empty file as a dummy image

print("Custom dataset directory structure simulated successfully.")
print(f"Train images per class: {num_train_images_per_class}")
print(f"Validation images per class: {num_val_images_per_class}")
print(f"Total classes: {len(class_names)}")

Created directory: custom_dataset
Created directory: custom_dataset/train
Created directory: custom_dataset/validation
Created directory: custom_dataset/train/cat
Created directory: custom_dataset/validation/cat
Created directory: custom_dataset/train/dog
Created directory: custom_dataset/validation/dog
Custom dataset directory structure simulated successfully.
Train images per class: 10
Validation images per class: 5
Total classes: 2


In [7]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Image dimensions
IMG_WIDTH = 150
IMG_HEIGHT = 150
IMG_SHAPE = (IMG_WIDTH, IMG_HEIGHT, 3) # RGB images

# Instantiate ImageDataGenerator for training with augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255, # Normalize pixel values to [0, 1]
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Instantiate ImageDataGenerator for validation (only rescaling)
validation_datagen = ImageDataGenerator(rescale=1./255)

print("ImageDataGenerators for training and validation instantiated successfully.")

ImageDataGenerators for training and validation instantiated successfully.


In [8]:
batch_size = 32

# Create train_generator
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

# Create validation_generator
validation_generator = validation_datagen.flow_from_directory(
    validation_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

print("Train and validation data generators created successfully.")
print(f"Found {train_generator.samples} training images belonging to {train_generator.num_classes} classes.")
print(f"Found {validation_generator.samples} validation images belonging to {validation_generator.num_classes} classes.")

Found 20 images belonging to 2 classes.
Found 10 images belonging to 2 classes.
Train and validation data generators created successfully.
Found 20 training images belonging to 2 classes.
Found 10 validation images belonging to 2 classes.


In [9]:
from tensorflow.keras import layers, models

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=IMG_SHAPE),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(class_names), activation='softmax') # Number of classes
])

print("CNN model architecture defined successfully.")
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


CNN model architecture defined successfully.


**Reasoning**:
Now that the CNN model architecture is defined, the next step is to compile it using an appropriate optimizer, loss function, and metrics, as specified in the instructions. I will use 'adam' optimizer, 'categorical_crossentropy' loss, and 'accuracy' metric.



In [10]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

print("CNN model compiled successfully.")

CNN model compiled successfully.


**Reasoning**:
The `UnidentifiedImageError` occurred because the dummy image files created in a previous step were empty and not valid image formats for `ImageDataGenerator`. To fix this, I will regenerate the training code block, but critically, I will first ensure that valid, small image files are created for the simulated dataset using the `PIL` library, and then proceed with the `ImageDataGenerator` setup, model definition, compilation, and training, all within this single code block to ensure the environment is correctly configured.



In [12]:
import os
from PIL import Image # Import Pillow for image creation
import shutil # Import shutil for directory cleanup
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models

# Define the base directory for the custom dataset
base_dir = 'custom_dataset'
class_names = ['cat', 'dog']
num_train_images_per_class = 10
num_val_images_per_class = 5
IMG_WIDTH = 150
IMG_HEIGHT = 150

# --- Corrected dummy image creation logic ---
# Ensure directories are clean before creating new images to avoid conflicts from previous runs
if os.path.exists(base_dir):
    shutil.rmtree(base_dir) # Remove existing faulty directory

os.makedirs(base_dir)

train_dir = os.path.join(base_dir, 'train')
validation_dir = os.path.join(base_dir, 'validation')

for directory in [train_dir, validation_dir]:
    os.makedirs(directory)

# Create class subdirectories and populate with actual dummy image files
print("Creating dummy image files...")
for class_name in class_names:
    for prefix, num_images, target_dir in [('train', num_train_images_per_class, train_dir),
                                            ('val', num_val_images_per_class, validation_dir)]:
        class_dir = os.path.join(target_dir, class_name)
        os.makedirs(class_dir, exist_ok=True)
        for i in range(num_images):
            dummy_file_path = os.path.join(class_dir, f'{class_name}_{i+1:03d}.jpg')
            # Create a small, valid black image using Pillow
            img = Image.new('RGB', (IMG_WIDTH, IMG_HEIGHT), color = 'black')
            img.save(dummy_file_path)
print("Dummy image files created successfully.")

# --- ImageDataGenerator setup ---
IMG_SHAPE = (IMG_WIDTH, IMG_HEIGHT, 3) # RGB images

train_datagen = ImageDataGenerator(
    rescale=1./255, # Normalize pixel values to [0, 1]
    rotation_range=40,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

validation_datagen = ImageDataGenerator(rescale=1./255)

batch_size = 32

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

validation_generator = validation_datagen.flow_from_directory(
    validation_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)
print("Train and validation data generators created successfully.")

# --- CNN Model Definition ---
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=IMG_SHAPE),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(class_names), activation='softmax') # Number of classes
])
print("CNN model architecture defined successfully.")

# --- Model Compilation ---
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
print("CNN model compiled successfully.")

# --- Model Training ---
epochs = 10

steps_per_epoch = train_generator.samples // batch_size
validation_steps = validation_generator.samples // batch_size

history = model.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_steps
)

print("CNN model training completed.")

Creating dummy image files...
Dummy image files created successfully.
Found 20 images belonging to 2 classes.
Found 10 images belonging to 2 classes.
Train and validation data generators created successfully.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


CNN model architecture defined successfully.
CNN model compiled successfully.


  self._warn_if_super_not_called()


Epoch 1/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 0.5000 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 7/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m

**Reasoning**:
I need to evaluate the trained PyTorch CNN model on the MNIST test dataset. This involves reloading the necessary components, setting the model to evaluation mode, iterating through the test data, calculating predictions, and then computing the final test loss and accuracy. I will ensure all required imports and model/data setup are present in this cell to avoid any state issues.



In [13]:
import torch
import torch.nn as nn
from torchvision import datasets, transforms

# Define the CNN model architecture (re-definition for completeness)
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the model
model = CNN()

# Define transformations
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST test dataset
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoader
batch_size = 64
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define Loss function
criterion = nn.CrossEntropyLoss()

# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

# Load the state_dict from the previously trained model
# Assuming `model` object retains its trained weights from cell 1a73bb67
# If a fresh model is instantiated, it would need to load saved weights
# For the purpose of this execution, we assume the model in memory is the trained one.

# Set the model to evaluation mode
model.eval()

test_loss = 0
correct = 0

# Disable gradient calculations during evaluation
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        test_loss += loss.item() * images.size(0) # Multiply by batch size to get total loss for batch
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum().item()

# Calculate average test loss and accuracy
test_loss /= len(test_loader.dataset)
test_accuracy = 100. * correct / len(test_loader.dataset)

print(f"\nTest Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.2f}%")



Test Loss: 2.3133
Test Accuracy: 9.88%


## Evaluate PyTorch Model (Retry)

### Subtask:
Evaluate the trained PyTorch CNN model on the MNIST test dataset to calculate its final accuracy and loss, ensuring the correct trained model is used.


In [14]:
import torch
import torch.nn as nn
from torchvision import datasets, transforms

# Assuming the 'model' object from the previous training step (1a73bb67) is already in memory and trained.
# Do NOT re-instantiate the model here, as it would lose its trained weights.

# Define transformations (reusing from previous cells for clarity and self-containment)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST test dataset (reusing from previous cells)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoader (reusing from previous cells)
batch_size = 64
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define Loss function (reusing from previous cells)
criterion = nn.CrossEntropyLoss()

# Check if GPU is available, otherwise use CPU (reusing from previous cells)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device) # Move the trained model to the correct device

# Set the model to evaluation mode
model.eval()

test_loss = 0
correct = 0

# Disable gradient calculations during evaluation
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        test_loss += loss.item() * images.size(0) # Multiply by batch size to get total loss for batch
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum().item()

# Calculate average test loss and accuracy
test_loss /= len(test_loader.dataset)
test_accuracy = 100. * correct / len(test_loader.dataset)

print(f"\nTest Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.2f}%")



Test Loss: 2.3133
Test Accuracy: 9.88%


**Reasoning**:
The previous evaluation step (cell `85c9a2b2`) still showed a low accuracy (9.88%), indicating that the model being evaluated was not the correctly trained one, despite efforts to preserve its state. To guarantee that the evaluation is performed on the *trained* PyTorch CNN model, I will consolidate the entire process—model definition, data loading, training, and evaluation—into a single code block. This ensures that the `model` object is trained and then immediately evaluated without any loss of weights or re-instantiation issues.



In [1]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms

# 1. Define the CNN model architecture
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu1 = nn.ReLU()
        self.pool1 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.relu2 = nn.ReLU()
        self.pool2 = nn.MaxPool2d(kernel_size=2, stride=2)
        self.flatten = nn.Flatten()
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.relu3 = nn.ReLU()
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool1(self.relu1(self.conv1(x)))
        x = self.pool2(self.relu2(self.conv2(x)))
        x = self.flatten(x)
        x = self.relu3(self.fc1(x))
        x = self.fc2(x)
        return x

# Instantiate the model
model = CNN()

# Define transformations
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# Load the MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

# Create DataLoaders
batch_size = 64
train_loader = torch.utils.data.DataLoader(train_dataset, batch_size=batch_size, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_dataset, batch_size=batch_size, shuffle=False)

# Define Loss function and Optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

epochs = 5

# Check if GPU is available, otherwise use CPU
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

print(f"Training model on {device}")

# 2. Training Loop (re-run from previous steps)
for epoch in range(epochs):
    model.train() # Set the model to training mode
    running_loss = 0.0
    for batch_idx, (images, labels) in enumerate(train_loader):
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    avg_train_loss = running_loss / len(train_loader)
    print(f"Epoch [{epoch+1}/{epochs}], Loss: {avg_train_loss:.4f}")

print("PyTorch model training completed.")

# 3. Evaluation Loop (as per instructions)
model.eval() # Set the model to evaluation mode

test_loss = 0
correct = 0

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        loss = criterion(outputs, labels)
        test_loss += loss.item() * images.size(0) # Multiply by batch size to get total loss for batch
        _, predicted = torch.max(outputs.data, 1)
        correct += (predicted == labels).sum().item()

# Calculate average test loss and accuracy
test_loss /= len(test_loader.dataset)
test_accuracy = 100. * correct / len(test_loader.dataset)

print(f"\nTest Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.2f}%")

100%|██████████| 9.91M/9.91M [00:00<00:00, 13.9MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 459kB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 4.27MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 11.9MB/s]


Training model on cuda
Epoch [1/5], Loss: 0.1284
Epoch [2/5], Loss: 0.0415
Epoch [3/5], Loss: 0.0276
Epoch [4/5], Loss: 0.0197
Epoch [5/5], Loss: 0.0158
PyTorch model training completed.

Test Loss: 0.0271
Test Accuracy: 99.19%


#Question 10: You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach–from data preparation and model training to deploying the model as a web app using Streamlit.

# Task
Build and deploy an end-to-end CNN model to classify chest X-ray images as "Normal" or "Pneumonia" for a medical imaging startup. This includes data preparation, model training, evaluation, and deploying the model as a web application using Streamlit.

## Data Collection and Organization


Gather a comprehensive dataset of chest X-ray images, categorized into 'Normal' and 'Pneumonia'. Ensure proper licensing and ethical considerations. Organize the dataset into a clear directory structure (e.g., `train/normal`, `train/pneumonia`, `val/normal`, `val/pneumonia`, `test/normal`, `test/pneumonia`).


**Reasoning**:
To simulate the dataset structure for the chest X-ray classification task, I will create the necessary directory structure for training, validation, and testing sets, each with 'normal' and 'pneumonia' subdirectories. I will also populate these directories with dummy image files using PIL to ensure they are valid for later processing steps.



In [1]:
import os
from PIL import Image
import shutil

# Define base directory and class names
base_dataset_dir = 'chest_xray_dataset'
class_names = ['normal', 'pneumonia']

# Define image dimensions for dummy images
IMG_WIDTH = 128
IMG_HEIGHT = 128

# Define the number of dummy images for each split and class
dummy_counts = {
    'train': {'normal': 20, 'pneumonia': 20},
    'val': {'normal': 5, 'pneumonia': 5},
    'test': {'normal': 10, 'pneumonia': 10}
}

# Clean up any existing directory to ensure a fresh start
if os.path.exists(base_dataset_dir):
    shutil.rmtree(base_dataset_dir)
    print(f"Removed existing directory: {base_dataset_dir}")

# Create the base directory
os.makedirs(base_dataset_dir, exist_ok=True)
print(f"Created base directory: {base_dataset_dir}")

# Create train, val, test directories and their class subdirectories
for split in dummy_counts.keys():
    split_dir = os.path.join(base_dataset_dir, split)
    os.makedirs(split_dir, exist_ok=True)
    print(f"Created split directory: {split_dir}")

    for class_name in class_names:
        class_dir = os.path.join(split_dir, class_name)
        os.makedirs(class_dir, exist_ok=True)
        print(f"Created class directory: {class_dir}")

        # Populate with dummy images
        num_images = dummy_counts[split][class_name]
        for i in range(num_images):
            dummy_file_path = os.path.join(class_dir, f'{class_name}_{i+1:03d}.jpg')
            # Create a small, valid black image using Pillow
            img = Image.new('RGB', (IMG_WIDTH, IMG_HEIGHT), color='black')
            img.save(dummy_file_path)

print("Dummy chest X-ray dataset directory structure and files created successfully.")

# Verify the structure
print("\nVerifying directory structure:")
for root, dirs, files in os.walk(base_dataset_dir):
    level = root.replace(base_dataset_dir, '').count(os.sep)
    indent = ' ' * 4 * (level)
    print(f'{indent}{os.path.basename(root)}/')
    subindent = ' ' * 4 * (level + 1)
    for f in files:
        print(f'{subindent}{f}')

Created base directory: chest_xray_dataset
Created split directory: chest_xray_dataset/train
Created class directory: chest_xray_dataset/train/normal
Created class directory: chest_xray_dataset/train/pneumonia
Created split directory: chest_xray_dataset/val
Created class directory: chest_xray_dataset/val/normal
Created class directory: chest_xray_dataset/val/pneumonia
Created split directory: chest_xray_dataset/test
Created class directory: chest_xray_dataset/test/normal
Created class directory: chest_xray_dataset/test/pneumonia
Dummy chest X-ray dataset directory structure and files created successfully.

Verifying directory structure:
chest_xray_dataset/
    val/
        normal/
            normal_004.jpg
            normal_003.jpg
            normal_002.jpg
            normal_005.jpg
            normal_001.jpg
        pneumonia/
            pneumonia_001.jpg
            pneumonia_005.jpg
            pneumonia_004.jpg
            pneumonia_003.jpg
            pneumonia_002.jpg
    tr

## Data Preprocessing and Augmentation

### Subtask:
Implement data preprocessing steps such as resizing images to a consistent dimension, normalizing pixel values, and applying data augmentation techniques (e.g., rotation, shifting, zooming, flipping) to increase the diversity of the training data and reduce overfitting. Use Keras `ImageDataGenerator` or PyTorch `transforms` for this.


**Reasoning**:
I will now implement the data preprocessing and augmentation steps using Keras `ImageDataGenerator` as outlined in the instructions. This involves defining image dimensions, setting up `ImageDataGenerator` instances for training (with augmentation) and validation/test (without augmentation), and creating `flow_from_directory` generators for each dataset split.



In [2]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Image dimensions defined in previous steps
# IMG_WIDTH = 128
# IMG_HEIGHT = 128
IMG_SHAPE = (IMG_WIDTH, IMG_HEIGHT, 3) # RGB images

# Define paths for dataset splits
train_dir = os.path.join(base_dataset_dir, 'train')
validation_dir = os.path.join(base_dataset_dir, 'val')
test_dir = os.path.join(base_dataset_dir, 'test')

# Instantiate ImageDataGenerator for training with augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255, # Normalize pixel values to [0, 1]
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Instantiate ImageDataGenerator for validation and test (only rescaling)
validation_test_datagen = ImageDataGenerator(rescale=1./255)

batch_size = 32

# Create train_generator
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

# Create validation_generator
validation_generator = validation_test_datagen.flow_from_directory(
    validation_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

# Create test_generator
test_generator = validation_test_datagen.flow_from_directory(
    test_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle=False # Keep data in order for evaluation
)

print("Data generators for training, validation, and testing created successfully.")
print(f"Train samples: {train_generator.samples}, Classes: {train_generator.class_indices}")
print(f"Validation samples: {validation_generator.samples}, Classes: {validation_generator.class_indices}")
print(f"Test samples: {test_generator.samples}, Classes: {test_generator.class_indices}")

Found 40 images belonging to 2 classes.
Found 10 images belonging to 2 classes.
Found 20 images belonging to 2 classes.
Data generators for training, validation, and testing created successfully.
Train samples: 40, Classes: {'normal': 0, 'pneumonia': 1}
Validation samples: 10, Classes: {'normal': 0, 'pneumonia': 1}
Test samples: 20, Classes: {'normal': 0, 'pneumonia': 1}


## CNN Model Architecture Definition

### Subtask:
Design or choose an appropriate CNN architecture for image classification.


**Reasoning**:
I will define the CNN model architecture using `tf.keras.Sequential` as specified in the instructions, including `Conv2D`, `MaxPooling2D`, `Flatten`, and `Dense` layers, with the correct input shape and output layer for binary classification. Then, I will print the model summary.



In [3]:
from tensorflow.keras import layers, models

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=IMG_SHAPE),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(class_names), activation='softmax') # Number of classes (2 for normal/pneumonia)
])

print("CNN model architecture defined successfully.")
model.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


CNN model architecture defined successfully.


**Reasoning**:
Now that the CNN model architecture is defined, the next logical step is to compile it using an appropriate optimizer, loss function, and metrics for binary classification.



In [4]:
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print("CNN model compiled successfully.")

CNN model compiled successfully.


**Reasoning**:
Now that the CNN model is compiled, the next step is to train it using the prepared data generators for training and validation.



In [5]:
epochs = 10

# Calculate steps per epoch for train and validation generators
steps_per_epoch = train_generator.samples // batch_size
validation_steps = validation_generator.samples // batch_size

history = model.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_steps
)

print("CNN model training completed.")

  self._warn_if_super_not_called()


Epoch 1/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 7s/step - accuracy: 0.6250 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step - accuracy: 0.4688 - loss: 0.6932 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 136ms/step - accuracy: 0.3750 - loss: 0.6935 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 168ms/step - accuracy: 0.5312 - loss: 0.6931 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 101ms/step - accuracy: 0.8750 - loss: 0.6922 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 139ms/step - accuracy: 0.4062 - loss: 0.6935 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 7/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━

**Reasoning**:
With the model trained, the next critical step is to evaluate its performance on the unseen test dataset to assess its generalization capabilities. I will use `model.evaluate()` with the `test_generator` to get the final test loss and accuracy.



In [6]:
loss, accuracy = model.evaluate(test_generator, steps=test_generator.samples // batch_size)

print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

  self._warn_if_super_not_called()


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 970ms/step - accuracy: 0.5000 - loss: 0.6931
Test Loss: 0.6931
Test Accuracy: 0.5000


**Reasoning**:
The previous training and evaluation showed 50% accuracy, indicating the model didn't learn, likely due to a `batch_size` too large for the small dummy dataset, resulting in `steps_per_epoch` and `validation_steps` being too low or zero. I will regenerate the data generators with a `batch_size` of 1 to ensure meaningful steps for training and evaluation, and then re-execute the evaluation step.



In [7]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Image dimensions defined in previous steps
# IMG_WIDTH = 128
# IMG_HEIGHT = 128
IMG_SHAPE = (IMG_WIDTH, IMG_HEIGHT, 3) # RGB images

# Define paths for dataset splits
train_dir = os.path.join(base_dataset_dir, 'train')
validation_dir = os.path.join(base_dataset_dir, 'val')
test_dir = os.path.join(base_dataset_dir, 'test')

# Instantiate ImageDataGenerator for training with augmentation
train_datagen = ImageDataGenerator(
    rescale=1./255, # Normalize pixel values to [0, 1]
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    shear_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Instantiate ImageDataGenerator for validation and test (only rescaling)
validation_test_datagen = ImageDataGenerator(rescale=1./255)

batch_size = 1 # Set batch_size to 1 for the small dummy dataset to ensure enough steps

# Create train_generator
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

# Create validation_generator
validation_generator = validation_test_datagen.flow_from_directory(
    validation_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical'
)

# Create test_generator
test_generator = validation_test_datagen.flow_from_directory(
    test_dir,
    target_size=(IMG_WIDTH, IMG_HEIGHT),
    batch_size=batch_size,
    class_mode='categorical',
    shuffle=False # Keep data in order for evaluation
)

print("Data generators for training, validation, and testing created successfully.")
print(f"Train samples: {train_generator.samples}, Classes: {train_generator.class_indices}")
print(f"Validation samples: {validation_generator.samples}, Classes: {validation_generator.class_indices}")
print(f"Test samples: {test_generator.samples}, Classes: {test_generator.class_indices}")

# Re-define model (since batch_size affects subsequent training)
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=IMG_SHAPE),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(512, activation='relu'),
    layers.Dense(len(class_names), activation='softmax') # Number of classes (2 for normal/pneumonia)
])

print("CNN model architecture re-defined successfully.")
model.summary()

# Re-compile model
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

print("CNN model re-compiled successfully.")

# Re-train model
epochs = 10

# Calculate steps per epoch for train and validation generators
steps_per_epoch = train_generator.samples // batch_size
validation_steps = validation_generator.samples // batch_size

history = model.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_steps
)

print("CNN model training completed.")

# Re-evaluate model
loss, accuracy = model.evaluate(test_generator, steps=test_generator.samples // batch_size)

print(f"Test Loss: {loss:.4f}")
print(f"Test Accuracy: {accuracy:.4f}")

Found 40 images belonging to 2 classes.
Found 10 images belonging to 2 classes.
Found 20 images belonging to 2 classes.
Data generators for training, validation, and testing created successfully.
Train samples: 40, Classes: {'normal': 0, 'pneumonia': 1}
Validation samples: 10, Classes: {'normal': 0, 'pneumonia': 1}
Test samples: 20, Classes: {'normal': 0, 'pneumonia': 1}
CNN model architecture re-defined successfully.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


CNN model re-compiled successfully.
Epoch 1/10


  self._warn_if_super_not_called()


[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 23ms/step - accuracy: 0.4360 - loss: 0.6935 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 2/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step - accuracy: 0.3595 - loss: 0.6934 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 3/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 9ms/step - accuracy: 0.3383 - loss: 0.6935 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 4/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.3638 - loss: 0.6933 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 5/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.5428 - loss: 0.6932 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 6/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step - accuracy: 0.5175 - loss: 0.6932 - val_accuracy: 0.5000 - val_loss: 0.6931
Epoch 7/10
[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━

**Reasoning**:
The previous execution still shows 50% accuracy, which is expected for a binary classification model trained on non-discriminative (all black) dummy images, even with a `batch_size` of 1. Since the purpose of this section is to demonstrate the end-to-end process, the next logical step after training and evaluating (even on dummy data) is to save the trained model for future deployment.



In [8]:
import os

# Define a path to save the model
model_save_path = 'chest_xray_cnn_model.keras'

# Save the model
model.save(model_save_path)

print(f"Model saved successfully to {model_save_path}")

Model saved successfully to chest_xray_cnn_model.keras


## Streamlit Web Application Development

### Subtask:
Develop a Streamlit application (`app.py`) that allows users to upload a chest X-ray image, preprocesses it, loads the trained CNN model, makes a prediction, and displays the result with a user-friendly interface.


```python
import streamlit as st
import tensorflow as tf
import numpy as np
from PIL import Image
import os

# --- Configuration --- #
MODEL_PATH = 'chest_xray_cnn_model.keras'
IMAGE_SIZE = (128, 128)
CLASS_NAMES = ['Normal', 'Pneumonia']

# --- Load the pre-trained model --- #
@st.cache_resource
def load_model():
    if not os.path.exists(MODEL_PATH):
        st.error(f"Model file not found at {MODEL_PATH}. Please ensure the model is saved correctly.")
        return None
    try:
        model = tf.keras.models.load_model(MODEL_PATH)
        st.success("Model loaded successfully!")
        return model
    except Exception as e:
        st.error(f"Error loading the model: {e}")
        return None

model = load_model()

# --- Streamlit App Layout --- #
st.set_page_config(page_title="Chest X-ray Classifier", layout="centered")

st.title("🩺 Chest X-ray Pneumonia Classifier")
st.markdown("Upload a chest X-ray image (JPG, JPEG, PNG) to classify it as Normal or Pneumonia.")

# --- File Uploader --- #
uploaded_file = st.file_uploader(
    "Choose an image...",
    type=["jpg", "jpeg", "png"],
    help="Upload an image file for classification."
)

if uploaded_file is not None:
    if model is None:
        st.warning("Model could not be loaded. Please check the model path and file.")
    else:
        # Display the uploaded image
        image = Image.open(uploaded_file).convert('RGB') # Ensure image is RGB
        st.image(image, caption='Uploaded X-ray Image', use_column_width=True)
        st.write("")
        st.write("Classifying...")

        # Preprocess the image
        # Resize
        image = image.resize(IMAGE_SIZE)
        # Convert to numpy array and normalize
        img_array = np.asarray(image) / 255.0
        # Expand dimensions to match model input shape (1, height, width, channels)
        img_array = np.expand_dims(img_array, axis=0)

        # Make prediction
        predictions = model.predict(img_array)
        predicted_class_idx = np.argmax(predictions, axis=1)[0]
        confidence = predictions[0][predicted_class_idx] * 100

        predicted_class_name = CLASS_NAMES[predicted_class_idx]

        # Display result
        st.success(f"Prediction: **{predicted_class_name}**")
        st.info(f"Confidence: **{confidence:.2f}%**")

        if predicted_class_name == 'Pneumonia':
            st.warning("**Disclaimer:** This is an AI-based prediction and should not be used as a substitute for professional medical advice. Always consult with a qualified healthcare professional for diagnosis and treatment.")
        else:
            st.success("**Disclaimer:** This is an AI-based prediction. Always consult with a qualified healthcare professional for diagnosis and treatment.")

# --- Instructions to run the app --- #
st.markdown(
    """
    ---
    **To run this application:**
    1. Save the code above into a file named `app.py` in the same directory as your `chest_xray_cnn_model.keras` file.
    2. Open your terminal or command prompt.
    3. Navigate to the directory where you saved `app.py`.
    4. Run the command: `streamlit run app.py`
    5. The application will open in your web browser.
    """
)
```