## **CNN Architecture**

**1. What is the role of filters and feature maps in Convolutional Neural Network (CNN)?**

**Answer:**

**Filters (also known as Kernels):**

*   **What they are:** Filters are small matrices of numbers that are convolved (slid) across the input image. Each filter is designed to detect a specific feature or pattern in the image, such as edges, corners, textures, or even more complex shapes.
*   **Their role:** During the training process, the values within these filters are learned. Different filters learn to respond strongly to different visual features. When a filter is convolved over an image, it produces a high output value when the feature it is designed to detect is present in that region of the image.
*   **Think of it like:** Imagine a magnifying glass that we move over an image. Different magnifying glasses (filters) highlight different aspects (features) of the image.

**Feature Maps:**

*   **What they are:** A feature map is the output of applying a filter to an entire input image. It's essentially a representation of the input image where the values indicate the presence and strength of the feature that the filter was designed to detect at each location.
*   **Their role:** Each filter applied to the input image generates a different feature map. These feature maps capture different aspects of the image, providing a rich representation of the input data. Subsequent layers in the CNN then take these feature maps as input and learn to detect more complex patterns by combining the features from the previous layers.
*   **Think of it like:** If we have a filter that detects horizontal edges, the resulting feature map will show us where all the horizontal edges are located in the original image. If we have another filter that detects vertical edges, its feature map will show the locations of vertical edges.

**In summary:**

Filters are the tools that scan the image for specific features, and feature maps are the results of that scan, highlighting where those features are present in the image. By using multiple filters in each layer, a CNN can learn to extract a wide range of features from the input data, which is essential for tasks like image classification, object detection, and image generation.

**2. Explain the concepts of padding and stride in CNNs (Convolutional Neural Network). How do they affect the output dimensions of feature maps?**

**Answer:**

**Padding:**

*   **What it is:** Padding involves adding extra pixels (typically zeros) around the border of the input image or feature map before convolution.
*   **Why it's used:**
    *   **Preserving spatial dimensions:** Without padding, the spatial dimensions (height and width) of the output feature map decrease with each convolutional layer. Padding helps to maintain the spatial size, preventing the loss of information, especially at the edges of the input.
    *   **Using information at the edges:** Pixels at the edges of an image are only involved in the convolution process a few times, compared to pixels in the center. Padding ensures that edge pixels are also considered sufficiently, allowing the network to learn features from the entire image.
*   **Types of padding:** Common types include "valid" (no padding) and "same" (padding is added so that the output dimensions are the same as the input dimensions, given a stride of 1).

**Stride:**

*   **What it is:** Stride is the number of pixels the filter shifts over the input image or feature map at each step of the convolution.
*   **Why it's used:**
    *   **Reducing spatial dimensions:** A stride greater than 1 reduces the spatial dimensions of the output feature map. This can help to reduce the computational cost of the network and downsample the feature maps.
    *   **Controlling the receptive field:** Stride affects how much of the input the filter "sees" at each step. A larger stride means the filter covers a larger area with fewer steps.

**How they affect the output dimensions of feature maps:**

The output dimensions of a convolutional layer are determined by the input dimensions, the filter size, the padding, and the stride. The formula for calculating the output dimension (for one dimension, say height or width) is:

`Output Dimension = [(Input Dimension - Filter Size + 2 * Padding) / Stride] + 1`

*   **Padding:** Increases the input dimension effectively, leading to a larger output dimension. With "same" padding and a stride of 1, the output dimension is the same as the input dimension.
*   **Stride:** A larger stride reduces the number of steps the filter takes, resulting in a smaller output dimension.

In essence, padding helps to preserve spatial information and ensures edge features are considered, while stride helps to reduce spatial dimensions and computational complexity. Both are important hyperparameters that need to be carefully chosen when designing a CNN architecture.

**3. Define receptive field in the context of CNNs. Why is it important for deep architectures?**

**Answer:**

**Receptive Field:**

*   **What it is:** The receptive field of a neuron in a CNN is the region in the input image (or the previous layer's feature map) that influences that neuron's activation. In other words, it's the area of the input that a particular filter is "seeing" when it computes its output at a specific location in the feature map.
*   **How it grows:** As we go deeper into a CNN (from earlier layers to later layers), the receptive field of the neurons in those layers increases. This is because each neuron in a deeper layer is influenced by a larger area of the previous layer's feature map, which in turn corresponds to an even larger area of the original input image. The receptive field grows due to the combined effect of convolution, pooling, and stride operations.

**Why it's important for deep architectures:**

*   **Capturing hierarchical features:** The increasing receptive field size in deeper layers allows the network to capture increasingly complex and abstract features. Early layers with small receptive fields detect simple features like edges and corners. Deeper layers with larger receptive fields can combine these simple features to detect more complex patterns like shapes, textures, and eventually, objects.
*   **Understanding global context:** As the receptive field expands, neurons in deeper layers are able to incorporate information from a wider region of the input. This is crucial for understanding the global context of an image and making decisions based on the relationships between different parts of the image. For tasks like image classification, a large receptive field is necessary to consider the entire image when determining its class.
*   **Efficiency:** Deep architectures with increasing receptive fields can process information efficiently. Instead of having a single large filter that covers the entire image in the first layer (which would be computationally expensive), the network uses multiple layers with smaller filters and increasing receptive fields. This allows the network to learn hierarchical representations and capture complex patterns with fewer parameters.

In summary, the receptive field is a key concept in CNNs that explains how neurons in deeper layers can see and process information from larger areas of the input. Its growth in deep architectures is essential for capturing hierarchical features, understanding global context, and achieving computational efficiency.

**4. Discuss how filter size and stride influence the number of parameters in a CNN.**

**Answer:**

The number of parameters in a convolutional layer is primarily determined by the filter size and the number of filters. Stride does **not** directly affect the number of parameters within a single convolutional layer. Here's the breakdown:

**Filter Size:**

*   **Direct Influence:** The filter size has a direct and significant impact on the number of parameters. Each filter is a small matrix of weights that the network learns. The number of parameters in a single filter is simply its dimensions (height \* width) multiplied by the number of input channels.
*   **Example:** If we have an input image with 3 channels (like RGB) and we use a 3x3 filter, that filter has 3 \* 3 \* 3 = 27 parameters. If we increase the filter size to 5x5, that filter now has 5 \* 5 \* 3 = 75 parameters.
*   **Total Parameters:** The total number of parameters in a convolutional layer is the number of parameters per filter multiplied by the number of filters in that layer. So, increasing the filter size or the number of filters will increase the total number of parameters.

**Stride:**

*   **Indirect Influence:** Stride does **not** change the number of weights within each filter. The filter itself remains the same size regardless of the stride.
*   **Effect on Output Size:** Stride affects the size of the output feature map. A larger stride results in a smaller output feature map. While this doesn't change the number of parameters in the convolutional layer itself, it can affect the number of parameters in subsequent layers (like fully connected layers) if their input size depends on the output size of the convolutional layer. However, for the convolutional layer specifically, stride does not influence the number of parameters.

**In summary:**

*   **Filter size** directly impacts the number of parameters in a convolutional layer: larger filters mean more parameters per filter, and more filters mean more total parameters.
*   **Stride** does **not** directly affect the number of parameters in a convolutional layer. Its main influence is on the spatial dimensions of the output feature map.

**5. Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.**

**Answer:**

**LeNet:**

*   **Depth:** One of the earliest CNNs, relatively shallow with a few convolutional and pooling layers followed by fully connected layers.
*   **Filter Sizes:** Used smaller filter sizes (e.g., 5x5).
*   **Performance:** Primarily used for digit recognition tasks. Achieved good performance for its time on datasets like MNIST. Limited in performance on more complex image recognition tasks due to its shallow depth and limited capacity.

**AlexNet:**

*   **Depth:** Significantly deeper than LeNet, with more convolutional and pooling layers. It was one of the first deep CNNs to achieve breakthrough performance on a large-scale image dataset.
*   **Filter Sizes:** Used a mix of filter sizes, including larger ones in the initial layers (e.g., 11x11) and smaller ones in later layers (e.g., 3x3).
*   **Performance:** Achieved a significant improvement in performance on the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012, demonstrating the power of deep CNNs for complex image recognition tasks. Introduced the use of ReLU activation functions and dropout for regularization.

**VGG:**

*   **Depth:** Even deeper than AlexNet, exploring the idea that increasing depth by using many small filters is beneficial. VGG networks come in different versions with varying depths (e.g., VGG16, VGG19).
*   **Filter Sizes:** Primarily used small 3x3 convolutional filters throughout the network. This allowed for increasing the receptive field with depth while keeping the number of parameters manageable.
*   **Performance:** Showed that increasing depth with small filters leads to improved performance on large-scale image recognition tasks. VGG networks are known for their simplicity and effectiveness, although they can be computationally expensive due to their depth.

**Comparison and Contrast:**

*   **Depth:** LeNet is the shallowest, followed by AlexNet, and then VGG which is the deepest among the three. Increasing depth generally led to improved performance on more complex datasets.
*   **Filter Sizes:** LeNet used primarily 5x5 filters. AlexNet used a mix of larger and smaller filters. VGG consistently used small 3x3 filters. The use of smaller filters in deeper networks allowed for a larger receptive field with fewer parameters compared to using larger filters.
*   **Performance:** Each architecture represented a step forward in CNN performance, with AlexNet significantly outperforming LeNet on large-scale datasets, and VGG further improving performance by exploring increased depth with small filters.

In summary, these architectures demonstrate the evolution of CNN design, highlighting the importance of depth and the strategic use of filter sizes for achieving high performance on image recognition tasks.

**6. Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.**

In [2]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# 1. Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Preprocess the images
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1).astype('float32') / 255
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1).astype('float32') / 255

# Preprocess the labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

# 2. Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(10, activation='softmax'))

# 3. Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Print the model summary
model.summary()

# 4. Train the model
print("\nTraining the model...")
history = model.fit(x_train, y_train, epochs=5, batch_size=200, verbose=2, validation_data=(x_test, y_test))

# 5. Evaluate the model
print("\nEvaluating the model...")
loss, accuracy = model.evaluate(x_test, y_test, verbose=0)

# 6. Finish task
print(f'\nModel Evaluation Results:')
print(f'Test Loss: {loss:.4f}')
print(f'Test Accuracy: {accuracy:.4f}')

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)



Training the model...
Epoch 1/5
300/300 - 6s - 18ms/step - accuracy: 0.9271 - loss: 0.2495 - val_accuracy: 0.9765 - val_loss: 0.0753
Epoch 2/5
300/300 - 2s - 6ms/step - accuracy: 0.9801 - loss: 0.0649 - val_accuracy: 0.9805 - val_loss: 0.0576
Epoch 3/5
300/300 - 2s - 5ms/step - accuracy: 0.9856 - loss: 0.0466 - val_accuracy: 0.9843 - val_loss: 0.0484
Epoch 4/5
300/300 - 2s - 5ms/step - accuracy: 0.9890 - loss: 0.0357 - val_accuracy: 0.9896 - val_loss: 0.0339
Epoch 5/5
300/300 - 2s - 5ms/step - accuracy: 0.9911 - loss: 0.0281 - val_accuracy: 0.9865 - val_loss: 0.0402

Evaluating the model...

Model Evaluation Results:
Test Loss: 0.0402
Test Accuracy: 0.9865


**7. Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.**

In [1]:
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# 1. Load and preprocess the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Normalize pixel values to be between 0 and 1
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# One-hot encode the labels
y_train = to_categorical(y_train, num_classes=10)
y_test = to_categorical(y_test, num_classes=10)

# 2. Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dense(10, activation='softmax'))

# 3. Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Print the model summary
model.summary()

# 4. Train the model
print("\nTraining the model...")
history = model.fit(x_train, y_train, epochs=10, batch_size=32, verbose=2, validation_data=(x_test, y_test))

# 5. Evaluate the model
print("\nEvaluating the model...")
loss, accuracy = model.evaluate(x_test, y_test, verbose=0)

# 6. Finish task
print(f'\nModel Evaluation Results:')
print(f'Test Loss: {loss:.4f}')
print(f'Test Accuracy: {accuracy:.4f}')

print("\nPreprocessing steps:")
print("- Normalized pixel values to be between 0 and 1")
print("- One-hot encoded the labels")

print("\nCNN Architecture:")
model.summary()

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)



Training the model...
Epoch 1/10
1563/1563 - 13s - 8ms/step - accuracy: 0.4498 - loss: 1.5080 - val_accuracy: 0.5285 - val_loss: 1.3116
Epoch 2/10
1563/1563 - 6s - 4ms/step - accuracy: 0.5931 - loss: 1.1484 - val_accuracy: 0.6090 - val_loss: 1.0954
Epoch 3/10
1563/1563 - 5s - 3ms/step - accuracy: 0.6506 - loss: 0.9950 - val_accuracy: 0.6569 - val_loss: 0.9827
Epoch 4/10
1563/1563 - 6s - 4ms/step - accuracy: 0.6865 - loss: 0.8930 - val_accuracy: 0.6749 - val_loss: 0.9351
Epoch 5/10
1563/1563 - 5s - 3ms/step - accuracy: 0.7157 - loss: 0.8190 - val_accuracy: 0.6822 - val_loss: 0.9244
Epoch 6/10
1563/1563 - 6s - 4ms/step - accuracy: 0.7345 - loss: 0.7589 - val_accuracy: 0.7100 - val_loss: 0.8460
Epoch 7/10
1563/1563 - 5s - 3ms/step - accuracy: 0.7500 - loss: 0.7123 - val_accuracy: 0.6913 - val_loss: 0.9335
Epoch 8/10
1563/1563 - 6s - 4ms/step - accuracy: 0.7655 - loss: 0.6702 - val_accuracy: 0.7012 - val_loss: 0.8794
Epoch 9/10
1563/1563 - 5s - 3ms/step - accuracy: 0.7800 - loss: 0.6265 -

**8. Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.**

In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# 1. Import necessary libraries (already done above)

# 2. Define the CNN model
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, stride=1, padding=1)
        self.relu = nn.ReLU()
        self.pool = nn.MaxPool2d(kernel_size=2, stride=2)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(-1, 64 * 7 * 7) # Flatten the output for the dense layer
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 3. Load and prepare the MNIST dataset
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

# 4. Define loss function and optimizer
model = SimpleCNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 5. Implement the training loop
def train(model, device, train_loader, optimizer, epoch):
    model.train()
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
        if batch_idx % 100 == 0:
            print(f'Train Epoch: {epoch} [{batch_idx * len(data)}/{len(train_loader.dataset)} ({100. * batch_idx / len(train_loader):.0f}%)]\tLoss: {loss.item():.6f}')

# 6. Evaluate the model
def test(model, device, test_loader):
    model.eval()
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += criterion(output, target).item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print(f'\nTest set: Average loss: {test_loss:.4f}, Accuracy: {correct}/{len(test_loader.dataset)} ({100. * correct / len(test_loader.dataset):.0f}%)\n')

# 7. Train and evaluate the model
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model.to(device)

epochs = 5
for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)

# 8. Finish task (results are printed in the test function)
print("\nPyTorch CNN training and evaluation on MNIST complete.")

100%|██████████| 9.91M/9.91M [00:00<00:00, 18.0MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 484kB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 3.85MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 11.1MB/s]



Test set: Average loss: 0.0001, Accuracy: 9832/10000 (98%)


Test set: Average loss: 0.0000, Accuracy: 9894/10000 (99%)


Test set: Average loss: 0.0000, Accuracy: 9910/10000 (99%)


Test set: Average loss: 0.0000, Accuracy: 9906/10000 (99%)


Test set: Average loss: 0.0000, Accuracy: 9895/10000 (99%)


PyTorch CNN training and evaluation on MNIST complete.


**9. Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.**

In [None]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

'''
Define the path to the dataset

Note: Before running this code, make sure the dataset is organized in the following structure:

your_dataset_directory/
  train/
    class1/
      img1.jpg
      img2.jpg
      ...
    class2/
      imgA.jpg
      imgB.jpg
      ...
    ...
  validation/
    class1/
      img3.jpg
      img4.jpg
      ...
    class2/
      imgC.jpg
      imgD.jpg
      ...
    ...

'''
dataset_dir = "path/to/your/dataset"

# Define image dimensions and batch size
img_height = 128
img_width = 128
batch_size = 32

# 3. Instantiate ImageDataGenerator
# We can add data augmentation here (e.g., rotation_range, width_shift_range, etc.)
train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2) # Using 20% for validation
test_datagen = ImageDataGenerator(rescale=1./255)

# 4. Prepare data generators
train_generator = train_datagen.flow_from_directory(
    dataset_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training') # Specify training subset

validation_generator = train_datagen.flow_from_directory(
    dataset_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation') # Specify validation subset

# Get the number of classes
num_classes = train_generator.num_classes

# 5. Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

# 6. Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Print the model summary
model.summary()

# 7. Train the model
print("\nTraining the model...")
# Using steps_per_epoch and validation_steps with generators is recommended
steps_per_epoch = train_generator.samples // batch_size
validation_steps = validation_generator.samples // batch_size

history = model.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=10,  # We can adjust the number of epochs
    validation_data=validation_generator,
    validation_steps=validation_steps)

# 8. Evaluate the model (optional, as validation during training gives performance)
# loss, accuracy = model.evaluate(validation_generator, steps=validation_steps)
# print(f'\nModel Evaluation Results:')
# print(f'Validation Loss: {loss:.4f}')
# print(f'Validation Accuracy: {accuracy:.4f}')

# 9. Finish task - The training history contains the performance metrics
print("\nCNN training on custom dataset with ImageDataGenerator complete.")

**10. You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach–from data preparation and model training to deploying the model as a web app using Streamlit.**

### Subtask:
Create dummy directories and files to simulate the chest X-ray dataset structure.


**Reasoning**:
The subtask requires creating a dummy dataset structure for a chest X-ray classification problem.



In [14]:
from PIL import Image

# 2. Define the base directory for the simulated dataset.
dataset_dir = "chest_xray_dataset"

# Define the number of dummy images to create per class per set
num_dummy_images = 50

# Define the classes
classes = ["Normal", "Pneumonia"]

# Define the sets
sets = ["train", "validation"]

# Define dummy image properties
img_height_dummy = 128
img_width_dummy = 128

# Remove the existing dummy directories and files to start fresh
import shutil
if os.path.exists(dataset_dir):
    shutil.rmtree(dataset_dir)
    print(f"Removed existing directory: {dataset_dir}")

# 3. Create the 'train' and 'validation' directories within the base directory.
# 4. Create 'Normal' and 'Pneumonia' subdirectories within both 'train' and 'validation' directories.
for data_set in sets:
    for class_name in classes:
        # Create the full path for the subdirectory
        subdir_path = os.path.join(dataset_dir, data_set, class_name)

        # Create the subdirectory if it doesn't exist
        os.makedirs(subdir_path, exist_ok=True)
        print(f"Created directory: {subdir_path}")

        # 5. For each class ('Normal' and 'Pneumonia') and each set ('train' and 'validation'), create a loop to generate a specified number of dummy image files
        for i in range(num_dummy_images):
            # 6. Inside the loop, create a dummy file path using the class name and a sequential index
            dummy_file_name = f"{class_name}_{i:03d}.jpg"
            dummy_file_path = os.path.join(subdir_path, dummy_file_name)

            # 7. Create a dummy image and save it
            try:
                # Create a blank white image
                img = Image.new('RGB', (img_width_dummy, img_height_dummy), color = 'white')
                img.save(dummy_file_path)
                # 8. Print messages indicating the creation of directories and dummy files for clarity.
                # print(f"Created dummy image: {dummy_file_path}") # Optional: uncomment to see each file created
            except IOError as e:
                print(f"Error creating dummy image {dummy_file_path}: {e}")

print("\nDummy image dataset structure created successfully.")

Removed existing directory: chest_xray_dataset
Created directory: chest_xray_dataset/train/Normal
Created directory: chest_xray_dataset/train/Pneumonia
Created directory: chest_xray_dataset/validation/Normal
Created directory: chest_xray_dataset/validation/Pneumonia

Dummy image dataset structure created successfully.


**Reasoning**:
The previous step successfully created the dummy dataset structure. The next step is to load and preprocess this dummy data using Keras ImageDataGenerator. This prepares the data for model training.



In [15]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

# Define image dimensions and batch size
img_height = 128
img_width = 128
batch_size = 32

# Instantiate ImageDataGenerator
# We can add data augmentation here (e.g., rotation_range, width_shift_range, etc.)
train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2) # Using 20% for validation

# Prepare data generators
train_generator = train_datagen.flow_from_directory(
    dataset_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='training') # Specify training subset

validation_generator = train_datagen.flow_from_directory(
    dataset_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='categorical',
    subset='validation') # Specify validation subset

# Get the number of classes
num_classes = train_generator.num_classes

print("\nData generators created using ImageDataGenerator.")

Found 160 images belonging to 2 classes.
Found 40 images belonging to 2 classes.

Data generators created using ImageDataGenerator.


**Reasoning**:
The data generators have been successfully created. The next logical step is to define and compile the CNN model architecture for the chest X-ray classification task using Keras, as this is necessary before training.



In [16]:
# Build the CNN model
model = Sequential()
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=(img_height, img_width, 3)))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D((2, 2)))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Print the model summary
model.summary()

print("\nCNN model defined and compiled.")

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)



CNN model defined and compiled.


**Reasoning**:
The model is defined and compiled, and the data generators are ready. The next step is to train the CNN model using the prepared data generators.



In [17]:
# Train the model
print("\nTraining the model...")
# Using steps_per_epoch and validation_steps with generators is recommended
steps_per_epoch = train_generator.samples // batch_size
validation_steps = validation_generator.samples // batch_size

history = model.fit(
    train_generator,
    steps_per_epoch=steps_per_epoch,
    epochs=5,  # Reduced epochs for demonstration with dummy data
    validation_data=validation_generator,
    validation_steps=validation_steps)

print("\nCNN model training complete.")


Training the model...
Epoch 1/5


  self._warn_if_super_not_called()


[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 170ms/step - accuracy: 0.4468 - loss: 1.2342 - val_accuracy: 0.5312 - val_loss: 0.6951
Epoch 2/5
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step - accuracy: 0.5108 - loss: 0.7050 - val_accuracy: 0.4062 - val_loss: 0.7067
Epoch 3/5
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step - accuracy: 0.5460 - loss: 0.7078 - val_accuracy: 0.5312 - val_loss: 0.6937
Epoch 4/5
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step - accuracy: 0.4965 - loss: 0.7089 - val_accuracy: 0.5000 - val_loss: 0.6967
Epoch 5/5
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 31ms/step - accuracy: 0.5417 - loss: 0.6885 - val_accuracy: 0.5000 - val_loss: 0.6932

CNN model training complete.


**Reasoning**:
The model has been trained on the dummy data. The next step is to save the trained model so it can be used for deployment.



In [18]:
# Save the trained model
model_save_path = "chest_xray_cnn_model.keras"
model.save(model_save_path)

print(f"\nCNN model saved to: {model_save_path}")

# 9. Finish task - The model is trained and saved.


CNN model saved to: chest_xray_cnn_model.keras
