# CNN Architecture

Question 1: What is the role of filters and feature maps in Convolutional Neural 
Network (CNN)? 

Answer:  Filters in Convolutional Neural Networks (CNNs) are learnable matrices that slide over input data, such as images, to detect essential features like edges, textures, and patterns. As the filter moves across the input, it performs a mathematical operation called convolution, producing output values that form a new matrix known as a feature map.

### Role of Filters

- Filters act as pattern detectors, identifying localized features by focusing on specific spatial regions of the input data.
- Different filters in a layer learn to recognize different low-level features (like edges or corners), and deeper layers learn higher-level, more abstract patterns (such as shapes or specific objects).
- These filters are updated and refined during training so that they become specialized in detecting features important for the specific task, such as classification or detection.

### Role of Feature Maps

- A feature map is the output produced when a filter convolves with the input data.
- Each feature map represents the spatial location and strength of a particular detected feature across the input image.
- Multiple feature maps from multiple filters collectively capture a diverse set of features, which are then used by subsequent layers for more complex recognition tasks.

### Summary Table

| Concept      | Description                                                                   |
|--------------|-------------------------------------------------------------------------------|
| Filter       | Learnable matrix that detects local features via convolution                  |
| Feature Map  | Output matrix showing where and to what extent a feature is detected          |

Filters and feature maps together enable CNNs to transform raw input data into hierarchical, abstract feature representations essential for tasks like image classification and object detection.



Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural 
Network). How do they affect the output dimensions of feature maps?
 
Answe

***

## Padding
**Padding** is when you add extra pixels around the border of your input before applying convolution. Most often, the padding pixels are zeros (called *zero-padding*).

**Why use padding?**
- Without padding, the feature map shrinks after every convolution, which can quickly lose border information.
- Padding helps preserve the spatial size of the input and lets the filter “see” the edges of the image.

Types of padding:
- **Valid padding:** No padding, so the output is smaller than the input.
- **Same padding:** Adds just enough padding so the output size matches the input size (very common in  CNNs).[1][2]

***

## Stride
**Stride** is how many pixels the filter jumps as it slides across the input.

**Why use stride?**
- Larger stride means filters skip more pixels, producing a **smaller** output feature map.
- Smaller stride (usually 1) means the filter checks every position, proda larger output.[2][3]

***

## How They Affect Feature Map Size

Here’s a simple formula for one dimension (say, width):

$$
\text{Output Size} = \frac{\text{Input Size} - \text{Filter Size} + 2 \times \text{Padding}}{\text{Stride}} + 1
$$

This shows:
- More padding **increases** output size.
- Bigger stride **decreases** output size.
- Larlter **decreases** output size.[3][4]

***

### Let's Review:  
- **Padding** keeps the output big (helps avoid losing information at edges).
- **Stride** controls how mucmath-behind-convolutional-neural-networks-6aed775df076/): 

Question 3: Define receptive field in the context of CNNs. Why is it important for deep 
architectures? 

Answer: 

***

## What is a Receptive Field?

The **receptive field** of a neuron (or unit) in a convolutional neural network is the size of the region in the original input image that affects that neuron's output. In other words, it’s the area of the input data that a particular feature in a certain layer is “lookin[2][4]

- In initial layers, a neuron's receptive field is small—just a few pixels.
- In deeper layers, thanks to stacked convolutions and pooling, each neuron’s receptive field grows to cover a larger chunk of the input image.

## Why is Receptive Field Important for Deep Architectures?

The **receptive field determines**:
- **How much context a neuron sees:** A small field captures details (edges, small patterns); a large field captures global structure (shapes, we objects).[2]
- **Ability to detect large objects:** If the receptive field is too small, deeper units can't “see” enough of the image to recognize bigger features.
- **Balancing detail and context:** Effective deep networks carefully expand the receptive field so higher layers can combine both fine details and broad context, which is crucial for tasks like semantic segmentation, object detection, and scenrstanding.[4][2]

For very deep CNNs, the receptive field grows rapidly—sometimes exponentially—allowing the deepest layers to capture relationships spanning almost the whole image.

***

### Check Your Understanding

If a CNN has 3 layers, each using a 3x3 filter and stride 1, how big is the receptive field of a neuron in the third layer?  
(Think: Each layer lets the neuron “see” l.pub/2019/computing-receptive-fields)

Question 4: Discuss how filter size and stride influence the number of parameters in a 
CNN.
 
Answer

***

## Filter Size and Stride: Influence on Number of Parameters

**Filter size** directly influences the number of parameters in a CNN layer.  
**Stride** changes how the filter moves across the input, but does NOT influence the number of parameters.

### How Filter Size Affects Parameters

Each **filter** (also known as kernel) is a small matrix of weights. The number of parameters in a convolutional layer is calculated:

$$
\text{Parameters} = (k_w \times k_h \times C_{\text{in}} + 1) \times C_{\text{out}}
$$

Where:
- $$ k_w $$: filter width
- $$ k_h $$: filter height
- $$ C_{\text{in}} $$: number of input channels
- $$ C_{\text{out}} $$: number of filters (output channels)
- $$ +1 $$: bias term for each filter

**If you increase the filter size (e.g., from 3x3 to 5x5), you directly increase the number of weights in each filter, and so increase the total parameters.**[1]

### How Stride Affects Parameters

**Stride** is how far the filter jumps at each move. Increasing stride changes the size of the output feature map, but **does NOT change the number of weights (paramethe filter itself**.[3][7][1]
- The stride affects only how many times the filter is applied over the image—not the size or number of weights.

***

## Mnemonic:
- *Filter size* controls the **number of weights** (parameters).
- *Stride* controls **how many times you use those weights** (output size), not how many weights you have.

***

**Check Your Understanding:**  
If you have a convolutional layer with 32 filters, input channels = 3, and filter size = 3x3,  
How many parametkillsboost.google/course_templates/18/video/381973): 

Question 5: Compare and contrast different CNN-based architectures like LeNet, 
AlexNet, and VGG in terms of depth, filter sizes, and performance.  

Answer: compare three classic CNN architectures—**LeNet**, **AlexNet**, and **VGG**—focusing on **depth**, **filter sizes**, and **performance**.

***

## LeNet

- **Depth:** Very shallow; LeNet-5 has **7 layers** (including convolutional, pooling, and fully connected la[3][4]
- **Filter sizes:** Uses larger filters for the first layer (typically 5x5), and subsampling (average pooling) ers.[3]
- **Performance:** Designed for digit recognition (MNIST dataset). LeNet is **fast, lightweight**, and ideal for limited-resource environments; achieves respectable accuracy for simple tasks but not state-of-the-art for complexages.[5]
- **Activation:** 'tanh' functions.

## AlexNet

- **Depth:** More advanced; **8 weight layers** (5 convolutional + 3 fully connected), substantially deepe LeNet.[4][3]
- **Filter sizes:** First layer uses **large (11x11)** filters, then 5x5 and 3x3 inter layers.[3]
- **Performance:** Revolutionized large-scale image classification (ImageNet). Achieves **much higher accuracy** than LeNet. Faster training due to ReLU activations. Requires greater memory and compute, but balances complexity and extractpability.[4][5]
- **Activation:** 'ReLU' for fast training.
- **Feature:** Introduced innovations such as dropout, data augmentation, and GPU-training.

## VGG

- **Depth:** Very deep; most popular versions are **VGG-16** and **VGG-19** (16 or 19 convolutional layers lly connected).[4][3]
- **Filter sizes:** Uses only **very small (3x3)** filtacked in depth.[3][4]
- **Performance:** Outstanding accuracy (top-5 on ImageNet: ~92.7% for VGG16), but **much heavier** computational requirements. Training and inference times are significantly longer—suitedowerful hardware.[5][3]
- **Activation:** 'ReLU' throughout.

***

### Summary Table

| Architecture | Depth          | Filter Sizes      | Performance             | Resource Use      |
|--------------|---------------|-------------------|-------------------------|-------------------|
| LeNet        | 7 layers[3][4]       | Larger (mainly 5x5)[3]   | For simple images (fast)[5]  | Low               |
| AlexNet      | 8 layers[3][4]       | Starts large (11x11), then smaller[3] | Great for large, varied images[5] | Medium            |
| VGG          | 16–19 layers[3][4]   | Small (3x3) throughout[3][4] | Top accuracy, complex images[3][5] | Very high         |

***

**Quick Review:**  
- **LeNet:** Shallow, simple, quick, small filters for small tasks.  
- **AlexNet:** Moderate depth, mixes filter sizes, great leap in image classification.  
- **VGG:** Very deep, all small filters, best [7](https://ieeexplore.ieee.org/document/10537732/)[7](https://ieeexplore.ieee.org/document/10537732/)

Question 6: Using keras, build and train a simple CNN model on the MNIST dataset 
from scratch. Include code for module creation, compilation, training, and evaluation. 
(Include your Python code and output in the code box below.

 
Answer:

In [5]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# 1. Load and preprocess MNIST data
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data()
# Shape: (60000, 28, 28), (10000, 28, 28)

# Add channel dimension and normalize to [0,1]
x_train = x_train.astype("float32") / 255.0
x_test  = x_test.astype("float32") / 255.0
x_train = x_train[..., None]   # (60000, 28, 28, 1)
x_test  = x_test[..., None]    # (10000, 28, 28, 1)

num_classes = 10

# 2. Build CNN model
model = keras.Sequential(
    [
        layers.Conv2D(32, (3, 3), activation="relu", input_shape=(28, 28, 1)),
        layers.MaxPooling2D((2, 2)),
        layers.Conv2D(64, (3, 3), activation="relu"),
        layers.MaxPooling2D((2, 2)),
        layers.Flatten(),
        layers.Dense(64, activation="relu"),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

print(model.summary())

# 3. Compile model
model.compile(
    optimizer="adam",
    loss="sparse_categorical_crossentropy",
    metrics=["accuracy"],
)

# 4. Train model
history = model.fit(
    x_train,
    y_train,
    epochs=5,
    batch_size=128,
    validation_split=0.1,
    verbose=2,
)

# 5. Evaluate on test set
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test loss: {test_loss:.4f}")
print(f"Test accuracy: {test_acc:.4f}")


None
Epoch 1/5
422/422 - 5s - 13ms/step - accuracy: 0.9255 - loss: 0.2496 - val_accuracy: 0.9763 - val_loss: 0.0779
Epoch 2/5
422/422 - 4s - 9ms/step - accuracy: 0.9809 - loss: 0.0637 - val_accuracy: 0.9858 - val_loss: 0.0507
Epoch 3/5
422/422 - 4s - 9ms/step - accuracy: 0.9854 - loss: 0.0462 - val_accuracy: 0.9877 - val_loss: 0.0414
Epoch 4/5
422/422 - 4s - 9ms/step - accuracy: 0.9889 - loss: 0.0351 - val_accuracy: 0.9880 - val_loss: 0.0380
Epoch 5/5
422/422 - 4s - 9ms/step - accuracy: 0.9909 - loss: 0.0287 - val_accuracy: 0.9893 - val_loss: 0.0368
Test loss: 0.0288
Test accuracy: 0.9902


Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a 
CNN model to classify RGB images. Show your preprocessing and architecture. 
(Include your Python code and output in the code box below.) 

Answer:  

In [6]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers

# 1. Load CIFAR-10 dataset
from tensorflow.keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

print("Train images:", x_train.shape)   # (50000, 32, 32, 3)
print("Test images:", x_test.shape)     # (10000, 32, 32, 3)

# 2. Preprocess: scale pixels to [0, 1] and one-hot encode labels
x_train = x_train.astype("float32") / 255.0
x_test  = x_test.astype("float32") / 255.0

num_classes = 10
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test  = keras.utils.to_categorical(y_test, num_classes)

# 3. Create CNN model for RGB images
model = keras.Sequential(
    [
        layers.Conv2D(32, (3, 3), activation="relu", padding="same",
                      input_shape=(32, 32, 3)),
        layers.MaxPooling2D((2, 2)),

        layers.Conv2D(64, (3, 3), activation="relu", padding="same"),
        layers.MaxPooling2D((2, 2)),

        layers.Conv2D(128, (3, 3), activation="relu", padding="same"),
        layers.MaxPooling2D((2, 2)),

        layers.Flatten(),
        layers.Dense(128, activation="relu"),
        layers.Dropout(0.5),
        layers.Dense(num_classes, activation="softmax"),
    ]
)

print(model.summary())

# 4. Compile the model
model.compile(
    optimizer="adam",
    loss="categorical_crossentropy",
    metrics=["accuracy"],
)

# 5. Train the model
history = model.fit(
    x_train,
    y_train,
    epochs=10,
    batch_size=128,
    validation_split=0.1,
    verbose=2,
)

# 6. Evaluate on test data
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print(f"Test loss: {test_loss:.4f}")
print(f"Test accuracy: {test_acc:.4f}")


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m79s[0m 0us/step
Train images: (50000, 32, 32, 3)
Test images: (10000, 32, 32, 3)


None
Epoch 1/10
352/352 - 10s - 28ms/step - accuracy: 0.3452 - loss: 1.7642 - val_accuracy: 0.4996 - val_loss: 1.3772
Epoch 2/10
352/352 - 7s - 21ms/step - accuracy: 0.4948 - loss: 1.3982 - val_accuracy: 0.5752 - val_loss: 1.1645
Epoch 3/10
352/352 - 7s - 21ms/step - accuracy: 0.5631 - loss: 1.2332 - val_accuracy: 0.6396 - val_loss: 1.0273
Epoch 4/10
352/352 - 7s - 21ms/step - accuracy: 0.6082 - loss: 1.1069 - val_accuracy: 0.6642 - val_loss: 0.9606
Epoch 5/10
352/352 - 7s - 21ms/step - accuracy: 0.6428 - loss: 1.0210 - val_accuracy: 0.6954 - val_loss: 0.8790
Epoch 6/10
352/352 - 7s - 21ms/step - accuracy: 0.6687 - loss: 0.9526 - val_accuracy: 0.7040 - val_loss: 0.8546
Epoch 7/10
352/352 - 7s - 21ms/step - accuracy: 0.6894 - loss: 0.8963 - val_accuracy: 0.7264 - val_loss: 0.7963
Epoch 8/10
352/352 - 7s - 21ms/step - accuracy: 0.7067 - loss: 0.8435 - val_accuracy: 0.7448 - val_loss: 0.7620
Epoch 9/10
352/352 - 7s - 21ms/step - accuracy: 0.7216 - loss: 0.8013 - val_accuracy: 0.7454 - val

Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST 
dataset. Include model definition, data loaders, training loop, and accuracy evaluation. 
(Include your Python code and output in the code box below.) 

Answer:  

In [12]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader
from torchvision import datasets, transforms

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Device:", device)

# 1. Data transforms and loaders
transform = transforms.Compose([
    transforms.ToTensor(),                       # [0,1]
    transforms.Normalize((0.1307,), (0.3081,))   # standard MNIST normalization
])

train_dataset = datasets.MNIST(
    root="./data",
    train=True,
    transform=transform,
    download=True,
)

test_dataset = datasets.MNIST(
    root="./data",
    train=False,
    transform=transform,
    download=True,
)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader  = DataLoader(test_dataset,  batch_size=1000, shuffle=False)

# 2. CNN model definition
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)   # 1x28x28 -> 32x28x28
        self.pool1 = nn.MaxPool2d(2, 2)                          # 32x14x14
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1) # 64x14x14
        self.pool2 = nn.MaxPool2d(2, 2)                          # 64x7x7
        self.fc1 = nn.Linear(64 * 7 * 7, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = self.pool1(x)
        x = torch.relu(self.conv2(x))
        x = self.pool2(x)
        x = x.view(x.size(0), -1)  # flatten
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN().to(device)
print(model)

# 3. Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-3)

# 4. Training loop
num_epochs = 5
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    correct = 0
    total = 0

    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item() * images.size(0)

        _, preds = torch.max(outputs, 1)
        correct += (preds == labels).sum().item()
        total += labels.size(0)

    train_loss = running_loss / total
    train_acc = correct / total

    print(f"Epoch [{epoch+1}/{num_epochs}] "
          f"Train Loss: {train_loss:.4f}  Train Acc: {train_acc:.4f}")

# 5. Evaluation on test set
model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, preds = torch.max(outputs, 1)
        correct += (preds == labels).sum().item()
        total += labels.size(0)

test_acc = correct / total
print(f"Test Accuracy: {test_acc:.4f}")


Device: cpu


100%|█████████████████████████████████████████████████████████████████████████████| 9.91M/9.91M [00:05<00:00, 1.76MB/s]
100%|█████████████████████████████████████████████████████████████████████████████| 28.9k/28.9k [00:00<00:00, 37.9kB/s]
100%|██████████████████████████████████████████████████████████████████████████████| 1.65M/1.65M [00:02<00:00, 796kB/s]
100%|█████████████████████████████████████████████████████████████████████████████| 4.54k/4.54k [00:00<00:00, 1.14MB/s]


CNN(
  (conv1): Conv2d(1, 32, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool1): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pool2): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (fc1): Linear(in_features=3136, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=10, bias=True)
)
Epoch [1/5] Train Loss: 0.1385  Train Acc: 0.9587
Epoch [2/5] Train Loss: 0.0430  Train Acc: 0.9869
Epoch [3/5] Train Loss: 0.0287  Train Acc: 0.9911
Epoch [4/5] Train Loss: 0.0200  Train Acc: 0.9933
Epoch [5/5] Train Loss: 0.0161  Train Acc: 0.9945
Test Accuracy: 0.9854


Question 9: Given a custom image dataset stored in a local directory, write code using 
Keras ImageDataGenerator to preprocess and train a CNN model. 
(Include your Python code and output in the code box below.) 

Answer: 

In [16]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models

# 1. Paths and basic parameters
train_dir = r"E:\Data Science\Deep_Learning\Module-3\animals\Train"
val_dir   = r"E:\Data Science\Deep_Learning\Module-3\animals\Validation"

img_height, img_width = 150, 150
batch_size = 32

# 2. ImageDataGenerators (preprocessing + augmentation)
train_datagen = ImageDataGenerator(
    rescale=1.0/255.0,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.2,
    horizontal_flip=True
)

val_datagen = ImageDataGenerator(
    rescale=1.0/255.0
)

# 3. Create generators from local directories
train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode="categorical",   # use "binary" for 2 classes
    shuffle=True
)

val_generator = val_datagen.flow_from_directory(
    val_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode="categorical",
    shuffle=False
)

num_classes = train_generator.num_classes
print("Classes:", train_generator.class_indices)

# 4. Define a simple CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation="relu",
                  input_shape=(img_height, img_width, 3)),
    layers.MaxPooling2D(2, 2),

    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D(2, 2),

    layers.Conv2D(128, (3, 3), activation="relu"),
    layers.MaxPooling2D(2, 2),

    layers.Flatten(),
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(num_classes, activation="softmax"),
])

print(model.summary())

# 5. Compile the model
model.compile(
    optimizer="adam",
    loss="categorical_crossentropy",
    metrics=["accuracy"]
)

# 6. Train the model
epochs = 10
history = model.fit(
    train_generator,
    epochs=epochs,
    validation_data=val_generator
)

# 7. Evaluate on validation set
val_loss, val_acc = model.evaluate(val_generator, verbose=0)
print(f"Validation loss: {val_loss:.4f}")
print(f"Validation accuracy: {val_acc:.4f}")


Found 1000 images belonging to 2 classes.
Found 1000 images belonging to 2 classes.
Classes: {'cat': 0, 'dog': 1}


None
Epoch 1/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m64s[0m 2s/step - accuracy: 0.5690 - loss: 0.7619 - val_accuracy: 0.7320 - val_loss: 0.5386
Epoch 2/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m28s[0m 874ms/step - accuracy: 0.7620 - loss: 0.5057 - val_accuracy: 0.8280 - val_loss: 0.3774
Epoch 3/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m28s[0m 871ms/step - accuracy: 0.8010 - loss: 0.4526 - val_accuracy: 0.8760 - val_loss: 0.3068
Epoch 4/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m28s[0m 885ms/step - accuracy: 0.8370 - loss: 0.3970 - val_accuracy: 0.8890 - val_loss: 0.2948
Epoch 5/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m30s[0m 934ms/step - accuracy: 0.8550 - loss: 0.3387 - val_accuracy: 0.9460 - val_loss: 0.2253
Epoch 6/10
[1m32/32[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m29s[0m 899ms/step - accuracy: 0.8990 - loss: 0.2452 - val_accuracy: 0.9560 - val_loss: 0.1123
Epoch 7/10
[1m32/32

Question 10: You are working on a web application for a medical imaging startup. Your 
task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” 
and “Pneumonia” categories. Describe your end-to-end approach–from data preparation 
and model training to deploying the model as a web app using Streamlit. 
(Include your Python code and output in the code box below.) 


Answer: 

In [None]:
# train_chest_xray_cnn.py
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os

base_dir   = r"D:\datasets\chest_xray"   # change to your path
train_dir  = os.path.join(base_dir, "train")
val_dir    = os.path.join(base_dir, "val")
test_dir   = os.path.join(base_dir, "test")

img_height, img_width = 224, 224
batch_size = 32

# 1. Data generators with augmentation
train_datagen = ImageDataGenerator(
    rescale=1.0/255.0,
    rotation_range=15,
    width_shift_range=0.1,
    height_shift_range=0.1,
    zoom_range=0.1,
    horizontal_flip=True
)

val_datagen = ImageDataGenerator(rescale=1.0/255.0)
test_datagen = ImageDataGenerator(rescale=1.0/255.0)

train_gen = train_datagen.flow_from_directory(
    train_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode="binary"
)

val_gen = val_datagen.flow_from_directory(
    val_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode="binary",
    shuffle=False
)

test_gen = test_datagen.flow_from_directory(
    test_dir,
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode="binary",
    shuffle=False
)

print("Class indices:", train_gen.class_indices)  # {'NORMAL': 0, 'PNEUMONIA': 1}

# 2. CNN model definition
model = keras.Sequential([
    layers.Conv2D(32, (3, 3), activation="relu",
                  input_shape=(img_height, img_width, 3)),
    layers.MaxPooling2D(2, 2),

    layers.Conv2D(64, (3, 3), activation="relu"),
    layers.MaxPooling2D(2, 2),

    layers.Conv2D(128, (3, 3), activation="relu"),
    layers.MaxPooling2D(2, 2),

    layers.Flatten(),
    layers.Dense(128, activation="relu"),
    layers.Dropout(0.5),
    layers.Dense(1, activation="sigmoid")  # binary output
])

model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=1e-4),
    loss="binary_crossentropy",
    metrics=["accuracy"]
)

print(model.summary())

# 3. Train the model
callbacks = [
    keras.callbacks.EarlyStopping(
        monitor="val_loss", patience=5, restore_best_weights=True
    )
]

history = model.fit(
    train_gen,
    epochs=20,
    validation_data=val_gen,
    callbacks=callbacks
)

# 4. Evaluate on test set
test_loss, test_acc = model.evaluate(test_gen, verbose=0)
print(f"Test loss: {test_loss:.4f}")
print(f"Test accuracy: {test_acc:.4f}")

# 5. Save model and class mapping for deployment
model.save("chest_xray_cnn.h5")

# Save labels order
import json
with open("class_indices.json", "w") as f:
    json.dump(train_gen.class_indices, f)
