Question 1: What is the role of filters and feature maps in Convolutional Neural
Network (CNN)?

Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural
Network). How do they affect the output dimensions of feature maps?

Question 3: Define receptive field in the context of CNNs. Why is it important for deep
architectures?

Question 4: Discuss how filter size and stride influence the number of parameters in a
CNN.

Question 5: Compare and contrast different CNN-based architectures like LeNet,
AlexNet, and VGG in terms of depth, filter sizes, and performance.

Question 6: Using keras, build and train a simple CNN model on the MNIST dataset
from scratch. Include code for module creation, compilation, training, and evaluation.
(Include your Python code and output in the code box below.)

Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a
CNN model to classify RGB images. Show your preprocessing and architecture.
(Include your Python code and output in the code box below.)

Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST
dataset. Include model definition, data loaders, training loop, and accuracy evaluation.
(Include your Python code and output in the code box below.)

Question 9: Given a custom image dataset stored in a local directory, write code using
Keras ImageDataGenerator to preprocess and train a CNN model.
(Include your Python code and output in the code box below.)'

Question 10: You are working on a web application for a medical imaging startup. Your
task is to build and deploy a CNN model that classifies chest X-ray images into “Normal”
and “Pneumonia” categories. Describe your end-to-end approach–from data preparation
and model training to deploying the model as a web app using Streamlit.
(Include your Python code and output in the code box below.)

# CNN Architecture Assignment

# Question 1: What is the role of filters and feature maps in Convolutional Neural Network (CNN)?
# --------------------------------------------------
Filter (kernel): small learnable weight tensor that slides across the input and performs dot-products to detect local patterns (edges, textures, shapes, objects, …).  
Feature map: 2-D (or 3-D) output produced by one filter; each value shows how strongly the corresponding spatial location matches the filter’s pattern.  
Many filters = many feature maps = rich hierarchical representation that the network learns automatically.

# --------------------------------------------------
# Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural Network). How do they affect the output dimensions of feature maps?
# --------------------------------------------------
Padding: extra pixels (usually 0) added around the input before convolution.  
  – “same” padding: output spatial size ≈ input size.  
  – “valid” padding: no padding; output shrinks.  

Stride: step size with which the filter slides.  
  – stride = 1 ⇒ slide one pixel at a time (largest output).  
  – stride > 1 ⇒ down-samples the output (smaller feature map).  

Output spatial size formula (square case):  
O = ⌊(I − K + 2P)/S⌋ + 1  
I=input size, K=kernel size, P=padding, S=stride.

# --------------------------------------------------
# Question 3: Define receptive field in the context of CNNs. Why is it important for deep architectures?
# --------------------------------------------------
Receptive field (RF): region of the original input image that can influence the activation of a single neuron in a given layer.  
Early layers have small RF (few pixels); deeper layers have larger RF (hundreds of pixels) thanks to stacking convolutions and pooling.  
Large RF is crucial for deep nets because it allows neurons to integrate global context and recognize large-scale objects.

# --------------------------------------------------
# Question 4: Discuss how filter size and stride influence the number of parameters in a CNN.
# --------------------------------------------------
Parameter count in a conv layer = (F × F × C_in) × C_out + C_out(bias)  
F = filter spatial size, C_in = input channels, C_out = output channels.  
Stride does NOT change the number of parameters; it only changes output resolution.  
Larger filters increase parameters quadratically in F; common remedy: stack multiple 3×3 filters to get large RF with fewer params.

# --------------------------------------------------
# Question 5: Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.
# --------------------------------------------------

LeNet is one of the earliest CNNs with a shallow architecture containing around 5–7 layers. It uses relatively large filters (5×5) and was mainly designed for handwritten digit recognition. It has low computational requirements and performs well on simple datasets.

AlexNet marked a major advancement in 2012, consisting of 8 layers with larger filters in the initial layers (11×11, 5×5). It introduced important innovations such as ReLU activation, dropout, and data augmentation. AlexNet significantly improved performance on the ImageNet dataset.

VGG is much deeper, with 16 or 19 layers, and uses a uniform architecture of small 3×3 filters throughout the network. Although computationally expensive due to its large number of parameters, VGG achieves very high accuracy and is widely used as a feature extractor in many applications.

In [3]:
# Question 6: Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.

import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# 1. Load dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# 2. Reshape (CNN requires 4D input)
x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

# 3. Normalize
x_train = x_train / 255.0
x_test = x_test / 255.0

# 4. One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# 5. Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# 6. Compile model
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# 7. Train model
model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# 8. Evaluate model
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)


Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 55ms/step - accuracy: 0.8837 - loss: 0.3750 - val_accuracy: 0.9838 - val_loss: 0.0536
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 50ms/step - accuracy: 0.9826 - loss: 0.0563 - val_accuracy: 0.9858 - val_loss: 0.0491
Epoch 3/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m43s[0m 51ms/step - accuracy: 0.9890 - loss: 0.0370 - val_accuracy: 0.9880 - val_loss: 0.0421
Epoch 4/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 50ms/step - accuracy: 0.9924 - loss: 0.0246 - val_accuracy: 0.9905 - val_loss: 0.0415
Epoch 5/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 51ms/step - accuracy: 0.9931 - loss: 0.0204 - val_accuracy: 0.9890 - val_loss: 0.0404
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 9ms/step - accuracy: 0.9869 - loss: 0.0468
Test Accuracy: 0.9897000193595886


In [4]:
# Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a
# CNN model to classify RGB images. Show your preprocessing and architecture.

import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical

# 1. Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# 2. Normalize pixel values (0–255 → 0–1)
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# 3. One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# 4. Build CNN model for RGB images
model = Sequential([
    Conv2D(32, (3,3), activation='relu', padding='same', input_shape=(32,32,3)),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), activation='relu', padding='same'),
    MaxPooling2D((2,2)),

    Conv2D(128, (3,3), activation='relu', padding='same'),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.3),
    Dense(10, activation='softmax')
])

# 5. Compile model
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# 6. Train model
model.fit(x_train, y_train, epochs=10, batch_size=64, validation_split=0.1)

# 7. Evaluate model
test_loss, test_acc = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_acc)


Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m107s[0m 148ms/step - accuracy: 0.3265 - loss: 1.8110 - val_accuracy: 0.5524 - val_loss: 1.2706
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m134s[0m 136ms/step - accuracy: 0.5547 - loss: 1.2533 - val_accuracy: 0.6478 - val_loss: 1.0249
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m98s[0m 139ms/step - accuracy: 0.6286 - loss: 1.0542 - val_accuracy: 0.6922 - val_loss: 0.8915
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m95s[0m 135ms/step - accuracy: 0.6759 - loss: 0.9276 - val_accuracy: 0.7112 - val_loss: 0.8451
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m95s[0m 136ms/step - accuracy: 0.7053 - loss: 0.8399 - val_accuracy: 0.7268 - val_loss: 0.8061
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m98s[0m 139ms/step - accuracy: 0.7411 - loss: 0.7398 - val_accuracy: 0.7462 - val_loss: 0.7584
Epoch 7/

In [5]:
# Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# 1. Device configuration (GPU if available)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Transformations (normalization + tensor conversion)
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# 3. Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)

# 4. Create DataLoaders
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# 5. Define CNN model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.layer1 = nn.Sequential(
            nn.Conv2d(1, 32, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.layer2 = nn.Sequential(
            nn.Conv2d(32, 64, kernel_size=3, padding=1),
            nn.ReLU(),
            nn.MaxPool2d(2)
        )
        self.fc = nn.Sequential(
            nn.Flatten(),
            nn.Linear(64*7*7, 128),
            nn.ReLU(),
            nn.Linear(128, 10)
        )

    def forward(self, x):
        x = self.layer1(x)
        x = self.layer2(x)
        x = self.fc(x)
        return x

model = CNN().to(device)

# 6. Loss and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 7. Training loop
num_epochs = 5
for epoch in range(num_epochs):
    model.train()
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)

        outputs = model(images)
        loss = criterion(outputs, labels)

        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {loss.item():.4f}")

# 8. Accuracy evaluation
model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)

        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print(f"Test Accuracy: {accuracy:.2f}%")


100%|██████████| 9.91M/9.91M [00:01<00:00, 6.09MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 172kB/s]
100%|██████████| 1.65M/1.65M [00:01<00:00, 1.53MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 7.42MB/s]


Epoch [1/5], Loss: 0.0368
Epoch [2/5], Loss: 0.0064
Epoch [3/5], Loss: 0.0296
Epoch [4/5], Loss: 0.0069
Epoch [5/5], Loss: 0.1171
Test Accuracy: 99.08%


In [9]:
# Question 9: Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.

import os
import numpy as np
from PIL import Image

# ----------------------------------------------------------
# STEP 1: Create fake dataset folders (Normal / Pneumonia)
# ----------------------------------------------------------

base_path = "/content/chest_xray_demo"
train_dir = os.path.join(base_path, "train")
val_dir = os.path.join(base_path, "val")

# Create folders
for path in [train_dir, val_dir]:
    os.makedirs(os.path.join(path, "Normal"), exist_ok=True)
    os.makedirs(os.path.join(path, "Pneumonia"), exist_ok=True)

# Generate random images for Normal & Pneumonia
def create_random_images(folder, count):
    for i in range(count):
        arr = np.random.randint(0, 255, (128,128,3), dtype=np.uint8)
        img = Image.fromarray(arr)
        img.save(os.path.join(folder, f"img{i}.jpg"))

# create random images for training
create_random_images(os.path.join(train_dir, "Normal"), 20)
create_random_images(os.path.join(train_dir, "Pneumonia"), 20)

# create random images for validation
create_random_images(os.path.join(val_dir, "Normal"), 5)
create_random_images(os.path.join(val_dir, "Pneumonia"), 5)

print("Demo dataset created successfully!")

# ----------------------------------------------------------
# STEP 2: Preprocessing and Generators
# ----------------------------------------------------------

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

train_datagen = ImageDataGenerator(rescale=1./255, horizontal_flip=True)
val_datagen = ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(
    train_dir,
    target_size=(128, 128),
    batch_size=8,
    class_mode='categorical'
)

val_gen = val_datagen.flow_from_directory(
    val_dir,
    target_size=(128, 128),
    batch_size=8,
    class_mode='categorical'
)

# ----------------------------------------------------------
# STEP 3: Build CNN Model
# ----------------------------------------------------------

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    MaxPooling2D(2,2),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),

    Flatten(),
    Dense(128, activation='relu'),
    Dense(train_gen.num_classes, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# ----------------------------------------------------------
# STEP 4: Train the model
# ----------------------------------------------------------

history = model.fit(train_gen, validation_data=val_gen, epochs=3)

# ----------------------------------------------------------
# STEP 5: Evaluation
# ----------------------------------------------------------

loss, acc = model.evaluate(val_gen)
print("Validation Accuracy:", acc)


Demo dataset created successfully!
Found 40 images belonging to 2 classes.
Found 10 images belonging to 2 classes.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/3


  self._warn_if_super_not_called()


[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 372ms/step - accuracy: 0.4156 - loss: 4.5693 - val_accuracy: 0.5000 - val_loss: 0.7232
Epoch 2/3
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 248ms/step - accuracy: 0.4406 - loss: 0.7571 - val_accuracy: 0.5000 - val_loss: 0.6919
Epoch 3/3
[1m5/5[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 250ms/step - accuracy: 0.6764 - loss: 0.6793 - val_accuracy: 0.5000 - val_loss: 0.6919
[1m2/2[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - accuracy: 0.5000 - loss: 0.6918
Validation Accuracy: 0.5


# Question 10: End-to-end CNN medical X-ray classifier → Streamlit web app
# ------------------------------------------------------------------

1. Business / regulatory framing
   - FDA-class II software → treat every step (data, model, code) as auditable.
   - PHI compliance: de-identify DICOM, store on encrypted bucket, HIPAA-BAA cloud.

2. Data acquisition & inventory
   - Public sets: NIH ChestX-ray14, RSNA Pneumonia Challenge, CheXpert.
   - Private hospital PACS export → DICOM → PNG 1 k², 8-bit, single view (PA).
   - Target balance: ~10 k pneumonia, ~10 k normal after augmentation.

3. Label consolidation
   - Radiologist consensus reads (2+ board-cert.) → binary label.
   - Uncertain / borderline cases excluded to keep 0/1 gold standard.

4. Pre-processing pipeline (re-usable module)
   a. DICOM → PNG converter (pydicom) + automatic windowing (lung window).
   b. Auto-crop lung fields (pre-trained U-Net seg) → center square 512×512.
   c. Intensity normalize to [0,1] via 1-st & 99-th percentile clipping.
   d. Data augmentation (albumentations): shift/rotate 10°, brightness ±15%, horizontal flip, cutout 32×32.
   e. 5-fold patient-wise stratified split (no patient in both train & test).

5. Model design
   - Backbone: ImageNet-pre-trained DenseNet-121 (good small-lesion sensitivity).
   - Replace final classifier → GlobalAveragePooling → Dropout 0.3 → Dense 1 (sigmoid).
   - Freeze first 120 layers for 3 epochs, then unfreeze entire net (lower LR).

6. Training strategy
   - Loss: binary-cross-entropy + 0.01 L2 on last layer.
   - Optimizer: AdamW lr=1e-4, cosine decay to 1e-6, batch 16 (GPU memory).
   - Callbacks: EarlyStopping(patience=5, monitor=val_AUC), ModelCheckpoint(save_best), ReduceLROnPlateau.
   - Class imbalance: use pos_weight = (N_neg/N_pos) in BCE.
   - 50 epochs ≈ 45 min on single V100.

7. Evaluation & clinical metrics
   - ROC-AUC, PR-AUC, sensitivity @ 95 % specificity, F1.
   - Confidence interval via 1 000 bootstrap on patient level.
   - Grad-CAM++ heat-maps reviewed by radiologist → qualitative safety check.

8. Explainability artefact
   - Save Grad-CAM overlay PNG alongside prediction for physician review.

9. Model packaging
   - Export best checkpoint to ONNX (fp16) → 40 MB.
   - Convert to TensorFlow Lite or keep ONNX for onnx-runtime (fast CPU fallback).

10. Web-app skeleton (Streamlit)
    File tree:
    app/
     ├── requirements.txt
     ├── main.py
     ├── model/
     │    └── chest_densenet121.onnx
     └── utils/
          ├── preprocess.py
          ├── gradcam.py
          └── infer.py

    main.py excerpt:
    ```python
    import streamlit as st, onnxruntime as ort, cv2, numpy as np, requests, io
    from utils import preprocess, gradcam, infer

    st.set_page_config(page_title="Pneumonia Screening")
    st.title("Chest X-ray Pneumonia Classifier")
    uploaded = st.file_uploader("Upload chest X-ray (PNG/JPG/DICOM)", type=["png","jpg","dcm"])
    if uploaded:
        img = preprocess.load_and_clean(uploaded)          # step 4 a-c
        prob = infer.predict(img)                          # ONNX, 60 ms CPU
        st.metric("Pneumonia probability", f"{prob:.1%}")
        if prob > 0.5:
            st.warning("Pneumonia suspected – refer for radiologist review.")
        else:
            st.success("Normal study.")
        with st.expander("View heat-map"):
            heat = gradcam.compute(img)
            st.image(heat, caption="Grad-CAM++", use_column_width=True)
    ```

11. CI/CD & deployment
    - GitHub Actions: lint → pytest → Docker build → push to AWS ECR.
    - Infrastructure: ECS Fargate (2 vCPU, 4 GB) behind Application Load Balancer, auto-scale 2-10 tasks.
    - S3 static front-end optional; Streamlit runs inside container on 8501.
    - CloudWatch logs + alarm on p95 latency > 1 s.

12. Post-deployment surveillance
    - Log every inference request (no image, only hash & score) → S3 + Athena.
    - Weekly recompute drift metrics (PSI, KL) on new uploads; trigger retrain if PSI > 0.2.
    - Maintain model registry (MLflow) with version tags tied to container image SHA.

13. User training & disclaimers
   - Landing page shows “Not for primary diagnosis – physician must verify”.
   - Provide calibration curve and failure-case examples.
   - Include “Report adverse result” button → creates Jira ticket for safety team.

Outcome: clinicians drag-and-drop an X-ray in browser → prediction + heat-map in < 1 s, zero installation, full audit trail.