---

# Practical Question

---


# Question 1: What is the role of filters and feature maps in Convolutional Neural Network (CNN)?

- Answer:
In a Convolutional Neural Network (CNN), filters and feature maps play a crucial role in automatically learning and extracting important features from input data such as images.

Filters (Kernels):
Filters are small matrices of learnable weights that slide over the input image to perform convolution operations. Each filter detects specific patterns like edges, textures, or shapes. Different filters learn to identify different features during training.

Feature Maps:
The output generated after applying a filter to an input is called a feature map (or activation map). It represents the spatial presence and intensity of specific features detected by the filter across the image.

Together, filters and feature maps enable CNNs to transform raw pixel data into meaningful hierarchical representations — from simple edges in early layers to complex objects in deeper layers — facilitating effective image recognition and classification.

---

# Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural Network). How do they affect the output dimensions of feature maps?

- Answer:
In a Convolutional Neural Network (CNN), padding and stride are two essential parameters that control how filters move across the input and how the spatial dimensions of the resulting feature maps are determined.

Padding:
Padding involves adding extra rows and columns (usually of zeros) around the border of the input image. Its main purpose is to preserve spatial dimensions after convolution and prevent the loss of important edge information.

Valid Padding (No Padding): Reduces the output size since the filter does not go beyond image boundaries.

Same Padding: Maintains the same spatial dimensions as the input by adding zeros around the border.

Stride:
Stride refers to the number of pixels by which the filter moves (or “slides”) over the input image.

A stride of 1 means the filter moves one pixel at a time, producing larger feature maps.

A stride greater than 1 reduces the size of the feature maps by skipping positions.


---


# Question 3: Define receptive field in the context of CNNs. Why is it important for deep architectures?

- Answer:
In the context of Convolutional Neural Networks (CNNs), the receptive field refers to the specific region of the input image that influences the activation of a particular neuron in a feature map. In simpler terms, it is the area of the input that the neuron “sees” or responds to.

In Early Layers:
The receptive field is small and corresponds to local features such as edges or textures.

In Deeper Layers:
As the network progresses through multiple convolution and pooling layers, the receptive field expands, allowing neurons to capture more complex and global patterns (e.g., shapes, objects).

Importance in Deep Architectures:
The receptive field is vital because it determines how much contextual information a neuron can capture:

Larger receptive fields enable neurons to understand broader spatial relationships within the image.

Smaller receptive fields focus on fine-grained local features.

In deep CNN architectures, the progressive expansion of the receptive field allows the model to integrate both local and global information — essential for accurate object recognition and semantic understanding.

---

# Question 4: Discuss how filter size and stride influence the number of parameters in a CNN.

- Answer:
In a Convolutional Neural Network (CNN), both filter size and stride play critical roles in determining the model’s complexity, particularly the number of learnable parameters and the spatial resolution of feature maps.

1. Filter Size:
The filter (kernel) size directly impacts the number of parameters in a layer.
Each filter has parameters equal to:

(
Filter Height
×
Filter Width
×
Input Channels
)
+
1
(Filter Height×Filter Width×Input Channels)+1

(The extra “+1” accounts for the bias term.)
Therefore, larger filters (e.g., 5×5) have more parameters than smaller ones (e.g., 3×3), leading to higher computational cost and risk of overfitting. Modern CNNs often use smaller filters stacked in depth to reduce parameters while maintaining representational power.

2. Stride:
The stride determines how far the filter moves after each convolution. While stride itself does not change the number of filter parameters, it affects the output feature map size.

A larger stride reduces the feature map dimensions, thereby decreasing the total number of activations and computational operations in subsequent layers.

A smaller stride retains more spatial details but increases computational load.

In summary:

Filter size affects the number of parameters directly.

Stride affects the computational cost and feature map size, but not the parameter count of the convolution layer itself.

---

# Question 5: Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.
- Answer:
Over the years, various CNN architectures such as LeNet, AlexNet, and VGG have marked key milestones in the evolution of deep learning. Each of these architectures differs in depth, design choices, and performance characteristics.

LeNet-5 (1998):
Developed by Yann LeCun, LeNet-5 is one of the earliest CNN architectures, primarily designed for handwritten digit recognition on the MNIST dataset. It consists of seven layers, including two convolutional layers, two subsampling (pooling) layers, and three fully connected layers. The network uses 5×5 filters and sigmoid/tanh activations. Although shallow by modern standards, LeNet laid the foundation for convolutional neural networks by demonstrating their ability to automatically extract spatial features from images.

AlexNet (2012):
AlexNet, introduced by Alex Krizhevsky and colleagues, was a breakthrough in deep learning that won the ImageNet Large Scale Visual Recognition Challenge (ILSVRC) in 2012. It significantly increased the network depth to eight layers (five convolutional and three fully connected) and introduced several innovations such as the ReLU activation function, dropout regularization, and GPU-based training. AlexNet used larger filter sizes in the initial layers (e.g., 11×11 and 5×5) and smaller ones (3×3) in later layers. It achieved a massive leap in image classification accuracy, proving the effectiveness of deep learning at scale.

VGGNet (2014):
Proposed by the Visual Geometry Group (VGG) at the University of Oxford, VGGNet emphasized simplicity and uniformity in architecture design. The network came in two popular variants — VGG-16 and VGG-19 — representing the number of weight layers. Unlike AlexNet, VGG used only 3×3 convolutional filters throughout the network, stacked deeper to capture complex patterns effectively. This design choice made the model highly consistent and scalable while significantly increasing depth. Although it achieved superior performance on ImageNet, VGGNet required much more computation and memory due to its high parameter count (around 138 million).



---


# Question 6: Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.


In [1]:
# Import necessary libraries
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

# 1. Load and preprocess the MNIST dataset
(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()

# Reshape to include channel dimension (28x28x1) and normalize pixel values
x_train = x_train.reshape((x_train.shape[0], 28, 28, 1)).astype('float32') / 255
x_test = x_test.reshape((x_test.shape[0], 28, 28, 1)).astype('float32') / 255

# 2. Build the CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Display the model summary
model.summary()

# 3. Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# 4. Train the model
history = model.fit(x_train, y_train, epochs=5, batch_size=64,
                    validation_data=(x_test, y_test))

# 5. Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"\nTest Accuracy: {test_acc:.4f}")


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 57ms/step - accuracy: 0.8893 - loss: 0.3648 - val_accuracy: 0.9815 - val_loss: 0.0553
Epoch 2/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m51s[0m 55ms/step - accuracy: 0.9836 - loss: 0.0520 - val_accuracy: 0.9884 - val_loss: 0.0353
Epoch 3/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 53ms/step - accuracy: 0.9905 - loss: 0.0324 - val_accuracy: 0.9895 - val_loss: 0.0343
Epoch 4/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 53ms/step - accuracy: 0.9926 - loss: 0.0233 - val_accuracy: 0.9912 - val_loss: 0.0277
Epoch 5/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 53ms/step - accuracy: 0.9936 - loss: 0.0184 - val_accuracy: 0.9904 - val_loss: 0.0309
313/313 - 3s - 8ms/step - accuracy: 0.9904 - loss: 0.0309

Test Accuracy: 0.9904


---

# Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.



In [2]:
# Import required libraries
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
import matplotlib.pyplot as plt

# 1. Load the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

# 2. Normalize pixel values to the range [0,1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# Print dataset shapes
print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)

# 3. Build the CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(128, (3,3), activation='relu'),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

# Display model summary
model.summary()

# 4. Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# 5. Train the model
history = model.fit(x_train, y_train, epochs=10, batch_size=64,
                    validation_data=(x_test, y_test))

# 6. Evaluate model performance
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=2)
print(f"\nTest Accuracy: {test_acc:.4f}")


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 0us/step
Training data shape: (50000, 32, 32, 3)
Test data shape: (10000, 32, 32, 3)


Epoch 1/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m74s[0m 92ms/step - accuracy: 0.3472 - loss: 1.7606 - val_accuracy: 0.5665 - val_loss: 1.2163
Epoch 2/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m91s[0m 104ms/step - accuracy: 0.5804 - loss: 1.1883 - val_accuracy: 0.6328 - val_loss: 1.0504
Epoch 3/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m76s[0m 97ms/step - accuracy: 0.6463 - loss: 1.0009 - val_accuracy: 0.6452 - val_loss: 1.0063
Epoch 4/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m73s[0m 93ms/step - accuracy: 0.6903 - loss: 0.8825 - val_accuracy: 0.6760 - val_loss: 0.9391
Epoch 5/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m74s[0m 94ms/step - accuracy: 0.7180 - loss: 0.7934 - val_accuracy: 0.6990 - val_loss: 0.8850
Epoch 6/10
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m71s[0m 90ms/step - accuracy: 0.7493 - loss: 0.7146 - val_accuracy: 0.7046 - val_loss: 0.8757
Epoch 7/10
[1m

---

# Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.


In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Load MNIST dataset
train_data = datasets.MNIST(root='./data', train=True, transform=transforms.ToTensor(), download=True)
test_data = datasets.MNIST(root='./data', train=False, transform=transforms.ToTensor(), download=True)
train_loader = DataLoader(train_data, batch_size=64, shuffle=True)
test_loader = DataLoader(test_data, batch_size=1000)

# Define a simple CNN
class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3)
        self.conv2 = nn.Conv2d(32, 64, 3)
        self.fc1 = nn.Linear(9216, 128)
        self.fc2 = nn.Linear(128, 10)
    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.relu(self.conv2(x))
        x = nn.functional.max_pool2d(x, 2)
        x = torch.flatten(x, 1)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = SimpleCNN()
optimizer = optim.Adam(model.parameters(), lr=0.001)
criterion = nn.CrossEntropyLoss()

# Training
for epoch in range(3):  # short for student example
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1} done")

# Evaluation
correct = 0
with torch.no_grad():
    for data, target in test_loader:
        output = model(data)
        pred = output.argmax(dim=1)
        correct += (pred == target).sum().item()
print(f"Test Accuracy: {correct/len(test_data):.4f}")


100%|██████████| 9.91M/9.91M [00:00<00:00, 36.4MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.09MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 9.70MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 9.68MB/s]


Epoch 1 done
Epoch 2 done
Epoch 3 done
Test Accuracy: 0.9882


---

# Question 9: Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.



In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras import layers, models
from tensorflow.keras.optimizers import Adam

# 1. Preprocessing using ImageDataGenerator
train_datagen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_generator = train_datagen.flow_from_directory(
    'dataset/',  # path to dataset folder
    target_size=(64, 64),
    batch_size=32,
    class_mode='categorical',
    subset='training'
)
val_generator = train_datagen.flow_from_directory(
    'dataset/',
    target_size=(64, 64),
    batch_size=32,
    class_mode='categorical',
    subset='validation'
)

# 2. Build a simple CNN model
model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(64,64,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(train_generator.num_classes, activation='softmax')
])

# 3. Compile the model
model.compile(optimizer=Adam(), loss='categorical_crossentropy', metrics=['accuracy'])

# 4. Train the model
history = model.fit(train_generator, epochs=5, validation_data=val_generator)

# 5. Evaluate (accuracy on validation)
val_loss, val_acc = model.evaluate(val_generator)
print(f"Validation Accuracy: {val_acc:.4f}")


---

# Question 10: You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach–from data preparation and model training to deploying the model as a web app using Streamlit.




In [None]:
# 1. Data Preparation using ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_gen = ImageDataGenerator(rescale=1./255, validation_split=0.2)
train_data = train_gen.flow_from_directory(
    'chest_xray/train',
    target_size=(150,150),
    batch_size=32,
    class_mode='binary',
    subset='training'
)
val_data = train_gen.flow_from_directory(
    'chest_xray/train',
    target_size=(150,150),
    batch_size=32,
    class_mode='binary',
    subset='validation'
)

# 2. Build CNN Model
from tensorflow.keras import models, layers

model = models.Sequential([
    layers.Conv2D(32, (3,3), activation='relu', input_shape=(150,150,3)),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(64, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Conv2D(128, (3,3), activation='relu'),
    layers.MaxPooling2D((2,2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(1, activation='sigmoid')
])

# 3. Compile and Train
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = model.fit(train_data, validation_data=val_data, epochs=5)

# 4. Save the model
model.save("chest_xray_cnn.h5")

# 5. Deploy with Streamlit
# Save this as app.py
"""
import streamlit as st
from tensorflow.keras.models import load_model
from tensorflow.keras.preprocessing import image
import numpy as np

st.title("Chest X-ray Pneumonia Classifier")

model = load_model("chest_xray_cnn.h5")

uploaded_file = st.file_uploader("Upload an X-ray image", type=["jpg","png"])
if uploaded_file:
    img = image.load_img(uploaded_file, target_size=(150,150))
    img_array = image.img_to_array(img)/255.0
    img_array = np.expand_dims(img_array, axis=0)
    prediction = model.predict(img_array)[0][0]
    label = "Pneumonia" if prediction > 0.5 else "Normal"
    st.write(f"Prediction: {label}")
"""
