Question 1: What is the role of filters and feature maps in Convolutional Neural
Network (CNN)?

Answer:In a Convolutional Neural Network (CNN), filters and feature maps play an important role in extracting useful information from input data such as images. Filters are small matrices of numbers that slide over the input image and perform a mathematical operation called convolution. Each filter is designed to detect specific features like edges, corners, textures, or patterns. When a filter moves across the image, it looks for these features and responds strongly where the feature is present. The result of applying a filter to the input image is called a feature map. Feature maps show where and how strongly a particular feature appears in the image. As the CNN goes deeper, filters learn more complex features, and feature maps help the network understand the image step by step, leading to accurate recognition or classification.


Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural
Network). How do they affect the output dimensions of feature maps?

Answer:Padding and stride are important concepts in Convolutional Neural Networks that control how the convolution operation is applied to the input image. Padding means adding extra pixels, usually zeros, around the border of the input image before applying the filter. Padding helps preserve the spatial size of the image and prevents loss of important information at the edges. Stride refers to how many pixels the filter moves at a time while sliding over the image. A small stride moves the filter one pixel at a time and produces a larger feature map, while a larger stride skips more pixels and produces a smaller feature map. Together, padding and stride directly affect the output dimensions of feature maps. More padding increases or maintains the feature map size, while a larger stride reduces the size of the feature map by covering the image more quickly.


Question 3: Define receptive field in the context of CNNs. Why is it important for deep
architectures?

Answer:In the context of Convolutional Neural Networks, the receptive field refers to the specific region of the input image that a neuron in a feature map is able to “see” or be influenced by. In early layers of a CNN, the receptive field is small and focuses on simple features like edges or corners. As the network goes deeper, the receptive field becomes larger because each layer builds on the previous one. This is important for deep architectures because a larger receptive field allows the network to capture more global and complex patterns, such as shapes or objects, rather than just small details. By gradually increasing the receptive field, deep CNNs can effectively understand both local features and overall structure in the input data, leading to better performance in tasks like image classification and object detection.


Question 4: Discuss how filter size and stride influence the number of parameters in a
CNN.

Answer:In a Convolutional Neural Network, filter size and stride influence how many parameters the network has and how it processes the input data. The filter size directly affects the number of parameters because larger filters contain more values that need to be learned. For example, a 5×5 filter has more parameters than a 3×3 filter, which increases the model complexity and memory usage. Stride, on the other hand, does not change the number of parameters in the filters themselves, but it affects how often the filter is applied across the input. A larger stride reduces the size of the resulting feature map, which can lower the number of computations and reduce parameters in later layers such as fully connected layers. Therefore, filter size increases parameters directly, while stride influences the overall parameter count indirectly by controlling feature map dimensions.


Question 5: Compare and contrast different CNN-based architectures like LeNet,
AlexNet, and VGG in terms of depth, filter sizes, and performance.

Answer:LeNet, AlexNet, and VGG are popular CNN-based architectures that differ in depth, filter sizes, and performance. LeNet is one of the earliest CNN models and has a shallow architecture with only a few layers and small filter sizes. It was mainly designed for simple tasks like handwritten digit recognition and performs well on small datasets. AlexNet is deeper than LeNet and uses more layers with larger filter sizes in the initial layers. It introduced important ideas like ReLU activation and dropout, which significantly improved performance on large-scale image classification tasks. VGG is even deeper than AlexNet and is known for its very uniform design, using many layers with small 3×3 filters. This increased depth allows VGG to learn more complex features, resulting in better accuracy, but it also requires more computation and memory. Overall, as depth increases from LeNet to VGG, performance improves, but so does computational cost.


Question 6: Using keras, build and train a simple CNN model on the MNIST dataset
from scratch. Include code for module creation, compilation, training, and evaluation.

(Include your Python code and output in the code box below.)

Answer:

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import mnist

(x_train, y_train), (x_test, y_test) = mnist.load_data()

x_train = x_train.reshape((60000, 28, 28, 1)) / 255.0
x_test = x_test.reshape((10000, 28, 28, 1)) / 255.0

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, batch_size=64)

test_loss, test_accuracy = model.evaluate(x_test, y_test)

print("Test accuracy:", test_accuracy)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 52ms/step - accuracy: 0.8671 - loss: 0.4269
Epoch 2/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m83s[0m 53ms/step - accuracy: 0.9827 - loss: 0.0550
Epoch 3/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 52ms/step - accuracy: 0.9889 - loss: 0.0357
Epoch 4/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 52ms/step - accuracy: 0.9919 - loss: 0.0274
Epoch 5/5
[1m938/938[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 50ms/step - accuracy: 0.9935 - loss: 0.0203
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 13ms/step - accuracy: 0.9869 - loss: 0.0385
Test accuracy: 0.9890000224113464


Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a
CNN model to classify RGB images. Show your preprocessing and architecture.

(Include your Python code and output in the code box below.)

Answer:

In [2]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

x_train = x_train / 255.0
x_test = x_test / 255.0

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, batch_size=64)

test_loss, test_accuracy = model.evaluate(x_test, y_test)

print("Test accuracy:", test_accuracy)


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 0us/step
Epoch 1/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m64s[0m 79ms/step - accuracy: 0.3500 - loss: 1.7742
Epoch 2/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m62s[0m 80ms/step - accuracy: 0.5539 - loss: 1.2518
Epoch 3/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m65s[0m 83ms/step - accuracy: 0.6280 - loss: 1.0619
Epoch 4/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 78ms/step - accuracy: 0.6695 - loss: 0.9475
Epoch 5/5
[1m782/782[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m62s[0m 80ms/step - accuracy: 0.6903 - loss: 0.8795
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 14ms/step - accuracy: 0.6860 - loss: 0.8953
Test accuracy: 0.682699978351593


Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST
dataset. Include model definition, data loaders, training loop, and accuracy evaluation.

(Include your Python code and output in the code box below.)

Answer:

In [6]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3)
        self.conv2 = nn.Conv2d(32, 64, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(64 * 12 * 12, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.relu(self.conv2(x))
        x = self.pool(x)
        x = x.view(x.size(0), -1)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN()
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

for epoch in range(5):
    model.train()
    for data, target in train_loader:
        optimizer.zero_grad()
        output = model(data)
        loss = criterion(output, target)
        loss.backward()
        optimizer.step()
    print(f"Epoch {epoch+1} completed")

model.eval()
correct = 0
total = 0

with torch.no_grad():
    for data, target in test_loader:
        output = model(data)
        _, predicted = torch.max(output, 1)
        total += target.size(0)
        correct += (predicted == target).sum().item()

accuracy = correct / total
print("Test accuracy:", accuracy)


Epoch 1 completed
Epoch 2 completed
Epoch 3 completed
Epoch 4 completed
Epoch 5 completed
Test accuracy: 0.9861


Question 9: Given a custom image dataset stored in a local directory, write code using
Keras ImageDataGenerator to preprocess and train a CNN model.

(Include your Python code and output in the code box below.)

Answer:

In [13]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import os
import numpy as np
from PIL import Image

train_dir = "dataset/train"
validation_dir = "dataset/validation"

for directory in [train_dir, validation_dir]:
    os.makedirs(os.path.join(directory, 'class_a'), exist_ok=True)
    os.makedirs(os.path.join(directory, 'class_b'), exist_ok=True)
    for class_name in ['class_a', 'class_b']:
        dummy_image_path = os.path.join(directory, class_name, 'dummy_image.png')
        if not os.path.exists(dummy_image_path):
            dummy_image = Image.fromarray(np.random.randint(0, 255, (150, 150, 3), dtype=np.uint8))
            dummy_image.save(dummy_image_path)

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    horizontal_flip=True
)

validation_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(150, 150),
    batch_size=32,
    class_mode='categorical'
)

validation_generator = validation_datagen.flow_from_directory(
    validation_dir,
    target_size=(150, 150),
    batch_size=32,
    class_mode='categorical'
)

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(train_generator.num_classes, activation='softmax'))

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(
    train_generator,
    epochs=5,
    validation_data=validation_generator
)


Found 2 images belonging to 2 classes.
Found 2 images belonging to 2 classes.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
  self._warn_if_super_not_called()


Epoch 1/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step - accuracy: 0.5000 - loss: 0.6924 - val_accuracy: 0.5000 - val_loss: 0.7908
Epoch 2/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 393ms/step - accuracy: 0.5000 - loss: 0.7837 - val_accuracy: 0.5000 - val_loss: 3.2905
Epoch 3/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 668ms/step - accuracy: 0.5000 - loss: 2.6556 - val_accuracy: 0.5000 - val_loss: 0.8740
Epoch 4/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 249ms/step - accuracy: 0.5000 - loss: 0.8278 - val_accuracy: 0.5000 - val_loss: 0.8594
Epoch 5/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 276ms/step - accuracy: 0.5000 - loss: 0.8055 - val_accuracy: 0.5000 - val_loss: 0.7785


<keras.src.callbacks.history.History at 0x7deb02b902f0>

Question 10: You are working on a web application for a medical imaging startup. Your
task is to build and deploy a CNN model that classifies chest X-ray images into “Normal”
and “Pneumonia” categories. Describe your end-to-end approach–from data preparation
and model training to deploying the model as a web app using Streamlit.

(Include your Python code and output in the code box below.)

Answer:To build and deploy a CNN model for classifying chest X-ray images into Normal and Pneumonia categories, the first step is data preparation. The dataset is collected and organized into folders for each class, then images are resized and normalized to make them suitable for training. Data augmentation is applied to improve model generalization. Next, a CNN model is trained using convolutional and pooling layers to learn important features from X-ray images, followed by dense layers for classification. After training, the model is saved. For deployment, Streamlit is used to create a simple web application where users can upload an X-ray image. The uploaded image is preprocessed in the same way as training data and passed to the trained model, which then predicts and displays whether the image is Normal or Pneumonia. This end-to-end approach allows easy access to the model through a web interface for real-world use.

In [16]:
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import streamlit as st
from PIL import Image
import numpy as np
import os

train_dir = "chest_xray/train"
val_dir = "chest_xray/val"

for directory in [train_dir, val_dir]:
    os.makedirs(os.path.join(directory, 'NORMAL'), exist_ok=True)
    os.makedirs(os.path.join(directory, 'PNEUMONIA'), exist_ok=True)
    for class_name in ['NORMAL', 'PNEUMONIA']:
        dummy_image_path = os.path.join(directory, class_name, 'dummy_image.png')
        if not os.path.exists(dummy_image_path):
            dummy_image = Image.fromarray(np.random.randint(0, 255, (224, 224, 3), dtype=np.uint8))
            dummy_image.save(dummy_image_path)

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=15,
    zoom_range=0.2,
    horizontal_flip=True
)

val_datagen = ImageDataGenerator(rescale=1./255)

train_gen = train_datagen.flow_from_directory(
    train_dir,
    target_size=(224, 224),
    batch_size=32,
    class_mode='binary'
)

val_gen = val_datagen.flow_from_directory(
    val_dir,
    target_size=(224, 224),
    batch_size=32,
    class_mode='binary'
)

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(224, 224, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(128, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))

model.compile(optimizer='adam',
              loss='binary_crossentropy',
              metrics=['accuracy'])

model.fit(train_gen, epochs=5, validation_data=val_gen)

model.save("pneumonia_model.h5")

st.title("Chest X-ray Classification")
uploaded_file = st.file_uploader("Upload Chest X-ray Image", type=["jpg", "png", "jpeg"])

if uploaded_file is not None:
    image = Image.open(uploaded_file).resize((224, 224))
    img_array = np.array(image) / 255.0
    img_array = np.expand_dims(img_array, axis=0)

    loaded_model = tf.keras.models.load_model("pneumonia_model.h5")
    prediction = loaded_model.predict(img_array)

    if prediction[0][0] > 0.5:
        st.write("Prediction: Pneumonia")
    else:
        st.write("Prediction: Normal")


Found 2 images belonging to 2 classes.
Found 2 images belonging to 2 classes.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5


  self._warn_if_super_not_called()


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step - accuracy: 0.5000 - loss: 0.6915 - val_accuracy: 0.5000 - val_loss: 3.7004
Epoch 2/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 555ms/step - accuracy: 0.5000 - loss: 2.8584 - val_accuracy: 0.5000 - val_loss: 1.0202
Epoch 3/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 567ms/step - accuracy: 0.5000 - loss: 0.8993 - val_accuracy: 0.5000 - val_loss: 0.7425
Epoch 4/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 629ms/step - accuracy: 0.5000 - loss: 0.7243 - val_accuracy: 0.5000 - val_loss: 0.6935
Epoch 5/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 0.5000 - loss: 0.6930 - val_accuracy: 0.5000 - val_loss: 0.7132


2025-12-25 05:25:02.262 
  command:

    streamlit run /usr/local/lib/python3.12/dist-packages/colab_kernel_launcher.py [ARGUMENTS]
