Question1. What is the role of filters and feature maps in Convolutional Neural
Network (CNN)?

    -> In a Convolutional Neural Network (CNN), filters (also called kernels) are small matrices that slide over the input image to detect specific patterns like edges, textures, or shapes. Each filter focuses on a unique feature of the image. The result of applying these filters is a feature map, which represents the presence and location of the detected features. As the network goes deeper, feature maps capture increasingly complex patterns, helping the CNN learn hierarchical representations of the data.


Question 2: Explain the concepts of padding and stride in CNNs(Convolutional NeuralNetwork). How do they affect the output dimensions of feature maps?

    -> Padding in CNNs involves adding extra pixels (usually zeros) around the input image to control the spatial dimensions of the output feature map. It helps preserve edge information and prevents reduction in size after convolution. Stride refers to the number of pixels by which the filter moves across the input image; a larger stride reduces the output size, while a stride of one keeps it closer to the input dimensions. Together, padding and stride determine how much the feature map shrinks or retains spatial resolution during convolution.


Question 3: Define receptive field in the context of CNNs. Why is it important for deep architectures?    

    -> The receptive field in a CNN refers to the specific region of the input image that a particular neuron in a feature map responds to. It represents how much of the input contributes to that neuron’s activation. In deeper architectures, the receptive field increases with each layer, allowing the network to capture more complex and global patterns rather than just local details. This expansion is crucial for understanding higher-level features like shapes, objects, and overall image context.


Question 4: Discuss how filter size and stride influence the number of parameters in a CNN.

    -> The filter size directly affects the number of parameters in a CNN because each filter contains weights equal to its area multiplied by the number of input channels. Larger filters therefore lead to more parameters and higher computational cost. The stride, on the other hand, controls how much the filter moves across the input — it doesn’t change the number of parameters but affects how many positions (and hence activations) are computed. A larger stride reduces the number of output activations, decreasing the overall computation while keeping the parameter count constant.


Question 5: Compare and contrast different CNN-based architectures like LeNet,
AlexNet, and VGG in terms of depth, filter sizes, and performance.

    -> LeNet is one of the earliest CNN architectures, relatively shallow with about 5–7 layers, using small filters (mostly 5×5) and designed for simple tasks like digit recognition. AlexNet is deeper, with 8 layers, larger filters in the initial layers, and introduced ReLU activation and dropout, achieving breakthrough performance on large-scale datasets like ImageNet. VGG goes even deeper, with 16–19 layers (11×11, 5×5), and standardizes the use of small 3×3 filters stacked in multiple layers, leading to improved feature extraction and accuracy at the cost of higher computational requirements.

    

In [1]:
# Question 6: Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.

import tensorflow as tf
from tensorflow.keras import datasets, layers, models

(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()
x_train = x_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
x_test = x_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0

# CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])


model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# Evaluation
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m28s[0m 32ms/step - accuracy: 0.8789 - loss: 0.4039 - val_accuracy: 0.9800 - val_loss: 0.0681
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 33ms/step - accuracy: 0.9834 - loss: 0.0535 - val_accuracy: 0.9898 - val_loss: 0.0374
Epoch 3/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m26s[0m 31ms/step - accuracy: 0.9887 - loss: 0.0346 - val_accuracy: 0.9912 - val_loss: 0.0330
Epoch 4/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m27s[0m 31ms/step - accuracy: 0.9917 - loss: 0.0264 - val_accuracy: 0.9910 - val_loss: 0.0315
Epoch 5/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 30ms/step - accuracy: 0.9943 - loss: 0.0179 - val_accuracy: 0.9888 - val_loss: 0.0358
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 7ms/step - accuracy: 0.9860 - loss: 0.0408
Test Accuracy: 0.9895


In [4]:
# Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.

from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical

(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

# Normalize pixel values
x_train, x_test = x_train.astype('float32') / 255.0, x_test.astype('float32') / 255.0

# One-hot encoding
y_train, y_test = to_categorical(y_train, 10), to_categorical(y_test, 10)

model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(128, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(128, activation='relu'),
    layers.Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(x_train, y_train, epochs=10, batch_size=64, validation_split=0.1)

test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")

Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m45s[0m 63ms/step - accuracy: 0.3459 - loss: 1.7778 - val_accuracy: 0.5346 - val_loss: 1.2997
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m49s[0m 70ms/step - accuracy: 0.5580 - loss: 1.2439 - val_accuracy: 0.6232 - val_loss: 1.0908
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m75s[0m 60ms/step - accuracy: 0.6279 - loss: 1.0520 - val_accuracy: 0.6510 - val_loss: 0.9979
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m48s[0m 68ms/step - accuracy: 0.6758 - loss: 0.9225 - val_accuracy: 0.6366 - val_loss: 1.0425
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 71ms/step - accuracy: 0.7083 - loss: 0.8419 - val_accuracy: 0.7026 - val_loss: 0.8663
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 74ms/step - accuracy: 0.7386 - loss: 0.7519 - val_accuracy: 0.7156 - val_loss: 0.8427
Epoch 7/10
[1m7

In [5]:
# Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

# Data preprocessing and loaders
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5,), (0.5,))])
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, 3)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(32, 64, 3)
        self.fc1 = nn.Linear(64*5*5, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = self.pool(torch.relu(self.conv1(x)))
        x = self.pool(torch.relu(self.conv2(x)))
        x = x.view(-1, 64*5*5)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN().to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)


for epoch in range(5):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
    print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader):.4f}")

model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Test Accuracy: {correct/total:.4f}")


100%|██████████| 9.91M/9.91M [00:00<00:00, 33.3MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.16MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 9.39MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 4.09MB/s]


Epoch 1, Loss: 0.1585
Epoch 2, Loss: 0.0456
Epoch 3, Loss: 0.0319
Epoch 4, Loss: 0.0233
Epoch 5, Loss: 0.0185
Test Accuracy: 0.9891


In [9]:
# Question 9: Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.

# Import required libraries
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# Normalize pixel values to [0,1]
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Conv2D(128, (3,3), activation='relu'),
    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(x_train, y_train, epochs=10, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")


Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m47s[0m 64ms/step - accuracy: 0.2857 - loss: 1.9077 - val_accuracy: 0.4848 - val_loss: 1.4362
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 60ms/step - accuracy: 0.5021 - loss: 1.3908 - val_accuracy: 0.5746 - val_loss: 1.1865
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m44s[0m 62ms/step - accuracy: 0.5725 - loss: 1.2079 - val_accuracy: 0.6198 - val_loss: 1.0444
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m82s[0m 62ms/step - accuracy: 0.6180 - loss: 1.0913 - val_accuracy: 0.6722 - val_loss: 0.9446
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m78s[0m 56ms/step - accuracy: 0.6544 - loss: 0.9969 - val_accuracy: 0.6938 - val_loss: 0.8917
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 56ms/step - accuracy: 0.6720 - loss: 0.9274 - val_accuracy: 0.6958 - val_loss: 0.8619
Epoch 7/10
[1m7

Question 10: You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach from data preparation and model training to deploying the model as a web app using Streamlit.

    -> To build and deploy a CNN-based chest X-ray classifier, I would begin with data preparation, collecting and organizing images into “Normal” and “Pneumonia” folders. Using Keras’ ImageDataGenerator, I would normalize pixel values and apply augmentations like rotation, zoom, and flip to prevent overfitting. Next, I would design a CNN model (e.g., Conv2D → MaxPooling → Dense → Softmax) or use a pretrained model like VGG16 for better accuracy. After compiling with the Adam optimizer and binary cross-entropy loss, I would train and validate the model on the dataset.

    Once trained, I’d save the model as an .h5 file and create a Streamlit web app where users can upload X-ray images. The app would load the saved model, preprocess the uploaded image, and display the prediction (“Normal” or “Pneumonia”) along with confidence scores. Finally, I’d deploy the app using Streamlit Cloud, Render, or AWS, ensuring it includes a clean UI, proper error handling, and an intuitive workflow for medical professionals or end users.