# CNN Architecture Assignment


---



## Question 1
**What is the role of filters and feature maps in Convolutional Neural Network (CNN)?**

**Answer:**
Filters (kernels) are small matrices that slide over the input image to detect specific patterns such as edges, textures, or shapes. Each filter performs a convolution operation to extract a particular feature.

Feature maps are the outputs produced after applying filters to the input. Each feature map highlights the presence and location of a specific feature detected by its corresponding filter. Together, filters and feature maps enable CNNs to learn hierarchical visual features.

## Question 2
**Explain the concepts of padding and stride in CNNs. How do they affect output dimensions?**

**Answer:**
Padding involves adding extra pixels (usually zeros) around the input image to control the spatial size of the output. Stride defines the step size by which the filter moves across the input.

Padding helps preserve spatial dimensions, while larger strides reduce output size. Output size depends on input size, filter size, padding, and stride.

## Question 3
**Define receptive field in CNNs. Why is it important?**

**Answer:**
The receptive field is the region of the input image that affects a particular neuron in a CNN. As depth increases, the receptive field grows, allowing neurons to capture larger contextual information.

It is important because larger receptive fields help deep CNNs understand complex patterns and global structures.

## Question 4
**Discuss how filter size and stride influence the number of parameters in a CNN.**

**Answer:**
Filter size directly affects the number of parameters: larger filters have more weights. Stride does not change the number of parameters but affects the output feature map size. Smaller filters with deeper architectures are preferred for efficiency.

## Question 5
**Compare and contrast LeNet, AlexNet, and VGG.**

**Answer:**
- **LeNet:** Shallow network, small filters, designed for digit recognition.
- **AlexNet:** Deeper network, larger filters, introduced ReLU and dropout.
- **VGG:** Very deep network with small (3×3) filters, high accuracy but computationally expensive.

# Question 6:
**Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.**

In [None]:

import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train = x_train[..., None] / 255.0
x_test = x_test[..., None] / 255.0

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D((2,2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
model.evaluate(x_test, y_test)

Epoch 1/3
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 3ms/step - accuracy: 0.9096 - loss: 0.3033
Epoch 2/3
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9835 - loss: 0.0550
Epoch 3/3
[1m1875/1875[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.9901 - loss: 0.0325
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9826 - loss: 0.0569


[0.046846337616443634, 0.9848999977111816]

# Question 7:
**Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.**

In [None]:

from tensorflow.keras.datasets import cifar10

(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train, x_test = x_train/255.0, x_test/255.0

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),
    Flatten(),
    Dense(64, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(x_train, y_train, epochs=5)
model.evaluate(x_test, y_test)

Epoch 1/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m9s[0m 4ms/step - accuracy: 0.3915 - loss: 1.6593
Epoch 2/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.6160 - loss: 1.0958
Epoch 3/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.6697 - loss: 0.9448
Epoch 4/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.7052 - loss: 0.8439
Epoch 5/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 3ms/step - accuracy: 0.7295 - loss: 0.7715
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.6896 - loss: 0.9079


[0.9062128067016602, 0.6906999945640564]

# Question 8:
 **Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.**

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# 1. Device configuration (CPU / GPU)
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# 2. Data preprocessing
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

# 3. Load MNIST dataset
train_dataset = datasets.MNIST(
    root='./data',
    train=True,
    transform=transform,
    download=True
)

test_dataset = datasets.MNIST(
    root='./data',
    train=False,
    transform=transform,
    download=True
)

# 4. Data loaders
train_loader = DataLoader(
    dataset=train_dataset,
    batch_size=64,
    shuffle=True
)

test_loader = DataLoader(
    dataset=test_dataset,
    batch_size=64,
    shuffle=False
)

# 5. CNN model definition
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()

        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.pool = nn.MaxPool2d(2, 2)

        # ✅ Corrected feature size: 64 × 12 × 12 = 9216
        self.fc1 = nn.Linear(64 * 12 * 12, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = torch.relu(self.conv2(x))
        x = self.pool(x)
        x = x.view(x.size(0), -1)   # Flatten
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# 6. Initialize model
model = CNN().to(device)

# 7. Loss function and optimizer
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# 8. Training loop
epochs = 5

for epoch in range(epochs):
    model.train()
    running_loss = 0.0

    for images, labels in train_loader:
        images = images.to(device)
        labels = labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch [{epoch+1}/{epochs}] - Loss: {running_loss:.4f}")

# 9. Model evaluation (Accuracy)
model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        images = images.to(device)
        labels = labels.to(device)

        outputs = model(images)
        _, predicted = torch.max(outputs, 1)

        total += labels.size(0)
        correct += (predicted == labels).sum().item()

accuracy = 100 * correct / total
print(f"Test Accuracy: {accuracy:.2f}%")


Epoch [1/5] - Loss: 112.4094
Epoch [2/5] - Loss: 35.6354
Epoch [3/5] - Loss: 21.6731
Epoch [4/5] - Loss: 14.7206
Epoch [5/5] - Loss: 10.6947
Test Accuracy: 98.98%


# Question 9:
**Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.**

In [11]:
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# 1. SETUP DATA AUGMENTATION
# This prevents overfitting by creating variations of your training images
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Validation data should only be rescaled (no augmentation)
val_datagen = ImageDataGenerator(rescale=1./255)

# 2. FLOW FROM DIRECTORIES
# Update 'class_mode' to 'categorical' if you have more than 2 classes
train_generator = train_datagen.flow_from_directory(
    '/content/Train',
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary'
)

val_generator = val_datagen.flow_from_directory(
    '/content/Val',
    target_size=(150, 150),
    batch_size=32,
    class_mode='binary'
)

# 3. DEFINE THE CNN MODEL
model = Sequential([
    # First Convolutional Block
    Conv2D(32, (3, 3), activation='relu', input_shape=(150, 150, 3)),
    MaxPooling2D(2, 2),

    # Second Convolutional Block
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),

    # Third Convolutional Block
    Conv2D(128, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),

    # Flatten and Dense Layers
    Flatten(),
    Dense(512, activation='relu'),
    Dropout(0.5), # Helps prevent overfitting
    Dense(1, activation='sigmoid') # Use 'softmax' and units=N for multi-class
])

# 4. COMPILE
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

# 5. TRAIN
history = model.fit(
    train_generator,
    epochs=20,
    validation_data=val_generator,
    verbose=1
)

# 6. SAVE THE MODEL
model.save('\nmy_cnn_model.h5')
print("\nModel saved as my_cnn_model.h5")

Found 10 images belonging to 2 classes.
Found 10 images belonging to 2 classes.
Epoch 1/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step - accuracy: 0.6000 - loss: 0.6976 - val_accuracy: 0.5000 - val_loss: 0.9761
Epoch 2/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 169ms/step - accuracy: 0.5000 - loss: 1.1078 - val_accuracy: 0.5000 - val_loss: 2.9861
Epoch 3/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 138ms/step - accuracy: 0.5000 - loss: 2.3480 - val_accuracy: 0.5000 - val_loss: 0.7130
Epoch 4/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 138ms/step - accuracy: 0.7000 - loss: 0.6081 - val_accuracy: 0.5000 - val_loss: 1.3404
Epoch 5/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 136ms/step - accuracy: 0.5000 - loss: 1.1835 - val_accuracy: 0.5000 - val_loss: 0.9614
Epoch 6/20
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 131ms/step - accuracy: 0.5000 - loss: 0.9656 - val




Model saved as my_cnn_model.h5


## Question 10
**You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach–from data preparation and model training to deploying the model as a web app using Streamlit. **

**Answer:**
1. **Data Preparation:** Collect and clean chest X-ray images, apply augmentation and normalization.
2. **Model Training:** Use a CNN with transfer learning (e.g., ResNet), binary crossentropy loss, and Adam optimizer.
3. **Evaluation:** Use accuracy, recall, and ROC-AUC.
4. **Deployment:** Save the trained model and deploy it using Streamlit for real-time predictions via a web interface.