# Assignment Code: DS-AG-021
# CNN Architecture | Assignment

Q1. What is the role of filters and feature maps in Convolutional Neural Network (CNN)?

- Ans: Filters (Kernels): Filters are small, learnable weight matrices that slide over the input and perform convolution operations. Their role is to extract specific local features (e.g., edges, textures) by responding strongly to particular patterns.

- Feature Maps: Feature maps are the outputs produced after applying filters to the input. They represent the spatial locations and strengths of the features detected by the corresponding filters.

Q2. Explain the concepts of padding and stride in CNNs(Convolutional Neural Network). How do they affect output dimensions of feature maps?

- Ans: Padding refers to adding extra pixels (typically zeros) around the border of an input feature map before applying convolution.
   - Purpose: It preserves spatial dimensions and allows the filter to process edge pixels effectively.
   - Effect on Output Dimensions: For an input of size (N*N), a filter of size (F*F), stride S, and padding P, the output dimension O is:
    
                  O = [(N - F + 2P) / S] + 1

      - Increasing padding increases the output size or helps maintain it.

- Stride defines the number of pixels by which the convolutional filter moves across the input feature map.
  - Effect on Output Dimensions: A larger stride reduces the output dimension, effectively performing downsampling, as per the same formula:

    
                 O = [(N - F + 2P) / S] + 1

Q3. Define receptive field in the context of CNNs. Why is it important for deep architectures?

- Ans: The receptive field of a neuron in a CNN is the region of the input image that influences the neuron's activation. In other words, it is the spatial extent of the input "seen" by that neuron.

- **Importance in Deep Architectures:**

  - Hierarchical Feature Learning: Larger receptive fields in deeper layers allow neurons to capture more global and abstract features.

  - Contextual Understanding: Essential for recognizing patterns that depend on wider context, such as objects in an image.

  - Design Consideration: Helps in determining network depth and kernel sizes to ensure the network can capture features at the required scale.

Q4. Discuss how filter size and stride influence the number of parameters in a CNN.

- Ans: ** Influence of Filter Size and Stride on CNN Parameters:**

   - Filter Size (Kernel Size):
       - The number of learnable parameters in a convolutional layer depends directly on the filter size.
       - For a layer with (K) filters of size (F*F) and C input channels, the total number of parameters is:

          Parameters = K* (F * F * C) + K
     
      (the +K accounts for bias terms)

       - Effect: Larger filters increase the number of parameters and model complexity.

    - Stride:
        -  It determines the step size of the filter over the input.
        - Effect on Parameters: Stride does not change the number of learnable parameters, as it only affects the spatial size of the output feature map, not the filter itself.

Q5. Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.

- Ans: **LeNet (1998)**

    - Depth: Shallow - 5 layers (2 conv + 3 FC)
    - Filter Sizes: Large in early layers (5 * 5), small number of filters
    - Stride & Padding: Stride 1, minimal padding
    - Feature Maps/ Channels: Few filters (6-16)
    - Performance: Good for small datasets (MNIST)
    - Key Innovations: First practical CNN for digit recognition

- **AlexNet (2012)**

    - Depth: Deeper - 8 layers (5 conv + 3 FC)
    - Filter Sizes: (11 * 11) in first layer, then (5 * 5) and (3 * 3); more filters per layer
    - Stride & Padding: Stride 4 in first layer, padding varies
    - Feature Maps/ Channels: Many filters (up to 384-256)
    - Performance: Breakthrough on ImageNet, better generalization
    - Key Innovations: ReLU activation, dropout, overlapping pooling, GPU training

- **VGG (2014)**

    - Depth: very deep - 16-19 layers (all conv + FC at the end)
    - Filter Sizes: Uniform small filters (3 * 3) throughout; depth compensates for receptive field
    - Stride & Padding: Stride 1, padding used to preserve spatial dimensions
    - Feature Maps/ Channels: Gradually increasing filters (64 → 512)
    - Performance: High accuracy on ImageNet, very effective for deep feature extraction
    - Key Innovations: Very deep, uniform architecture; simplicity in design; small filters improve non-linearity

Q10. You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into "Normal" and "Pneumonia" categories. Describe your end-to-end approach from-data preparation and model training to deploying the model as a web app using Streamlit.

- Ans:
1. **Data Preparaton:**
    - Collect labeled chest X-ray images for "Normal" and "Pneumonia."
    - Perform preprocessing: resizing, normalization, and augmentation (rotation, flipping) to enhance generalization.
    - Split data into training, validation, and test sets.

2. **Model Development:**
   - Choose a CNN architecture (e.g., ResNet, EfficientNet, or a custom CNN).
   - Compile the model with an appropriate loss function (binary cross-entropy)
  and optimizer (Adam).
   - Train the model on the training set with early stopping and monitor performance on the validation set.
   - Evaluate final accuracy and other metrics on the test set.

3. **Model Saving:**
  - Save the trained model using formats like `.h5` or TensorFlow SavedModel for deployment.

4. **Web App Deployment:**
  - Build a Streamlit app that:
     - Accepts X-ray image uploads from users.
     - Loads the trained CNN model.
     - Preprocesses the uploaded image and predicts the class.
     - Displays the result with confidence scores.
  - Deploy the Streamlit app on platforms like Streamlit Cloud, Heroku, or AWS.

5. **Monitoring & Maintenance:**
    - Implement logging and feedback mechanisms for model predictions.
    - Periodically retrain the model with new data to maintain accuracy.

In [22]:
# Q6. Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.

# Import required modules
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.utils import to_categorical

# Load MNIST and Preprocess data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = x_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
x_test = x_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN and Compile model
model = Sequential([
    Conv2D(32, kernel_size=(3,3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D(pool_size=(2,2)),
    Conv2D(64, kernel_size=(3,3), activation='relu'),
    MaxPooling2D(pool_size=(2,2)),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train & Evaluate model
history = model.fit(x_train, y_train, batch_size=128, epochs=5, validation_split=0.1)
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {test_acc:.4f}")

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m7s[0m 10ms/step - accuracy: 0.8396 - loss: 0.5316 - val_accuracy: 0.9825 - val_loss: 0.0629
Epoch 2/5
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9793 - loss: 0.0667 - val_accuracy: 0.9830 - val_loss: 0.0575
Epoch 3/5
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9865 - loss: 0.0452 - val_accuracy: 0.9900 - val_loss: 0.0367
Epoch 4/5
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9899 - loss: 0.0326 - val_accuracy: 0.9907 - val_loss: 0.0359
Epoch 5/5
[1m422/422[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 5ms/step - accuracy: 0.9922 - loss: 0.0255 - val_accuracy: 0.9893 - val_loss: 0.0395
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 4ms/step - accuracy: 0.9852 - loss: 0.0446
Test Accuracy: 0.9880


In [23]:
# Q7. Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.

from tensorflow.keras.datasets import cifar10

# Load CIFAR-10 & Preprocess data
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN & Compile model CIFAR-10
model_cifar = Sequential([
    Conv2D(32, (3,3), activation='relu', padding='same', input_shape=(32,32,3)),
    Conv2D(32, (3,3), activation='relu', padding='same'),
    MaxPooling2D((2,2)),
    Conv2D(64, (3,3), activation='relu', padding='same'),
    Conv2D(64, (3,3), activation='relu', padding='same'),
    MaxPooling2D((2,2)),
    Flatten(),
    Dense(512, activation='relu'),
    Dense(10, activation='softmax')
])
model_cifar.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Train & Evaluate model
history_cifar = model_cifar.fit(x_train, y_train, batch_size=64, epochs=10, validation_split=0.1)
test_loss_cifar, test_acc_cifar = model_cifar.evaluate(x_test, y_test)
print(f"CIFAR-10 Test Accuracy: {test_acc_cifar:.4f}")

Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m11s[0m 12ms/step - accuracy: 0.3871 - loss: 1.6674 - val_accuracy: 0.6254 - val_loss: 1.0531
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.6652 - loss: 0.9498 - val_accuracy: 0.7136 - val_loss: 0.8300
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.7536 - loss: 0.7033 - val_accuracy: 0.7548 - val_loss: 0.7148
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.8199 - loss: 0.5241 - val_accuracy: 0.7590 - val_loss: 0.7047
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.8684 - loss: 0.3740 - val_accuracy: 0.7702 - val_loss: 0.7542
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 7ms/step - accuracy: 0.9212 - loss: 0.2275 - val_accuracy: 0.7676 - val_loss: 0.7953
Epoch 7/10
[1m704/704[0m

In [24]:
#Q8. Using PyTorch, write a script to define to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.

import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Device configuration & Transformations
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))  # MNIST mean and std
])

# Load MNIST dataset & Define CNN model
train_dataset = datasets.MNIST(root='./data', train=True, download=True, transform=transform)
test_dataset = datasets.MNIST(root='./data', train=False, download=True, transform=transform)
train_loader = DataLoader(train_dataset, batch_size=128, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=128, shuffle=False)

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.pool = nn.MaxPool2d(2, 2)
        # Corrected input size for the first fully connected layer
        self.fc1 = nn.Linear(64*12*12, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.relu(self.conv1(x))
        x = self.pool(self.relu(self.conv2(x)))
        # Corrected flattening operation
        x = x.view(-1, 64*12*12)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

model = CNN().to(device)

# Loss and Optimizer & Training loop
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

num_epochs = 5
for epoch in range(num_epochs):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)
        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        running_loss += loss.item()
        print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {running_loss/len(train_loader):.4f}")

# Evaluation
model.eval()
correct = 0
total = 0
with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print(f"Test Accuracy: {correct/total:.4f}")

Epoch [1/5], Loss: 0.0049
Epoch [1/5], Loss: 0.0097
Epoch [1/5], Loss: 0.0138
Epoch [1/5], Loss: 0.0175
Epoch [1/5], Loss: 0.0207
Epoch [1/5], Loss: 0.0232
Epoch [1/5], Loss: 0.0254
Epoch [1/5], Loss: 0.0271
Epoch [1/5], Loss: 0.0284
Epoch [1/5], Loss: 0.0297
Epoch [1/5], Loss: 0.0311
Epoch [1/5], Loss: 0.0323
Epoch [1/5], Loss: 0.0336
Epoch [1/5], Loss: 0.0343
Epoch [1/5], Loss: 0.0353
Epoch [1/5], Loss: 0.0362
Epoch [1/5], Loss: 0.0371
Epoch [1/5], Loss: 0.0380
Epoch [1/5], Loss: 0.0388
Epoch [1/5], Loss: 0.0396
Epoch [1/5], Loss: 0.0402
Epoch [1/5], Loss: 0.0411
Epoch [1/5], Loss: 0.0420
Epoch [1/5], Loss: 0.0427
Epoch [1/5], Loss: 0.0436
Epoch [1/5], Loss: 0.0447
Epoch [1/5], Loss: 0.0457
Epoch [1/5], Loss: 0.0462
Epoch [1/5], Loss: 0.0470
Epoch [1/5], Loss: 0.0480
Epoch [1/5], Loss: 0.0489
Epoch [1/5], Loss: 0.0496
Epoch [1/5], Loss: 0.0500
Epoch [1/5], Loss: 0.0505
Epoch [1/5], Loss: 0.0514
Epoch [1/5], Loss: 0.0521
Epoch [1/5], Loss: 0.0528
Epoch [1/5], Loss: 0.0534
Epoch [1/5],

In [25]:
# Q9. Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.

#  Import Libraries
import tensorflow as tf
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.preprocessing import image
import numpy as np
from PIL import Image

#  Data Preprocessing using ImageDataGenerator
train_dir = './data/train'  # Corrected path to your training dataset folder
val_dir = './data/val'      # Corrected path to your validation dataset folder

# Image augmentation for training
train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=20,
    width_shift_range=0.2,
    height_shift_range=0.2,
    shear_range=0.2,
    zoom_range=0.2,
    horizontal_flip=True,
    fill_mode='nearest'
)

# Only rescale for validation
val_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
    train_dir,
    target_size=(128,128),
    batch_size=32,
    class_mode='categorical'
)

val_generator = val_datagen.flow_from_directory(
    val_dir,
    target_size=(128,128),
    batch_size=32,
    class_mode='categorical'
)

num_classes = len(train_generator.class_indices)
print("Class indices:", train_generator.class_indices)

#  CNN Model Definition
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(128,128,3)),
    MaxPooling2D(2,2),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),

    Conv2D(128, (3,3), activation='relu'),
    MaxPooling2D(2,2),

    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(num_classes, activation='softmax')
])

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()

#  Train the Model
history = model.fit(
    train_generator,
    epochs=10,
    validation_data=val_generator
)

#  Save the Model
model.save("bird_classifier_model.h5")

#  Predict on Uploaded Image
def predict_image(img_path, model, target_size=(128,128)):
    img = image.load_img(img_path, target_size=target_size)
    img_array = image.img_to_array(img)/255.0
    img_array = np.expand_dims(img_array, axis=0)

    preds = model.predict(img_array)
    class_index = np.argmax(preds, axis=1)[0]

    # Map index to class name
    class_labels = {v:k for k,v in train_generator.class_indices.items()}
    return class_labels[class_index]

Found 1 images belonging to 2 classes.
Found 1 images belonging to 2 classes.
Class indices: {'class_a': 0, 'class_b': 1}


Epoch 1/10


  self._warn_if_super_not_called()


[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step - accuracy: 0.0000e+00 - loss: 0.7472 - val_accuracy: 1.0000 - val_loss: 0.2663
Epoch 2/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 81ms/step - accuracy: 1.0000 - loss: 0.2866 - val_accuracy: 1.0000 - val_loss: 0.0238
Epoch 3/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 81ms/step - accuracy: 1.0000 - loss: 0.0386 - val_accuracy: 1.0000 - val_loss: 2.8582e-04
Epoch 4/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 78ms/step - accuracy: 1.0000 - loss: 0.0397 - val_accuracy: 1.0000 - val_loss: 3.5763e-07
Epoch 5/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 83ms/step - accuracy: 1.0000 - loss: 1.7808e-04 - val_accuracy: 1.0000 - val_loss: 0.0000e+00
Epoch 6/10
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 81ms/step - accuracy: 1.0000 - loss: 0.0000e+00 - val_accuracy: 1.0000 - val_loss: 0.0000e+00
Epoch 7/10
[1m1/1[0m [32m━━━

