# Questions

**Question 1: What is the role of filters and feature maps in Convolutional Neural Network (CNN)?**
  - Convolutional Neural Networks (CNNs) are a class of deep learning models mainly used for image and pattern recognition.
  - Two key components of CNNs are filters and feature maps.
  - These components help the network automatically learn important visual patterns from input data.
  - Filters in CNN:-
    - A filter is a small matrix of learnable weights.
    - It slides over the input image during the convolution operation.
    - It detect specific features such as Edges, Corners, Textures and Shapes.
    - A same filter is applied across the entire image which helps in reduction of the number of parameters and computational cost.
  - Feature Maps in CNN:-
    - A feature map is the output produced after applying a filter to the input image.
    - It represents the presence and location of detected features in the input.
    - And highlight important regions where specific features occur.
    - Multiple feature maps are generated using multiple filters.
    - These feature maps become more abstract in deeper layers.

**Question 2: Explain the concepts of padding and stride in CNNs(Convolutional Neural Network). How do they affect the output dimensions of feature maps?**
  - In Convolutional Neural Networks, padding and stride are important parameters of the convolution operation.
  - They control how filters move over the input image and how the output feature maps are generated.
  - Padding in CNN:-
    - Padding refers to adding extra pixels around the border of the input image before applying convolution.
    - It preserves spatial dimensions of the input.
    - And prevents loss of important information at the edges of the image.
  - Stride in CNN:-
    - Stride is the number of pixels by which the filter moves across the input image.
    - It determines how much the filter shifts at each step during convolution.
    - It controls the amount of overlap between filter applications.
    - ANd helps reduce the size of feature maps.

**Question 3: Define receptive field in the context of CNNs. Why is it important for deep architectures?**
  - The receptive field of a neuron in a CNN refers to the region of the input image that influences the activation of that neuron.
  - Each neuron in a convolutional layer is connected only to a local region of the input, not the entire image.
  - A larger receptive field allows neurons to capture global context.
  - It is essential for understanding relationships between distant parts of an image.

**Question 4: Discuss how filter size and stride influence the number of parameters in a CNN.**
  - Filter size and stride are key hyperparameters that influence how many parameters a CNN has either directly or indirectly.
  - Filter size refers to the spatial dimensions of the convolutional kernel.
  - Each filter has learnable weights and usually one bias term.
  - Filter size has a direct and significant impact on the number of parameters in CNNs.
  - Stride is the number of pixels the filter moves at each step during convolution.
  - Stride influences parameters indirectly by controlling feature map dimensions.

**Question 5: Compare and contrast different CNN-based architectures like LeNet, AlexNet, and VGG in terms of depth, filter sizes, and performance.**
  - LeNet:-
    - It is designed primarily for handwritten digit recognition.
    - It have a shallow network with 5-7 layers.
    - Uses relatively large filters.
    - It performs well on simple low-resolution images.
    - But have limited capability for complex image recognition tasks.
  - AlexNet:-
    - It have deeper neural net than LeNet with 8 layers.
    - Uses larger filters in early layers and smaller filters in later layers.
    - It handles complex, high-resolution images.
    - It introduced innovations such as ReLU activation, Dropout function and GPU-based training.
  - VGGNet:-
    - It is known for its simplicity and uniform architecture.
    - It uses very deep networks 16-17 layers.
    - It uses small 3×3 filters throughout the network.
    - Have high accuracy on image classification tasks.
    - And better feature extraction due to increased depth.

In [1]:
# Question 6: Using keras, build and train a simple CNN model on the MNIST dataset from scratch. Include code for module creation, compilation, training, and evaluation.
import tensorflow as tf
from tensorflow.keras import datasets, layers, models
from tensorflow.keras.utils import to_categorical

(x_train, y_train), (x_test, y_test) = datasets.mnist.load_data()

x_train = x_train / 255.0
x_test = x_test / 255.0

x_train = x_train.reshape(-1, 28, 28, 1)
x_test = x_test.reshape(-1, 28, 28, 1)

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

model = models.Sequential()
model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(28, 28, 1)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

history = model.fit(
    x_train,
    y_train,
    epochs=10,
    batch_size=64,
    validation_split=0.1
)

test_loss, test_accuracy = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_accuracy)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
[1m11490434/11490434[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m42s[0m 47ms/step - accuracy: 0.8875 - loss: 0.3790 - val_accuracy: 0.9855 - val_loss: 0.0491
Epoch 2/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 47ms/step - accuracy: 0.9846 - loss: 0.0509 - val_accuracy: 0.9888 - val_loss: 0.0388
Epoch 3/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m41s[0m 46ms/step - accuracy: 0.9908 - loss: 0.0306 - val_accuracy: 0.9887 - val_loss: 0.0377
Epoch 4/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 47ms/step - accuracy: 0.9933 - loss: 0.0220 - val_accuracy: 0.9907 - val_loss: 0.0365
Epoch 5/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m39s[0m 46ms/step - accuracy: 0.9949 - loss: 0.0167 - val_accuracy: 0.9877 - val_loss: 0.0446
Epoch 6/10
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m40s[0m 48ms/step - accuracy: 0.9960 - loss: 0.0127 - val_accuracy: 0.9908 - val_loss: 0.0363
Epoch 7/10
[1m8

In [3]:
# Question 7: Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images. Show your preprocessing and architecture.

(x_train, y_train), (x_test, y_test) = datasets.cifar10.load_data()

print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)

x_train = x_train / 255.0
x_test = x_test / 255.0

y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

model = models.Sequential()

model.add(layers.Conv2D(32, (3, 3), activation='relu',
                        input_shape=(32, 32, 3)))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.MaxPooling2D((2, 2)))
model.add(layers.Conv2D(64, (3, 3), activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(10, activation='softmax'))

model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)


history = model.fit(
    x_train,
    y_train,
    epochs=10,
    batch_size=64,
    validation_split=0.1
)

test_loss, test_accuracy = model.evaluate(x_test, y_test)
print("Test Accuracy:", test_accuracy)


Training data shape: (50000, 32, 32, 3)
Test data shape: (10000, 32, 32, 3)
Epoch 1/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m55s[0m 76ms/step - accuracy: 0.3085 - loss: 1.8730 - val_accuracy: 0.4782 - val_loss: 1.4239
Epoch 2/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 71ms/step - accuracy: 0.5141 - loss: 1.3590 - val_accuracy: 0.5730 - val_loss: 1.2177
Epoch 3/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m53s[0m 76ms/step - accuracy: 0.5816 - loss: 1.1831 - val_accuracy: 0.5966 - val_loss: 1.1330
Epoch 4/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m51s[0m 72ms/step - accuracy: 0.6240 - loss: 1.0677 - val_accuracy: 0.6236 - val_loss: 1.0700
Epoch 5/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m81s[0m 71ms/step - accuracy: 0.6525 - loss: 0.9846 - val_accuracy: 0.6616 - val_loss: 0.9797
Epoch 6/10
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m50s[0m 71ms/step - accuracy: 0.67

In [4]:
# Question 8: Using PyTorch, write a script to define and train a CNN on the MNIST dataset. Include model definition, data loaders, training loop, and accuracy evaluation.
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("Using device:", device)

transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))
])

train_dataset = datasets.MNIST(
    root='./data',
    train=True,
    download=True,
    transform=transform
)

test_dataset = datasets.MNIST(
    root='./data',
    train=False,
    download=True,
    transform=transform
)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=1000, shuffle=False)

class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()

        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)

        self.pool = nn.MaxPool2d(2)
        self.fc1 = nn.Linear(64 * 5 * 5, 128)
        self.fc2 = nn.Linear(128, 10)

    def forward(self, x):
        x = torch.relu(self.conv1(x))
        x = self.pool(x)
        x = torch.relu(self.conv2(x))
        x = self.pool(x)

        x = x.view(x.size(0), -1)
        x = torch.relu(self.fc1(x))
        x = self.fc2(x)

        return x

model = CNN().to(device)

criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

def train(model, device, train_loader, optimizer, criterion, epochs):
    model.train()
    for epoch in range(epochs):
        running_loss = 0.0
        for data, target in train_loader:
            data, target = data.to(device), target.to(device)

            optimizer.zero_grad()
            outputs = model(data)
            loss = criterion(outputs, target)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()

        print(f"Epoch [{epoch+1}/{epochs}], Loss: {running_loss/len(train_loader):.4f}")

train(model, device, train_loader, optimizer, criterion, epochs=5)

def test(model, device, test_loader):
    model.eval()
    correct = 0
    total = 0

    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            outputs = model(data)
            _, predicted = torch.max(outputs.data, 1)
            total += target.size(0)
            correct += (predicted == target).sum().item()

    accuracy = 100 * correct / total
    print(f"Test Accuracy: {accuracy:.2f}%")


Using device: cpu


100%|██████████| 9.91M/9.91M [00:00<00:00, 62.8MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 1.41MB/s]
100%|██████████| 1.65M/1.65M [00:00<00:00, 14.3MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 9.77MB/s]


Epoch [1/5], Loss: 0.1342
Epoch [2/5], Loss: 0.0427
Epoch [3/5], Loss: 0.0293
Epoch [4/5], Loss: 0.0206
Epoch [5/5], Loss: 0.0169


In [5]:
# Question 9: Given a custom image dataset stored in a local directory, write code using Keras ImageDataGenerator to preprocess and train a CNN model.
import tensorflow as tf
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

(x_train, y_train), (x_test, y_test) = cifar10.load_data()

datagen = ImageDataGenerator(
    rescale=1.0 / 255,
    rotation_range=20,
    zoom_range=0.2,
    horizontal_flip=True
)

datagen.fit(x_train)
model = Sequential([
    Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    MaxPooling2D(2, 2),
    Conv2D(64, (3, 3), activation='relu'),
    MaxPooling2D(2, 2),
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

model.fit(datagen.flow(x_train, y_train, batch_size=32),
          epochs=5,
          validation_data=(x_test / 255.0, y_test))

loss, accuracy = model.evaluate(x_test / 255.0, y_test)
print("Accuracy:", accuracy)

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5


  self._warn_if_super_not_called()


[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 56ms/step - accuracy: 0.3776 - loss: 1.7140 - val_accuracy: 0.5496 - val_loss: 1.2475
Epoch 2/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 57ms/step - accuracy: 0.5520 - loss: 1.2688 - val_accuracy: 0.6291 - val_loss: 1.0809
Epoch 3/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m85s[0m 54ms/step - accuracy: 0.5989 - loss: 1.1321 - val_accuracy: 0.6475 - val_loss: 1.0082
Epoch 4/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m93s[0m 59ms/step - accuracy: 0.6340 - loss: 1.0507 - val_accuracy: 0.6803 - val_loss: 0.9331
Epoch 5/5
[1m1563/1563[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m89s[0m 57ms/step - accuracy: 0.6563 - loss: 0.9914 - val_accuracy: 0.6896 - val_loss: 0.9070
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 14ms/step - accuracy: 0.6894 - loss: 0.8954
Accuracy: 0.6895999908447266


**Question 10: You are working on a web application for a medical imaging startup. Your task is to build and deploy a CNN model that classifies chest X-ray images into “Normal” and “Pneumonia” categories. Describe your end-to-end approach–from data preparation and model training to deploying the model as a web app using Streamlit.**
  - The objective is to classify chest X-ray images into two categories: Normal and Pneumonia.
  - This is a binary image classification problem using CNN.
  - Data Collection:-
    - A publicly available Chest X-ray dataset can be used.
  - Data Preprocessing and Augmentation:-
    - Images are resized to a fixed dimension.
    - Pixel values are normalized using rescaling.
    - Data augmentation techniques such as Rotation, Zooming and Horizontal flipping are applied to improve model generalization.
    - Keras ImageDataGenerator is used for preprocessing.
  - Model Architecture:-
    - The model is compiled using Adam Optimizer Binary Cross-Entropy Loss function and Evaluate accuracy.
    - The model is trained on the training dataset and validated on the test dataset.
    - Performance is evaluated using accuracy and loss values.
  - Model Saving:-
    - After training, the model is saved in `.h5` or `.keras` format.
    - This saved model is later loaded during deployment for inference.
  - Web Application Development:-
    - Streamlit is used to create a simple web interface.
    - The web app allows users to upload a chest X-ray image, View the uploaded image and receive classification results (Normal or Pneumonia).
  - Deployment:-
    - Streamlit app can be deployes on cloud platforms like AWS, Streamlit Cloud or Heroku.