# CNN Architecture

## 1. What is the role of filters and feature maps in Convolutional Neural Network (CNN)?

### Role of Filters and Feature Maps in CNN

In a Convolutional Neural Network (CNN), **filters** and **feature maps** play a very important role in extracting useful information from images.

#### Filters (Kernels)
- Filters are small matrices (for example: 3×3 or 5×5).
- They slide over the input image during convolution.
- Each filter detects a specific feature such as:
  - edges
  - corners
  - textures
  - shapes
- The values inside the filter are learned automatically during training.

#### Feature Maps
- A feature map is the output produced after applying a filter to the input image.
- Each filter creates **one feature map**.
- Feature maps highlight where a specific feature is present in the image.
- Deeper layers produce feature maps with more complex features.


## 2. Explain the concepts of padding and stride in CNNs. How do they affect the output dimensions of feature maps?

### Padding and Stride in Convolutional Neural Networks (CNN)

#### Padding
- Padding means adding extra pixels (usually zeros) around the border of the input image.
- It is used to control the size of the output feature map.
- Padding helps in:
  - preserving edge information
  - preventing the image from shrinking too quickly

**Types of Padding**
- **Valid padding**: No padding is added → output size reduces.
- **Same padding**: Padding is added so output size remains same as input size.

#### Stride
- Stride defines how many pixels the filter moves at a time.
- A stride of 1 means the filter moves one pixel at a time.
- A larger stride skips pixels and reduces the output size.

#### Effect on Output Size
- Increasing **padding** → increases or preserves output size.
- Increasing **stride** → decreases output size.



## 3. Define receptive field in the context of CNNs. Why is it important for deep architectures?

### Receptive Field in Convolutional Neural Networks (CNN)

#### What is Receptive Field?
- The receptive field is the **region of the input image** that a neuron (or feature) in a CNN layer can "see".
- It tells how much of the original image influences a single output value.
- In simple words, it is the **area of the input image used to make a decision**.

#### How Receptive Field Grows
- In early layers, the receptive field is small and captures simple features like edges.
- As we go deeper in the network:
  - more convolution layers
  - pooling layers
- the receptive field increases and captures larger patterns like objects and shapes.

#### Importance in Deep Architectures
- A larger receptive field helps the model understand:
  - global context
  - spatial relationships
- It allows deep CNNs to recognize complex objects instead of just small features.
- Without a growing receptive field, the network cannot learn high-level features.


## 4. Discuss how filter size and stride influence the number of parameters in a CNN.

### Effect of Filter Size and Stride on Number of Parameters in CNN

#### Filter Size
- Filter size decides how many values (weights) are inside a filter.
- Larger filters contain more parameters.
- Number of parameters in one filter is calculated as:

Parameters = (Filter Height × Filter Width × Input Channels) + Bias

#### Example
- 3×3 filter with 1 input channel:
  - Parameters = (3×3×1) + 1 = 10
- 5×5 filter with 1 input channel:
  - Parameters = (5×5×1) + 1 = 26
- Larger filters increase:
  - model size
  - computation cost
  - risk of overfitting

#### Stride
- Stride does **not directly change** the number of parameters.
- It affects how many times the filter is applied on the input.
- Larger stride:
  - reduces output feature map size
  - reduces computation
- Smaller stride:
  - keeps more spatial information


## Compare and contrast CNN architectures: LeNet, AlexNet, and VGG.

### Comparison of LeNet, AlexNet, and VGG CNN Architectures

#### LeNet
- One of the earliest CNN architectures.
- Designed mainly for handwritten digit recognition (MNIST).
- Shallow network with fewer layers.
- Uses small input images (28×28).
- Performance is good for simple tasks but not suitable for complex images.


#### AlexNet
- First deep CNN that showed huge success in image classification.
- Won the ImageNet competition in 2012.
- Introduced GPU training and ReLU activation.
- Handles large and complex images.


#### VGG
- Very deep CNN architecture.
- Uses only small filters (3×3) but many layers.
- Focuses on depth to improve performance.
- High accuracy but very high computational cost.


#### Comparison Table

| Feature      | LeNet        | AlexNet        | VGG           |
|-------------|--------------|----------------|---------------|
| Depth       | Shallow      | Medium         | Very Deep     |
| Filter Size | Large        | Large → Medium | Small (3×3)   |
| Complexity  | Low          | Medium         | High          |
| Performance | Basic        | Good           | Excellent     |


## 6. Build and train a simple CNN model on the MNIST dataset using Keras.

In [1]:
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = mnist.load_data()

# Reshape data to include channel dimension
X_train = X_train.reshape(-1, 28, 28, 1) / 255.0
X_test = X_test.reshape(-1, 28, 28, 1) / 255.0

# One-hot encode target labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(28,28,1)),
    MaxPooling2D(pool_size=(2,2)),
    
    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(pool_size=(2,2)),
    
    Flatten(),
    Dense(128, activation='relu'),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train the model
model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", test_accuracy)


  if not hasattr(np, "object"):
  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m20s[0m 20ms/step - accuracy: 0.9474 - loss: 0.1735 - val_accuracy: 0.9837 - val_loss: 0.0607
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 22ms/step - accuracy: 0.9844 - loss: 0.0503 - val_accuracy: 0.9830 - val_loss: 0.0551
Epoch 3/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 23ms/step - accuracy: 0.9888 - loss: 0.0356 - val_accuracy: 0.9902 - val_loss: 0.0416
Epoch 4/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 20ms/step - accuracy: 0.9916 - loss: 0.0259 - val_accuracy: 0.9910 - val_loss: 0.0384
Epoch 5/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m16s[0m 19ms/step - accuracy: 0.9936 - loss: 0.0198 - val_accuracy: 0.9887 - val_loss: 0.0349
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 6ms/step - accuracy: 0.9887 - loss: 0.0331
Test Accuracy: 0.9886999726295471


## 7. Load and preprocess the CIFAR-10 dataset using Keras, and create a CNN model to classify RGB images.

In [2]:
# Import required libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.datasets import cifar10
from tensorflow.keras.utils import to_categorical

# Load CIFAR-10 dataset
(X_train, y_train), (X_test, y_test) = cifar10.load_data()

# Normalize pixel values
X_train = X_train / 255.0
X_test = X_test / 255.0

# One-hot encode labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(32,32,3)),
    MaxPooling2D((2,2)),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D((2,2)),

    Flatten(),
    Dense(128, activation='relu'),
    Dropout(0.5),
    Dense(10, activation='softmax')
])

# Compile the model
model.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=['accuracy']
)

# Train the model
model.fit(X_train, y_train, epochs=5, batch_size=64, validation_split=0.1)

# Evaluate the model
test_loss, test_accuracy = model.evaluate(X_test, y_test)
print("Test Accuracy:", test_accuracy)


Downloading data from https://www.cs.toronto.edu/~kriz/cifar-10-python.tar.gz
[1m170498071/170498071[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m68s[0m 0us/step
Epoch 1/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m23s[0m 26ms/step - accuracy: 0.3816 - loss: 1.6859 - val_accuracy: 0.5228 - val_loss: 1.3448
Epoch 2/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 27ms/step - accuracy: 0.5183 - loss: 1.3510 - val_accuracy: 0.5886 - val_loss: 1.1662
Epoch 3/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m19s[0m 27ms/step - accuracy: 0.5716 - loss: 1.2103 - val_accuracy: 0.6204 - val_loss: 1.0847
Epoch 4/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m23s[0m 32ms/step - accuracy: 0.6050 - loss: 1.1247 - val_accuracy: 0.6558 - val_loss: 1.0031
Epoch 5/5
[1m704/704[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m22s[0m 32ms/step - accuracy: 0.6262 - loss: 1.0633 - val_accuracy: 0.6666 - val_loss: 0.9426
[1m313/313[0m [3

## 8. Define and train a CNN on the MNIST dataset using PyTorch.

In [None]:
%pip install torch torchvision torchaudio


In [5]:
# Import required libraries
import torch
import torch.nn as nn
import torch.optim as optim
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

# Device configuration
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

# Data preprocessing
transform = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.5,), (0.5,))
])

# Load MNIST dataset
train_dataset = datasets.MNIST(root='./data', train=True, transform=transform, download=True)
test_dataset = datasets.MNIST(root='./data', train=False, transform=transform, download=True)

train_loader = DataLoader(train_dataset, batch_size=64, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=64, shuffle=False)

# Define CNN model
class CNN(nn.Module):
    def __init__(self):
        super(CNN, self).__init__()
        self.conv1 = nn.Conv2d(1, 32, kernel_size=3)
        self.conv2 = nn.Conv2d(32, 64, kernel_size=3)
        self.pool = nn.MaxPool2d(2, 2)
        self.fc1 = nn.Linear(64 * 5 * 5, 128)
        self.fc2 = nn.Linear(128, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = x.view(x.size(0), -1)
        x = self.relu(self.fc1(x))
        x = self.fc2(x)
        return x

# Initialize model, loss, optimizer
model = CNN().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)

# Training loop
for epoch in range(5):
    model.train()
    running_loss = 0.0
    for images, labels in train_loader:
        images, labels = images.to(device), labels.to(device)

        optimizer.zero_grad()
        outputs = model(images)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        running_loss += loss.item()

    print(f"Epoch {epoch+1}, Loss: {running_loss/len(train_loader)}")

# Evaluation
model.eval()
correct = 0
total = 0

with torch.no_grad():
    for images, labels in test_loader:
        images, labels = images.to(device), labels.to(device)
        outputs = model(images)
        _, predicted = torch.max(outputs, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print("Test Accuracy:", correct / total)


100%|██████████| 9.91M/9.91M [00:03<00:00, 3.04MB/s]
100%|██████████| 28.9k/28.9k [00:00<00:00, 122kB/s]
100%|██████████| 1.65M/1.65M [00:01<00:00, 1.03MB/s]
100%|██████████| 4.54k/4.54k [00:00<00:00, 2.27MB/s]


Epoch 1, Loss: 0.1580375699147181
Epoch 2, Loss: 0.04634912982411774
Epoch 3, Loss: 0.03244273949310612
Epoch 4, Loss: 0.023735606959258426
Epoch 5, Loss: 0.01794759144512331
Test Accuracy: 0.9886


## 9. Train a CNN using Keras ImageDataGenerator on a custom image dataset.

In [9]:
# Import required libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Image parameters
img_height = 224
img_width = 224
batch_size = 32

# ImageDataGenerator (NO validation split)
datagen = ImageDataGenerator(
    rescale=1./255
)

# Load image data from directory
train_data = datagen.flow_from_directory(
    'dataset/',                # Path to dataset folder
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary'
)

# Build CNN model
model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(img_height, img_width, 3)),
    MaxPooling2D(2,2),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),

    Flatten(),
    Dense(128, activation='relu'),
    Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

# Train the model
model.fit(
    train_data,
    epochs=5
)


Found 2 images belonging to 2 classes.


Epoch 1/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 3s/step - accuracy: 0.5000 - loss: 0.6588
Epoch 2/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 573ms/step - accuracy: 0.5000 - loss: 4.6685
Epoch 3/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 495ms/step - accuracy: 0.5000 - loss: 0.6883
Epoch 4/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 611ms/step - accuracy: 1.0000 - loss: 1.0663e-05
Epoch 5/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 490ms/step - accuracy: 1.0000 - loss: 2.1019e-05


<keras.src.callbacks.history.History at 0x1cac1597b00>

## 10.End-to-end approach to build and deploy a CNN model for classifying Chest X-ray images (Normal vs Pneumonia) using Streamlit

### Step 1: Data Preparation
- Collect chest X-ray images and organize them into folders:
  - dataset/
    - train/
      - Normal/
      - Pneumonia/
    - test/
      - Normal/
      - Pneumonia/
- Resize images to a fixed size (e.g., 224×224).
- Normalize pixel values to range [0,1].
- Use data augmentation to avoid overfitting.


In [14]:
# Data preprocessing using ImageDataGenerator
from tensorflow.keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(
    rescale=1./255,
    rotation_range=15,
    zoom_range=0.2,
    horizontal_flip=True
)

test_datagen = ImageDataGenerator(rescale=1./255)

train_data = train_datagen.flow_from_directory(
    'dataset/',
    target_size=(224,224),
    batch_size=32,
    class_mode='binary'
)

test_data = test_datagen.flow_from_directory(
    'dataset/',
    target_size=(224,224),
    batch_size=32,
    class_mode='binary'
)


Found 2 images belonging to 2 classes.


Found 2 images belonging to 2 classes.


### Step 2: Model Building and Training
- Use a CNN with convolution, pooling, and dense layers.
- Binary classification using sigmoid activation.
- Binary cross-entropy loss is used.


In [15]:
# CNN model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense

model = Sequential([
    Conv2D(32, (3,3), activation='relu', input_shape=(224,224,3)),
    MaxPooling2D(2,2),

    Conv2D(64, (3,3), activation='relu'),
    MaxPooling2D(2,2),

    Flatten(),
    Dense(128, activation='relu'),
    Dense(1, activation='sigmoid')
])

model.compile(
    optimizer='adam',
    loss='binary_crossentropy',
    metrics=['accuracy']
)

model.fit(train_data, epochs=5, validation_data=test_data)

# Save model
model.save("xray_model.h5")


Epoch 1/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 5s/step - accuracy: 0.5000 - loss: 0.6909 - val_accuracy: 1.0000 - val_loss: 0.3441
Epoch 2/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 0.5000 - loss: 0.3901 - val_accuracy: 1.0000 - val_loss: 2.1040e-05
Epoch 3/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1s/step - accuracy: 1.0000 - loss: 3.6588e-05 - val_accuracy: 0.5000 - val_loss: 0.4800
Epoch 4/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 2s/step - accuracy: 1.0000 - loss: 0.0129 - val_accuracy: 1.0000 - val_loss: 0.1248
Epoch 5/5
[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 681ms/step - accuracy: 1.0000 - loss: 0.0038 - val_accuracy: 1.0000 - val_loss: 9.5989e-05




### Step 3: Model Deployment using Streamlit
- Load the trained model.
- Accept image upload from user.
- Preprocess image and predict result.
- Display prediction on web interface.


In [16]:
# streamlit_app.py
import streamlit as st
import tensorflow as tf
import numpy as np
from PIL import Image

model = tf.keras.models.load_model("xray_model.h5")

st.title("Chest X-ray Classification")

uploaded_file = st.file_uploader("Upload X-ray Image", type=["jpg","png"])

if uploaded_file is not None:
    image = Image.open(uploaded_file).resize((224,224))
    img_array = np.array(image) / 255.0
    img_array = img_array.reshape(1,224,224,3)

    prediction = model.predict(img_array)

    if prediction[0][0] > 0.5:
        st.write("Prediction: Pneumonia")
    else:
        st.write("Prediction: Normal")




2025-12-24 12:15:46.262 
  command:

    streamlit run c:\Users\Dell\anaconda3\Lib\site-packages\ipykernel_launcher.py [ARGUMENTS]
