Question 1: Explain the basic components of a digital image and how it is represented in a computer. State the differences between grayscale and color images.

In [3]:
#Answer

A digital image consists of pixels, which are the smallest units of an image. Each pixel has intensity values representing brightness and color. In computers, images are represented as matrices of pixel values.

Grayscale Image:
Each pixel has a single intensity value (0 to 255 for an 8-bit image).
It is represented as a 2D NumPy array.
Color Image:
Uses three channels (Red, Green, and Blue - RGB).
Each pixel has three values corresponding to these channels.
It is represented as a 3D NumPy array.

Question 2: Define Convolutional Neural Networks (CNNs) and discuss their role in image processing. Describe the key advantages of using CNNs over traditional neural networks for image-related tasks.

In [5]:
#Answer

A Convolutional Neural Network (CNN) is a deep learning architecture designed for image processing and pattern recognition. It consists of convolutional layers, activation functions, pooling layers, and fully connected layers.

Advantages of CNNs over traditional neural networks:

Automatic Feature Extraction – CNNs learn spatial hierarchies, unlike traditional networks that require handcrafted features.
Parameter Sharing – Convolutional layers reuse filters, reducing the number of parameters.
Translation Invariance – CNNs detect patterns anywhere in the image, unlike fully connected networks.
Better Generalization – CNNs reduce overfitting due to pooling layers and shared weights.

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Define a simple CNN model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(64, activation='relu'),
    layers.Dense(10, activation='softmax')  # Output layer for classification
])

# Print model summary
model.summary()


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


Question 3: Define convolutional layers and their purpose in a CNN. Discuss the concept of filters and how they are applied during the convolution operation. Explain the use of padding and strides in convolutional layers and their impact on the output size.

In [7]:
#Answer

Convolutional Layers: These layers apply a set of filters (also called kernels) to the input image to extract features like edges, textures, and shapes.
Filters: Small matrices that slide over the input image, performing element-wise multiplication and summation to produce feature maps.
Padding:
Same Padding: Keeps the output size the same as input by adding extra pixels around the image.
Valid Padding: No padding, causing the output size to shrink.
Strides: The number of pixels by which the filter moves over the image. Higher stride reduces output size.

In [2]:
import torch
import torch.nn as nn

# Define a convolutional layer (1 input channel, 1 output channel, 3x3 kernel, stride 1, padding 1)
conv_layer = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, stride=1, padding=1)

# Create a sample 5x5 grayscale image as a 2D tensor
input_image = torch.tensor([[1, 2, 3, 0, 1], 
                            [5, 6, 7, 1, 0], 
                            [9, 10, 11, 2, 3], 
                            [0, 2, 4, 8, 6], 
                            [5, 6, 7, 1, 0]], dtype=torch.float32)

# Reshape input to match PyTorch's expected shape: (batch_size=1, channels=1, height=5, width=5)
input_image = input_image.unsqueeze(0).unsqueeze(0)  # Shape: (1, 1, 5, 5)

# Apply convolution
output_image = conv_layer(input_image)

# Print shapes
print(f"Input shape: {input_image.shape}")   # Should be (1, 1, 5, 5)
print(f"Output shape: {output_image.shape}") # Should be (1, 1, 5, 5) with padding=1


Input shape: torch.Size([1, 1, 5, 5])
Output shape: torch.Size([1, 1, 5, 5])


[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.


Question 4: Describe the purpose of pooling layers in CNNs. Compare max pooling and average pooling operations.

In [9]:
#Answer

Pooling layers reduce the spatial dimensions of feature maps while preserving important features. They make the model:

More computationally efficient.
Less sensitive to small translations.
Better at extracting dominant features.

In [4]:
import torch
import torch.nn as nn

# Define Max Pooling and Average Pooling layers
max_pool = nn.MaxPool2d(kernel_size=2, stride=2)
avg_pool = nn.AvgPool2d(kernel_size=2, stride=2)

# Create a sample 4x4 feature map as a 2D tensor
feature_map = torch.tensor([[1, 3, 2, 1], 
                            [4, 6, 5, 2], 
                            [8, 10, 7, 4], 
                            [3, 5, 6, 1]], dtype=torch.float32)

# Reshape input to match PyTorch's 4D requirement: (batch_size=1, channels=1, height=4, width=4)
feature_map = feature_map.unsqueeze(0).unsqueeze(0)  # Shape: (1, 1, 4, 4)

# Apply pooling
max_pooled = max_pool(feature_map)
avg_pooled = avg_pool(feature_map)

# Print results
print("Max Pooled Output:\n", max_pooled.squeeze())  # Remove batch & channel dimensions for readability
print("Average Pooled Output:\n", avg_pooled.squeeze())  # Remove batch & channel dimensions for readability


Max Pooled Output:
 tensor([[ 6.,  5.],
        [10.,  7.]])
Average Pooled Output:
 tensor([[3.5000, 2.5000],
        [6.5000, 4.5000]])
