# TOPIC: Understanding Pooling and Padding in CNN

### 1. Describe the purpose and benefits of pooling in CNN.
-     Ans: Pooling in CNN (Convolutional Neural Network) simplifies and compresses information by reducing the dimensions of feature maps, 
         which helps in retaining essential features and reducing computational load. It enhances translation invariance, aids in noise reduction, 
         and speeds up training while maintaining crucial image characteristics.

### 2. Explain the difference between min pooling and max pooling.
-     Ans: Min pooling and max pooling are types of pooling in CNN. Min pooling selects the smallest value in a pool, highlighting darker features. 
         Max pooling picks the largest value, emphasizing brighter features. Both reduce dimensionality and capture essential patterns, 
         but max pooling is more common for preserving prominent features.

### 3. Discuss the concept of padding in CNN and its significance.
-     Ans: Padding in CNN involves adding extra border pixels to input images before convolution. It prevents reduction in feature map size after convolution,
         preserving information at image edges. Padding maintains spatial dimensions, aids feature extraction, and enables deeper networks by retaining 
         essential data during convolutions.

### 4. Compare and contrast zero-padding and valid-padding in terms of their effects on the output feature map size.
-     Ans: Zero-padding adds extra pixels around the input in CNN, maintaining feature map size after convolutions. 
         Valid-padding omits extra pixels, leading to smaller feature maps. Zero-padding helps retain spatial dimensions, 
         while valid-padding reduces them, impacting the output's spatial extent in the network.

# TOPIC: Exploring LeNet

### 1. Provide a brief overview of LeNet-5 architecture.
-   Ans: LeNet-5 is an early and influential Convolutional Neural Network (CNN) architecture. It consists of input, convolutional, pooling, and fully connected layers.
         LeNet-5 was designed for handwritten digit recognition. It features progressively reduced spatial dimensions through convolutions and pooling, 
         followed by flattening and fully connected layers for classification. LeNet-5's architecture serves as a foundation for modern CNNs, 
         contributing to their development and success in image-related tasks.

### 2. Describe the key components of LeNet-5 and their respective purposes.
-   Ans: LeNet-5 has three main components: convolutional layers, pooling layers, and fully connected layers. 
         Convolutional layers extract features from input images. 
         Pooling layers reduce dimensionality and capture key features. 
         Fully connected layers combine extracted features for classification. 
         These components work together to recognize and classify handwritten digits or other images effectively.

### 3. Discuss the advantages and limitations of LeNet-5 in the context of image classification tasks.
-   Ans: LeNet-5's strengths lie in its pioneering use of CNNs for image classification. It excels at simple tasks like 
         handwritten digit recognition due to its convolution and pooling layers that capture basic features. 
         However, it struggles with complex and diverse images due to its limited depth and simplicity. 
         Modern architectures have surpassed LeNet-5 by incorporating deeper layers, advanced activation functions, 
         and larger datasets, enabling them to handle intricate image classification tasks more effectively.



### 4. Implement LeNet-5 using a deep learning framework of your choice (e.g., TensorFlow, PyTorch) and train it on a publicly available dataset (e.g., MNIST) Evaluate its performance and provide insights.

In [2]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense,AveragePooling2D

2023-09-01 03:59:27.636601: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-01 03:59:27.711629: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-01 03:59:27.713117: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [3]:
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [4]:
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

In [5]:
# Building the Model Architecture

model = Sequential()

model.add(Conv2D(6, kernel_size = (5,5), padding = 'valid', activation='tanh', input_shape = (28, 28, 1)))
model.add(AveragePooling2D(pool_size= (2,2), strides = 2, padding = 'valid'))

model.add(Conv2D(16, kernel_size = (5,5), padding = 'valid', activation='tanh'))
model.add(AveragePooling2D(pool_size= (2,2), strides = 2, padding = 'valid'))

model.add(Flatten())

model.add(Dense(120, activation='tanh'))
model.add(Dense(84, activation='tanh'))
model.add(Dense(10, activation='softmax'))

model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 24, 24, 6)         156       
                                                                 
 average_pooling2d (Average  (None, 12, 12, 6)         0         
 Pooling2D)                                                      
                                                                 
 conv2d_1 (Conv2D)           (None, 8, 8, 16)          2416      
                                                                 
 average_pooling2d_1 (Avera  (None, 4, 4, 16)          0         
 gePooling2D)                                                    
                                                                 
 flatten (Flatten)           (None, 256)               0         
                                                                 
 dense (Dense)               (None, 120)               3

In [6]:
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=128, epochs=10,  validation_split=0.1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.src.callbacks.History at 0x7fbd78b72830>

In [7]:
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc}")

Test accuracy: 0.98580002784729


# TOPIC: Analyzing AlexNet

### 1. Present an overview of the AlexNet architecture
-    Ans: AlexNet is a significant Convolutional Neural Network (CNN) architecture known for revolutionizing image classification. 
         It features five convolutional and pooling layers for feature extraction and dimension reduction. Relu activation boosts non-linearity, 
         and dropout prevents overfitting. Fully connected layers combine features for classification. 
         AlexNet introduced GPU acceleration, ReLU activation, and data augmentation, leading to its victory in the ImageNet competition. 
         Its design laid the foundation for modern deep CNNs.

### 2. Explain the architectural innovations introduced in AlexNet that contributed to its breakthrough performance.
-    Ans: AlexNet introduced GPU acceleration for faster training, ReLU activation for better non-linearity, and data augmentation to expand the training set. It also implemented dropout to prevent overfitting. These innovations improved computation efficiency, enabled deeper networks, and enhanced the model's ability to capture complex features, leading to its remarkable success in image classification.

### 3. Discuss the role of convolutional layers, pooling layers, and fully connected layers in AlexNet.
-    Ans: In AlexNet, convolutional layers extract features like edges, textures, and patterns from images. 
          Pooling layers reduce dimensionality while keeping crucial information intact. 
          Fully connected layers merge these features for classification, determining what object the image contains.


### 4. Implement AlexNet using a deep learning framework of your choice and evaluate its performance on a dataset of your choice.

In [3]:
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout, BatchNormalization

2023-09-01 04:01:46.511618: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-01 04:01:47.100063: I tensorflow/tsl/cuda/cudart_stub.cc:28] Could not find cuda drivers on your machine, GPU will not be used.
2023-09-01 04:01:47.102901: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [4]:
# Load and preprocess the CIFAR-10 dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train, x_test = x_train / 255.0, x_test / 255.0

In [5]:
y_train = tf.keras.utils.to_categorical(y_train, 10)
y_test = tf.keras.utils.to_categorical(y_test, 10)

In [6]:
# Create AlexNet model

model = Sequential([
    Conv2D(64, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    MaxPooling2D((2, 2)),
    BatchNormalization(),

    Conv2D(128, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),
    BatchNormalization(),

    Conv2D(256, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),

    Conv2D(256, (3, 3), activation='relu', padding='same'),
    BatchNormalization(),

    Conv2D(256, (3, 3), activation='relu', padding='same'),
    MaxPooling2D((2, 2)),
    BatchNormalization(),

    Flatten(),
    Dense(4096, activation='relu'),
    Dropout(0.4),
    BatchNormalization(),

    Dense(4096, activation='relu'),
    Dropout(0.4),
    BatchNormalization(),

    Dense(10, activation='softmax')
])

model.summary()


Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 26, 26, 64)        640       
                                                                 
 max_pooling2d (MaxPooling2  (None, 13, 13, 64)        0         
 D)                                                              
                                                                 
 batch_normalization (Batch  (None, 13, 13, 64)        256       
 Normalization)                                                  
                                                                 
 conv2d_1 (Conv2D)           (None, 13, 13, 128)       73856     
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 6, 6, 128)         0         
 g2D)                                                            
                                                        

In [7]:
# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

In [8]:
# Train the model
model.fit(x_train, y_train, epochs=3, batch_size=128, validation_split=0.1)

Epoch 1/3
Epoch 2/3
Epoch 3/3


<keras.src.callbacks.History at 0x7f7f368f8d30>

In [9]:
# Evaluate the model
test_loss, test_acc = model.evaluate(x_test, y_test)
print(f"Test accuracy: {test_acc}")

Test accuracy: 0.9861000180244446
