In [None]:
# Understanding Pooling and Padding in CNN

In [None]:
# question 1

In [None]:
# In a convolutional neural network, pooling layers are applied after the convolutional layer. The main purpose of pooling is to reduce the size of feature maps, which in turn makes computation faster because the number of training parameters is reduced.

In [None]:
# question 2

In [None]:
# difference between max pooling and minpooling

# In max pooling, the operation retains the maximum value from the portion of the image covered by the filter.Min pooling, on the other hand, retains the minimum value from the portion of the image covered by the filter.
# It is a type of pooling operation often used in convolutional neural networks (CNNs) to reduce the spatial dimensions of the input.It is a less common pooling technique compared to max pooling.
# Max pooling helps in extracting the most important features, discarding the less significant ones.Min pooling can be used in specific scenarios where the focus is on extracting the least intense features or parts of the image.

In [None]:
# question 3

In [None]:
# In the context of Convolutional Neural Networks (CNNs), padding refers to the technique of adding extra pixels around the boundary of an image. This is done before applying the convolution operation. Padding is crucial in CNNs for several reasons, and its significance can be understood in the following ways:
# Preservation of spatial dimensions
# Prevention of information loss
# Control over the spatial size of the output
# Mitigation of boundary effects

In [None]:
# question 4

In [None]:
# difference between zero padding and valid padding

# Same padding, also known as zero padding, refers to the process of adding the necessary number of zero pixels to the input image so that the spatial dimensions of the output feature map remain the same as the input.Valid padding, on the other hand, means no padding is added to the input image. It only applies the convolution operation to the parts of the input where the filter can fully overlap with the input without running off the edges.
# It ensures that the spatial information is preserved throughout the convolutional layers, allowing the output feature map to have the same spatial dimensions as the input.This results in an output feature map with reduced spatial dimensions compared to the input, as the edges are not considered in the convolution operation.
# Same padding is often useful when there is a need to keep track of the spatial information and preserve the spatial size of the feature maps during the convolution operation.Valid padding is commonly used when the goal is to reduce the spatial dimensions of the feature maps, especially if the focus is on extracting the most critical features from the input data.

In [None]:
# Exploring LeNet

In [None]:
# question 1

In [None]:
# LeNet-5 is a pioneering convolutional neural network designed by Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner for handwritten and machine-printed character recognition. It was one of the earliest convolutional neural networks and played a crucial role in the development of modern deep learning models, particularly in the field of computer vision. LeNet-5 was introduced in 1998 and has a relatively simple architecture compared to modern CNNs.

In [None]:
# question 2

In [None]:
# The key components of LeNet-5 are as follows:

# Input Layer:

# The input layer of LeNet-5 accepts grayscale images with a fixed size of 32x32 pixels.

# Convolutional Layers:

# LeNet-5 consists of two sets of convolutional layers, each followed by a subsampling layer The purpose of the convolutional layers is to extract features from the input images using convolution operations with learnable filters.

#Subsampling Layers (Average Pooling Layers):

# Subsampling layers in LeNet-5 perform average pooling, reducing the spatial dimensions of the feature maps while retaining important features.These layers help in decreasing the sensitivity of the network to small translations in the input image.

# Fully Connected Layers:

# LeNet-5 has three fully connected layers. The first two fully connected layers serve as intermediate feature extractors, while the final fully connected layer acts as the classifier.The fully connected layers enable the network to learn complex patterns and correlations between features extracted from the earlier layers.

# Activation Functions:

# LeNet-5 employs nonlinear activation functions, such as the sigmoid or tanh function, after each layer to introduce nonlinearity into the network.

# Output Layer:

# The output layer of LeNet-5 uses the softmax activation function for multi-class classification, providing the probabilities for each class label.

In [None]:
# question 3

In [None]:
# advantages

# Effective Feature Extraction ,Translation Invariance ,Simplicity ,Early Demonstration of CNN Potential

# limitations

# Limited Complexity ,Limited Scalability ,Lack of Flexibility ,Performance on Large-Scale Data

In [None]:
# question 4

In [None]:
from tensorflow import keras
from keras.datasets import mnist
from keras.layers import Conv2D,AveragePooling2D
from keras.layers import Dense,Flatten
from keras.models import Sequential

In [None]:
(x_train,y_train),(x_test,y_test)=mnist.load_data()

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


In [None]:
x_train=x_train/255.0
x_test=x_test/255.0

In [None]:
y_train=keras.utils.to_categorical(y_train,10)
y_test=keras.utils.to_categorical(y_test,10)

In [None]:
model = Sequential()

model.add(Conv2D(6, kernel_size = (5,5), padding = 'valid', activation='tanh', input_shape = (28,28,1)))
model.add(AveragePooling2D(pool_size= (2,2), strides = 2, padding = 'valid'))

model.add(Conv2D(16, kernel_size = (5,5), padding = 'valid', activation='tanh'))
model.add(AveragePooling2D(pool_size= (2,2), strides = 2, padding = 'valid'))

model.add(Flatten())

model.add(Dense(120, activation='tanh'))
model.add(Dense(84, activation='tanh'))
model.add(Dense(10, activation='softmax'))

model.summary()




Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_10 (Conv2D)          (None, 24, 24, 6)         156       
                                                                 
 average_pooling2d_10 (Aver  (None, 12, 12, 6)         0         
 agePooling2D)                                                   
                                                                 
 conv2d_11 (Conv2D)          (None, 8, 8, 16)          2416      
                                                                 
 average_pooling2d_11 (Aver  (None, 4, 4, 16)          0         
 agePooling2D)                                                   
                                                                 
 flatten_5 (Flatten)         (None, 256)               0         
                                                                 
 dense_15 (Dense)            (None, 120)              

In [None]:
model.compile(loss='sparse_categorical_crossentropy', optimizer='sgd', metrics=['accuracy'])
model.fit(x_train, y_train, batch_size=128, epochs=5, verbose=1, validation_data=(x_test, y_test))
score = model.evaluate(x_test, y_test)

print('Test Loss:', score[0])
print('Test accuracy:', score[1])

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test Loss: 0.2705730199813843
Test accuracy: 0.9225999712944031


In [None]:
# Analyzing AlexNet

In [None]:
# question 1

In [None]:
# AlexNet is a seminal deep learning model that made significant strides in the field of computer vision, particularly in image classification tasks. Developed by Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, it won the ImageNet Large Scale Visual Recognition Challenge in 2012, and its architecture laid the groundwork for the subsequent development of deep learning models.

In [None]:
# question 2

In [None]:
# AlexNet introduced several architectural innovations that contributed to its breakthrough performance in the ImageNet Large Scale Visual Recognition Challenge in 2012. These innovations played a crucial role in demonstrating the potential of deep learning for image classification tasks and significantly influenced the subsequent development of deep neural network models. Some of the key architectural innovations introduced in AlexNet include:
# Deep Architecture,Rectified Linear Units (ReLU) Activation,Local Response Normalization,Data Augmentation,Dropout Regularization

In [None]:
# question 3

In [None]:
# Convolutional Layers:

# AlexNet contains five convolutional layers, each responsible for learning and extracting increasingly complex features from the input images. These layers apply convolution operations to the input data, using learnable filters to detect patterns and features at different levels of abstraction. The depth and breadth of these convolutional layers enabled the network to learn intricate representations of the input images, facilitating improved classification accuracy.

# Pooling Layers:

# Interspersed between the convolutional layers, AlexNet incorporates max pooling layers to downsample the feature maps, reducing the spatial dimensions and the computational complexity of the network. Pooling helps create a more robust representation of the features extracted by the convolutional layers, making the network less sensitive to variations in the position of the features within the input images.

# Fully Connected Layers:

# After the convolutional and pooling layers, AlexNet includes three fully connected layers. The first two fully connected layers have 4096 nodes each, while the last fully connected layer has 1000 nodes. These layers act as high-level feature extractors and classifiers, enabling the network to make predictions based on the complex representations learned from the earlier layers. The last fully connected layer uses the softmax activation function to generate a probability distribution over the 1000 ImageNet classes, facilitating the classification of input images into different categories.