# CNN Architecture

![Image](https://github.com/user-attachments/assets/94b7727e-2d99-495a-bba0-863442ad9fad)

![Image](https://github.com/user-attachments/assets/8c714f36-da69-4722-ab5e-5b39ba313838)

# Different Types of CNN Architectures
``ImageNet is a large visual database designed for use in visual object recognition research. It contains millions of labeled images and is widely used to train and evaluate machine learning models.``
</br>
Convolutional Neural Networks (CNNs) have various architectures designed for different tasks. Some of the popular CNN architectures include:


1. **LeNet-5**: One of the earliest CNN architectures, designed for handwritten digit recognition.
2. **AlexNet**: Won the ImageNet competition in 2012, known for its deep architecture and use of ReLU activation.
3. **VGGNet**: Known for its simplicity and use of very small (3x3) convolution filters.
4. **GoogLeNet (Inception)**: Introduced the Inception module, which dramatically reduced the number of parameters.
5. **ResNet**: Introduced residual learning, allowing for very deep networks by using skip connections.
6. **DenseNet**: Uses dense connections between layers, improving gradient flow and parameter efficiency.
7. **MobileNet**: Designed for mobile and embedded vision applications, known for its efficiency and lightweight architecture.

Each architecture has its own strengths and is suited for different types of tasks and computational constraints.

![Image](https://github.com/user-attachments/assets/7caa9325-4834-492a-bdfa-3e033d6c28e9)


## LeNet-5 Architecture

LeNet-5 is one of the earliest convolutional neural network (CNN) architectures, designed by Yann LeCun and his colleagues in 1998 for handwritten digit recognition (MNIST dataset). The architecture consists of the following layers:

1. **Input Layer**: 32x32 grayscale image.
2. **C1 - Convolutional Layer**: 6 filters of size 5x5, stride 1, followed by a sigmoid activation function. Output size: 28x28x6.
3. **S2 - Subsampling Layer**: Average pooling with a 2x2 filter and stride 2. Output size: 14x14x6.
4. **C3 - Convolutional Layer**: 16 filters of size 5x5, stride 1, followed by a sigmoid activation function. Output size: 10x10x16.
5. **S4 - Subsampling Layer**: Average pooling with a 2x2 filter and stride 2. Output size: 5x5x16.
6. **C5 - Convolutional Layer**: 120 filters of size 5x5, followed by a sigmoid activation function. Output size: 1x1x120.
7. **F6 - Fully Connected Layer**: 84 units, followed by a sigmoid activation function.
8. **Output Layer**: 10 units (one for each digit 0-9), followed by a softmax activation function.

LeNet-5 was groundbreaking at the time and laid the foundation for many modern CNN architectures.

![Image](https://github.com/user-attachments/assets/33bd18e5-66d1-4928-a9fe-a6100c6999df)

## Transition from $(32, 32, 1) \rightarrow (28, 28, 6)$

The transition happens due to the application of **6 convolutional filters** of size **$5 \times 5$**, without padding. Let's break it down step by step:

### 1. Input Image  
- **Size**: $(32, 32, 1) \rightarrow$ A grayscale image (1 channel).

### 2. Convolutional Layer  
- **Filter Size**: $5 \times 5$  
- **Number of Filters**: 6  
- **Stride**: 1  
- **Padding**: None (Valid Convolution)  

The formula for output size when applying a convolution is:

$$
\text{Output Size} = \frac{\text{Input Size} - \text{Filter Size}}{\text{Stride}} + 1
$$

Applying this to the spatial dimensions:

$$
\frac{32 - 5}{1} + 1 = 28
$$

So, after applying 6 filters, the new shape becomes **$(28, 28, 6)$**.

- The **height** and **width** are reduced to **$28 \times 28$**.
- The **depth** (number of channels) increases to **6**, as each filter extracts different features.


# Practical

In [7]:
import tensorflow
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten, Conv2D, MaxPooling2D, Dropout,AveragePooling2D

In [11]:
model = Sequential()
model.add(Conv2D(6,kernel_size=(5,5),padding='valid',activation='tanh',input_shape=(32,32,1)))
model.add(AveragePooling2D(pool_size=(2,2),strides=2,padding='valid'))

model.add(Conv2D(16,kernel_size=(5,5),padding='valid',activation='tanh'))
model.add(AveragePooling2D(pool_size=(2,2),strides=2,padding='valid'))

model.add(Flatten())

model.add(Dense(120,activation='tanh'))
model.add(Dense(84,activation='tanh'))
model.add(Dense(10,activation='softmax'))

In [12]:
model.summary()