# Building a CNN Architecture

This code snippet demonstrates how to build a Convolutional Neural Network (CNN) architecture using the `keras` library in Python. The model consists of multiple convolutional layers, max-pooling layers, a flattening layer, and fully connected layers.

## Code Explanation

### 1. Input Layer
First, we create an input layer specifying the shape of the input data.

```python
inputs = Input(shape=input_shape)
```
### 2. Define the Model Architecture
We use a Sequential model to define the architecture of the CNN.
```python
model = Sequential()
```

### 3. Convolutional and Max-Pooling Layers

#### Convolutional Layer 1
- **Conv2D**: 32 filters, kernel size of 5x5, activation function `relu`.
- **MaxPooling2D**: Pool size of 2x2.
```python
model.add(Conv2D(32, (5, 5), activation='relu'))
model.add(MaxPooling2D((2, 2)))
```
#### Convolutional Layer 2
- **Conv2D**: 64 filters, kernel size of 5x5, activation function `relu`.
- **MaxPooling2D**: Pool size of 2x2.
```python
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D((2, 2)))
```

#### Convolutional Layer 3
- **Conv2D**: 128 filters, kernel size of 5x5, activation function `relu`.
- **MaxPooling2D**: Pool size of 2x2.
```python
model.add(Conv2D(64, (5, 5), activation='relu'))
model.add(MaxPooling2D((2, 2)))
```

### 4. Flattening and Fully Connected Layers
#### Flattening Layer 
The flattening layer converts the 2D matrix data to a 1D vector.

```python
model.add(Flatten())
```
#### Fully Connected Layer
- **Dense**:  512 units, activation function `relu`.
- **Output layer**: 10 units (assuming 10 classes for classification), activation function `softmax`.
```python
model.add(Dense(512, activation='relu'))
model.add(Dense(10, activation='softmax'))
```

### 5. Build the Model

Finally, we build the model by specifying the inputs and outputs.
```python
model = Model(inputs=inputs, outputs=model(inputs))
```

In [None]:
import os
from tensorflow.keras import layers, models # type: ignore
from tensorflow.keras.models import Sequential, Model # type: ignore
from tensorflow.keras.utils import plot_model # type: ignore
from tensorflow.keras.layers import Input,Conv2D, MaxPooling2D, Flatten, Dense, Dropout # type: ignore

# Define the CNN model
def create_cnn_model(input_shape):    
    
    # Create an Input layer
    inputs = Input(shape=input_shape)
    
    # Define the rest of the model architecture
    model = Sequential()
    
    # Convolutional Layer 1
    model.add(Conv2D(32, (5, 5), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    
    # Convolutional Layer 2
    model.add(Conv2D(64, (5, 5), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    
    # Convolutional Layer 3
    model.add(Conv2D(128, (5, 5), activation='relu'))
    model.add(MaxPooling2D((2, 2)))
    
    # Flattening Layer
    model.add(Flatten())
    
    # Fully Connected Layer 1
    model.add(Dense(512, activation='relu'))
    
    # Output Layer
    model.add(Dense(10, activation='softmax'))  # Assuming 10 classes for classification
    
    # Build the model
    model = Model(inputs=inputs, outputs=model(inputs))
    
    return model

# Create the model
input_shape = (64, 64, 3)  # Specify the input shape
cnn_model = create_cnn_model(input_shape)
cnn_model.summary()

ResNet, short for Residual Network, is a type of deep neural network architecture introduced by Kaiming He et al. in 2015. Deep neural networks often suffer from the vanishing gradient problem, where gradients become progressively smaller as they are backpropagated from the output layer to the input layer. This means that earlier layers (closer to the input) receive extremely small updates, slowing down their learning or making it nearly negligible. This issue is common in deep networks and hinders their training.

Similarly, deep networks can also face the exploding gradient problem, where gradients grow exponentially during backpropagation, making the model unstable. ResNet addresses both vanishing and exploding gradient problems by introducing shortcut connections, or residual connections, that allow the gradient to bypass one or more layers as shown in the figure.

The core idea of ResNet is the use of residual blocks, which consist of two or three layers with a direct (shortcut) connection that skips one or more layers. These shortcut connections help preserve the flow of gradients, making it feasible to train very deep networks.

1. Basic Residual Block: Used in ResNet-18 and ResNet-34, each block has 2 layers, typically 3x3 convolutional layers.
2. Bottleneck Residual Block: Used in ResNet-50 and ResNet-101, each block has 3 layers: a 1x1 convolutional layer for dimensionality reduction, a 3x3 convolutional layer, and another 1x1 convolutional layer for restoring the dimensions. In bottleneck design, the 1×1 convolution layers are added at the beginning and end of each block to reduce and then restore the number of channels, respectively, as shown in the figure (right).

Shallower networks like ResNet-18 and ResNet-34 are faster and less computationally intensive, making them suitable for tasks where computational resources are limited. Deeper networks like ResNet-50 and ResNet-101 can capture more complex patterns, making them better suited for more challenging tasks but at the cost of higher computational requirements.

In [None]:
#ResNet from Scratch
import tensorflow as tf
from tensorflow.keras import layers # type: ignore

def resnet_block(inputs, filters, kernel_size, strides):
    x = layers.Conv2D(filters, kernel_size, strides=strides, padding='same')(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)
    x = layers.Conv2D(filters, kernel_size, padding='same')(x)
    x = layers.BatchNormalization()(x)
    
    if strides > 1:
        inputs = layers.Conv2D(filters, 1, strides=strides, padding='same')(inputs)
    
    x = layers.Add()([x, inputs])
    x = layers.Activation('relu')(x)
    return x

def build_resnet(input_shape, num_classes):
    inputs = tf.keras.Input(shape=input_shape)
    
    x = layers.Conv2D(64, 7, strides=2, padding='same')(inputs)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('relu')(x)
    x = layers.MaxPooling2D(pool_size=3, strides=2, padding='same')(x)
    
    x = resnet_block(x, 64, 3, strides=1)
    x = resnet_block(x, 64, 3, strides=1)
    x = resnet_block(x, 64, 3, strides=1)
    
    x = resnet_block(x, 128, 3, strides=2)
    x = resnet_block(x, 128, 3, strides=1)
    x = resnet_block(x, 128, 3, strides=1)
    x = resnet_block(x, 128, 3, strides=1)
    
    x = resnet_block(x, 256, 3, strides=2)
    x = resnet_block(x, 256, 3, strides=1)
    x = resnet_block(x, 256, 3, strides=1)
    x = resnet_block(x, 256, 3, strides=1)
    x = resnet_block(x, 256, 3, strides=1)
    x = resnet_block(x, 256, 3, strides=1)
    
    x = resnet_block(x, 512, 3, strides=2)
    x = resnet_block(x, 512, 3, strides=1)
    x = resnet_block(x, 512, 3, strides=1)
    
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dense(num_classes, activation='softmax')(x)
    
    model = tf.keras.Model(inputs=inputs, outputs=x)
    return model

# Example usage
input_shape = (224, 224, 3)
num_classes = 1000
resnet_model = build_resnet(input_shape, num_classes)

In [None]:
import tensorflow as tf
from tensorflow.keras import layers # type: ignore

def efficientnet_block(inputs, filters, kernel_size, strides, expand_ratio):
    channel_axis = 1 if tf.keras.backend.image_data_format() == 'channels_first' else -1
    input_shape = tf.keras.backend.int_shape(inputs)
    input_filters = input_shape[channel_axis]
    expanded_filters = input_filters * expand_ratio
    
    x = inputs
    if expand_ratio != 1:
        x = layers.Conv2D(expanded_filters, 1, padding='same', use_bias=False)(x)
        x = layers.BatchNormalization()(x)
        x = layers.Activation('swish')(x)
    
    x = layers.DepthwiseConv2D(kernel_size, strides=strides, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('swish')(x)
    
    x = layers.Conv2D(filters, 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    
    if strides == 1 and input_filters == filters:
        x = layers.Add()([x, inputs])
    
    return x

def build_efficientnet(input_shape, num_classes, width_coefficient, depth_coefficient, dropout_rate):
    channel_axis = 1 if tf.keras.backend.image_data_format() == 'channels_first' else -1
    
    # Define the number of filters for each block
    block_filters = {
        'b0': [32, 16, 24, 40, 80, 112, 192, 320],
        'b1': [32, 16, 24, 40, 80, 112, 192, 320],
        'b2': [32, 16, 24, 48, 88, 120, 208, 352],
        'b3': [40, 24, 32, 64, 112, 160, 272, 464],
        'b4': [48, 24, 32, 64, 112, 160, 272, 464],
        'b5': [48, 24, 40, 80, 160, 224, 384, 640],
        'b6': [56, 32, 40, 80, 160, 224, 384, 640],
        'b7': [64, 32, 48, 96, 192, 256, 448, 768]
    }
    
    # Define the number of layers for each block
    block_layers = {
        'b0': [1, 2, 2, 3, 3, 4, 1],
        'b1': [1, 2, 2, 3, 3, 4, 1],
        'b2': [1, 2, 2, 3, 3, 4, 1],
        'b3': [1, 2, 2, 3, 3, 4, 1],
        'b4': [1, 2, 2, 3, 3, 4, 1],
        'b5': [1, 2, 3, 4, 4, 6, 1],
        'b6': [1, 2, 3, 4, 4, 6, 1],
        'b7': [1, 2, 3, 4, 4, 6, 1]
    }
    
    # Calculate the number of filters and layers based on the width and depth coefficients
    num_filters = [int(x * width_coefficient) for x in block_filters['b0']]
    num_layers = [int(x * depth_coefficient) for x in block_layers['b0']]
    
    inputs = tf.keras.Input(shape=input_shape)
    x = inputs
    
    # Stem convolutional layer
    x = layers.Conv2D(num_filters[0], 3, strides=2, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('swish')(x)
    
    # Building blocks
    for i in range(7):
        for j in range(num_layers[i]):
            strides = 2 if j == 0 and i != 0 else 1
            x = efficientnet_block(x, num_filters[i+1], 3, strides, expand_ratio=1)
    
    # Head convolutional layer
    x = layers.Conv2D(num_filters[-1], 1, padding='same', use_bias=False)(x)
    x = layers.BatchNormalization()(x)
    x = layers.Activation('swish')(x)
    
    # Global average pooling and dropout
    x = layers.GlobalAveragePooling2D()(x)
    if dropout_rate > 0:
        x = layers.Dropout(dropout_rate)(x)
    
    # Output layer
    x = layers.Dense(num_classes, activation='softmax')(x)
    
    model = tf.keras.Model(inputs=inputs, outputs=x)
    return model

# Example usage
input_shape = (224, 224, 3)
num_classes = 1000
width_coefficient = 1.0
depth_coefficient = 1.0
dropout_rate = 0.2

efficientnet_model = build_efficientnet(input_shape, num_classes, width_coefficient, depth_coefficient, dropout_rate)
efficientnet_model.summary()