**<h2>Introduction to Convolutional Neural Networks (CNNs)</h2>**

Convolutional Neural Networks (CNNs) are designed to process data that has a known grid-like topology, such as images, which can be interpreted as a 2D grid of pixels. CNNs are particularly effective for image and video recognition tasks due to their ability to capture spatial hierarchies and local patterns.

**<h2>Key Components of CNNs</h2>**
CNNs consist of several types of layers, each serving a specific purpose in the network. Below, we describe these layers in detail, along with code examples for building, fitting, and predicting using these layers.



**<h2>1. Convolutional Layers (Conv1D, Conv2D, Conv3D)</h2>**

Purpose: To extract features from the input data by applying convolution operations with learnable filters.



**Conv1D Example**

Used for sequential data (e.g., time series, text).

In [1]:
import tensorflow as tf
from tensorflow.keras import layers, models

# Generate dummy data
import numpy as np
X_train = np.random.random((100, 64, 1))  # 100 samples, 64 timesteps, 1 feature
y_train = np.random.randint(2, size=(100, 1))

# Build a simple Conv1D model
model = models.Sequential([
    layers.Conv1D(32, 3, activation='relu', input_shape=(64, 1)), # [(n-f+p)/s] + 1, p = f-1 if padding="same" else p=0
    layers.MaxPooling1D(2),
    layers.Flatten(),
    layers.Dense(1, activation='sigmoid')
])

# Compile the model
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

model.summary()

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv1d (Conv1D)             (None, 62, 32)            128       
                                                                 
 max_pooling1d (MaxPooling1  (None, 31, 32)            0         
 D)                                                              
                                                                 
 flatten (Flatten)           (None, 992)               0         
                                                                 
 dense (Dense)               (None, 1)                 993       
                                                                 
Total params: 1121 (4.38 KB)
Trainable params: 1121 (4.38 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


**Conv2D Example**

Used for image data.

In [2]:
# Generate dummy data
X_train = np.random.random((100, 64, 64, 3))  # 100 samples, 64x64 pixels, 3 color channels
y_train = np.random.randint(10, size=(100, 1))

# Build a simple Conv2D model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)), # [(n-f+p)/s] + 1, p = f-1 if padding="same" else p=0
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)



Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 62, 62, 32)        896       
                                                                 
 max_pooling2d (MaxPooling2  (None, 31, 31, 32)        0         
 D)                                                              
                                                                 
 flatten_1 (Flatten)         (None, 30752)             0         
                                                                 
 dense_1 (Dense)             (None, 10)                307530    
                                                                 
Total params: 308426 (1.18 MB)
Trainable params: 308426 (1.18 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [3]:
# Generate dummy data
X_train = np.random.random((100, 64, 64, 3))  # 100 samples, 64x64 pixels, 3 color channels
y_train = np.random.randint(10, size=(100, 1))

# Build a simple Conv2D model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), padding="same", activation='relu', input_shape=(64, 64, 3)), # [(n-f+p)/s] + 1, p = f-1 if padding="same" else p=0
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()



Model: "sequential_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_1 (Conv2D)           (None, 64, 64, 32)        896       
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 32, 32, 32)        0         
 g2D)                                                            
                                                                 
 flatten_2 (Flatten)         (None, 32768)             0         
                                                                 
 dense_2 (Dense)             (None, 10)                327690    
                                                                 
Total params: 328586 (1.25 MB)
Trainable params: 328586 (1.25 MB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


In [5]:
# Generate dummy data
X_train = np.random.random((100, 64, 64, 3))  # 100 samples, 64x64 pixels, 3 color channels
y_train = np.random.randint(10, size=(100, 1))

# Build a simple Conv2D model
model = models.Sequential([
    layers.Conv2D(32, (3, 3), padding="same", strides=(2, 2), activation='relu', input_shape=(64, 64, 3)), # [(n-f+p)/s] + 1, p = f-1 if padding="same" else p=0
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()



Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_2 (Conv2D)           (None, 32, 32, 32)        896       
                                                                 
 max_pooling2d_2 (MaxPoolin  (None, 16, 16, 32)        0         
 g2D)                                                            
                                                                 
 flatten_3 (Flatten)         (None, 8192)              0         
                                                                 
 dense_3 (Dense)             (None, 10)                81930     
                                                                 
Total params: 82826 (323.54 KB)
Trainable params: 82826 (323.54 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


**<h2>2. Pooling Layers</h2>**

Purpose: To reduce the spatial dimensions of the feature maps, making the computation more efficient and reducing the risk of overfitting.

**MaxPooling2D Example**

In [6]:
# Generate dummy data
X_train = np.random.random((100, 64, 64, 3))
y_train = np.random.randint(10, size=(100, 1))

# Build a model with MaxPooling2D
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)),
    layers.MaxPooling2D((2, 2)), # n/s
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.MaxPooling2D((2, 2)),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)


Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_3 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 31, 31, 32)        0         
 g2D)                                                            
                                                                 
 conv2d_4 (Conv2D)           (None, 29, 29, 64)        18496     
                                                                 
 max_pooling2d_4 (MaxPoolin  (None, 14, 14, 64)        0         
 g2D)                                                            
                                                                 
 flatten_4 (Flatten)         (None, 12544)             0         
                                                                 
 dense_4 (Dense)             (None, 10)               

**<h2>3. Upsampling Layers</h2>**

Purpose: To increase the spatial dimensions of the feature maps, useful for tasks like image segmentation.

**Upsampling2D Example**

In [8]:
# Generate dummy data
X_train = np.random.random((100, 32, 32, 3))
y_train = np.random.randint(10, size=(100, 1))

# Build a model with Upsampling2D
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(32, 32, 3)),
    layers.UpSampling2D((2, 2)), # 2*n
    layers.Conv2D(64, (3, 3), activation='relu'),
    layers.Flatten(),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()


# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)

Model: "sequential_5"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 30, 30, 32)        896       
                                                                 
 up_sampling2d (UpSampling2  (None, 60, 60, 32)        0         
 D)                                                              
                                                                 
 conv2d_6 (Conv2D)           (None, 58, 58, 64)        18496     
                                                                 
 flatten_5 (Flatten)         (None, 215296)            0         
                                                                 
 dense_5 (Dense)             (None, 10)                2152970   
                                                                 
Total params: 2172362 (8.29 MB)
Trainable params: 2172362 (8.29 MB)
Non-trainable params: 0 (0.00 Byte)
________________

**<h2>4. Global Average Pooling Layers</h2>**

Purpose: To reduce each feature map to a single value by taking the average of all its values.

**GlobalAveragePooling2D Example**

In [9]:
# Generate dummy data
X_train = np.random.random((100, 64, 64, 3))
y_train = np.random.randint(10, size=(100, 1))

# Build a model with GlobalAveragePooling2D
model = models.Sequential([
    layers.Conv2D(32, (3, 3), activation='relu', input_shape=(64, 64, 3)),
    layers.GlobalAveragePooling2D(), # (batch size, input_channels)
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.summary()

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)


Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_7 (Conv2D)           (None, 62, 62, 32)        896       
                                                                 
 global_average_pooling2d (  (None, 32)                0         
 GlobalAveragePooling2D)                                         
                                                                 
 dense_6 (Dense)             (None, 10)                330       
                                                                 
Total params: 1226 (4.79 KB)
Trainable params: 1226 (4.79 KB)
Non-trainable params: 0 (0.00 Byte)
_________________________________________________________________


**5. Transfer Learning with Pre-trained Models**

Purpose: To leverage pre-trained models for new tasks, reducing the need for large amounts of data and computational resources.

**Example using VGG16**

In [None]:
from tensorflow.keras.applications import VGG16

# Load pre-trained VGG16 model without the top layer
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

# Freeze the layers of the base model
for layer in base_model.layers:
    layer.trainable = False

# Generate dummy data
X_train = np.random.random((100, 224, 224, 3))
y_train = np.random.randint(10, size=(100, 1))

# Add custom layers on top of the base model
model = models.Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(256, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])

# Fit the model
model.fit(X_train, y_train, epochs=10, batch_size=32)

# Predict
predictions = model.predict(X_train)
print(predictions)


**<h2> Number of parmaters and output shape</h2>**

**Output Size Calculation**

The output size of a convolutional layer is calculated using the formula:

$$ \text{Output Size} = \left\lfloor \frac{n - f + p}{s} \right\rfloor + 1 $$

Where:
- $ n $ is the input size
- $ f $ is the filter size
- $ p $ is the padding (where $ p = f - 1 $ if padding is 'same' and $ p = 0 $ if padding is 'valid')
- $ s $ is the stride size
- $ \left\lfloor \cdot \right\rfloor $ denotes taking the floor of the value

For layers like `Conv2DTranspose`, the inverse operation is performed:

$$ s \times (n + f - p) $$

**Parameter Count**

The number of parameters in a convolutional layer depends on the number of channels in the previous layer. For example, if the current layer has 2 filters and the previous layer has 3 channels, the parameters for each filter are duplicated across the channels. If the filter size is $ (2, 2) $:

- For each filter, there are $ 3 \times (2 \times 2) $ parameters to learn, plus 1 bias term.
- The total parameters for one filter are $ 3 \times (2 \times 2) + 1 $.
- The same applies to the second filter.

The general formula for the total number of parameters is:

$$ (\text{filter height} \times \text{filter width} \times \text{number of channels in previous layer} + 1) \times \text{number of filters} $$

**Note**

Pooling layers do not have parameters to learn.

**Example Calculation**
Let's consider an example to illustrate the calculations:

1. **Output Size Calculation**:
    - Input size ($ n $): 32
    - Filter size ($ f $): 3
    - Padding ($ p $): 2 (for 'same' padding)
    - Stride ($ s $): 1

$$ \text{Output Size} = \left\lfloor \frac{32 - 3 + 2}{1} \right\rfloor + 1 = \left\lfloor 31 \right\rfloor + 1 = 32 $$

2. **Parameter Count**:
    - Filter size: $ (3, 3) $
    - Previous layer channels: 3
    - Current layer filters: 2

For one filter:

$$ 3 \times (3 \times 3) + 1 = 3 \times 9 + 1 = 27 + 1 = 28 $$

For two filters:

$$ 28 \times 2 = 56 $$

**Summary**

- **Output Size Calculation**: $ \left\lfloor \frac{n - f + p}{s} \right\rfloor + 1 $
- **Inverse Operation**: $ s \times (n + f - p) $ for `Conv2DTranspose`

- **Parameter Count**: $ (\text{filter height} \times \text{filter width} \times \text{number of channels in previous layer} + 1) \times \text{number of filters} $
- **Pooling Layers**: No parameters to learn