# Deep Learning Project

**Group:** Songbird  
**Members:** Charlotte de Vries, Jiazhen Tang, Paulo Zirlis

In [None]:
# Setup block (packages)

## 1. Project Overview



## 2. Dataset Description

In [None]:
# Load data

***

## 2. Neural Network Models
### 2.1 Custom CNN

The Convolutional Neural Network (CNN) model designed for this project consists of four convolutional blocks followed by a final block with pooling, dropout and fully connected layers. Each convolutional block has a convolution layer, batch normalization, activation function (ReLU) and max pooling. Early stopping was added to control for overfitting and underfitting. Batch normalization was used to improve training speed and stability. Dropout was included in the final block to further prevent overfitting.

The architecture is as follows:

**Input and Data Augmentation**
- Input layer: shape (256, 256, 1)
- Data Augmentation: Random rotations and horizontal flips

**First Convolutional Block**
- Conv2d layer: 32 filters, 3x3 kernel, stride of 1, same padding
- Batch Normalization
- ReLU Activation
- MaxPooling2d layer: 2x2 pool size, stride of 2.

**Other Convolutional Blocks**
- same as the first block but with increasing number of filters (64, 128, 256)

**Classifier Head**
- Global Average Pooling layer
- Dense layer: 64 units, ReLU activation
- Dropout layer: 0.3 dropout rate
- Dense layer: 5 units (nº of classes), Softmax activation

In [6]:
### CNN Architecture

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers

# Seed for reproducibility
np.random.seed(42)

# Custom CNN
CNN = keras.Sequential([
    
    # Input
    layers.InputLayer(shape=[256, 256, 1]),
    
    # Data Augmentation
    layers.RandomFlip("horizontal"), # flip images horizontally
    layers.RandomRotation(0.1),      # rotate images randomly by 10%


    # First Convolutional Block
    layers.Conv2D(filters=32, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=2),

    # Second Convolutional Block
    layers.Conv2D(filters=64, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=2),

    # Third Convolutional Block
    layers.Conv2D(filters=128, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=2),

    # Fourth Convolutional Block
    layers.Conv2D(filters=256, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=2),

    # Classifier Head
    layers.GlobalAveragePooling2D(),
    layers.Dense(units=64, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(units=5, activation='softmax')  # 5 classes
])

CNN.summary()

[1mModel: "sequential_1"[0m
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━┓
┃[1m [0m[1mLayer (type)                        [0m[1m [0m┃[1m [0m[1mOutput Shape               [0m[1m [0m┃[1m [0m[1m        Param #[0m[1m [0m┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━┩
│ random_flip_3 ([38;5;33mRandomFlip[0m)           │ ([38;5;45mNone[0m, [38;5;34m256[0m, [38;5;34m256[0m, [38;5;34m1[0m)         │               [38;5;34m0[0m │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ random_rotation_3 ([38;5;33mRandomRotation[0m)   │ ([38;5;45mNone[0m, [38;5;34m256[0m, [38;5;34m256[0m, [38;5;34m1[0m)         │               [38;5;34m0[0m │
├──────────────────────────────────────┼─────────────────────────────┼─────────────────┤
│ conv2d_4 ([38;5;33mConv2D[0m)                    │ ([38;5;45mNone[0m, [38;5;34m256[0m, [38;5;34m25

In [None]:
### Train and Evaluate CNN
from tensorflow.keras.callbacks import EarlyStopping

# Compile the model
CNN.compile(
    optimizer = 'adam',
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)

# Define early stopping
early_stopping = EarlyStopping(
    min_delta = 0.001,
    patience = 20,
    restore_best_weights = True
)

# Fit the model
history = CNN.fit(
    train,
    validation_data = valid,
    batch_size = 32,
    epochs = 25,
    callbacks = [early_stopping],
    verbose = 1
)

***

## 2.2 Pre-trained Residual Network
Charlotte

In [None]:
# Code for ResNet
# hello

## 2.3 Pre-trained Vision Transformer
Jiazhen

In [None]:
# Code for ViT