# Deep Learning Project

**Group:** Songbird  
**Members:** Charlotte de Vries, Jiazhen Tang, Paulo Zirlis

In [None]:
# Setup block (packages)

## 1. Project Overview



## 2. Dataset Description

In [None]:
# Load data

***

## 2. Neural Network Models
### 2.1 Custom CNN
Paulo

The Convolutional Neural Network (CNN) model designed for this project mostly of multiple blocks of convolutional layers followed by pooling layers ending with fully connected layers at the end. Early stopping and dropout layers were added to control for overfitting and underfitting. Batch normalization was used to improve training speed and stability.

The architecture is as follows:

**Input and Data Augmentation**
- Input layer: shape (256, 256, 1)
- Data Augmentation: Random rotations and horizontal flips

**First Convolutional Block**
- Conv2d layer: 32 filters, 3x3 kernel, stride of 1, same padding
- Batch Normalization
- ReLU Activation
- MaxPooling2d layer: 2x2 pool size, stride of 2.

**Other Convolutional Blocks**
- same as the first block but with increasing number of filters (64, 128, 256)

**Head**
- Flatten layer
- Dense layer: 512 units, ReLU activation
- Dropout layer: 0.3 dropout rate
- Dense layer: 5 units (nÂº of classes), Softmax activation

In [None]:
### CNN Architecture

import numpy as np
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing

# Seed for reproducibility
np.random.seed(42)

# Custom CNN
CNN = keras.Sequential([
    
    # Input
    layers.InputLayer(input_shape=[256, 256, 1]),
    
    # Data Augmentation
    preprocessing.RandomFlip("horizontal"), # flip images horizontally
    preprocessing.RandomRotation(0.1),      # rotate images randomly by 10%


    # First Convolutional Block
    layers.Conv2d(filters=32, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=1),

    # Second Convolutional Block
    layers.Conv2d(filters=64, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=1),

    # Third Convolutional Block
    layers.Conv2d(filters=128, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=1),

    # Fourth Convolutional Block
    layers.Conv2d(filters=256, kernel_size=3, strides=1, padding='same'),
    layers.BatchNormalization(),
    layers.Activation('relu'),
    layers.MaxPool2D(pool_size=2, strides=1),

    # Classifier Head
    layers.Flatten(),
    layers.Dense(units=256, activation='relu'),
    layers.Dropout(0.3),
    layers.Dense(units=5, activation='softmax')  # 5 classes
])

In [None]:
### Train and Evaluate CNN



***

## 2.2 Pre-trained Residual Network
Charlotte

In [None]:
# Code for ResNet
# hello

## 2.3 Pre-trained Vision Transformer
Jiazhen

In [None]:
# Code for ViT