# CNN Regressor for Head Pose Estimation

## Overview

This Jupyter notebook demonstrates the process of building a Convolutional Neural Network (CNN) regressor to estimate head pose angles (yaw, pitch, and roll) from images. The model is trained using a sub-dataset derived from the 300W-LP dataset, specifically focusing on the head position attributes.

## Dataset Preparation

### Dataset Source
The original dataset used is the [300W-LP dataset](http://www.cbsr.ia.ac.cn/users/xiangyuzhu/projects/3DDFA/main.htm). It contains various features, but for this project, only the head position attributes (yaw, pitch, roll) were extracted.

### Dataset Generation
A subset of the 300W-LP dataset was created using the notebook provided in this [gist](https://gist.github.com/mani3/1ec02066cb11df85cfc694cab9230bc3#file-generate-dataset-from-the300w-lp-public-ipynb). This notebook extracts the relevant head position features and prepares the data for training.

A subset of the new generated dataset is used. The regressor model is trained using only yaw angle values. 

### Dataset Splitting
The dataset was split into three subsets:
- **Training Set**: 60%
- **Testing Set**: 20%
- **Validation Set**: 20%

## Model Architecture

The CNN regressor is built using the following architecture:

```python
model = Sequential([
        Conv2D(32, (3, 3), activation='relu', input_shape=input_shape),
        MaxPooling2D(pool_size=(2, 2)),
        
        Conv2D(64, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),
        
        Conv2D(128, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),
        
        Conv2D(256, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),
        
        Conv2D(512, (3, 3), activation='relu'),
        MaxPooling2D(pool_size=(2, 2)),
        
        Flatten(),
        Dense(256, activation='relu'),
        Dropout(0.5),
        Dense(128, activation='relu'),
        Dropout(0.5),
    
        Dense(1)
    ])
    
model.compile(optimizer=Adam(learning_rate=0.001),
              loss=MeanSquaredError(),
              metrics=['mae'])


# Import all the neccesary dependencies

In [4]:
import numpy as np
import pandas as pd
import tensorflow as tf
from keras.models import Sequential
from keras.layers import BatchNormalization, Dense, Dropout, Conv2D, MaxPool2D, Flatten
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.losses import MeanSquaredError
from tensorflow.keras.preprocessing.image import array_to_img
from tensorflow.keras.preprocessing.image import img_to_array, load_img
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping
# for tensorflow install https://developer.apple.com/metal/tensorflow-plugin/

from sklearn.metrics import mean_absolute_error, mean_squared_error, r2_score

from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.regularizers import l2

from sklearn.metrics import classification_report, confusion_matrix, ConfusionMatrixDisplay
import matplotlib.pyplot as plt

In [5]:
# load the preprocessed dataset 
with open('../dataset/data_20000.npy', 'rb') as data, open('../dataset/label_20000.npy', 'rb') as label:
  dataset = np.load(data)
  print(dataset.shape)
  labels = np.load(label)
  print(labels.shape)

(20000, 128, 128, 3)
(20000, 3)
