# Facial Keypoints Detection

##### W207 Spring 2022 Section 10 
##### Team 4: Eric Sun, Jiayi Hu, Sridhar Chadalavada

### Introduction
We are using the Kaggle data set for [Facial Keypoints Detection](https://www.kaggle.com/c/facial-keypoints-detection/overview) to experiment and build models that detect the locations of up to 30 keypoints on images of faces. Facial keypoint detection in images has a variety of applications such as image tagging, biometrics, or psychological and clinical medical diagnosis.

The facial keypoints are primarily features dividing the face sagitally, for example the centers of the left and right eyes, with a smaller number of features in the midline, such as the tip of the nose. We will be exploring convolutional neural networks (CNNs) and the impacts of tuning parameters and hyperparameters to improve score and mitigate overfitting as well as comparing other models that may be less performant.

#### Data
The Kaggle data sets include labeled Training and unlabeled Testing data sets. Because we do not have access to the Test labels, the predicted Test labels of our final model will be scored within Kaggle, which will evaluate our submission against those of other participants based on the root mean square errors of the predicted and original values. Our Training data includes 7,049 images that we will use to divide into Training and Development sets, and our Test data includes 1,783 images. The images are represented as a grid of 96x96 pixels in the range of (0, 255) with each keypoint defined by an x and y position in that grid.

#### Internal Project Milestones
3/13: Baseline Submission  
3/20: Individual research and analysis into CNN modeling and transformation  
3/27: Identify chosen model parameters and experiments and merge for notebook report  
4/3: Complete final notebook  
4/10: Complete final presentation  
4/14: Final deliverable and in-class presentation  

### Initialization

In [None]:
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
from ipywidgets import interactive
import tensorflowjs as tfjs

%matplotlib inline
pd.options.display.width = 800

In [None]:
id_lookup_table = pd.read_csv('./IdLookupTable.csv')

train = pd.read_csv('./training.csv')
print('Initial Training', train.shape)

test = pd.read_csv('./test.csv', index_col=0)
print('Initial Test', test.shape)

### Exploratory Data Analysis (EDA)

In [None]:
# View data

train.head().T

In [None]:
# Missing values

train.isnull().sum().plot(kind='bar')
plt.ylim(0, len(train))
plt.show()

# different options to fix na values
# inplace saves space but I want the flexibility to switch later without reloading the original dataset

# train_no_na = train.dropna(axis=0).copy().reset_index(drop=True)
# train_no_na = train.fillna(value=0).copy()
train_no_na = train.fillna(method='ffill').copy()


In [None]:
# Modify image data in-place

def format_string_into_list(target, from_col):
	target['image_data'] = target[from_col].map(lambda x: np.array([int(y) for y in x.split(' ')]))
	assert len(target['image_data'].map(len).unique()) == 1, f'Missing or uneven lengths in image data: {target.image_data.map(len)}' 
	assert min(target['image_data'].map(min)) >= 0, 'Negative values in image data'
	assert max(target['image_data'].map(max)) < 256, 'Unexpectedly large values in image data'

In [None]:
format_string_into_list(train_no_na, 'Image')

In [None]:
# Show face

def plot_face(index):
	plt.imshow(train_no_na.at[index, 'image_data'].reshape(96, 96), cmap='gray')
	keypoints = train_no_na.iloc[index, :~1].to_numpy()
	plt.scatter(keypoints[0::2], keypoints[1::2])
	plt.xticks([])
	plt.yticks([])
	plt.show()

interactive(plot_face, index=train_no_na.index)

### Model Building

In [None]:
model = tf.keras.Sequential([
	tf.keras.layers.Flatten(input_shape=(96, 96, 1)),
		tf.keras.layers.Dense(128, activation="relu"),
		tf.keras.layers.Dropout(0.1),
		tf.keras.layers.Dense(64, activation="relu"),
		tf.keras.layers.Dense(30)
])

In [None]:
model.compile(optimizer='adam', loss='mse', metrics=['acc'])

In [None]:
x = np.concatenate(train_no_na.image_data.to_numpy()).reshape(-1, 96, 96)
y = np.concatenate(train_no_na.iloc[:,:~1].to_numpy()).reshape(-1, len(train_no_na.columns) - 2)
print(x.shape, y.shape)
model.fit(x, y, epochs=5)


### Analysis of Predictions

In [None]:
format_string_into_list(test, 'Image')

In [None]:
x = np.concatenate(test.image_data.to_numpy()).reshape(-1, 96, 96)
y = model.predict(x)

In [None]:
# Show face

def plot_face2(index):
	plt.imshow(test.at[index, 'image_data'].reshape(96, 96), cmap='gray')
	predicted_kp = y[index]
	plt.scatter(predicted_kp[0::2], predicted_kp[1::2], color='red')
	plt.xticks([])
	plt.yticks([])
	plt.show()

interactive(plot_face2, index=test.index)

### Export model into TFJS

In [None]:
tfjs.converters.save_keras_model(model, 'tfjs_model', weight_shard_size_bytes=999999999)