## Experiment Overview

Before running the experiment, we do some preprocessing as described in [the 2015 VGGFace paper](http://www.robots.ox.ac.uk/~vgg/publications/2015/Parkhi15/parkhi15.pdf). This includes a duplicates filter that uses a nearest neighbors search to find and remove repeated images and identities. Direct quotes from the paper used for this preprocessing:


>The mean value of each channel is subtracted for each pixel.

...

> face descriptor is extracted from the layer adjacent to the classifier layer. This leads to a 2048 dimensional descriptor, which is then L2 normalised. 

...

>the template vector is obtained by averaging the face descriptors of the images. Cosine similarity is used to represent the similarity between two templates.

After preprocessing and filtering repeat identities from the dataset using a nearest neighbors search, we save the 224x224 cropped face images into one directory with the format {age}\_{race}\_{sex}\_{unique id}.png.
<br><br>
The experiment consists of sampling from each race, sex, and two age groups (18-40), (41+), with equal representation from each, for a total of 2620 unique identities sampled. We choose this number of unique IDs because it matches the quantity used in the original paper.<br><br>
We then reattach the classifier head to the VGGFace model (with ResNet architecture) as a 2620-way classifier and train it using image augmentation. Image augmentation does not include horizontal flipping, also known as mirroring. We reserve the mirror image for validation and testing. 
<img src="./imgs/augmentation.png">
<img src="./imgs/traintest.png">
<br>
After the model converges, we record the errors at test time and repeat the experiment. For each experiment, we resample from the database while maintaining equal representation from each group.<br><br>
The following script will repeat the experiment 100 times and record the results into a csv file called 'results.csv' in the working directory. The experiment can be stopped at any time, and the results of all completed experiments will remain intact. The 

In [None]:
from people import get_people_from, Person
from bias_exp import build_df, perform_experiment, sample_by_subgroups, plot_history, BatchMaker
from collections import OrderedDict
import pickle    

#load people
people, _ = get_people_from('./clean_dataset/')

#build dataframe for sampling
df = build_df(people)
#Save df for later inspection of labels
df.to_csv('experiment_labels.csv', index=False)

#load batchmaker for on-the-fly image preprocessing
with open('batchmaker.pkl', 'rb') as f:
    batch_maker = pickle.load(f)
    
history = []

for i in range(100):
    #sample, experiment, record, repeat
    sampled = sample_by_subgroups(df, seed=i)
    history += perform_experiment(sampled, batch_maker, people, ticker=i)
    plot_history([history[-1]], ticker=i)

#dump history if all experiments complete
#this is just optimization history, nothing too important
with open('history.pkl', 'wb') as f:
    pickle.dump(history, f)

Using TensorFlow backend.


4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 40
2620
4060  under 18
15273  18-40
8698  above 