Skip to content

Latest commit

 

History

History
78 lines (64 loc) · 10.7 KB

File metadata and controls

78 lines (64 loc) · 10.7 KB

Understanding and Comparing Deep Neural Networks for Age and Gender Classification - Data and Models

This repository contains all the evaluated models for which results are reported in the paper titled paper titled "Understanding and Comparing Deep Neural Networks for Age and Gender Classification" as published in the proceedings of the IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG) at the International Conference on Computer Vision (ICCV) 2017.

Should you find any code or the models from this github repository useful, please add a reference to the corresponding publication to your work:

@incproceedings{lapuschkin2017understanding,
  author = {Lapuschkin, Sebastian and Binder, Alexander and M\"uller, Klaus-Robert and Samek, Wojciech},
  title = {Understanding and Comparing Deep Neural Networks for Age and Gender Classification},
  booktitle = {Proceedings of the IEEE International Conference on Computer Vision Workshops (ICCVW)},
  pages = {1629-1638},
  year = {2017},
  doi = {10.1109/ICCVW.2017.191},
  url = {https://doi.org/10.1109/ICCVW.2017.191}
}

This repo contains the deploy.prototxt and train_val.prototxt files for all model architectures, pretraining and preprocessing choices for which performance measures are reported in the paper linked above. mean.binaryproto files for the employed datasets and Caffe are supplied as well. This repository shares scripts and workflows with Gil Levi's age and gender deep learning project page.

Model performances, depending on architecture, initialization and data preprocessing, averaged over all folds of the data set. For additional results, see section Result Overview.

Due to github's hard file size limit of 100mb per file, all model weights (i.e. the *.caffemodel files) and lmdb data files are hosted externally, via a nextcloud service of the Fraunhofer Heinrich Hertz Institute (see section Repository Content below).

All heatmap visualizations shown in the paper, such as the image at the top of the page, have been generated using the LRP implementation for Caffe, as provided by in the LRP Toolbox. Scripts assisting in the computation of heatmap visualizations can be found in folder heatmap_drawing

Exemplary LRP heatmap visualizations for the predicted classes on a gender prediction task, identifying regions in the input image used by the model to decide for (hot colors) or argue against (cold colors) the predicted (=true, in these cases) class

Repository Content

  • Folder folds contains the dataset split description for the Adience benchmark data used for training and evaluation. This folder is an extension to the one found in Gil Levi's repo and contains additional preprocessing settings.
  • training_scripts contains shell scripts used for starting the training of the neural network models.
  • DataPrepartionCode contains scripts for generating mean.binaryproto and lmdb binary blobs from raw Adience image data. This folder is an extension to the one found in Gil Levi's repo and contains additional preprocessing settings.
  • The folder mean_images contains the mean.binaryproto files for all folds and preprocessing choices, as used for training, validation and testing
  • The folder model_definitions contains the *.prototxt files for Caffe, i.e. a description of the model architecture each. Here, a naming pattern [target]_[init]_[arch][_preproc] applies, where
    • target is from {age, gender} and describes the prediction problem
    • init is from {fromscratch, finetuning, imdbwiki} and describes random initialization, a weight intialization from ImageNet pretraining, and a weight initialization from ImageNet pretraining followed by IMDB-WIKI pretraining, respectively.
    • arch is from {caffereference, googlenet, vgg16, net_definitions} and describes the architecture of the model. Here, net_definitions refers to the model architecture used in Gil Levi's repo. The net_definitions models do not have an init block within the folder name.
    • The _preproc suffix is optional and refers to _unaligned images (i.e. training images only under rotation alignment), aligned training images (landmark-based alignment, so suffix) or _mixed alignment, (i.e. both images under landmark-based and rotation-based alignment are used for training)
    • The pretrained models used as starting points (init) for training can be downloaded here. The model weights behind this link have been downloaded from the Caffe repo (caffereference, googlenet), the IMDB-WIKI project page (vgg16 on imdbwiki) and the Caffe Model Zoo (vgg16 on imagenet).
  • The lmdb files used for model training, validation testing can be downloaded here.
  • The model weights (i.e. the *.caffemodel files) to the neural network descriptions contained in this repository can be downloaded here. These files match the model definitions in folder model_definitions
  • heatmap_drawing contains scripts generating configuration files for computing LRP heatmaps using the LRP Toolbox for Caffe.

Note that you will have to adapt the (absolute) paths denoted in scripts and model description files in order to use the code.

Result Overview

Below table briefly presents the obtained results from the paper this repository belongs to.

age AdienceNet CaffeNet GoogleNet VGG16 gender AdienceNet CaffeNet GoogleNet VGG16
[i, ⋅] 51.487.0 52.587.9 54.489.3 [i, ⋅] 88.1 87.7 88.2
[r, ⋅] 51.987.4 52.689.0 54.490.0 [r, ⋅] 88.3 88.0 89.3
[m, ⋅] 53.688.4 54.489.7 56.590.8 [m, ⋅] 89.0 88.9 89.7
[i,n] 51.787.6 56.691.0 53.888.2 [i,n] 90.0 91.2 92.0
[r,n] 52.287.1 57.592.0 [r,n] 90.7 91.7
[m,n] 53.088.4 58.892.7 56.590.0 [m,n] 90.6 92.0 92.7
[i,w] 60.294.2 [i,w] 90.6
[r,w] [r,w]
[m,w] 63.096.0 [m,w] 92.3

Face categorization results in accuracy and percent, using oversampling for prediction. Left: Results for age classification. Small numbers next to the accuracy score show 1-off accuracy, the accuracy of predicting the correct age group or an adjacent one. Right: Results for gender prediction. Entries in the gender and age column indicate choices for data preprocessing and model initialization:

  • i: in-plane, landmark based face alignment, r: rotation based alignment, m: combining i and r for training and using r for testing
  • n: Imagenet pretraining, : random weight initialization and w: IMDB-WIKI pretraining following ImageNet pretraining

Bold values match or exceed the at publication time reported state of the art results on the Adience benchmark dataset.