Covariance Pooling for Facial Expression Recognition
There are following main parts:

  • Covariance Pooling of Convolution Features
  • Temporal Pooling of Features


For pooling Convolution Features, I do not have exact hyperparameters to reproduce exact numbers in the paper. I at least obtained 86% with model1 ( and However, I have uploaded all the pretrained models from the paper.

Pooling Convolution Features

You can download following models (2.5 GB total)

  • models and run (after uncommenting appropriate lines)
  • For the code for inception-resnet-v1, I the used same implementation of inception-resnet in facenet
  • For baseline, the network is same as included here except this code contains few additional (covariance pooling) layers.

Pooling Temporal Features:

Features extracted with CNN (model proposed in the paper) from AFEW dataset are placed in zip Extract the zip to afew_features in same folder. To classify result, simple run bash


  • python 2.7
  • tensorflow
  • numpy
  • sklearn

Some Notes:

  • This code framework is mostly based on facenet
  • Apply the patch suggested in tensorflow_patch.txt file. While computing gradient of eigen-decomposition, NaNs are returned by tensorflow when eigenvalues are identical. This throws error and cannot continue training. The patch only replaces NANs with zeros. This makes training easier.
