# Face Emotion Recognition

![faces](FER+vsFER.png)

## Preprocessing

### Contents

## Structure

### GoogLeNet [1]

* 딥러닝은 망이 깊고(depth) 레이어가 넓을 수록(wide) 성능이 좋다.
* 현실에서는 과최적화나 그레디언트 소멸문제가 발생
* 망내 연결을 줄이면서 행렬연산에서는 dense하게 연산되는 모델 구성

![GLN](f02.png)

### Inception module

중간에 추가된 1$\times$1 convolution layer에 의해 계산상의 병목현상(bottleneck)을 줄이고 연산 페널티를 크게 늘리지 않으면서 네트워크의 깊이와 넓이를 확장해준다.

![inception](inception_1x1.png)

In [6]:
from IPython.display import IFrame
IFrame('http://iamaaditya.github.io/2016/03/one-by-one-convolution/', 1000, 450)

In [None]:
# Inception (dimension reductions) -- b

tower_0 = Conv2D(160, (1, 1), padding='same', activation='relu', kernel_regularizer=l2(reg[1]))(x)

tower_1 = Conv2D(112, (1, 1), padding='same', activation='relu', kernel_regularizer=l2(reg[1]))(x)
tower_1 = Conv2D(224, (3, 3), padding='same', activation='relu', kernel_regularizer=l2(reg[0]))(tower_1)

tower_2 = Conv2D(24, (1, 1), padding='same', activation='relu', kernel_regularizer=l2(reg[1]))(x)
tower_2 = Conv2D(64, (5, 5), padding='same', activation='relu', kernel_regularizer=l2(reg[0]))(tower_2)

tower_3 = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(x)
tower_3 = Conv2D(64, (1, 1), padding='same', activation='relu', kernel_regularizer=l2(reg[1]))(tower_3)

x = keras.layers.concatenate([tower_0, tower_1, tower_2, tower_3], axis=3)

GoogLeNet에서 input data는 224$\times$224 이미지이다. fer2013의 경우 48 $\times$ 48 이미지가 input data이기 때문에 적절하게 layer를 줄여야한다.[2]

![](glns.png)

### Other approaches


|Method | FD | LM | Registration | Illumionation | Accuracy(Public) | Architecture | Depth | Parameters | AD | AF | + Train | + Test | Ensemble |
|-------|----|----|-------------|----------------|---------------------------|----|----|-------------|:----:|:----:|:----:|--------|---------|
|Method 1| no| no|no | normalize | 71.2%|   CPCPFF| 4 | 12.0 m |  no | no | S, M | -| average |
|Method 2| several | no | no | histeq, LPF |72%|   PCCPCCPCFFF | 8 | 6.2 m | no | no | A, M | A| weighted |
|Method 3 | no |ref* | rigid (LM) | several | 73.5% |   CPCPCPFF | 5 | 2.4 m |  no | yes | T, M, REG | ten-crop, REG| average |
|Method 4| several | ref* |rigid (LM) | several |72.5%|   CPCPCPFF | 5 | 4.8 m |  no | no | T, M | - | hierarchy |
|Method 5| no | ref* | affine (LM) | no | 66.5% |  CPCPIIPIPFFF | 11 | 7.3 m | yes | no | ten-crop |-| -|
|Method 6| no | ref* | indirect | no | 75% |  CPNCPNCPCFF | 6 | 21.3 m | yes | yes | - |- |- |

#### 약자 설명

|Preprocessing|Structure|Differences1| Differences2|
|:---|:---|:---|:---|
|FD: Facial Detection|C: Convolutional|AD: Additional Training Data|S: Similarity Transformation|
 |LM: Facial Landmark Extraction|P: Pooling| AF: Additional Features |A: Affine Transformation|
 |HISTEQ: Histogram Equalization|N: Response-Normalization| +: Data augmentation |T:Translation|
 |LPF: Linear Plane Fitting|I: Inception|  REG: Face Registration |M: Horizontal Mirroring|
 ||F: Fully connected layers| 

## Conclusion

![emotion](Plutchik-Model-600.png)

* 사람의 인식율: 65$\pm$5 %

## Comments and Future works

* 데이터 중 얼굴이 아닌 데이터 존재
    * 얼굴 인지(Face registration)를 연구하여 성능향상
* Image rotation, shearing, resizing 기술 적용
* 다양한 데이터셋 시도
* 얼굴 특징 인지(Facial Landmark detection)를 연구하여 성능향상(진행중)
    * IntraFace 지원종료로 다른 방법 시도(dataset에서 직접 training, 다른 패키지 시도)

In [3]:
from IPython.display import IFrame
IFrame('http://www.humansensing.cs.cmu.edu/intraface/download.html', 800, 450)

## References

[1] Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, Andrew Rabinovich, "Going Deeper with Convolutions", https://arxiv.org/abs/1409.4842

[2] Ali Mollahosseini, David Chan, Mohammad H. Mahoor, "Going Deeper in Facial Expression Recognition using Deep Neural Networks", https://arxiv.org/abs/1511.04110

Method 1: [3] Y. Tang, "Deep Learning using Linear Support Vector Machines", 2013

Method 2: [4] Z. Yu and C. Zhang, “Image based static facial expression recognition with multiple deep network learning,” in ACM International Conference on Multimodal Interaction (MMI), 2015

Method 3: [5] B.-K. Kim, S.-Y. Dong, J. Roh, G. Kim, and S.-Y. Lee, “Fusing Aligned and Non-Aligned Face Information for Automatic Affect Recognition in the Wild: A Deep Learning Approach", 2016

Method 4: [6] B.-K. Kim, J. Roh, S.-Y. Dong, and S.-Y. Lee, "Hierarchical committee of deep convolutional neural networks for robust facial expression recognition", 2016,

Method 5: [7] A. Mollahosseini, D. Chan, and M. H. Mahoor, “Going Deeper in Facial Expression Recognition using Deep Neural Networks", 2015

Method 6: [8] Z. Zhang, P. Luo, C.-C. Loy, and X. Tang, “Learning Social Relation Traits from Face Images", 2015

ref*: [9] X. Xiong and F. Torre, “Supervised descent method and its applications to face alignment", 2013

[10] Christopher Pramerdorfer, Martin Kampel, "Facial Expression Recognition using
Convolutional Neural Networks: State of the Art", https://pdfs.semanticscholar.org/4edc/7f27d4512b69be54abfc6b9876e5b00725ab.pdf