Joint Face Detection and Alignment using Multi-task Cascaded Convolutional Networks
Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao
IEEE Signal Processing Letter, CCF C
Official matlab, PyTorch(only implement inference, convert weights from matlab)
to improve the performance face detection and face alignment using deep learning.
The main problems are:
- occlusion
- illumination
- various poses
- using multi-task leaning, here they using face detection(binary classification, bounding box regression, and face landmark detection)
- using cascaded convolution networks, 3 networks here to generate the final result from coarse to fine.
- classification with softmax
- bounding box regression and landmark regression using euclidean loss
- The effectiveness of online hard sample mining
- The effectiveness of joint detection and alignment
- Evaluation on face detection
- Evaluation on face alignment
- the first cascade face detection classifier was proposed in 2004, not by this paper. In this paper, they change the traditional methods into deep learning.
- the idea of coarse to fine has a long history.
- even though someone has use face detection and face alignment to do multi-task learning
- deformable part model require high computational cost
- any better hard samples mining methods? like focal-loss?
- this paper using image pyramid? can we use feature pyramid instead?
- for detection problems, there are some subproblems
- construct the training sets, like roidb in faster-rcnn
- using roidb for training
- why the input size is so small?
- do we really need facial landmark in training Pnet and Rnet?
- train 3 networks independently or jointly? Independently