Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HRNet keypoints head (Facial landmarks) #3391

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

SebastienLinker
Copy link

@SebastienLinker SebastienLinker commented Jul 23, 2020

This PR is a re implementation of HR Net keypoints detection (Facial landmarks): https://github.com/HRNet/HRNet-Facial-Landmark-Detection
Some ideas from Mask RCNN tentative: https://github.com/powerlic/mmdetection/commit/e7773ca738c2ff35cf7d38af117464d445a6edbc?diff=unified

The MSE loss is calculated on the heatmaps
Outputs should be a N*3 vector for N keypoints, with 2 coordinates and the confidence level for each keypoint

Current status

Trained and tested on WFLW dataset: https://wywu.github.io/projects/LAB/WFLW.html
Note that a small sigma (on the heatmap) will make it more difficult to adapt
COCO dataset has a visibility flag for each keypoint, we added this possibility and the input data should be a N*3 array similar to MSCOCO input

TODO

  • Keypoints head
  • Check results on non square images, non square heatmaps
  • Double check heatmaps decoding, we may have a 1 or 2 pixels errors when going back to the original image
  • Data augmentation
  • Output confidence level
  • Visibility flag
  • Test on CPU, see: TypeError: object of type 'DataContainer' has no len() #3048
  • Unit tests (does not cover data augmenters)
  • Asyncio interface
  • Normalize keypoints errors, interocular distance similar to HRNet paper
  • Heatmaps: We can add more flexibility and specify a decoding function in the config file

Ideas

  • BBox RoI extractor is not used in the original repo of HRNet facial landmarks, but we might need it in another project, can be added here following some ROI extractors already implemented
  • Use MS COCO dataset (could be used to test RoI extractor and allow comparison with more methods)
  • Dataset WFLW: Similar to the original HRNet paper, we focus on a small area of the image. This is done by calling CenterOnFace pipeline. What we can do is convert the dataset and load a small image already centered on the face, with the proper annotation.
  • RetinaFace: Which combines 5 keypoints and face bbox: https://github.com/deepinsight/insightface/tree/master/RetinaFace

Notes

  • The network used in HRNet keypoints detection does not have any neck, while the one used in this repo for bboxes and masks has. (I included 2 examples, with and without neck under config folder, accuracy is similar, removing the neck makes it run faster) The original ones is available here: https://github.com/HRNet/HRNet-Object-Detection/blob/master/mmdet/models/backbones/hrnet.py#L442
  • This model does not output any bbox, few changes had to be made in the global structure:
    • Some assertions have been removed to allow a model without any bbox
    • The output can be either a dictionary with the following keys (bbox, mask, keypoints, heatmaps), or the original format: a numpy array (bboxes only) or a tuple (np.array, list of dicts) for masks
    • Test function in mmdet/apis/test.py had to be modified accordingly to handle and display keypoints (TODO: Check whether mmcv can handle keypoints directly)
  • The WFLW dataset has 98 keypoints per image. This number is a bit high and it may be slow to converge especially if you train from scratch (although it eventually converges). You can reduce the number of keypoints to get faster convergence. This should be done both in the config file (model/roi_head/keypoint_head/num_keypoints) and the WflwDataset code under load_annotations
  • At train time, heatmaps are generated in the dataloader to speedup. However you need to use keep_ratio=False if the batch size is > 1, or it might create errors due to the concatenation of heatmaps of different sizes (depends on your original image size)
  • You can output heatmaps, but unlike masks, they are not compressed into RLE format. Be aware of memory usage in your own scripts

Btw, I noticed you guys recently started a new mmpose project, but we started working on that some time ago. Feel free to move it to mmpose if you think it should go there.

@CLAassistant
Copy link

CLAassistant commented Jul 23, 2020

CLA assistant check
All committers have signed the CLA.

@codecov
Copy link

codecov bot commented Jul 24, 2020

Codecov Report

Merging #3391 into master will decrease coverage by 0.67%.
The diff coverage is 38.93%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3391      +/-   ##
==========================================
- Coverage   57.97%   57.30%   -0.68%     
==========================================
  Files         202      209       +7     
  Lines       13491    14005     +514     
  Branches     2282     2375      +93     
==========================================
+ Hits         7822     8025     +203     
- Misses       5310     5601     +291     
- Partials      359      379      +20     
Flag Coverage Δ
#unittests 57.30% <38.93%> (-0.68%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
mmdet/datasets/custom.py 35.46% <0.00%> (-0.26%) ⬇️
mmdet/models/detectors/two_stage.py 74.39% <ø> (+0.58%) ⬆️
mmdet/models/roi_heads/cascade_roi_head.py 73.42% <0.00%> (-0.67%) ⬇️
mmdet/models/roi_heads/standard_roi_head.py 66.42% <0.00%> (-0.25%) ⬇️
mmdet/datasets/pipelines/transforms.py 66.15% <5.40%> (-7.02%) ⬇️
mmdet/datasets/pipelines/loading.py 46.70% <8.33%> (-1.42%) ⬇️
mmdet/apis/test.py 12.71% <12.50%> (+0.09%) ⬆️
mmdet/core/keypoint/keypoint_target.py 14.28% <14.28%> (ø)
mmdet/models/roi_heads/keypoint_roi_head.py 19.11% <19.11%> (ø)
mmdet/datasets/wflw.py 19.79% <19.79%> (ø)
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b05c17e...34dd229. Read the comment docs.

@joihn
Copy link

joihn commented Sep 29, 2021

Hello, any news ? A merge would be helpfull to use a keypoint regression in combination with the latest models from the zoo :)

@SebastienLinker
Copy link
Author

Hello, any news ? A merge would be helpfull to use a keypoint regression in combination with the latest models from the zoo :)

Well it is more than one year old now, I am not sure how much work is needed to make it compatible with latest mmdet.
Sorry I don't really have time to keep it up to date. Feel free to try and update if necessary :)

@23119841
Copy link

Hello, any news ? A merge would be helpfull to use a keypoint regression in combination with the latest models from the zoo :)

Well it is more than one year old now, I am not sure how much work is needed to make it compatible with latest mmdet. Sorry I don't really have time to keep it up to date. Feel free to try and update if necessary :)

Thanks for your contribution. Is it possible to use this keypoints header for the swin transformer which based on mmdection?

@SebastienLinker
Copy link
Author

Hello, any news ? A merge would be helpfull to use a keypoint regression in combination with the latest models from the zoo :)

Well it is more than one year old now, I am not sure how much work is needed to make it compatible with latest mmdet. Sorry I don't really have time to keep it up to date. Feel free to try and update if necessary :)

Thanks for your contribution. Is it possible to use this keypoints header for the swin transformer which based on mmdection?

I guess you can do some refactoring and get a keypoints head compatible with the swin transformer

@23119841
Copy link

Hello, any news ? A merge would be helpfull to use a keypoint regression in combination with the latest models from the zoo :)

Well it is more than one year old now, I am not sure how much work is needed to make it compatible with latest mmdet. Sorry I don't really have time to keep it up to date. Feel free to try and update if necessary :)

Thanks for your contribution. Is it possible to use this keypoints header for the swin transformer which based on mmdection?

I guess you can do some refactoring and get a keypoints head compatible with the swin transformer

Thanks for your advice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants