Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SCRFD issues #1518

Open
SthPhoenix opened this issue May 15, 2021 · 28 comments
Open

SCRFD issues #1518

SthPhoenix opened this issue May 15, 2021 · 28 comments

Comments

@SthPhoenix
Copy link
Contributor

SthPhoenix commented May 15, 2021

Hi! I'm testing your new SCRFD face detector and have noticed some issues with onnx inference code and network outputs:

  1. In scrfd.py line 275 you are filtering bboxes, but later at line 278 you return det, so max_num parameter have no effect and may cause exceptions.

  2. Later at line 335 you are calling detector without providing input shape, which wont work with model having dynamic shape. However it won't be an issue when called from face_analysis.py

  3. I have noticed that detector returns very low scores or even fails on faces occupying >40% of image, it's especially visible for square shaped images, when there can't be provided additional padding during resize process. Also I have noticed that in such cases accuracy increases when lowering detection size (i.e. 480x480), and decreases when increasing it (i.e 1024x1024).
    Here is an example of detection at 640x640 scale:
    1
    Original image size is 1200x1200.
    As you can see when detection is run with resize to 640x640 score is 0.38
    For 480x480 score is 0.86, and for 736x736 score is 0.07.
    Same behavior is noticed for both scrfd_10g_bnkps and scrfd_2.5g_bnkps models.
    In some cases it might be fixed by adding fixed padding around image, but it might lead to decreased accuracy for other image types, so it can't be applied by default.

BTW: Thanks for your great work!

@nttstar
Copy link
Collaborator

nttstar commented May 16, 2021

Hi @SthPhoenix, thanks for your attention.

  1. Just fixed.
  2. For models that support dynamic input, you should pass the input_size param for detect() method.
  3. It actually depends on the anchor design. But generally, it's good for 640 input size. You can try more pictures.

@nttstar
Copy link
Collaborator

nttstar commented May 16, 2021

BTW: if you're in a single-face situation, input size of 384/256(even 128) without padding is recommended.

@SthPhoenix
Copy link
Contributor Author

  1. That's great! Thanks!
  2. Yes, I've just meant that example code will throw exception out of box, though easily fixable.
  3. New detector works just great for other image types, beside the ones with large faces.

BTW: if you're in a single-face situation, input size of 384/256(even 128) without padding is recommended.

I'm developing face recognition REST API based on InsightFace models and TensorRT inference backend.
The problem comes with unconstrained heterogeneous input, like random photo galleries or images provided by users, where optimal settings can't be set in advance. Original Retinaface detector works great in such scenario, but it has a bit lower accuracy and much lower speed than your new SCRFD detector.

It looks like that new detector was mostly optimized for small faces and is a bit undertrained for large faces. Can it be somehow fixed during training or is it a design flaw?

@nttstar
Copy link
Collaborator

nttstar commented May 17, 2021

My suggestion( if you want to use the pre-trained models ):
Option 1: Use 512 input size
Option 2: Use a combination of 640 and 256 inputs with some engineering tricks, which is only 16% more flops than the single 640 input.

@SthPhoenix
Copy link
Contributor Author

Thanks! I was investigating these options yesterday, option 2 is more promising but more logically complicated.
In a long term retraining seems a better solution, could you give any hints on what parameters could be tuned?

@SthPhoenix
Copy link
Contributor Author

Little update: this bug seems to be related only to *_bnkps models. When using models without key points detection work as expected

@SthPhoenix
Copy link
Contributor Author

SthPhoenix commented May 26, 2021

@nttstar , I have retrained scrfd_2.5g_bnkps, with batch norm replaced with group norm (new model should be called scrfd_2.5g_gnkps i think), just like for scrfd_2.5g model, I achieved following WiderFace AP:

Model Easy Medium Hard
scrfd_2.5g_bnkps 93.80 92.02 77.13
scrfd_2.5g_gnkps 93.57 91.70 76.08

As you can see this config gives a small accuracy decrease, while completely solves problem with large faces. For above example I'm getting score around 0.7

@nttstar
Copy link
Collaborator

nttstar commented May 26, 2021

@SthPhoenix Thanks! Did you make the feature maps shared? BTW, you can open a new repo to place this new model so that I can give a link to it, if you want.

@SthPhoenix
Copy link
Contributor Author

SthPhoenix commented May 27, 2021

@SthPhoenix Thanks! Did you make the feature maps shared?

No, shared feature maps seems to reduce accuracy more noticably.

BTW, you can open a new repo to place this new model so that I can give a link to it, if you want.

I'm training scrfd_10g_gnkps right now, both models will be included in InsightFace-REST repo, though it would be great if you mention it )
Also I can make a pull request with updated configs for 0.5, 2.5, 10 and 34g models and dockerfile for training scrfd in docker.

@xxxpsyduck
Copy link

@SthPhoenix So you still using widerface, only change the config?

@SthPhoenix
Copy link
Contributor Author

@SthPhoenix So you still using widerface, only change the config?

Yes, just it.

@nttstar
Copy link
Collaborator

nttstar commented May 27, 2021

@SthPhoenix Thanks! Did you make the feature maps shared?

No, shared feature maps seems to reduce accuracy more noticably.

BTW, you can open a new repo to place this new model so that I can give a link to it, if you want.

I'm training scrfd_10g_gnkps right now, both models will be included in InsightFace-REST repo, though it would be great if you mention it )
Also I can make a pull request with updated configs for 0.5, 2.5, 10 and 34g models and dockerfile for training scrfd in docker.

No, shared feature maps seems to reduce accuracy more noticably.
Have you tested it? How about the mAP?

@SthPhoenix
Copy link
Contributor Author

I have tested it by modifying scrfd_500m.py (as it could be faster trained) as follows:

norm_cfg=dict(type='GN', num_groups=16, requires_grad=True),
cls_reg_share=True,
strides_share=True,

After 640 epoch I've got following mAP:
0.881, 0.851, 0.619
Which is much lower than results you reported for same model without KPS, and even newer scrfd_500m_bnkps you published yesterday.

So I have trained scrfd_2.5g_gnkps model and training scrfd_10g_gnkps model now with following config:

norm_cfg=dict(type='GN', num_groups=16, requires_grad=True),
cls_reg_share=True,
strides_share=False,

BTW, scrfd_500m_gnkps model had no improvement on large faces, though I'm not sure if it's connected to strides_share=True, I'll try retraining this model and checking again.

@nttstar
Copy link
Collaborator

nttstar commented May 27, 2021

Shared feature map should be better by using GN, from my experiments of resnet based backbone.

@SthPhoenix
Copy link
Contributor Author

Shared feature map should be better by using GN, from my experiments of resnet based backbone.

Hmmm, I'll check it on other models, thanks!

@SthPhoenix
Copy link
Contributor Author

I have released retrained models at my repo.
Models accuracy on WiderFace benchmark:

Model Easy Medium Hard
scrfd_10g_gnkps 95.51 94.12 82.14
scrfd_2.5g_gnkps 93.57 91.70 76.08
scrfd_500m_gnkps 88.70 86.11 63.57

All models were trained with following settings:

norm_cfg=dict(type='GN', num_groups=16, requires_grad=True),
cls_reg_share=True,
strides_share=False,

Model scrfd_10g_gnkps was trained up to 720 epoch, for some reason it gives best results at this checkpoint, though all other models begins to degrade after 640 epoch.

@czzbb
Copy link

czzbb commented Jul 27, 2021

@nttstar , I have retrained scrfd_2.5g_bnkps, with batch norm replaced with group norm (new model should be called scrfd_2.5g_gnkps i think), just like for scrfd_2.5g model, I achieved following WiderFace AP:

Model Easy Medium Hard
scrfd_2.5g_bnkps 93.80 92.02 77.13
scrfd_2.5g_gnkps 93.57 91.70 76.08
As you can see this config gives a small accuracy decrease, while completely solves problem with large faces. For above example I'm getting score around 0.7

Hi,I just run the bash CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_2.5g.py 4 and only achieve 62.4ap. Did anything I miss?

@SthPhoenix
Copy link
Contributor Author

Hi @czzbb ! My config was based on scrfd_2.5g_bnkps.py modified according to my previous post.

@czzbb
Copy link

czzbb commented Jul 28, 2021

Hi @czzbb ! My config was based on scrfd_2.5g_bnkps.py modified according to my previous post.

Hi, I can't reproduce the official results(I just got 62.4, while it should be 77 ap). So I wonder is there anything I ignore. I download the datasets and annoations, and then directly run CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_2.5g.py 4

@SthPhoenix
Copy link
Contributor Author

Hi, I can't reproduce the official results(I just got 62.4, while it should be 77 ap). So I wonder is there anything I ignore. I download the datasets and annoations, and then directly run CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_2.5g.py 4

If you are using original configs you should get mAP close to published values without issues.
Have you tested mAP using evaluation script or this is mAP logged during training? You should refer to mAP outputted by evaluation script.

@czzbb
Copy link

czzbb commented Jul 28, 2021

Hi, I can't reproduce the official results(I just got 62.4, while it should be 77 ap). So I wonder is there anything I ignore. I download the datasets and annoations, and then directly run CUDA_VISIBLE_DEVICES="0,1,2,3" PORT=29701 bash ./tools/dist_train.sh ./configs/scrfd/scrfd_2.5g.py 4

If you are using original configs you should get mAP close to published values without issues.
Have you tested mAP using evaluation script or this is mAP logged during training? You should refer to mAP outputted by evaluation script.

Thanks a lot! I got this 62.4 ap during training. But I get 77 ap using the evaluation script. Can't imagine the difference could be so huge.

@tyxsspa
Copy link

tyxsspa commented Sep 24, 2021

@nttstar @SthPhoenix
Hi can you explain why GN works for big face? I retrain scrfd_10g_bnkps with specific big face(occupying >80% of image) augumentation and add big face ~20% in each batch, the model can handle big face detect, then i retrain scrfd_34g_bnkps, it works bad ,all big faces not detected. Then i change to scrfd_34g_gnkps, it works, but GN not supported by tensorRT(@SthPhoenix can you tell me how to transform GN onnx models to RT models?)

@SthPhoenix
Copy link
Contributor Author

Then i change to scrfd_34g_gnkps, it works, but GN not supported by tensorRT(@SthPhoenix can you tell me how to transform GN onnx models to RT models?)

In TensorRT you should load optional layers support, for Python TRT API you should just add this line just after import:

import tensorrt as trt
trt.init_libnvinfer_plugins(None, "")

@markdchoung
Copy link

Hi, @tuoyuxiang @SthPhoenix

I followed the guide above, but the model performance cannot be reproduced. The model performed well on widerface validation dataset, but it performed poorly on images with large faces.

Did you train with the following configuration?

norm_cfg=dict(type='GN', num_groups=16, requires_grad=True),
cls_reg_share=True,
strides_share=False,

Did you train the model without any other additional methods? Then, the trained model perform well on large faces?
How many gpu did you use? The total batch size will vary depending on the number of gpu, could this also affect the performance? I used 8 gpus for training.

@tyxsspa
Copy link

tyxsspa commented Feb 16, 2023

Hi, @tuoyuxiang @SthPhoenix

I followed the guide above, but the model performance cannot be reproduced. The model performed well on widerface validation dataset, but it performed poorly on images with large faces.

Did you train with the following configuration?

norm_cfg=dict(type='GN', num_groups=16, requires_grad=True),
cls_reg_share=True,
strides_share=False,

Did you train the model without any other additional methods? Then, the trained model perform well on large faces? How many gpu did you use? The total batch size will vary depending on the number of gpu, could this also affect the performance? I used 8 gpus for training.

I use this to improve large faces, you can try it:
https://modelscope.cn/models/damo/cv_resnet_facedetection_scrfd10gkps/summary

@JackLin-Authme
Copy link

JackLin-Authme commented Apr 20, 2023

Hi, there.

I found that using default scales augmentation (range = [0.3, 0.45, ..., 2.0]) for training process could make tons of tiny faces be negative samples by using ATSS, but in widerface evaluation, these tiny faces are important on hard protocol evaluation.

截圖 2023-04-20 11 35 19

Base on ATSS algorithm, one step is to select anchors inside gt boxes. These tiny faces could not be included by selected anchors or have little anchors to be predicted.

截圖 2023-04-20 11 23 02

Does anyone have this question and how do you explain and solve it?

By the way, I solve large face problem by replacing SGD optimizer with AdamW.

@NahidEbrahimian
Copy link

dockerfile for training scrfd in docker.

@SthPhoenix Can you share your docker file with me for training scrfd in docker?

I made one docker file, but in the training process in docker and in specific epochs, the training stopped with an error.

@Jack-Lin-NTU
Copy link

Hi, there.

I found that using default scales augmentation (range = [0.3, 0.45, ..., 2.0]) for training process could make tons of tiny faces be negative samples by using ATSS, but in widerface evaluation, these tiny faces are important on hard protocol evaluation.

截圖 2023-04-20 11 35 19

Base on ATSS algorithm, one step is to select anchors inside gt boxes. These tiny faces could not be included by selected anchors or have little anchors to be predicted.

截圖 2023-04-20 11 23 02

Does anyone have this question and how do you explain and solve it?

By the way, I solve large face problem by replacing SGD optimizer with AdamW.

I read SCRFD paper, the authors said faces smaller than 4 x 4 pixel would be dropped, but the source code is not added this constrain? is it right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants