Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

about the result #3

Closed
sunny0315 opened this issue Jun 15, 2020 · 10 comments
Closed

about the result #3

sunny0315 opened this issue Jun 15, 2020 · 10 comments

Comments

@sunny0315
Copy link

I try to run the file 'serfiq_example.py' and got the result of two test images which you providied in './data/'. The score of 'test_img.jpeg' is 0.89 ,and the score of 'test_img2.jpeg' is 0.87.Are the resuls correct? Yet,I test on other images like side faces and front faces.But their results are indistinguishable,Is there something wrong with my use? Thank you for your reply!

@codelilei
Copy link

codelilei commented Jun 15, 2020

The same problem as above. I have also tried to save 'pre_fc1_bias.npy' and 'pre_fc1_weights.npy' manually, sicnce the arcface pretrained model is not spcified by the author, but the problem still exists.

@jankolf
Copy link
Collaborator

jankolf commented Jun 15, 2020

Hi,
the results or scores seem to be correct.

@sunny0315: If you look at the following paper, what is also shown in the repository, you will see that the score distribution of SER-FIQ on Arcface is mostly in the range of 0.86-0.90:
https://arxiv.org/abs/2004.01019.
Here is an image of a score distribution for different ethnics on the ColorFeret dataset.
Depending on how well Arcface can handle an input image, the score will change. Details can be taken from the two papers in the repository. If Arcface/Insightface is able to produce a good, stable embedding from the image, the score will be higher than with an image where Arcface is "less stable".

@codelilei : The weights stored in the data folder are the weights that were used for the experiments. We have used a pretrained model from Insightface's model zoo. The model used in our experiments is the "LResNet100E-IR,ArcFace@ms1m-refine-v2" pre-trained model, downloaded in April 2019.

If you have further questions, please do not hesitate to ask them.

Best regards,
Jan

@pterhoer
Copy link
Owner

Hi everyone,
first of all, thanks for your interesting in our work!

As Jan already mentioned, unfortunately, SER-FIQ on ArcFace produces very narrow quality estimates. Although this narrow quality range is unconvienet, it is still meaningful! (If you take more than 2 decimal places into account).

To get a more "natural" quality range, you can simply use scaling methods, such as MinMax normalization. Or, if you are interested, we can add a scaling parameter to the model to output quality scores in a convient range of [0,1].

Best,
Philipp

@codelilei
Copy link

The model used in our experiments is the "LResNet100E-IR,ArcFace@ms1m-refine-v2" pre-trained model

@jankolf yeah, I got exactly the same weights saved from "LResNet100E-IR,ArcFace@ms1m-refine-v2" downloaded yesterday.

I was mainly confused about the fact that the profile face image could also get a score higher than 0.8, which seemed to be inconsistent with the first impression brought by the distribution figure shown in your another paper https://arxiv.org/abs/2004.01019.

the score distribution of SER-FIQ on Arcface is mostly in the range of 0.86-0.90

To get a more "natural" quality range, you can simply use scaling methods, such as MinMax normalization.

Now it seems to make sense, I didn't notice the starting point of the x-coordinate. I will try more images later and focus on the relative size relation.

Nice work! Thanks for your quick reply!

@sunny0315
Copy link
Author

sunny0315 commented Jun 16, 2020

@jankolf @pterhoer Thanks for your quick reply! At first, I thought that based on this score, we could solve the problem such as pose, occlusions and expressions which learned from the paper https://arxiv.org/pdf/2003.09373.pdf.So I found some representative pictures, and their scores didn't seem to have a strong correlation and can't filter them just by setting a score threshold(probability may lead to misjudgment).Maybe this method can be used to judge whether it is suitable for a recognition system.
image

@pterhoer
Copy link
Owner

@codelilei Thanks for your feedback! If you find any other problems, just contact us.

@sunny0315 The face quality score of SER-FIQ relates to how well the deployed face recognition model can deal with the input image. If your network can deal well with various poses, occlusions, and expressions, SER-FIQ will not produce low quality values. In this cases, I would recommend to use a network that is not robust to such variations. Then SER-FIQ will produce low quality values for images with these variations.
(Btw, the same goes for the use of biased networks. If the deployed network is biased, the obtained quality values will contain the same bias as well.)

@sunny0315
Copy link
Author

@pterhoer
I see. Thank you very much!

@pterhoer
Copy link
Owner

@pterhoer
I see. Thank you very much!

You are welcome :)

@RyanCV
Copy link

RyanCV commented Jul 27, 2020

Hi @pterhoer

If your network can deal well with various poses, occlusions, and expressions, SER-FIQ will not produce low quality values. In this cases, I would recommend to use a network that is not robust to such variations.
Then, what is the purpose of using SER-FIQ? It seems the SER-FIQ is strongly dependent on and positively correlated with the network, right? Then, how reliable the face quality value it predicts?

@pterhoer
Copy link
Owner

Hi again RyanCV,

if your network can deal well with variations, such as poses, occlussions, and expressions, it will produce relatively stable representations. Stable representations leads to less variations in the stochastic embeddings and thus, to a high robustness and quality estimates.

Best,
Philipp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants