Question about 300W-LP labels acquirements #61

FunkyKoki · 2022-01-06T15:37:43Z

Thanks for your work and help. Now I have got a lite-weight model using MobileNetV3-small as backbone. This lite-weight model can achieve the same pose evaluation performance on AFLW2000 as your model (both without fine-tuning on 300W-LP).

Now I am focused on fine-tuning. In your paper, you said:

Training pose rotation labels are obtained by converting the 300W-LP ground-truth Euler angles to rotation vectors, and pose translation labels are created using the ground-truth landmarks, using standard means.

I open this issue to confirm several things, and I will be very grateful if you can help.

Here are my questions:

How did you define the face bounding box for each image in 300W-LP since in some images there can be two or more faces detected? Did you just use one bounding box, which has landmark annotations? How did you get the bounding box? Did you use a face detector, like InsightFace?
Since the only one face in each image of 300W-LP are annotated with 68 points, can I directly use the labeled landmarks to make the JSON files, and choose to use self.threed_68_points in the code here to generate the lmdb file?

That's all. Thank you so much.

The text was updated successfully, but these errors were encountered:

eugeneYz · 2022-01-08T02:07:30Z

have u ever found the prediction result x,y,z an error (especially z when i use my webcam to detect)?

FunkyKoki · 2022-01-10T14:41:49Z

have u ever found the prediction result x,y,z an error (especially z when i use my webcam to detect)?

@eugeneYz

What is the model you are using? Are you using the fine-tuned model?

By the way, the question you asked is not related to the topic at all.

You can ask @vitoralbiero in a new issue.

FunkyKoki · 2022-01-11T02:09:42Z

Hi, @vitoralbiero , why don't you release the 300W-LP annotations? This makes me feel strange. Is this because of the copyright?

I use InsightFace to detect the faces and the corresponding landmarks (3D, 68 points) for each image in 300W-LP, only one face closest to the center of the image would be saved as the ground truth (with .json format). Then I use the convert_json_list_to_lmdb.py to convert them into a lmdb file and use it for training.

But the evaluation performance on AFLW2000 is not boosted at all.

vitoralbiero · 2022-01-12T16:15:28Z

Hello @FunkyKoki,

I don't have time right now to release the annotations. But we may do so in the future.

300W-LP comes with Euler angles and landmarks, so we converted the provided Euler angles to get rotation vectors, and used the landmarks to get translation vectors.

To fine-tune on 300W-LP, please follow section 4.1 of our paper, which describes the training and fine-tuning steps.

Hope this helps.

FunkyKoki · 2022-01-14T07:00:31Z

Hi there, thanks for your answer, but I still cannot figure out how did you define the ground truth face bounding box for each image?

vitoralbiero · 2022-01-14T21:21:52Z

We used the provided landmarks to get a bounding box. You can use this function to get a bbox using landmarks.

vitoralbiero added the question Further information is requested label Jan 12, 2022

vitoralbiero closed this as completed Jan 16, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Question about 300W-LP labels acquirements #61

Question about 300W-LP labels acquirements #61

FunkyKoki commented Jan 6, 2022

eugeneYz commented Jan 8, 2022

FunkyKoki commented Jan 10, 2022 •

edited

FunkyKoki commented Jan 11, 2022

vitoralbiero commented Jan 12, 2022

FunkyKoki commented Jan 14, 2022

vitoralbiero commented Jan 14, 2022

Question about 300W-LP labels acquirements #61

Question about 300W-LP labels acquirements #61

Comments

FunkyKoki commented Jan 6, 2022

eugeneYz commented Jan 8, 2022

FunkyKoki commented Jan 10, 2022 • edited

FunkyKoki commented Jan 11, 2022

vitoralbiero commented Jan 12, 2022

FunkyKoki commented Jan 14, 2022

vitoralbiero commented Jan 14, 2022

FunkyKoki commented Jan 10, 2022 •

edited