Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two boxes, same points? #18

Closed
realleyriley opened this issue Jun 28, 2019 · 4 comments
Closed

Two boxes, same points? #18

realleyriley opened this issue Jun 28, 2019 · 4 comments

Comments

@realleyriley
Copy link

realleyriley commented Jun 28, 2019

In the dlib_faces_5points.tar there are two boxes for each picture in the xml files with the same 5 landmark points. Why are there two boxes? Did you run two different face detectors? Maybe dlib's and OpenCV's face detectors?

An example of the two boxes from the third image in the train set:
image

@realleyriley
Copy link
Author

realleyriley commented Jun 29, 2019

After thinking about it just now, I think the two boxes might be for making the landmark predictor more robust. I ran the script below to find any boxes that had different landmark coordinates, but I found nothing. By using different boxes, the shape predictor can train itself to identify landmarks more generally. The landmark predictor takes in the bounding box as input, and by training it on different bounding boxes, we can expand the training set I believe.

import xml.etree.ElementTree as ET
root = ET.parse('train_cleaned.xml').getroot()`

for image in root.find('images').findall('image'):
    boxes = image.findall('box')    # find all boxes in this image
    if(len(boxes) < 1):
        print('no boxes? ', image.attrib)
    if(len(boxes) > 1):             # some images only have one box
        parts0 = [part.attrib for part in boxes[0].findall('part')]
        parts1 = [part.attrib for part in boxes[1].findall('part')]
        if(parts0 != parts1):
            print('unequal points ', image.attrib)

@davisking
Copy link
Owner

Yes this is the reason. The two sets of boxes come from dlib's two face detectors. Training on both of them causes the resulting model to work well for both face detectors, since as you noted, the landmarking model's predictions are seeded from the position of the bounding box. So boxes that are placed systematically different from how they appear in training data would result in worse performance. Since the two face detectors use slightly different box annotation schemes this would be a problem if the landmarking model wasn't trained to be aware of this.

@realleyriley
Copy link
Author

Thank you for responding, it makes sense.

Another question: why is there a black border on many of the pictures? And why are their dimensions square? I flipped through the first 50 and noticed many of them are 711x711 or 411x411 and a few other dimensions.
How will the black pixels affect the training, and the resulting model?

The training is done in the shape_predictor_trainer.h file, but I don't understand it well enough to answer my own question.

@davisking
Copy link
Owner

davisking commented Jun 30, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants