Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Alignment improvements #50

Closed
bamos opened this issue Nov 5, 2015 · 7 comments
Closed

Alignment improvements #50

bamos opened this issue Nov 5, 2015 · 7 comments
Milestone

Comments

@bamos
Copy link
Collaborator

bamos commented Nov 5, 2015

From @hbredin: df40fa2#commitcomment-14206211

do we really need to detect the same face twice (once in the original image, once in the warped image)?
wouldn't it be less error-prone to directly warp the original image into a 96x96 normalized thumbnail?
I do not understand why we need to do that in two steps -- I must have missed something...

@bamos
Copy link
Collaborator Author

bamos commented Nov 5, 2015

Hi @hbredin - you're right, we don't. It was just easiest for prototyping to add these redundantly. Opening this issue to improve this portion of alignment, it should reduce the execution time.

@bamos bamos changed the title Alignment: Warp into normalized thumbnail without multiple detections Alignment improvements Nov 12, 2015
@bamos bamos added this to the v0.2.0 milestone Nov 12, 2015
@bamos
Copy link
Collaborator Author

bamos commented Nov 12, 2015

Related:

I spent a little bit of time looking at OpenFace and noticed two potential
issues with the alignment code. I'll explain briefly and am happy to send
you some test images if you feel this makes any sense.

I used your comparison demos to get distances for face-crops of a single
identity. In one experiment I used only 20 crops and noticed that 10 pairs
had a distance > 1.0.

When I looked more closely I noticed the 3 landmarks you use to align the
faces were NOT properly aligned in the -warped-annotated output files for
some of my images. Resizing the image before calling your alignImg method
fixes the alignment on all my test cases (but I have not fully investigated
why this is the case).

I also noticed that to compute the final crop-box you use again dlib's face
detector. This seems to cause the landmarks above to not be placed
consistently inside the box. So another quick change I made was to compute
the crop box from the landmarks... What is the rationale behind preferring
the box from running detection again?

After re-running my original experiment with the above two very simple
changes, the number of pairs at a distance > 1.0 was reduced from 10 to 4.

@bamos
Copy link
Collaborator Author

bamos commented Nov 12, 2015

  • Another improvement: Use nose instead of the bottom lip

These improvements are likely to improve our LFW performance.

@hbredin
Copy link

hbredin commented Nov 13, 2015

I see this issue is marked with the "help wanted" tag. How can we help?

I'd suggest something very simple inspired by this:

size = 96
EYES_AND_NOSE = np.array([39, 42, 33])
H = cv2.getAffineTransform(detected_landmarks[EYES_AND_NOSE], 
                           size * mean_landmarks[EYES_AND_NOSE])
thumbnail = cv2.wrapPerspective(image, H, (size, size))

where mean_landmarks are already normalized between 0 (min) and 1 (max) in mean.csv file.

However, I guess if we change the way faces are normalized, the whole neural net needs to be retrained, right?

@bamos bamos mentioned this issue Nov 17, 2015
7 tasks
@bamos
Copy link
Collaborator Author

bamos commented Nov 17, 2015

Hi @hbredin - thanks for the code snippet. Currently marked as help wanted to show I'm not actively working on this and am happy for somebody else to help. Otherwise, I'll probably get to this in a few weeks.

If we're lucky, this change will be compatible with the existing nn4.v1 network and will yield similar accuracy on the LFW. In this case I think we should just push this over top of the existing alignment code and train a new network just to see if the performance increases.

If it decreases the nn4.v1 performance, I think we should make the alignment code backwards compatible and train a new nn4.v2 based on it.

bamos pushed a commit that referenced this issue Dec 8, 2015
Thanks @hbredin for the great feedback!

1. Make it clear dlib expects RGB images.
2. Keep the mean landmarks in the alignment source code
   rather than loading them from a txt file.
3. Normalize the mean landmarks between 0 and 1 for
   the transformation.
bamos pushed a commit that referenced this issue Dec 8, 2015
bamos pushed a commit that referenced this issue Dec 8, 2015
bamos pushed a commit that referenced this issue Dec 11, 2015
@bamos
Copy link
Collaborator Author

bamos commented Dec 11, 2015

Thanks again @hbredin! Merged everything into the master branch.

@bamos bamos closed this as completed Dec 11, 2015
@hbredin
Copy link

hbredin commented Dec 12, 2015

Great. Thanks. Looking forward to v2 model :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants