New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
the align function may Shear the face image #49
Comments
Yes, I think this is a known problem which has been discussed for example here. To my knowledge it has not been really solved though. Another approach would be to not do any transformation at all but instead just use a bounding box and let the CNN handle any rotations etc within that box. Rotations can then be seen as a kind of data augumentation instead. I have tried this approach but using MTCNN for face alignment, and when training a Inception-Resnet-v1 network on this data I can get a model with accuracy ~0.975 on LFW. Not sure how much of the performance improvement that can be attributed to not having the shearing effect though. |
@davidsandberg Can you open the code of Inception-Resnet-v1 and the model with accuracy ~0.975 on LFW? ~Thank you very much! |
All the code for training is already in the repo. I ran the command With the learning rate schedule (../data/learning_rate_schedule_classifier_long.txt)
The Inception-Resnet-v1 will improve the performance significantly compared to the nn4 model but to get to 0.975 a better alignment (MTCNN) is also needed. The code that I used to get the above result can be found here but it requires Caffe installed and to clone the MTCNN repo, which i don't plan to describe here. Instead I'm working on an implementation of this using python/tensorflow but this in not ready yet. |
I was studying Dlib recently and found that Dlib already provides the ability to detect faces, extract landmarks, and with one additional step, save the "face_chips" (aligned using the landmark) as image files. I tried this approach to process the LFW dataset and it seems to be working pretty well. Since "align_dlib.py" is already using Dlib and it seems that the easiest way for this task is to use native Dlib all the way. There is an example in the Dlib source tree: http://dlib.net/face_landmark_detection_ex.cpp.html The following is an example on how to save the aligned face_chips to files.
Thanks, --Scott |
@davidsandberg
Hint: I have replace your "imResample" by "scipy.misc.imresize" and get >50% faster evaluation. |
@melgor
For the resize thing I guess it looks quite crazy :-), but the reason for using the home-brewed implementation was that while comparing the tensorflow implementation to the matlab one I needed a resample that worked identically in the two implementations. So I ended up having that same code in matlab as well, which should be exchanged for a scipy or opencv implementation when the two implementations match. |
I tried with the following command: python facenet_train_classifier.py --logs_base_dir logs/facenet/ --models_base_dir models/facenet/ --data_dir ./align/casia --image_size 182 --model_def models.inception_resnet_v1 --lfw_dir ./align/datasets/lfw_160 --weight_decay 2e-4 --optimizer RMSPROP --learning_rate -1 --max_nrof_epochs 80 --keep_probability 0.8 --random_crop --random_flip --learning_rate_schedule_file ../data/learning_rate_schedule_classifier_long.txt It has the following error: Traceback (most recent call last): Do you know what went wrong with my command? Thanks |
I'm pretty sure I ran into the same problem but now I'm not sure how it was solved. Could have been that it was fixed when upgrading to a newer version of slim, but I'm not sure. |
I got the following result:
Why not use least square?
Y = MX
M = (YXt)(XXt)-1
The text was updated successfully, but these errors were encountered: