Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I separate the face detection from the landmark extraction? #4

Closed
ivsanro1 opened this issue Jul 19, 2017 · 3 comments
Closed

Comments

@ivsanro1
Copy link

Hello,

I acknowledge that the current implementation of the fast face is pretty good, but I think that the face detection + extraction is being made atomically, hence, when you skip frames (for speedup purposes) you are skipping also the landmark relocalization, and that makes the whole app a little more clunky, because you can see that the landmarks stay in the same position for N=3 frames.

My idea is basically separate the face detection and the landmark extraction, so you can make face detection once every 5 or 6 frames, but keep extracting landmarks every 1 or 2 frames. The problem is that I know from little to nothing of JNI, and I don't see how the two processes (face detect. and landmark extract.) can be separated, because they seem to be in the same native method. How could I accomplish this?
Note that this would also be useful because you could forget about HOG face detection from DLIB and try a ViolaJones from OpenCV, or even other approximations...

And other unrelated questions:

Can you provide the source code of dlib with the changes? Or at least a brief explanations of the changes made to dlib?

Why the source files of the JNI libraries are not in the project? Did you make any changes to them? (the ones in here: https://github.com/tzutalin/dlib-android/tree/master/jni)

Thanks in advance

@gicheonkang
Copy link
Owner

Hi @ivsanro1
Hmm... That's new fresh idea.
But, you might face some issues because it is required to modify dlib's overall architecture.

In app/src/main/.../OnGetImageListener.java, there is a code like below.
results = mFaceDet.detect(mResizedBitmap);
This method contains detection, feature extraction. Now, it doesn't divided atomically.

Here is the source code of current engine. dlib
I only modified some points that can be parallel-processing.
Please check the source code if separation is possible.
Thank you for your suggestion :)

@ivsanro1
Copy link
Author

ivsanro1 commented Jul 19, 2017

I know, I know, but I don't think it is really about dlib. Let me explain:

As far as I know, dlib is attached to the project via shared library that also includes the compiled JNI libraries that help to comunicate the C++ code (dlib) with the java code (the app). In this project, the first call from the Java code related to the face detection is, like you said, in app/src/main/.../OnGetImageListener.java:

results = mFaceDet.detect(mResizedBitmap);

where results is an instance object from the class FaceDet, and, as you said, detect(·) is a method from that class which performs the face detection and the landmark calculation atomically.

I have seen in the dlib-android project the source files of the JNI, and I guess they are unchanged in this project, those can be found here:

https://github.com/tzutalin/dlib-android/tree/master/jni/jni_detections

and specifically, jni_face_det.cpp has the source code of the native methods that grab the code directly from dlib. This is the part that I don't understand too much, but you get the main idea: the detection and the landmark is calculated atomically (you can't calculate such things separatedly).

However, if we take a look at the dlib example face_landmark_detection_ex.cpp, that can be found in:

http://dlib.net/face_landmark_detection_ex.cpp.html

you can see that the face detector:

frontal_face_detector detector = get_frontal_face_detector();

and the landmark calculator, known as shape predictor:

shape_predictor sp; deserialize(argv[1]) >> sp;

are two separated things, and, in fact, the face detection task:

std::vector<rectangle> dets = detector(img);

and the landmark extraction task:

full_object_detection shape = sp(img, dets[j])

are being done separatedly. Therefore, despite I don't really know how to separate such code when it comes to the JNI libraries, I think that could be possible (and, in fact, it would be the best pratice) to do a JNI native method for the face detection only and another one for the landmark extractor, but separatedly. But I have a lot of problems understanding how could I achieve this, because I have no idea of JNI and I find JNI code very awful.

Thanks for your time

@gicheonkang
Copy link
Owner

@ivsanro1 Yes, Originally, Dlib process steps as you mentioned. frontal_face_detect --> feature extraction.
But, I can't fully understand your suggestion. Is this the main idea that you said?
Reduce the frequency of face-detection(per 6 frames), keep extracting feature(per 2 frames).

ex) one cycle
frame 1 Face detection, feature extraction
frame 2 -
frame 3 Feature extraction
frame 4 -
frame 5 Feature extraction

I completely know this idea can boost the speed temporarily.
But, fundamental problem of this app is detecting every N frames. N-1 frames occurs false positive.
I think changing internal algorithm has a limit. So, I'm working hard to boost the speed using external resources (GPU, FPU core, Deep learning based modeling)

Anyway, Thank you so much for your interest.
When I upgrade the app, I'll let you know first.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants