Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

face_locations() with model='cnn' using GPU still very slow (~600ms for 500x400 image) on Jetson Nano #1471

Open
marcjasner opened this issue Feb 13, 2023 · 2 comments

Comments

@marcjasner
Copy link

marcjasner commented Feb 13, 2023

  • face_recognition version: 1.3
  • Python version: 3.6.9
  • Operating System: Ubuntu 18.04.6 LTS (Jetson Nano 4gb w/ JetPack 4.6)

I'm just trying to get facial detection and recognition working for an app that needs to detect/recognize faces as quickly as possible. I understand with GPU that this library can get up to 15fps. I am using a Jetson Nano 4gb with JetPack 4.6, dlib 19.24 (CUDA enabled), and a clean installation of face_recognition 1.3 via pip3.

I installed face_recognition via pip3 which had the following output:

Looking in indexes: https://pypi.org/simple, https://pypi.ngc.nvidia.com
Collecting face_recognition
  Downloading face_recognition-1.3.0-py2.py3-none-any.whl (15 kB)
Requirement already satisfied: Click>=6.0 in /usr/lib/python3/dist-packages (from face_recognition) (6.7)
Requirement already satisfied: numpy in /usr/local/lib/python3.6/dist-packages (from face_recognition) (1.18.5)
Requirement already satisfied: Pillow in /home/marc/.local/lib/python3.6/site-packages (from face_recognition) (8.4.0)
Collecting face-recognition-models>=0.3.0
  Downloading face_recognition_models-0.3.0.tar.gz (100.1 MB)
     |████████████████████████████████| 100.1 MB 1.4 MB/s
  Preparing metadata (setup.py) ... done
Requirement already satisfied: dlib>=19.7 in /usr/local/lib/python3.6/dist-packages/dlib-19.24.0-py3.6-linux-aarch64.egg (from face_recognition) (19.24.0)
Building wheels for collected packages: face-recognition-models
  Building wheel for face-recognition-models (setup.py) ... done
  Created wheel for face-recognition-models: filename=face_recognition_models-0.3.0-py2.py3-none-any.whl size=100566186 sha256=7c5980c2063efaed4fa069d0aa2bcd5a7ea66ede6e93ae0be6f5582323aa12a2
  Stored in directory: /tmp/pip-ephem-wheel-cache-rr2na938/wheels/6a/e1/1a/8969952b51c25409d5b96ecb09603de12b8534bd6d68e6e7d1
Successfully built face-recognition-models
Installing collected packages: face-recognition-models, face-recognition
Successfully installed face-recognition-1.3.0 face-recognition-models-0.3.0

I then wrote a very simple python script that looks like:

#!/usr/bin/python3

import face_recognition
import time

def current_milli_time():
    return round(time.time() * 1000)

image = face_recognition.load_image_file("your_file.jpg")

start=current_milli_time()
face_locations = face_recognition.face_locations(image, model="cnn")
print(face_locations)
end=current_milli_time()
print("Get locations took {}ms" .format(end-start))

start=current_milli_time()
face_locations = face_recognition.face_locations(image, model="cnn")
print(face_locations)
end=current_milli_time()
print("Get locations took {}ms" .format(end-start))

I know the first call takes a while (about 15s in this case), and subsequent calls should be faster, so that's why I did the call twice. The image "your_file.jpg" is a 500x400 jpg of 2 people standing side by side.

Unfortunately the quickest I can get the face_locations function to run is around 600ms. The output looks like:
$ ./frtest.py

[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 14663ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 646ms

If I run the detection in a loop to get more samples it stays pretty consistent at ~646ms:

[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 13735ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 639ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 645ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 646ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 648ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 648ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 648ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 645ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 646ms
[(40, 226, 122, 144), (279, 323, 336, 267)]
Get locations took 648ms

Am I doing something wrong? Am I misunderstanding something? I would expect facial detection should be faster than ~1.5 fps on a GPU accelerated platform, no?

I have confirmed with tegrastats that the GPU is being used, as I can see the GPU frequency to up to 99%, so I know that isn't the issue.

Any thoughts? Any help would be greatly appreciated.

@marcjasner
Copy link
Author

I'm still experiencing this issue. I have recompiled dlib and reinstalled face_recognition, but I am still seeing similar performance issues. Is there any information to be provided that might help me resolve this?

@lianrenbao
Copy link

I have the same issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants