Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"gputouse" option for analyze_videos #272

Closed
mschart opened this issue Apr 29, 2019 · 13 comments
Closed

"gputouse" option for analyze_videos #272

mschart opened this issue Apr 29, 2019 · 13 comments
Assignees

Comments

@mschart
Copy link

mschart commented Apr 29, 2019

Hi there,

I have two GPUs and I could train in parallel using the "gputouse" option in train_network. I'd like to also analyze videos in parallel (as I have many) so would it be possible to include the same "gputouse" option in the analyze_videos function? If I start two different ones in parallel now I get a tensorflow error:

InternalError: Failed to create session.

Cheers,

Michael

@mschart
Copy link
Author

mschart commented Apr 29, 2019

Ps: I guess it's just including this line:

os.environ['CUDA_VISIBLE_DEVICES'] = str(gputouse)

@MMathisLab
Copy link
Member

MMathisLab commented Apr 29, 2019

good idea (edited*: which Alex evidently already had implemented - hehe); typically I just run separate docker containers and they are individually linked to a specific GPU, but for non-Docker users, this is indeed useful.

@AlexEMG
Copy link
Member

AlexEMG commented Apr 29, 2019

Yes, and you can set it e.g. by:
deeplabcut.train_network(config,shuffle=1,trainingsetindex=0,gputouse=3)
then just run it for your other GPU in another terminal...
See:
https://github.com/AlexEMG/DeepLabCut/blob/efa95129061b1ba1535f7361fe76e9267568a156/deeplabcut/pose_estimation_tensorflow/training.py#L12

@AlexEMG AlexEMG closed this as completed Apr 29, 2019
@mschart
Copy link
Author

mschart commented Apr 29, 2019

Yes, for training. The same option would be good to have for the "analyze_videos" function.

@AlexEMG
Copy link
Member

AlexEMG commented Apr 29, 2019

Wait, doesn't that exist as well: https://github.com/AlexEMG/DeepLabCut/blob/master/deeplabcut/pose_estimation_tensorflow/predict_videos.py#L34

@AlexEMG AlexEMG reopened this Apr 29, 2019
@AlexEMG AlexEMG self-assigned this Apr 29, 2019
@mschart
Copy link
Author

mschart commented Apr 29, 2019

Mhm, indeed. Then all is good and I did something silly :)

@mschart mschart closed this as completed Apr 29, 2019
@kenziew
Copy link

kenziew commented May 13, 2019

I'm having a similar issue trying to train a network and analyze videos on 2 separate GPUs. The analyze videos function seems to call a function that utilizes both GPU's. Any ideas? Thanks!

@kenziew
Copy link

kenziew commented May 13, 2019

this is with just running analyze videos function

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 418.43 Driver Version: 418.43 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 GeForce RTX 208... Off | 00000000:08:00.0 Off | N/A |
| 48% 65C P2 258W / 260W | 10856MiB / 10989MiB | 100% Default |
+-------------------------------+----------------------+----------------------+
| 1 GeForce RTX 208... Off | 00000000:42:00.0 On | N/A |
| 41% 41C P2 64W / 260W | 438MiB / 10981MiB | 0% Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
| 0 4108 C /home/kenzie/anaconda3/envs/DLC/bin/python 10845MiB |
| 1 1709 G /usr/lib/xorg/Xorg 18MiB |
| 1 1816 G /usr/bin/gnome-shell 58MiB |
| 1 2077 G /usr/lib/xorg/Xorg 108MiB |
| 1 2206 G /usr/bin/gnome-shell 85MiB |
| 1 4108 C /home/kenzie/anaconda3/envs/DLC/bin/python 155MiB |
+-----------------------------------------------------------------------------+

@AlexEMG
Copy link
Member

AlexEMG commented May 13, 2019

Just pass which GPU you want to use, e.g.

deeplabcut.train_network(config,gputouse=1)

or

deeplabcut.analyze_videos(config,videos,videotype='avi',shuffle=1,trainingsetindex=0,gputouse=0):

@kenziew
Copy link

kenziew commented May 13, 2019

We tried that too. It's just with analyze_videos. I can train 2 networks simultaneously but when I have one network running and try to analyze_videos on the second GPU I get a device CUDA:0 not supported by XLA service error

When only one GPU is installed, analyze_videos will only utilize one GPU. But when the second GPU is installed, analyze_videos tries to utilize both even when I specify which GPU using gputuose so then I get the error

@AlexEMG
Copy link
Member

AlexEMG commented May 13, 2019

Thanks for all the details. I looked into the code again and noticed that the environment variable is set after the TF session is initialized (for the predict code). I just swapped the order and updated the github repo (not pypi so far). Could you please download 2.0.6.3 and check if it works now for you?

@AlexEMG AlexEMG reopened this May 13, 2019
@kenziew
Copy link

kenziew commented May 13, 2019

that worked! thank you!

@AlexEMG AlexEMG closed this as completed May 13, 2019
@AlexEMG
Copy link
Member

AlexEMG commented May 13, 2019

Version released: https://pypi.org/project/deeplabcut/2.0.6.3/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants