Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More details on graphics + compute mode in README #12

Closed
duongnt opened this issue Mar 12, 2019 · 6 comments
Closed

More details on graphics + compute mode in README #12

duongnt opened this issue Mar 12, 2019 · 6 comments

Comments

@duongnt
Copy link

duongnt commented Mar 12, 2019

Hi,

Could you give more details on this point in the README file:

You should ensure that compute + graphics mode is enabled (through nvidia-smi) for your GPU

How can I ensure that? I looked at nvidia-smi manpage but it wasn't obvious to me, and google shows references to a tool call gpumodeswitch that I don't know how to get.

--Thi

@duongnt
Copy link
Author

duongnt commented Mar 12, 2019

is it this option:

--gom=MODE
Set GPU Operation Mode: 0/ALL_ON, 1/COMPUTE, 2/LOW_DP Supported on
GK110 M-class and X-class Tesla products from the Kepler family. Not
supported on Quadro and Tesla C-class products. LOW_DP and ALL_ON are
the only modes supported on GeForce Titan devices. Requires adminis-
trator privileges. See GPU Operation Mode for more information about
GOM. GOM changes take effect after reboot. The reboot requirement
might be removed in the future. Compute only GOMs don’t support WDDM
(Windows Display Driver Model)

Looks like I can't set it:

root@agent-05265972:/models# nvidia-smi --query-gpu=gom.current --format=csv,noheader
[Not Supported]
[Not Supported]
[Not Supported]
[Not Supported]
root@agent-05265972:/models# nvidia-smi --gom=0 --id=1
GOM mode cannot be changed on GPU 00000000:03:00.0.
Treating as warning and moving on.
All done.

My GPU is GeForce GTX 1080 Ti.

@pmh47
Copy link
Owner

pmh47 commented Mar 12, 2019

It is almost certain that your GPU is already set in the correct mode, so there should be no problem (that advice from the readme is relevant only in a tiny number of cases, and AFAIK never for GeForce cards). Do you get errors when using DIRT?

@pmh47 pmh47 closed this as completed Mar 12, 2019
@pmh47 pmh47 reopened this Mar 12, 2019
@pmh47
Copy link
Owner

pmh47 commented Mar 12, 2019

I shall update the README to be more precise about this, or remove the note entirely.

@duongnt
Copy link
Author

duongnt commented Mar 12, 2019

Thanks @pmh47! I think we should mention that Compute + Graphics mode is enabled by default for some cards, and maybe give some brief pointers on how to check and set those modes.

I'm not using DIRT directly, but working on a similar project that also uses EGL + interop. Right now I'm getting into this issue

./bin/python tests/test_render.py
libdc1394 error: Failed to initialize libdc1394
Traceback (most recent call last):
File "tests/test_render.py", line 13, in
from . import custom_ops as tfc
File "custom_ops/init.py", line 13, in
_TFC = tf.load_op_library(libplugin_path)
File "/python_env/tensorflow/python/framework/load_library.py", line 60, in load_op_library
lib_handle = py_tf.TF_LoadLibrary(library_filename)
tensorflow.python.framework.errors_impl.NotFoundError: lib/libsvg_path_renderer.so: undefined symbol: eglDestroyContext

This test runs fine in my local machine, but it doesn't run on a docker container on our cluster. I thought it was because the graphics mode was not enabled, but perhaps that's not the issue.

I noticed some differences from nvidia-smi -q locally and on cluster though:
Locally it says:

Timestamp : Tue Mar 12 11:10:12 2019
Driver Version : 410.73
CUDA Version : 10.0

Attached GPUs : 2
GPU 00000000:02:00.0
Product Name : TITAN X (Pascal)
Product Brand : GeForce
Display Mode : Enabled
Display Active : Enabled
Persistence Mode : Disabled
Accounting Mode : Disabled

On cluster:

Attached GPUs : 4
GPU 00000000:02:00.0
Product Name : GeForce GTX 1080 Ti
Product Brand : GeForce
Display Mode : Disabled
Display Active : Disabled
Persistence Mode : Disabled
Accounting Mode : Disabled

Looks like Display Mode is disabled on the cluster while it's enabled locally.

@pmh47
Copy link
Owner

pmh47 commented Mar 12, 2019

'Display Mode' should not matter for EGL offscreen rendering. Rather, I think you have a problem with finding / using the right version of the EGL library. Running ldd libsvg_path_renderer.so on the cluster should give useful info.

@duongnt
Copy link
Author

duongnt commented May 29, 2019

Thanks, I resolved this problem some time ago. It was not because of graphics + compute mode.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants