Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EGL initialization error #123

Closed
ajabri opened this issue Dec 29, 2019 · 16 comments
Closed

EGL initialization error #123

ajabri opened this issue Dec 29, 2019 · 16 comments

Comments

@ajabri
Copy link

ajabri commented Dec 29, 2019

Hi,

I've recently come across the following error, on a machine with Nvidia driver version: 440.33.01 CUDA version: 10.2, when trying to use EGL for headless rendering. The same exact code was running properly on a machine with CUDA Version 10.1 and driver version 390.

Any idea how to fix the issue? I've tried reinstalling dm_control with pip.

  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/__init__.py", line 28, in <module>
    from dm_control.suite import acrobot
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/acrobot.py", line 24, in <module>
    from dm_control import mujoco
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/__init__.py", line 18, in <module>
    from dm_control.mujoco.engine import action_spec
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/engine.py", line 44, in <module>
    from dm_control import _render
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 75, in <module>
    Renderer = import_func()
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 36, in _import_egl
    from dm_control._render.pyopengl.egl_renderer import EGLContext
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_renderer.py", line 66, in <module>
    EGL_DISPLAY = create_initialized_headless_egl_display()
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_renderer.py", line 49, in create_initialized_headless_egl_display
    for device in EGL.eglQueryDevicesEXT():
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_ext.py", line 65, in eglQueryDevicesEXT
    success = _eglQueryDevicesEXT(max_devices, devices, num_devices)
ctypes.ArgumentError: argument 2: <class 'TypeError'>: expected LP_c_void_p instance instead of EGLDeviceEXT_pointer_Array_10

Thanks,
A

@xinleipan
Copy link

Try to do

$ export DISPLAY=:0

in your terminal and try again, it works for me.

@ajabri
Copy link
Author

ajabri commented Jan 3, 2020

@xinleipan In this case, I'm trying to use EGL for headless rendering. Setting "DISPLAY=:0" results in the below error. Are you rendering with glfw or osmesa?

Traceback (most recent call last):
  File "main_maw.py", line 6, in <module>
    import dmc_wrapper as dmc2gym
  File "/home/aj/rlpyt/selfish/dmc_wrapper.py", line 2, in <module>
    from dm_control import suite
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/__init__.py", line 28, in <module>
    from dm_control.suite import acrobot
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/acrobot.py", line 24, in <module>
    from dm_control import mujoco
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/__init__.py", line 18, in <module>
    from dm_control.mujoco.engine import action_spec
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/engine.py", line 44, in <module>
    from dm_control import _render
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 67, in <module>
    Renderer = import_func()  # pylint: disable=invalid-name
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 36, in _import_egl
    from dm_control._render.pyopengl.egl_renderer import EGLContext
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_renderer.py", line 68, in <module>
    raise ImportError('Cannot initialize a headless EGL display.')
ImportError: Cannot initialize a headless EGL display.

@xinleipan
Copy link

I'm not sure, sorry. Maybe it's GLFW, I just installed MuJoCo normally with a Ubuntu 16.04 machine, CUDA 8.0 and NVIDIA GPU.

@geyang
Copy link

geyang commented Jan 4, 2020

same issue here on a ubuntu box.

@geyang
Copy link

geyang commented Jan 4, 2020

Problem solved, use this: @ajabri

export MUJOCO_GL="glfw"

In order to avoid the EGL, one way I have seen working is to use the GLFW backend. To do so, you need to set the environment flag to glfw. Then you need a display port that is actually available.

@ajabri
Copy link
Author

ajabri commented Jan 4, 2020 via email

@geyang
Copy link

geyang commented Jan 4, 2020

@ajabri if X display is not available, you can use "osmesa" backend.

(for the interest of others: won't work even if you install Xvfb and GLFW because they require X-display)

export MUJOCO_GL="osmesa"

Here is the full setup for a headless linux box with no built-in X display:

      export MUJOCO_GL=osmesa
      export MJLIB_PATH=$HOME/.mujoco/mujoco200/bin/libmujoco200.so
      export MJKEY_PATH=$HOME/.mujoco/mujoco200/mjkey.txt
      export LD_LIBRARY_PATH=$HOME/.mujoco/mujoco200/bin:$LD_LIBRARY_PATH
      export MUJOCO_PY_MJPRO_PATH=$HOME/.mujoco/mujoco200/
      export MUJOCO_PY_MJKEY_PATH=$HOME/.mujoco/mujoco200/mjkey.txt

@saran-t maybe we can add this to the documentation? :)

@ajabri
Copy link
Author

ajabri commented Jan 4, 2020 via email

@geyang
Copy link

geyang commented Jan 4, 2020

Maybe we can ping @saran-t about that

@lich14
Copy link

lich14 commented Jan 5, 2020

@ajabri if X display is not available, you can use "osmesa" backend.

(for the interest of others: won't work even if you install Xvfb and GLFW because they require X-display)

export MUJOCO_GL="osmesa"

Here is the full setup for a headless linux box with no built-in X display:

      export MUJOCO_GL=osmesa
      export MJLIB_PATH=$HOME/.mujoco/mujoco200/bin/libmujoco200.so
      export MJKEY_PATH=$HOME/.mujoco/mujoco200/mjkey.txt
      export LD_LIBRARY_PATH=$HOME/.mujoco/mujoco200/bin:$LD_LIBRARY_PATH
      export MUJOCO_PY_MJPRO_PATH=$HOME/.mujoco/mujoco200/
      export MUJOCO_PY_MJKEY_PATH=$HOME/.mujoco/mujoco200/mjkey.txt

@saran-t maybe we can add this to the documentation? :)

This way exactly solve me from those terrible mistakes

@saran-t
Copy link
Member

saran-t commented Jan 5, 2020

I can't see how a CUDA upgrade could have changed anything given that we are specifying the argument types ourselves, but one thing that you could try is to change this line

https://github.com/deepmind/dm_control/blob/master/dm_control/_render/pyopengl/egl_ext.py#L65

to

success = _eglQueryDevicesEXT(max_devices, ctypes.POINTER(ctypes.c_void_p)(ctypes.addressof(devices)), num_devices)

Let me know if this works.

@ajabri
Copy link
Author

ajabri commented Jan 6, 2020

The above change gives me the following error:

Traceback (most recent call last):
  File "main_maw.py", line 6, in <module>
    import dmc_wrapper as dmc2gym
  File "/home/aj/rlpyt/selfish/dmc_wrapper.py", line 2, in <module>
    from dm_control import suite
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/__init__.py", line 28, in <module>
    from dm_control.suite import acrobot
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/suite/acrobot.py", line 24, in <module>
    from dm_control import mujoco
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/__init__.py", line 18, in <module>
    from dm_control.mujoco.engine import action_spec
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/mujoco/engine.py", line 44, in <module>
    from dm_control import _render
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 67, in <module>
    Renderer = import_func()  # pylint: disable=invalid-name
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/__init__.py", line 36, in _import_egl
    from dm_control._render.pyopengl.egl_renderer import EGLContext
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_renderer.py", line 66, in <module>
    EGL_DISPLAY = create_initialized_headless_egl_display()
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_renderer.py", line 49, in create_initialized_headless_egl_display
    for device in EGL.eglQueryDevicesEXT():
  File "/home/aj/miniconda3/envs/rlpyt/lib/python3.7/site-packages/dm_control/_render/pyopengl/egl_ext.py", line 66, in eglQueryDevicesEXT
    success = _eglQueryDevicesEXT(max_devices, ctypes.POINTER(ctypes.c_void_p)(ctypes.addressof(devices)), num_devices)
TypeError: expected c_void_p instead of int

@alimuldal
Copy link
Collaborator

alimuldal commented Jan 6, 2020

I think this was due to an upstream change in PyOpenGL. In versions newer than 3.1.4, OpenGL.EGL has its own EGLDeviceEXT member (mcfletch/pyopengl@38f4cc5), which overrides our assignment of EGLDeviceEXT = ctypes.c_void_p when we do the wildcard import on the last line. Consequently we end up with mismatching pointer types in _eglQueryDevicesEXT and in the body of eglQueryDevicesEXT. I'll put together a fix.

@ajabri
Copy link
Author

ajabri commented Jan 8, 2020

Thank you @alimuldal!

@LazerLikeFocus
Copy link

Problem solved, use this: @ajabri

export MUJOCO_GL="glfw"

In order to avoid the EGL, one way I have seen working is to use the GLFW backend. To do so, you need to set the environment flag to glfw. Then you need a display port that is actually available.

I am new to this.
Can u plz tell me how can i use this 'export' function in colab?

@bdtacchi
Copy link

So I just spent hours on this, and I'm not an expert, so I might say something wrong. But, in my case, I was trying to run it over SSH, so I suspect that's what caused the headless display to not work. So, I tried switching the backend, but export MUJOCO_GL="glfw" or osmesa or whatever didn't make it not choose egl, so I had to go to the __init.py__ file from dmcontrol/_render and set the BACKEND variable right at the beginning to 'osmesa', which then made it finally use osmesa as the backend.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants