Can't set up a Xorg server in secondary graphics card #136

JoseTomasTocino · 2016-07-12T09:34:30Z

I'm working with a server with dual Quadro K2200 graphics card. The host is RHEL 7.2, I've installed Docker 0.11 and nvidia-docker. The output for nvidia-smi is this:

# nvidia-docker run --rm nvidia/cuda nvidia-smi
Tue Jul 12 14:42:22 2016       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 367.27                 Driver Version: 367.27                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2200        Off  | 0000:03:00.0      On |                  N/A |
| 42%   38C    P8     1W /  39W |    164MiB /  4041MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  Quadro K2200        Off  | 0000:04:00.0     Off |                  N/A |
| 42%   36C    P8     1W /  39W |      1MiB /  4041MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID  Type  Process name                               Usage      |
|=============================================================================|
+-----------------------------------------------------------------------------+

Now I'm trying to build a container (currently I don't care about the distro, I'm trying with both centos and debian) that uses the second card to show an X server, but so far I haven't had any luck. This is the Dockerfile I'm working with:

FROM debian

RUN apt-get update
RUN apt-get install -y xserver-xorg xinit xdm pciutils vim module-init-tools

After building the image I start a container with

nvidia-docker run --rm -t -i --privileged debian_x bash

And there I run Xorg -configure to create a scaffolding for the conf file, that I tweak to use the nvidia driver and just the second car (by removing all references to the first card). If I try to start the X server with xinit, I get the following at the end of the log:

[ 11153.937] (II) LoadModule: "nvidia"
[ 11153.937] (WW) Warning, couldn't open module nvidia
[ 11153.937] (II) UnloadModule: "nvidia"
[ 11153.937] (II) Unloading nvidia
[ 11153.937] (EE) Failed to load module "nvidia" (module does not exist, 0)
[ 11153.937] (EE) No drivers available.
[ 11153.937] (EE) 
Fatal server error:
[ 11153.937] (EE) no screens found(EE)

I've tried installing the NVIDIA driver from within the container, but it fails as it already detects nvidia-related modules running, as the output of lsmod shows:

# lsmod | grep nvidia
nvidia_drm             43350  2 
nvidia_modeset        764270  4 nvidia_drm
nvidia              11070459  121 nvidia_modeset
drm_kms_helper        125008  1 nvidia_drm
drm                   349210  5 drm_kms_helper,nvidia_drm
i2c_core               40582  6 drm,igb,ipmi_ssif,drm_kms_helper,i2c_algo_bit,nvidia

Any clue? Thanks!

The text was updated successfully, but these errors were encountered:

3XX0 · 2016-07-12T17:53:46Z

We don't support this use case. I'm curious though why don't you leverage the X server from your host instead?

ruffsl · 2016-07-12T18:06:28Z

The ROS community does something similar by leverage the X server from your host.
FYI here are some wiki tutorials about the topic:
http://wiki.ros.org/docker/Tutorials/GUI#The_simple_way
http://wiki.ros.org/docker/Tutorials/Hardware%20Acceleration#Using_nvidia-docker

JoseTomasTocino · 2016-07-14T07:22:29Z

Thanks for the answers.

@3XX0 I was actually trying to compare the (supposed) performance gain of having the X server directly in the container talking to the graphics card instead of the host. I understand you don't support this use case, but AFAIK it is doable, right? I eventually managed to start the X server in the container by manually copying the necessary files - namely, the nvidia_drv.so that was missing. However it broke the host's X server in the first card, I think it has to do with the management of tty or something (I'm actually clueless about this).

@ruffsl thanks for the links. As I mention in the previous paragraph, my intention was to compare that method you reference (following this tutorial) and the x-server-direct-from-the-container approach. However, the method you mention doesn't allow me to run more graphic-hungry apps, like glxgears:

# nvidia-docker run -ti --rm -e DISPLAY -v /tmp/.X11-unix:/tmp/.X11-unix coge_gnome_glmark2 bash
# glxgears
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  153 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  35
  Current serial number in output stream:  37

Apps like gedit work properly.

ruffsl · 2016-07-14T18:44:39Z

@JoseTomasTocino

This Dockerfile and launch process works for me:

FROM ubuntu:16.04

# install GLX-Gears
RUN apt-get update && apt-get install -y \
    mesa-utils && \
    rm -rf /var/lib/apt/lists/*

# nvidia-docker hooks
LABEL com.nvidia.volumes.needed="nvidia_driver"
ENV PATH /usr/local/nvidia/bin:${PATH}
ENV LD_LIBRARY_PATH /usr/local/nvidia/lib:/usr/local/nvidia/lib64:${LD_LIBRARY_PATH}

docker build -t foo .
xhost +local:root
nvidia-docker run -it \
    --env="DISPLAY" \
    --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
    foo glxgears
xhost -local:root

We use this same method to run RVIZ and gazebo from containers, rendering dense point clouds, raytracing for visual sensors, and displaying robot 3D models. Here's an old example from a little more that a year ago: https://www.youtube.com/watch?v=djLKmDMsdxM . Note, I was getting the same performance from running the stack locally on the host.

I'm not sure how much you'd gain from not needing to use the unix socket, but I guess your search might be helpful when we all migrate away from Xserver and onto Mir or what not, but would still like to use legacy X apps.

flx42 · 2016-07-14T19:12:24Z

@ruffsl that's a bit similar to what we have on our experimental opengl branch: 0158f50

3XX0 · 2016-07-14T22:02:02Z

Your problem is probably due to conflicting libGL, as @flx42 mentioned try the opengl branch this should just work.

Also it's using direct rendering so you shouldn't see any performance impact.

JoseTomasTocino · 2016-07-15T07:26:31Z

Thanks a lot guys. Using both @ruffsl 's Dockerfile and the code in the opengl branch I've managed to launch both glxgears and glmark2 from the container and it works very well. Looks like I was missing the modification of the PATH and LD_LIBRARY_PATH environment variables.

And exactly as @3XX0 has mentioned, glmark2 scores essentially the same when run from the host and from the container. Given the circumstances, there's no need to keep digging into running the X server from the container. However as I briefly commented here, it's definitely possible, once you sort out the coexistence of the host's X server and the container's. I think that would mean trying to assign a different tty to the container's host, at least as a first step. There doesn't seem to be too much info about this.

ruffsl · 2018-06-20T01:13:16Z

Just as an update for folks, here is a minimal working example for GLX-Gears GUI using nvidia-docker2.

FROM ubuntu:18.04

# install GLX-Gears and the GL Vendor-Neutral Dispatch library
RUN apt-get update && apt-get install -y \
    libglvnd0 \
    mesa-utils && \
    rm -rf /var/lib/apt/lists/*

# nvidia-container-runtime
ENV NVIDIA_VISIBLE_DEVICES \
    ${NVIDIA_VISIBLE_DEVICES:-all}
ENV NVIDIA_DRIVER_CAPABILITIES \
    ${NVIDIA_DRIVER_CAPABILITIES:+$NVIDIA_DRIVER_CAPABILITIES,}graphics

docker build -t foo .
xhost +local:root
nvidia-docker run -it \
    --env="DISPLAY" \
    --volume="/tmp/.X11-unix:/tmp/.X11-unix:rw" \
    foo glxgears
xhost -local:root

3XX0 · 2018-06-20T02:16:36Z

FYI, we have samples here

docker build git@gitlab.com:nvidia/samples.git#:opengl/ubuntu16.04/glxgears

mash-graz · 2018-06-28T21:38:19Z

https://gitlab.com/mash-graz/resolve is another example how to run a quite demanding real world application (DaVinci Resolve) utilizing OpenGL in nvidia-docker2

nathantsoi · 2018-06-30T20:36:37Z

thx for the example @ruffsl

i'm trying to get this working on the TX2 and i have built libglvnd from source since there are no 16.04 arm64 packages it seems

# OpenGL
# https://github.com/NVIDIA/nvidia-docker/issues/136#issuecomment-398593070
RUN apt-get install -y mesa-utils libxext-dev libx11-dev x11proto-gl-dev autogen autoconf libtool

RUN cd deps && \
  git clone https://github.com/NVIDIA/libglvnd.git && \
  cd libglvnd && \
  git reset --hard 9d909106f232209cf055428cae18387c18918704 && \
  bash autogen.sh && bash configure && make -j6 && \
  make install

ENV NVIDIA_VISIBLE_DEVICES \
    ${NVIDIA_VISIBLE_DEVICES:-all}
ENV NVIDIA_DRIVER_CAPABILITIES \
    ${NVIDIA_DRIVER_CAPABILITIES:+$NVIDIA_DRIVER_CAPABILITIES,}graphics

however i get: BadValue (integer parameter out of range for operation)

# glxgears
X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  154 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  34
  Current serial number in output stream:  35

any suggestions?

lromor · 2018-09-11T20:59:58Z

@nathantsoi
I think the issue is related with the video group mapping.
I guess you are trying to mount the xorg unix sockets and avoid the use of xauth by using
the same gid/uid mapping of a user inside the container.
If that's the case, simply run usermod -a -G video myuser.

If the problem still occours, be sure that the gid of the group video is the same between the host and the container.
Regards,

-l

rubenvandeven · 2018-10-09T20:25:19Z

@nathantsoi Did you get it working in the end? I have the same error. Even when following the nvidia/opengl glxgears sample.

@nathantsoi gid of video is 44 on both host and container. As per the example, I run xhost +si:localuser:root before running the container. Could it still be a rights issue?

I have an optimus laptop. However I do sudo tee /proc/acpi/bbswitch <<< ON to make sure the card is on. Proven by the fact that nvidia-smi works without any problems in the container.

Furthermore, the example by ruffsl in issue #136 works, but runs glxgears on my Intel card, rather than Nvidia.

Driver version 390.87, Cuda 9.0 on both host & container.

Thanks for any suggestion!

nathantsoi · 2018-10-09T23:43:52Z

I haven't had time to debug again. I ended up copying all the dependencies to a folder and running outside of docker (after building w/in docker) by setting LD_LIBRARY_PATH to the folder. Hope that helps!

flx42 added the unsupported label Jul 13, 2016

3XX0 closed this as completed Jul 27, 2016

flx42 mentioned this issue Aug 23, 2016

how to use nvidia-docker in container when I want to use GPU #176

Closed

ruffsl mentioned this issue Sep 26, 2016

Allow (simulated) display to be used with ci builds ros-planning/moveit_ci#1

Closed

flx42 mentioned this issue Jan 20, 2017

Failed to load module nvidia #296

Closed

pierallard mentioned this issue Jun 29, 2017

Works with nvidia drivers nidup/starcraft#15

Open

nmindz mentioned this issue Oct 12, 2017

X Error of failed request: BadValue #496

Closed

Abhijeet94 mentioned this issue Dec 22, 2017

Fail to launch glxgears from docker container #586

Closed

diegoferigo mentioned this issue Mar 26, 2018

iCub_SIM simulator not work in docker robotology/icub robotology/icub-main#501

Closed

RyodoTanaka mentioned this issue Jun 7, 2018

run error osrf/car_demo#36

Closed

ruffsl mentioned this issue Jun 20, 2018

Create template for adding nvidia support to docker images osrf/docker_templates#33

Closed

flx42 mentioned this issue Jun 27, 2018

nvidia-docker v2: error while loading shared libraries: libEGL.so.1: cannot open shared object file #776

Closed

ruffsl mentioned this issue Jun 28, 2018

nvidia-docker 1 can run OpenGL applications; nvidia-docker 2 can't #534

Closed

ruffsl mentioned this issue Jun 29, 2018

Gazebo won't connect to Mac XQuartz osrf/docker_images#167

Closed

LiaoWeiHsiang mentioned this issue Nov 14, 2019

Could not initialize OpenGL for RasterGLSurface, reverting to RasterSurface lgsvl/second-ros#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't set up a Xorg server in secondary graphics card #136

Can't set up a Xorg server in secondary graphics card #136

JoseTomasTocino commented Jul 12, 2016

3XX0 commented Jul 12, 2016

ruffsl commented Jul 12, 2016

JoseTomasTocino commented Jul 14, 2016

ruffsl commented Jul 14, 2016 •

edited

flx42 commented Jul 14, 2016

3XX0 commented Jul 14, 2016

JoseTomasTocino commented Jul 15, 2016

ruffsl commented Jun 20, 2018 •

edited

3XX0 commented Jun 20, 2018

mash-graz commented Jun 28, 2018

nathantsoi commented Jun 30, 2018

lromor commented Sep 11, 2018 •

edited

rubenvandeven commented Oct 9, 2018

nathantsoi commented Oct 9, 2018

Can't set up a Xorg server in secondary graphics card #136

Can't set up a Xorg server in secondary graphics card #136

Comments

JoseTomasTocino commented Jul 12, 2016

3XX0 commented Jul 12, 2016

ruffsl commented Jul 12, 2016

JoseTomasTocino commented Jul 14, 2016

ruffsl commented Jul 14, 2016 • edited

flx42 commented Jul 14, 2016

3XX0 commented Jul 14, 2016

JoseTomasTocino commented Jul 15, 2016

ruffsl commented Jun 20, 2018 • edited

3XX0 commented Jun 20, 2018

mash-graz commented Jun 28, 2018

nathantsoi commented Jun 30, 2018

lromor commented Sep 11, 2018 • edited

rubenvandeven commented Oct 9, 2018

nathantsoi commented Oct 9, 2018

ruffsl commented Jul 14, 2016 •

edited

ruffsl commented Jun 20, 2018 •

edited

lromor commented Sep 11, 2018 •

edited