Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WSL2 + Docker + OpenGL + NVIDIA not working (uses llvmpipe) #288

Open
riv-robot opened this issue Oct 7, 2021 · 33 comments
Open

WSL2 + Docker + OpenGL + NVIDIA not working (uses llvmpipe) #288

riv-robot opened this issue Oct 7, 2021 · 33 comments

Comments

@riv-robot
Copy link

Summary

I am running ROS GUI applications like RViz and Gazebo through a docker container on WSL2. The OpenGL renderer is not selecting my NVIDIA GTX 1050 card and uses llvmpipe (CPU) instead.

My system:

  • Windows 11 Beta Preview
  • Latest WSL2 kernel with Ubuntu 20.04
  • Latest Docker Desktop for Windows
  • Latest NVIDIA GPU driver for WSL2 CUDA support

Note "latest" refers to 07th October 2021 updates, I don't have versions numbers to hand

Steps taken to fix so far

The OpenGL renderer does find my NVIDIA card outside of a docker container on WSL2 (on the host). I have replicated the same issue after multiple reinstalls and using docker-ce instead of docker desktop. On a native Ubuntu 20.04 boot, the containers OpenGL renderer is correctly set to my NVIDIA card.

Expected Behaviour

RViz, Gazebo, GLXGears, glmark2 should all render with 3D hardware acceleration on the NVIDIA GPU.

@elezar
Copy link
Member

elezar commented Oct 7, 2021

@robertjbush which image is being used? Note that for OpenGL capabilities, the NVIDIA_DRIVER_CAPABILITIES environment variable should include graphics or be set to all. See https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/user-guide.html#driver-capabilities

@riv-robot
Copy link
Author

@elezar I believe I have tried those steps because:

  1. The image is a custom one, based on the ros-noetic image.
  2. The NVIDIA card is used for OpenGL rendering in the docker container on a native Ubuntu 20.04 install
  3. However it doesn't work with the same image on WSL2
  4. I have tried various NVIDIA images and running glxgears, glxinfo and glmark2
  5. NVIDIA_DRIVER_CAPABILITIES is set as you suggested

Are you, or anyone else within NVIDIA corporation, successfully running NVIDIA OpenGL rendering within a docker container on a WSL2 host?

@elezar
Copy link
Member

elezar commented Oct 8, 2021

Hi @robertjbush thanks for the additional information.

It may be that @rboissel will be able to provide some additional insight here.

@elezar
Copy link
Member

elezar commented Oct 8, 2021

One thing to note is that the graphics libraries are mounted from the host system, meaning that these need to be installed. Do glxgears, glxinfo, or glmark2 work in "native" WSL2 using the NVIDIA card?

Could you enable the debug option in the nvidia-contianer-cli section in the /etc/nvidia-container-runtime/config.toml file by uncommenting it.

The generated /var/log/nvidia-container-toolkit.log will contain information as to which libraries are not being located in this case.

@riv-robot
Copy link
Author

One thing to note is that the graphics libraries are mounted from the host system, meaning that these need to be installed. Do glxgears, glxinfo, or glmark2 work in "native" WSL2 using the NVIDIA card?

Yes they do.

I'll work on the second part of your post now.

@riv-robot
Copy link
Author

riv-robot commented Oct 8, 2021

@elezar I'm not getting those logs. This is my config.toml:

disable-require = false
#swarm-resource = "DOCKER_RESOURCE_GPU"
#accept-nvidia-visible-devices-envvar-when-unprivileged = true
#accept-nvidia-visible-devices-as-volume-mounts = false
[nvidia-container-cli]
#root = "/run/nvidia/driver"
#path = "/usr/bin/nvidia-container-cli"
environment = []
debug = "/var/log/nvidia-container-toolkit.log"
#ldcache = "/etc/ld.so.cache"
load-kmods = true
#no-cgroups = false
#user = "root:video"
ldconfig = "@/sbin/ldconfig.real"
[nvidia-container-runtime]
debug = "/var/log/nvidia-container-runtime.log"

This is at the end of my Docker Desktop JSON file

{
    "runtimes": {
        "nvidia": {
            "path": "/usr/bin/nvidia-container-runtime",
            "runtimeArgs": []
        }
    }
}
EOF

Some version info:

  • Microsoft Windows [Version 10.0.22000.194]
  • WSL2 Kernel 5.10.60.1
  • Docker Desktop (Windows): 4.1.0 (69386)

Can you provide any insight on this:

Are you, or anyone else within NVIDIA corporation, successfully running NVIDIA OpenGL rendering within a docker container on a WSL2 host?

@elezar
Copy link
Member

elezar commented Oct 8, 2021

@robertjbush what command line are you using to launch the container? Since nvidia is not set as the default runtime in your docker config, you would need to specify the runtime:

docker run --rm -ti --runtime=nvidia <image> nvidia-smi

Alternatively, specifying the --gpus flag should also ensure that the nvidia-container-toolkit is used to make the required modifications to the container when it is created.

While looking for documentation w.r.t. WSL support, I also found: https://docs.nvidia.com/cuda/wsl-user-guide/index.html#features-not-yet-supported which lists OpenGL-interop as unsupported.

@riv-robot
Copy link
Author

@elezar I've used the runtime and --gpus flag with no success.

While looking for documentation w.r.t. WSL support, I also found: https://docs.nvidia.com/cuda/wsl-user-guide/index.html#features-not-yet-supported which lists OpenGL-interop as unsupported.

But OpenGL is used by so many applications. Why would this not be supported when it is available on the host?

@riv-robot
Copy link
Author

@elezar Is there a forum to request new features?

@elezar
Copy link
Member

elezar commented Oct 13, 2021

@robertjbush let me ping someone to fine out where that limitation comes from as it may be related to WSL2 (although I recall reading that this now has better support for Linux graphics applications). If this is only due to the NVIDIA Container Toolkit I will create a ticket to track getting this added.

@riv-robot
Copy link
Author

@elezar WSL2 does indeed have better support for GPU graphics rendering. I can run OpenGL applications and use NVIDIA hardware to render them. But it isn't possible from a docker container when the host is WSL2 (the same container does use the NVIDIA GPU for rendering on a pure Ubuntu 20.04 install).

@elezar
Copy link
Member

elezar commented Oct 13, 2021

I have pinged @rboissel to have a look at the ticket. He has a better grasp on the WSL2 specifics and where the noted limitations come from.

@riv-robot
Copy link
Author

@elezar @rboissel Good news in part:

  1. I've been testing accelerated OpenGL through containers in WSL2. I used the dockerfile from microsoft's recent commit ac6221b.
  2. I also managed to get RViz and ROS (robotic operating system) to use accelerated OpenGL.
  3. However, the meshes (STL's) do not display when using the nvidia drivers

Any ideas why this may happen?

@bejota
Copy link

bejota commented Oct 27, 2021

@robertjbush
I'm having the same issue. GPU is working for compute in a docker container but not for OpenGL. I've tried environment variables such as LIBGL_ALWAYS_INDIRECT and NVIDIA_DRIVER_CAPABILITIES without success. I've also tried the dockerfiles from ac6221b. Were any other changes required to enable the GPU for graphics?

System Specs:

  • Windows 10 Pro, Version 21H2, Build 19044.1320, Windows Feature Experience Pack 120.2212.3920.0
  • WSL2: Linux FARWELL 5.10.16.3-microsoft-standard-WSL2 SMP Fri Apr 2 22:23:49 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
  • Docker Desktop 4.1.1 (69879)
  • Windows Driver: 510.06_quadro_win11_win10-dch_64bit_international.exe

Thanks.

@onomatopellan
Copy link

onomatopellan commented Oct 27, 2021

@bejota OpenGL acceleration in WSL2 only works in Windows 11.

@bejota
Copy link

bejota commented Oct 27, 2021

Dang. I knew someone was going to say that.

@riv-robot
Copy link
Author

Anyone tested RViz and meshes using accelerated OpenGL in WSL?

@tgaspar
Copy link

tgaspar commented Nov 2, 2021

Anyone tested RViz and meshes using accelerated OpenGL in WSL?

I am facing the exact same problem except I am not running the ROS stuff (or Rviz) from a container (I hope this is still relevant therefore).

Like many people before me, I had the issue that the 3D rendering was not done by the GPU. That meant that Rviz got very slow once the models got bit bigger. However, at that time the meshes were displaying.

So I upgraded to Win11 and did all the necessary to force the 3D rendering on the GPU (Nvidia GTX 1050 Ti). The GPU now does the rendering, except the meshes do not get displayed. The frames from TF, on the other hand, do get displayed.
image

@riv-robot
Copy link
Author

@tgaspar @elezar I have this exact problem.

@riv-robot
Copy link
Author

Friendly ping to anyone who's had this problem and solved it?

@onomatopellan
Copy link

Issue is being tracked in microsoft/wslg#554

@moracabanas
Copy link

moracabanas commented Nov 27, 2021

@bejota OpenGL acceleration in WSL2 only works in Windows 11.

I'm on windows 11 but I am trying to run full hardware accelerated apps from Docker.

GPU (rtx2060 max Q) is working on docker containers for compute. But im sure GUI apps are not hardware accelerated in some way.

I am facing the same issue where things like webgl are not working because glrenderer is set to llvmpipe

glxgears outputs +600fps

WSL2 Ubuntu 20.04 glxinfo | grep OpenGL

OpenGL vendor string: Microsoft Corporation
OpenGL renderer string: D3D12 (NVIDIA GeForce RTX 2060 with Max-Q Design)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 21.0.3
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.1 Mesa 21.0.3
OpenGL shading language version string: 1.40
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 21.0.3
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

Docker container glxinfo | grep OpenGL

OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 7.0, 128 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 18.3.6
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.1 Mesa 18.3.6
OpenGL shading language version string: 1.40
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 18.3.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

This is my script to run gpu accelerated containers

docker run -it --rm --gpus 'all,"capabilities=compute,graphics,utility,video,display"' --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
-e DISPLAY \
-e WAYLAND_DISPLAY \
-e XDG_RUNTIME_DIR \
-e PULSE_SERVER \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /mnt/wslg:/mnt/wslg \
-v $(pwd)/app:/app \
registry/image \
command

If I play 60FPS video on youtube chromium it plays good but sometimes choppy and GPU load for nvidia is going up only displaying its window on external monitor. I am pretty sure it is CPU rendering due to high CPU load when video playing.

Trying any webgl content reports the next error
image

@onomatopellan
Copy link

@moracabanas I think you are missing these:

-e LD_LIBRARY_PATH=/usr/lib/wsl/lib
-v /usr/lib/wsl:/usr/lib/wsl

Take a look at the samples.

@moracabanas
Copy link

moracabanas commented Nov 28, 2021

@moracabanas I think you are missing these:

-e LD_LIBRARY_PATH=/usr/lib/wsl/lib
-v /usr/lib/wsl:/usr/lib/wsl

Take a look at the samples.

Thanks you for your suggestion. I tried the new configuration based on WLSG docker run ... examples you mentioned.

But I am still not getting OpenGL as glxinfo | grep OpenGL shows:

glxinfo | grep OpenGL
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 7.0, 128 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 18.3.6
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.1 Mesa 18.3.6
OpenGL shading language version string: 1.40
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 18.3.6
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

Chrome is still showing the same unsupported and blacklisted WebGL

I tried Blender and it runs fine but you can feel there is no GPU acceleration at all

This is my image launcher script for testing now:

docker run -it --rm --gpus all --ipc=host --ulimit memlock=-1 --ulimit stack=67108864 \
-v /tmp/.X11-unix:/tmp/.X11-unix \
-v /mnt/wslg:/mnt/wslg \
-v /usr/lib/wsl:/usr/lib/wsl \
--device=/dev/dxg \
-e LD_LIBRARY_PATH=/usr/lib/wsl/lib \
-e DISPLAY=$DISPLAY \
-e WAYLAND_DISPLAY=$WAYLAND_DISPLAY \
-e XDG_RUNTIME_DIR=$XDG_RUNTIME_DIR \
-e PULSE_SERVER=$PULSE_SERVER \
-v $(pwd)/app:/app \
<repo/image:tag> \
bash

@onomatopellan
Copy link

@moracabanas

Mesa 18.3.6

You also need to install Mesa 21.x inside the container.

@moracabanas
Copy link

moracabanas commented Nov 30, 2021

@moracabanas

Mesa 18.3.6

You also need to install Mesa 21.x inside the container.

I've been trying to install mesa for hours on somewhere other than Ubuntu distro and I give up.

Do you have any advice to update or install mesa I.E any docker image? I don't want it to compile because in my experience, compiling software from source takes a day, mostly with errors. And also I don't know what I am doing in the process except copy pasting scripts.

Things I've tried already:

 sudo add-apt-repository ppa:kisak/kisak-mesa
sudo apt update
sudo apt upgrade

This is not working as this repo only supports Ubuntu and has no candidate for my buster/bullseye Debian based docker image.

@onomatopellan
Copy link

@moracabanas On Debian bullseye you need to add the deb http://http.us.debian.org/debian/ testing non-free contrib main line to your /etc/apt/sources.list and run sudo apt update && sudo apt upgrade -y after that.

@moracabanas
Copy link

I updated my image with that and now I get:
glxinfo | grep OpenGL

OpenGL vendor string: Microsoft Corporation
OpenGL renderer string: D3D12 (NVIDIA GeForce RTX 2060 with Max-Q Design)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 21.2.5
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 3.1 Mesa 21.2.5
OpenGL shading language version string: 1.40
OpenGL context flags: (none)
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.0 Mesa 21.2.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.00
OpenGL ES profile extensions:

Thanks you so much I am testing this now!

@moracabanas
Copy link

moracabanas commented Nov 30, 2021

All working like expected right now
Webgl is working solid! on my docker image

image

The weird issue now is about how I can get ~700fps on glxgears with llvmpipe and just ~70fps with mesa 21.x

@onomatopellan
Copy link

@moracabanas glxgears is somewhat outdated. Try better es2gears from the mesa-utils-extra package.

@rosiakpiotr
Copy link

Any updates on this one?

@kryptoniancode
Copy link

How to solve this? To get GPU OpenGL renderer in docker container?

In WSL2

$ glxinfo | grep "OpenGL"
OpenGL vendor string: Microsoft Corporation
OpenGL renderer string: D3D12 (NVIDIA GeForce GTX 1050 Ti)
OpenGL core profile version string: 4.2 (Core Profile) Mesa 23.0.2 - kisak-mesa PPA
OpenGL core profile shading language version string: 4.20
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.2 (Compatibility Profile) Mesa 23.0.2 - kisak-mesa PPA
OpenGL shading language version string: 4.20
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.1 Mesa 23.0.2 - kisak-mesa PPA
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10
OpenGL ES profile extensions:

In docker container

$ glxinfo | grep "OpenGL"
OpenGL vendor string: Mesa
OpenGL renderer string: llvmpipe (LLVM 15.0.7, 256 bits)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 23.0.2 - kisak-mesa PPA
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
OpenGL core profile extensions:
OpenGL version string: 4.5 (Compatibility Profile) Mesa 23.0.2 - kisak-mesa PPA
OpenGL shading language version string: 4.50
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile
OpenGL extensions:
OpenGL ES profile version string: OpenGL ES 3.2 Mesa 23.0.2 - kisak-mesa PPA
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20
OpenGL ES profile extensions:

@elezar elezar transferred this issue from NVIDIA/nvidia-docker Jan 22, 2024
@darkopetrovic
Copy link

I successfully updated MESA from 20.3.5 to 22.0.5 in docker container and is now able to detect the GPU card.

DISPLAY variable is set to :0 in wsl and container.

WSL

$ glxinfo -B
name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Microsoft Corporation (0xffffffff)
    Device: D3D12 (NVIDIA GeForce GTX 1660) (0xffffffff)
    Version: 23.2.1
    Accelerated: yes
    Video memory: 22321MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.2
    Max compat profile version: 4.2
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.1
OpenGL vendor string: Microsoft Corporation
OpenGL renderer string: D3D12 (NVIDIA GeForce GTX 1660)
OpenGL core profile version string: 4.2 (Core Profile) Mesa 23.2.1-1ubuntu3.1~22.04.2
OpenGL core profile shading language version string: 4.20
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 4.2 (Compatibility Profile) Mesa 23.2.1-1ubuntu3.1~22.04.2
OpenGL shading language version string: 4.20
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.1 Mesa 23.2.1-1ubuntu3.1~22.04.2
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10

Docker (before update)

$ glxinfo -B
name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Mesa/X.org (0xffffffff)
    Device: llvmpipe (LLVM 11.0.1, 256 bits) (0xffffffff)
    Version: 20.3.5
    Accelerated: no
    Video memory: 20006MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 4.5
    Max compat profile version: 3.1
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.2
OpenGL vendor string: Mesa/X.org
OpenGL renderer string: llvmpipe (LLVM 11.0.1, 256 bits)
OpenGL core profile version string: 4.5 (Core Profile) Mesa 20.3.5
OpenGL core profile shading language version string: 4.50
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 3.1 Mesa 20.3.5
OpenGL shading language version string: 1.40
OpenGL context flags: (none)

OpenGL ES profile version string: OpenGL ES 3.2 Mesa 20.3.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.20

I followed the guide here to update MESA. Mesa updated from 20.3.5 to 22.0.5 and is now able to detect my Nvidia card.

Docker (after update)

glxinfo -B
name of display: :0
display: :0  screen: 0
direct rendering: Yes
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: Microsoft Corporation (0xffffffff)
    Device: D3D12 (NVIDIA GeForce GTX 1660) (0xffffffff)
    Version: 22.0.5
    Accelerated: yes
    Video memory: 22321MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 3.3
    Max compat profile version: 3.3
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.1
OpenGL vendor string: Microsoft Corporation
OpenGL renderer string: D3D12 (NVIDIA GeForce GTX 1660)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 22.0.5
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile

OpenGL version string: 3.3 (Compatibility Profile) Mesa 22.0.5
OpenGL shading language version string: 3.30
OpenGL context flags: (none)
OpenGL profile mask: compatibility profile

OpenGL ES profile version string: OpenGL ES 3.1 Mesa 22.0.5
OpenGL ES profile shading language version string: OpenGL ES GLSL ES 3.10

I don't know that must be taken in account but I followed equally the guide here to enable WSLg in the container.

Otherwise I have equally the following variables in my Dockerfile:

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV LD_LIBRARY_PATH=/usr/lib/wsl/lib
ENV LIBVA_DRIVER_NAME=d3d12

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants