Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

libGL error: No matching fbConfigs or visuals found, failed to load driver: swrast #30

Closed
marcusvinicius178 opened this issue Jan 24, 2022 · 5 comments

Comments

@marcusvinicius178
Copy link

marcusvinicius178 commented Jan 24, 2022

Hi I am following this tutorial: https://github.com/AuroAi/carla_apollo_bridge

Everything works until the usage step to run the following script:

`# run in carla-apollo container in another terminal:

cd ~/carla_apollo_bridge
python examples/manual_control.py`

After running it I get the error:

libGL_error

My Nvidia-cuda setup is below. I am using TESLA GPU

image

I have followed the Nvidia-Docker installation: https://docs.docker.com/install/linux/docker-ce/ubuntu/

and installed the nvidia-docker2

However there are some discussions in the following links, that maybe this is deprecated to docker >19.02 and the nvidia=container-toolkit should be installed instead:
. My docker is :
Docker version 20.10.12, build e91ed57

1 -NVIDIA/nvidia-docker#1268

2 - https://www.pugetsystems.com/labs/hpc/Workstation-Setup-for-Docker-with-the-New-NVIDIA-Container-Toolkit-nvidia-docker2-is-deprecated-1568/

I have also read the Nvidia with Docker usage recommendation: https://github.com/nvidia/nvidia-docker

as well as read some solutions that worked for other guys:

1 - #11
2 - SoonminHwang/dockers#1 (comment)
3 - https://askubuntu.com/questions/541343/problems-with-libgl-fbconfigs-swrast-through-each-update/566522#566522

**Unfortunately ** this didn't work for me.

I have runned the carla-server container with the following additional flags (instead the original of tutorial):

docker run --rm -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --gpus=all --name=carla-server --net=host -d carlasim/carla:0.9.6

And I have modified the DOCKERFILE from carla_apollo_bridge/docker folder, adding Environmental variables (ENV) below:

image

I have also used **-e DISPLAY=$DISPLAY ** instead ENV SDL_VIDEODRIVER='offscreen', but any result...

I have no idea why this error persists and how to fix it.

Is it because should be installed nvidia-container-toolkit instead nvidia-docker2?
or is some docker run flag wrong or missing?

Some driver missing? mesa-utils, libvulkan1, libgl1-mesa-dri? I tried a lot of them and anything worked...

Can someone help?

@daohu527
Copy link

Does carla support not starting a graphical interface? It is really not very friendly to start a graphical interface in docker

In fact, if it's just a cyber framework, you can do without docker at all

@marcusvinicius178
Copy link
Author

marcusvinicius178 commented Jan 25, 2022

I cannot. I tried to run CARLA 0.9.6 along the Apollo5.0 (modified branch with the cyber bridge):

git clone https://github.com/auroai/apollo --single-branch -b carla

And answering to your question, yes they support graphical interface with Apollo 5.0:
https://youtu.be/QR__C8voIQg

But to run from source there is an issue in running the run_bridge.py file. This script use a lot of modules, that are built just inside the Docker image. Therefore it is not possible to start this bridge outside docker. It is going to raise in NoModuleFoundError and Import Errors

image

The tutorial was done to work with docker: https://github.com/AuroAi/carla_apollo_bridge
even that they say "it is possible to run from source" as far as I understood...

My last attempts was modify my docker image:

image

And start the container with these docker run flags:

docker run -it --rm --privileged -v /tmp/.X11-unix:/tmp/.X11-unix:rw -v /usr/lib/nvidia:/usr/lib/nvidia --device /dev/dri -e SDL_VIDEO_GL_DRIVER= libGL.so.1.7.0 -e DISPLAY=$DISPLAY -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --gpus=all --name=carla-server --net=host -d carlasim/carla:0.9.6

In addition I have issued the following commands (related to fix this issue)

Inside docker:

export PATH="/usr/lib/nvidia/bin":${PATH}
export LD_LIBRARY_PATH="/usr/lib/:/usr/lib32/nvidia":${LD_LIBRARY_PATH}

Tried to pull the cuda-gl image:

docker pull nvidia/cudagl:11.4.2-base-ubuntu20.04

The command below worked, then showed everything is fine with GPU:
LIBGL_DEBUG=verbose glxgears

Tried to recover the symbolic links (recommended)

sudo ln -s /usr/lib/x86_64-linux-gnu/libGL.so.1.0  /usr/local/lib/libGL.so.1.7.0
sudo ln -s /usr/lib/x86_64-linux-gnu/libGL.so  /usr/local/lib/libGL.so.1
export LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/nvidia"

Installed this module:

sudo apt-get install libgl1-mesa-glx

Anything worked.

The "good" news is that I had the same problems in my laptop, and doing the steps I have showed now... reinstalling Nvidia-drivers,etc worked...and the LibGL error disappeared!

However I need to set this in our enterprise remote computer....that has a better GPU, RAM, system etc than mine. But for some reason the same steps are not working in this AWS remote machine...
I am trying to compare with my personal laptop to see what is different, but I cannot not see anything different.

Some idea @daohu527

Thanks to try to help me! I really appreciate that

@daohu527
Copy link

Seems to be a nvidia driver and container issue, I'm also confused about using a GUI in a container, it's better to avoid it, or you'll face a lot of adaptation issues

@marcusvinicius178
Copy link
Author

Hi @daohu527 you are right it was not easy....after read a lot about Docker and try numerous things I got it (2 weeks ago, I forgot to put here the solution).

For my cloud machine (AWS) what worked was:

1: Run the carla-server container with the following flags:

docker run -it --name=carla-server --privileged -v /tmp/.X11-unix:/tmp/.X11-unix:rw -v /usr/lib/nvidia:/usr/lib/nvidia --device /dev/dri --rm -e __NV_PRIME_RENDER_OFFLOAD=1 -e __GLX_VENDOR_LIBRARY_NAME=nvidia -e DISPLAY=$DISPLAY -e NVIDIA_VISIBLE_DEVICES=all -e NVIDIA_DRIVER_CAPABILITIES=all --gpus=all --name=carla-server --net=host -d carlasim/carla:0.9.6

instead the tutorial command (DO NOT USE the COMMAND below)

docker run --gpus=all --name=carla-server --net=host -d carlasim/carla:0.9.6

In addition, the second step is needed:

2: Modify The Dockerfile inside carla_apollo_bridge/docker folder and Add these Environment Variables in the bottom of the file:

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all
ENV SDL_VIDEO_GL_DRIVER = libGL.so.1.7.0  #libGL.so.1
ENV DISPLAY=$DISPLAY
ENV LD_LIBRARY_PATH="$LD_LIBRARY_PATH:/usr/lib/nvidia"
ENV  __NV_PRIME_RENDER_OFFLOAD=1
ENV __GLX_VENDOR_LIBRARY_NAME=nvidia

I guess just the first 2 ENV are enough, however, I guess the addition of other ones brought more stability to open the pygame simulator. In addition a third step could be added, however it can crash a virtual machine (happened with me). It is removing the xorg.conf file inside /etc/X11 folder

The problem is kind of 90% fixed because sometimes the simulator does not open at first attempt, pointing the same GLX communication issue...however repeating the command twice or ever three or more times, the simulator will open and work....
Just a docker expert can resolve this 100%, but for me it working is enough...

@daohu527
Copy link

Great job, thanks for sharing : )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants