Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[solved] "Found no NVIDIA driver on your system" and "cannot connect to X server" #1

Open
MartinaRuocco opened this issue Jan 18, 2021 · 0 comments

Comments

@MartinaRuocco
Copy link

MartinaRuocco commented Jan 18, 2021

Hi @nlakshmanan!

I was trying to follow your README (run from Docker) but unfortunately, the execution failed and the processes were killed.
Here are the errors and how I managed to solve them:

  1. NVIDIA driver
UserWarning: CUDA initialization: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx (Triggered internally at  /pytorch/c10/cuda/CUDAFunctions.cpp:100.)

which was strange as I do have installed both the nvidia drivers and nvidia-docker.
I solved it by modifying the "docker run" command (solution taken from here):
docker run --runtime=nvidia -e NVIDIA_DRIVER_CAPABILITIES=compute,utility -e NVIDIA_VISIBLE_DEVICES=all -it nnlnachu/surreal_pybullet bash
an alternative would be to build your image on top of the nvidia/cuda image.

  1. X server
X11 functions dynamically loaded using dlopen/dlsym OK!
    cannot connect to X server                                   

which I solved by installing xvbf and running:
xvfb-run python3 surreal_subproc.py -al ppo --env gym:HalfCheetahPyBulletEnv-v0 exp1

  1. SDF files
    pybullet.error: Cannot load SDF file.
    I solved it by installing pybullet with pip install -e . instead of python setup.py install as explained here.

To summarize, I think your Step 3 should be the following:

cd ./surreal
python3 -m pip install -r requirements.txt
python3 -m pip install https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow_cpu-2.1.0-cp37-cp37m-manylinux2010_x86_64.whl
python3 setup.py install
cd ../pybullet-gym
pip install -e .
apt-get update -y
apt-get install -y xvfb

After that, in order to not repeat these commands every time we want to launch a training, we can save this container state as a new image, using the command:
docker commit container-name new-image-name
(solution taken from here)

I hope this helps!

@MartinaRuocco MartinaRuocco changed the title Found no NVIDIA driver on your system. cannot connect to X server Jan 21, 2021
@MartinaRuocco MartinaRuocco changed the title cannot connect to X server "Found no NVIDIA driver on your system" and "cannot connect to X server" Jan 21, 2021
@MartinaRuocco MartinaRuocco changed the title "Found no NVIDIA driver on your system" and "cannot connect to X server" [solved] "Found no NVIDIA driver on your system" and "cannot connect to X server" Jan 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant