New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xdpyinfo: unable to open display ":0.1". #7
Comments
Hi @nnsriram97, Can you give some more information about your machine specs (e.g. # GPUs, operating system)? The way the current code is set up to run an experiment is as follows:
From your error message it looks like it can't find the x-display for your 1st GPU. We have a script in AllenAct that will automatically start x-displays on each of your GPUs, see our installation instructions and the script itself (you might have to close any display that's already open on If you want to temporarily use a smaller number of training processes (e.g. 1 for debugging and checking that things work) you can simply change the line here to be |
Hi @Lucaweihs, I have 2 GTX 1080Ti's (Cuda 8.0, Nvidia-driver-460) on Ubuntu 16.04 with a display attached to it. I can successfully run the training code with Using
Using Does not being able to launch x_display on 0.1 mean only one GPU is being used? Because I see Train-1 running on GPU 1. If that's the case can you suggest ways to run the code utilizing both GPUs with maximum compute? Also is it possible to run jobs on a headless server without sudo access? |
It's strange that the xdpyinfo problem persists if an X-display is set up on $ DISPLAY=:0.1 glxgears
Running synchronized to the vertical refresh. The framerate should be
approximately the same as the monitor refresh rate.
111939 frames in 5.0 seconds = 22387.727 FPS
114639 frames in 5.0 seconds = 22927.768 FPS and
Thankfully no, you're still using both GPUs for inference/backprop but all of the THOR instances will be using a single GPU which can be a bit slower / use up valuable GPU memory.
This has been a problem that the AI2-THOR team has been trying to resolve for a while, the short answer: not yet, you'll need your system administrator to set up the x-displays if you don't have sudo access yourself. |
Running |
Given that glxgears doesn't run this suggests to me that you likely don't have an x-server running on
|
Thanks for pointing me to the script. I've sudo access but had a monitor attached to my pc and had Xorg running for display. Stopping the monitor service through
But I see all the thor instances running on GPU 0 while train-1 and valid-0 running on GPU 1. Is that normal? Also, can you suggest how to debug/view the simulator output for some particular instance of training? |
Interesting, after doing the above you should be see ai2thor processes on both gpus. Can you confirm that:
Also:
When running things on headless servers I often like to double check that the ai2thor processes are simulating by using a VNC. To do this yourself you can the following (you'll need to export DISPLAY=:0.0
nohup x11vnc -noxdamage -display :0.0 -nopw -once -xrandr -noxrecord -forever -grabalways --httpport 5900&
wm2& which will start the vnc server on the server. You can then connect to this server locally by installing a VNC viewer (e.g. https://www.realvnc.com/en/connect/download/viewer/) and then setting up a new connection (cmd+N on Mac) with properties that look something like this (note that 5900 is the http port specified in the above block): |
I can successfully run the above commands and I see some GPU memory being used by glxgears
Yes, I had changed it back to default in rearrange_base.py
I had ai2thor - 2.7.2 installed but then upgraded to 2.7.4. Thanks! Please refer to issue #3 for status related to running the code. |
Closing this as I believe things are training at reasonable FPS for you now, let me know if not! |
After updating to the latest version of the rearrangement repo and allenact, running baseline models throws an error. My system details: Ubuntu 16.04, display attached, 2 GPUs and glxgears successfully runs on DISPLAY:=0.0 UpdateIssue solved. Doing |
@nnsriram97 glad to hear you found a solution! We tried to make this "easier" for people by automatically discovering the x-display but it looks like this didn't like the |
Hi,
I am facing an issue while trying to run the baseline models
allenact -o rearrange_out -b . baseline_configs/one_phase/one_phase_rgb_resnet_dagger.py
.Any suggestions to solve this? I can run
python example.py
successfully though.The text was updated successfully, but these errors were encountered: