Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple Display Errors when launching the TDW docker image in a remote server #4

Closed
YanjunChen329 opened this issue Jul 24, 2020 · 6 comments

Comments

@YanjunChen329
Copy link

YanjunChen329 commented Jul 24, 2020

Hi,

I am trying to use the TDW docker image in a remote Ubuntu server. Due to the display limitation of my remote server, I have to launch a specialized docker container in my server and run an X server there so that I could connect the display to my local desktop through VNC. This workflow works fine for other applications so I was thinking about launching the TDW docker image from my container.

My docker container for display has Ubuntu version 18.04.3. When initializing it, I passed options

-v /var/run/docker.sock:/var/run/docker.sock
-v /usr/bin/docker:/bin/docker

into the RUN command.

However, when I build the TDW docker image, the building hangs in the last step: RUN ./TDW/TDW.x86_64.

image

I suspected that maybe ./TDW/TDW.x86_64 is an application with GUI, so I skipped the last step and launch the TDW image within my container. This is the command I use:

sudo docker run -it \
    --rm \
    --gpus all \
    -v /tmp/.X11-unix:/tmp/.X11-unix \
    -e DISPLAY=$DISPLAY \
    --network host \
    tdw:v1.6.0

However, when I run ./TDW/TDW.x86_64, the prompt still hangs there with no response, neither could I see anything from my display. I tried to run xclock but it says Error: Can't open display: :0. I tried to do some troubleshooting and found that my X11 socket file /tmp/.X11-unix/X0 is successfully passed into the TDW container, but it is identified as a directory instead of a socket.

In the TDW container
aa
In my display container
image
I am a newbie in this and I couldn't figure out what happened.


After I failed to launch the TDW container with the Docker within Docker approach, I tried to install an X server and VNC display directly in the TDW container. I successfully see the display on my local desktop. Running xclock in the TDW container also pops up a window in the display. However, when I tried to run ./TDW/TDW.x86_64, the following error shows up:

X Error of failed request:  BadValue (integer parameter out of range for operation)
  Major opcode of failed request:  150 (GLX)
  Minor opcode of failed request:  3 (X_GLXCreateContext)
  Value in failed request:  0x0
  Serial number of failed request:  87
  Current serial number in output stream:  88

It seems to be an OpenGL problem and I have no idea how to solve it. Any feedback or suggestion on this is much appreciated! Thank you for the time!

@jemsMIT
Copy link
Collaborator

jemsMIT commented Jul 25, 2020

Hi - have you tried using xpra to view a remote render? Here is a link to our documentation on this approach:

https://github.com/threedworld-mit/tdw/blob/master/Documentation/misc_frontend/xpra.md

@alters-mit
Copy link
Member

alters-mit commented Jul 27, 2020

@YanjunChen329 What is the output of nvidia-smi? Your drivers might be out of date.

@YanjunChen329
Copy link
Author

Hi,

Thank you for the prompt replies! I tried to follow the instructions in the link that @jemsMIT provided. However, I still encountered the same "Can't open display: 80" error. Here is the details of my steps:

First, I can't run step 2: DISPLAY=:80 vlgrun -d :0 ./tdw.x86_64 because my host container does not have ./TDW.x86_64. I assumed that this is the command I should run when I am in the TDW container. Thus, I followed the shell script tdw/Docker/start_container_xpra.sh for server-side code. Here are the commands I run:

DISPLAY=:0
xhost +local:root
xpra start :80
DISPLAY=:80
xhost +local:root

sudo docker run -it \
  --rm \
  --gpus all \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -e DISPLAY=$DISPLAY \
  --network host \
  vglrun -d :0 \
  tdw:v1.6.0 \
  ./TDW/TDW.x86_64

However, for the last command docker complains that no image vglrun can be found. I think this is a small typo in the shell script so I run the following command instead:

sudo docker run -it \
  --rm \
  --gpus all \
  -v /tmp/.X11-unix:/tmp/.X11-unix \
  -e DISPLAY=$DISPLAY \
  --network host \
  tdw:v1.6.0 \
  vglrun -d :0 ./TDW/TDW.x86_64

After it started running, I went to my Mac and tried the command xpra attach --ssh=ssh ssh/<my_server_hostname>/80. However, the following error shows up:

Warning: vendor 'Intel Inc.' is greylisted,
you may want to turn off OpenGL if you encounter bugs
Xpra GTK3 client version 4.0.1-r26380 64-bit
running on Mac OS X 10.15.4
GStreamer version 1.16.2 for Python 3.8.2 64-bit
visionary/third_party/dm_control/__pycach│/Applications/Xpra.app/Contents/Resources/lib/python/xpra/platform/darwin/osx_tray.py:90: Warning: invalid cast from 'GtkMenuBar' to 'GtkWindow'
self.macapp.set_menu_bar(self.menu)

(Xpra:1722): Gtk-CRITICAL **: 16:26:53.014: gtk_window_add_accel_group: assertion 'GTK_IS_WINDOW (window)' failed
failed to connect to XXXX:80, retrying for 20 seconds

I also tried to connect to my xpra server through web browsers following a tutorial online. I managed to see the Xpra favicon but immediately after, an error message popped up saying: ERROR CODE: GPU_DEAD_ON_ARRIVAL


The thing that baffles me the most is that I received the OpenGL error after I installed X-server and VNC on the TDW docker image. Is there any dependency that I may be lacking or is there any other approach that I could try? Thank you again for your help!

@YanjunChen329
Copy link
Author

@alters-mit My remote server has driver version 430.5. Here is the output from running nvidia-smi:

image

@alters-mit
Copy link
Member

@YanjunChen329 I found numerous problems with TDW's Docker container, which I fixed. Try upgrading: pip3 install tdw -U and see our Docker documentation

Please let me know if that helps solve your problem.

@alters-mit
Copy link
Member

Closing this ticket due to inactivity and because the bug has likely been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants