Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run alfred on headless servers without root account #48

Closed
davidnvq opened this issue Sep 24, 2020 · 10 comments
Closed

Run alfred on headless servers without root account #48

davidnvq opened this issue Sep 24, 2020 · 10 comments

Comments

@davidnvq
Copy link

Hello there,

I'm trying to deploy the code on the headless servers that I don't have root access. The job is submitted to the servers via a job scheduler so that I even can't ssh to such servers.

I followed your guide in #29, but I got an error when running startx.py. It seems like that the execution needs the root privilege.
May you give me some hint how can I work around this problem?

Thank you a lot!

Below is the full output:

python startx.py
Starting X on DISPLAY=:0

Section "Device"
    Identifier     "Device0"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:61:0:0"
EndSection


Section "Screen"
    Identifier     "Screen0"
    Device         "Device0"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
        Virtual 1024 768
    EndSubSection
EndSection


Section "Device"
    Identifier     "Device1"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:62:0:0"
EndSection


Section "Screen"
    Identifier     "Screen1"
    Device         "Device1"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
        Virtual 1024 768
    EndSubSection
EndSection


Section "Device"
    Identifier     "Device2"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:177:0:0"
EndSection


Section "Screen"
    Identifier     "Screen2"
    Device         "Device2"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
        Virtual 1024 768
    EndSubSection
EndSection


Section "Device"
    Identifier     "Device3"
    Driver         "nvidia"
    VendorName     "NVIDIA Corporation"
    BusID          "PCI:178:0:0"
EndSection


Section "Screen"
    Identifier     "Screen3"
    Device         "Device3"
    DefaultDepth    24
    Option         "AllowEmptyInitialConfiguration" "True"
    SubSection     "Display"
        Depth       24
        Virtual 1024 768
    EndSubSection
EndSection


Section "ServerLayout"
    Identifier     "Layout0"
    Screen 0 "Screen0" 0 0
    Screen 1 "Screen1" 0 0
    Screen 2 "Screen2" 0 0
    Screen 3 "Screen3" 0 0
EndSection

(EE) 
Fatal server error:
(EE) PAM authentication failed, cannot start X server.
	Perhaps you do not have console ownership?
(EE) 
(EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
(EE) 
@jzhanson
Copy link

You could try using the ai2thor-docker repo which is basically a few scripts to make and run a Docker container that runs startx in it.

@davidnvq
Copy link
Author

@jzhanson Thanks. In fact, I also tried this one allenai/ai2thor-docker#3. The docker container won't work properly unless it is launched on the headless machine. Specifically, my machine is a headless server, so it can't forward the x server to the container. Do you have any idea about this problem?

@MohitShridhar
Copy link
Collaborator

MohitShridhar commented Sep 24, 2020

I was going to suggest docker as well: https://github.com/askforalfred/alfred#docker-setup

I am not sure how to use THOR without X. Perhaps you can check here: https://github.com/allenai/ai2thor ?

@davidnvq
Copy link
Author

@MohitShridhar Thank you for the suggestion. At least, I find your README for installing the docker is more well-documented the ai2thor repo 👍 Let me give it a try.

@davidnvq
Copy link
Author

davidnvq commented Sep 24, 2020

Anw, the docker containers built from your Docker file or from ai2thor-docker work perfectly given the host machine has a display forwarded to the container. I'm trying to use xvfb to create a virtual display but still not successful. I will close this issue when I find a way to handle this problem.

@ekolve
Copy link

ekolve commented Sep 24, 2020

If you unset the DISPLAY environment variable and launch the docker container, does it work? The docker container will launch its own X11 server to render thor. If not, can you paste the console output you receive when running?

@davidnvq
Copy link
Author

davidnvq commented Sep 24, 2020

Given the docker image built with ai2thor-docker, I run the docker container without passing DISPLAY as you said:

docker run -it ai2thor-docker

And run the test script with its error output:

root@31ea08b788d9:/app# python3 example_agent.py 

X.Org X Server 1.19.6
Release Date: 2017-12-20
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.15.0-115-generic x86_64 Ubuntu
Current Operating System: Linux 31ea08b788d9 5.4.0-48-generic #52~18.04.1-Ubuntu SMP Thu Sep 10 12:50:22 UTC 2020 x86_64
Kernel command line: BOOT_IMAGE=/boot/vmlinuz-5.4.0-48-generic root=UUID=5d29a7bb-9107-41dc-98df-088a9a97c6fe ro quiet splash vt.handoff=1
Build Date: 04 September 2020  03:34:39PM
xorg-server 2:1.19.6-1ubuntu4.6 (For technical support please see http://www.ubuntu.com/support) 
Current version of pixman: 0.34.0
	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Thu Sep 24 19:20:52 2020
(++) Using config file: "/tmp/tmp7vfy91h0"
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE) 
Fatal server error:
(EE) parse_vt_settings: Cannot open /dev/tty0 (No such file or directory)
(EE) 
(EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) 
(EE) Server terminated with error (1). Closing log file.
xdpyinfo:  unable to open display ":0.0".
Traceback (most recent call last):
  File "example_agent.py", line 8, in <module>
    controller = ai2thor.controller.Controller(scene='FloorPlan28')
  File "/usr/local/lib/python3.6/dist-packages/ai2thor/controller.py", line 426, in __init__
    host=host
  File "/usr/local/lib/python3.6/dist-packages/ai2thor/controller.py", line 918, in start
    self.check_x_display(env['DISPLAY'])
  File "/usr/local/lib/python3.6/dist-packages/ai2thor/controller.py", line 749, in check_x_display
    ("Invalid DISPLAY %s - cannot find X server with xdpyinfo" % x_display)
AssertionError: Invalid DISPLAY :0.0 - cannot find X server with xdpyinfo

I checked the environment variable DISPLAY inside the container and it returns nothing:

echo $DISPLAY
# return nothing

I'm very new to such kind of setup, please correct me if I'm wrong. I checked several issues at ai2thor repo but still found no concrete guide for setup on headless servers. Many thanks beforehand.

@ekolve
Copy link

ekolve commented Sep 24, 2020

Take a look at scripts/run.sh in the ai2thor-docker repo. You must run the docker container with the --privileged flag passed in.

@davidnvq
Copy link
Author

Thank you a lot @ekolve, it works seamlessly with docker. I scale it up to the HPC server which only can support with singularity containers. So I have to convert the docker image into the singularity image. If you have any experience with singularity, any pointer from you would be really helpful for me. Below are the commands to reproduce my errors if needed:

quang$  singularity build --sandbox ai2thor.sif docker://quanguet/ai2thor:nano
quang$  singularity shell --writable --fakeroot --nv ai2thor.sif
# get into the container
singularity> python3 /app/example_agent.py    
# or simply
singularity> startx

The output error:

_XSERVTransmkdir: Owner of /tmp/.X11-unix should be set to root

X.Org X Server 1.19.6
Release Date: 2017-12-20
X Protocol Version 11, Revision 0
Build Operating System: Linux 4.15.0-115-generic x86_64 Ubuntu
Current Operating System: Linux g0003.abci.local 3.10.0-862.el7.x86_64 #1 SMP Fri Apr 20 16:44:24 UTC 2018 x86_64
Kernel command line: BOOT_IMAGE=/vmlinuz-3.10.0-862.el7.x86_64 root=UUID=41b57a73-124f-47dd-ada6-860528ae35d8 ro selinux=0 quiet console=tty0 console=ttyS0,115200 ipv6.disable=1 crashkernel=256M thash_entries=131072 consoleblank=0 scsi_mod.eh_deadline=1
Build Date: 04 September 2020  03:34:39PM
xorg-server 2:1.19.6-1ubuntu4.6 (For technical support please see http://www.ubuntu.com/support) 
Current version of pixman: 0.34.0
	Before reporting problems, check http://wiki.x.org
	to make sure that you have the latest version.
Markers: (--) probed, (**) from config file, (==) default setting,
	(++) from command line, (!!) notice, (II) informational,
	(WW) warning, (EE) error, (NI) not implemented, (??) unknown.
(==) Log file: "/var/log/Xorg.0.log", Time: Sat Sep 26 15:12:07 2020
(==) Using system config directory "/usr/share/X11/xorg.conf.d"
(EE) 
Fatal server error:
(EE) parse_vt_settings: Cannot open /dev/tty0 (Permission denied)
(EE) 
(EE) 
Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
(EE) Please also check the log file at "/var/log/Xorg.0.log" for additional information.
(EE) 
(EE) Server terminated with error (1). Closing log file.
xinit: giving up
xinit: unable to connect to X server: Connection refused
xinit: server error

Many thanks.

@davidnvq
Copy link
Author

davidnvq commented Oct 1, 2020

Since it is mostly related with Singularity and it works with Docker, so I'm going to close this issue. Thank you a lot.

@davidnvq davidnvq closed this as completed Oct 1, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants