Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

node-chrome-debug:2.47.1 xvfb-run: error: Xvfb failed to start #91

Closed
MarceloEmmerich opened this issue Aug 21, 2015 · 60 comments
Closed

Comments

@MarceloEmmerich
Copy link

I have the following setup in a docker-compose.yml:

hub:
  image: selenium/hub
  ports:
    - "4444:4444"
chrome:
  image: selenium/node-chrome-debug:2.47.1
  ports:
    - 5900
  links:
    - hub
    - nginx

when I launch my test, I often receive the following error from the logs of the chrome container:

xvfb-run: error: Xvfb failed to start

and my test runner (nightwatchjs) says:

Error retrieving a new session from the selenium server { status: 13, value: { class: 'org.openqa.grid.common.exception.GridException', message: 'Error forwarding the new session cannot find : Capabilities [{platform=ANY, acceptSslCerts=true, javascriptEnabled=true, browserName=chrome, name=Evaluator Login, chromeOptions={args=[--no-sandbox, --window-size=1400,1100, --lang=de, --disable-web-security]}}]' } }

When I connect with VNC to the container previous to running the tests, it works 100% of the time. However, I want this to be part of the CI process, so it also has to run unattended. Any ideas? Thanks, Marcelo

@ghost
Copy link

ghost commented Aug 28, 2015

I see you run a debug node attached to the hub. Maybe there are timing issues in spinning up a hub and then adding a node...

A better way may be to start a standalone-chrome-debug image if you are running a remote driver on demand. That way you bypass the entire grid. I assume you only need one browser session.

The hub plus nodes is a good setup when you need to spin 20 nodes for example because you need to run parallel tests.

@sarahkevinking
Copy link

I started running into this issue as well using the node-chrome docker image.

@pwaller
Copy link

pwaller commented Sep 4, 2015

I've just hit this, I happened to notice it started when I upgraded to docker 1.8.1 from docker 1.7.1.

@pwaller
Copy link

pwaller commented Sep 4, 2015

Okay, after a bit of fun trying to reproduce and bisect, I've discovered why it's happening. It's actually my docker-compose upgrade that did it, not docker's upgrade. I can reproduce it on both 1.7 and 1.8.

The difference is with docker compose 1.4, if you don't use --force-recreate, then sometimes it will get stuck saying Waiting xvfb....

Now that I know this, I'm able to reproduce it without docker-compose.

In one window, run this:

$ docker run -ti --rm --name hub selenium/hub

Then in another:

$ docker run -ti --name chrome --link hub selenium/node-chrome-debug
[...snipped usual output...]
<press CTRL-C>
$ docker start -ai chrome
Waiting xvfb...
xvfb-run: error: Xvfb failed to start
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
^C/opt/bin/entry_point.sh: line 15: kill: (5) - No such process
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
^C/opt/bin/entry_point.sh: line 15: kill: (5) - No such process
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...

@alexander-lazarov
Copy link

Anyone with solution to this? I'm having the same problem standalone-firefox image:

selenium:
    hostname: selenium
    image: selenium/standalone-firefox:2.48.2
alexander@rincewind:~$ docker-compose -v
docker-compose version: 1.5.1
alexander@rincewind:~$ docker -v
Docker version 1.9.0, build 76d6bc9

@bkuhl
Copy link

bkuhl commented Dec 9, 2015

I had the same issue as @alexander-lazarov. I was able to get past this error using selenium/standalone-firefox-debug

@oholubyev
Copy link

Intermittent "xvfb-run: error: Xvfb failed to start" on selenium/standalone-firefox-debug

@referup-tarantegui
Copy link

I have the same issue with selenium/standalone-firefox, I've tried with different tags and it can be reproduced in all of them. Here are the results of a docker diff on the container, I hope it helps.

C /home
C /home/seluser
A /home/seluser/.cache
A /home/seluser/.cache/mozilla
A /home/seluser/.cache/mozilla/firefox
A /home/seluser/.mozilla
A /home/seluser/.mozilla/extensions
A /home/seluser/.mozilla/firefox
A /home/seluser/.mozilla/firefox/Crash Reports
A /home/seluser/.mozilla/firefox/Crash Reports/InstallTime20151030083932
A /home/seluser/.mozilla/firefox/Crash Reports/events
A /home/seluser/Desktop
C /tmp
A /tmp/.X11-unix
A /tmp/.X11-unix/X99
A /tmp/.X99-lock
A /tmp/hsperfdata_seluser
C /var
C /var/tmp

@oholubyev
Copy link

The solution for me was to use --force-recreate flag when launching through Docker Compose: "docker-compose up --force-recreate". It forces recreation of containers even if their configuration and image haven't changed.

@alexander-lazarov
Copy link

What I noticed is that if I change the value of DISPLAY env variable from 99:0 to something different, like 98:0, it manages to start Xvfb. Not sure if this I'm forcing a recreation of the image this way (like @oholubyev proposed), or the previous display is left open/taken for some reason and you need to use a new one to start xvfb. After all, this is a container, not a whole virtual machine and I guess it is possible that something is not being cleaned completely on the host after shutdown/restart.

@kierr
Copy link

kierr commented Jan 23, 2016

--force-recreate worked for me too, thanks @oholubyev

@kierr
Copy link

kierr commented Feb 2, 2016

Ok it's no longer working for me, I start with 1 firefox container but when I try to scale I get the same error, and force recreate changes nothing.

@anton-kasperovich
Copy link

Same issue

Docker: 1.10.0
Image: selenium/node-firefox:2.46.0

9 February 2016 22:36:41 GMT+2xvfb-run: error: Xvfb failed to start

@kierr
Copy link

kierr commented Feb 18, 2016

I couldn't scale above one firefox node, on 5 different servers, because of this issue. For me, it was because of port 4444 conflicting, with all the nodes on the same IP...

@SublimeVincentHerl
Copy link

Same issue with Docker version 1.10.2

@caioquirino
Copy link

I've tested with 2.52 version and the error continues.

@garagepoort
Copy link
Contributor

We had the same problem with the standalone-chrome-debug image. The image executes the following file /opt/bin/entry-point.sh. It tries to start an xDisplay on port 99.0. After we restarted the image the following lock file was present /tmp/.X99-lock. Because of this file xvfb could not be started.
We wrapped the selenium image in our own docker image and added a script to remove this lock file on startup of the container.
Removing the file /tmp/.X99-lock on startup of the container fixed our problem.

@caioquirino
Copy link

@garagepoort Can you make a fix in the entry-point.sh and a pull request please?

@alexander-lazarov
Copy link

@caioquirino: based on what @garagepoort shared, using this Dockerfile should solve the issue. Would be good if anyone can test this, because ironically, I can't reproduce the crash at the moment.

https://gist.github.com/alexander-lazarov/3851614fdd9ab33a1182

@caioquirino
Copy link

Hi @alexander-lazarov,
Thank you for collaborating! :D
Your gist does not work because the Dockerfile was on image's build time.
The fix needs to be every time when the docker container starts (entrypoint).
I will make the PR tomorrow.

@garagepoort
Copy link
Contributor

Hi @caioquirino,
I created the pull request with the fix.
To version 2.52.0
#179
To master
#180

@caioquirino
Copy link

I have tested @garagepoort fix and that worked well.
Waiting for merge...

@garagepoort
Copy link
Contributor

My pull request has been merged and included in release 2.53.0. This issue can be resolved?

@joemewes
Copy link

joemewes commented Apr 19, 2016

I have this on 2.53.0 firefox-debug (2.53.0 selenium-grid)

docker v1.10.3

Waiting xvfb...
-bash: b: command not found
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...

related? same? new issue? Chrome loads fine.

I have also tried --force-recreate flag a few times...

ignore - my issue looks to be this #208

@AlesKrajnik
Copy link

@garagepoort I just had the same issue with node-chrome-debug 2.53.0. For a reason I don't know the node-chrome-debug and node-firefox-debug exited and when I tried to restart them, the Waiting xvfb... messages appeared and the container exited after few seconds. I had to delete the .X99-lock file and now everything is fine.
This merge of yours went only to the standalone-chrome-debug? Could you also please add it to the node-*-debug images?

@garagepoort
Copy link
Contributor

@aleshaczech Ok I'll create a pull request for them all.

@kchudy
Copy link

kchudy commented May 26, 2016

@garagepoort I can reproduce the error event on selenium/standalone-chrome-debug:2.53.0. Here's the output of ./opt/bin/entry_point.sh call:

root@1adb1d203eb4:/# ./opt/bin/entry_point.sh 
Looking for lock file: /tmp/.X??-lock
Waiting xvfb...
-bash: 169.254/16: No such file or directory
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
root@1adb1d203eb4:/# Error: Couldn't connect to XServer:99.0
26/05/2016 03:04:47 passing arg to libvncserver: -rfbport
26/05/2016 03:04:47 passing arg to libvncserver: 5900
26/05/2016 03:04:47 -usepw: found /root/.vnc/passwd
26/05/2016 03:04:47 x11vnc version: 0.9.13 lastmod: 2011-08-10  pid: 169
26/05/2016 03:04:47 XOpenDisplay(":99.0") failed.
26/05/2016 03:04:47 Trying again with XAUTHLOCALHOSTNAME=localhost ...

26/05/2016 03:04:47 ***************************************
26/05/2016 03:04:47 *** XOpenDisplay failed (:99.0)

*** x11vnc was unable to open the X DISPLAY: ":99.0", it cannot continue.
*** There may be "Xlib:" error messages above with details about the failure.

Some tips and guidelines:

** An X server (the one you wish to view) must be running before x11vnc is
   started: x11vnc does not start the X server.  (however, see the -create
   option if that is what you really want).

** You must use -display <disp>, -OR- set and export your $DISPLAY
   environment variable to refer to the display of the desired X server.
 - Usually the display is simply ":0" (in fact x11vnc uses this if you forget
   to specify it), but in some multi-user situations it could be ":1", ":2",
   or even ":137".  Ask your administrator or a guru if you are having
   difficulty determining what your X DISPLAY is.

** Next, you need to have sufficient permissions (Xauthority) 
   to connect to the X DISPLAY.   Here are some Tips:

 - Often, you just need to run x11vnc as the user logged into the X session.
   So make sure to be that user when you type x11vnc.
 - Being root is usually not enough because the incorrect MIT-MAGIC-COOKIE
   file may be accessed.  The cookie file contains the secret key that
   allows x11vnc to connect to the desired X DISPLAY.
 - You can explicitly indicate which MIT-MAGIC-COOKIE file should be used
   by the -auth option, e.g.:
       x11vnc -auth /home/someuser/.Xauthority -display :0
       x11vnc -auth /tmp/.gdmzndVlR -display :0
   you must have read permission for the auth file.
   See also '-auth guess' and '-findauth' discussed below.

** If NO ONE is logged into an X session yet, but there is a greeter login
   program like "gdm", "kdm", "xdm", or "dtlogin" running, you will need
   to find and use the raw display manager MIT-MAGIC-COOKIE file.
   Some examples for various display managers:

     gdm:     -auth /var/gdm/:0.Xauth
              -auth /var/lib/gdm/:0.Xauth
     kdm:     -auth /var/lib/kdm/A:0-crWk72
              -auth /var/run/xauth/A:0-crWk72
     xdm:     -auth /var/lib/xdm/authdir/authfiles/A:0-XQvaJk
     dtlogin: -auth /var/dt/A:0-UgaaXa

   Sometimes the command "ps wwwwaux | grep auth" can reveal the file location.

   Starting with x11vnc 0.9.9 you can have it try to guess by using:

              -auth guess

   (see also the x11vnc -findauth option.)

   Only root will have read permission for the file, and so x11vnc must be run
   as root (or copy it).  The random characters in the filenames will of course
   change and the directory the cookie file resides in is system dependent.

See also: http://www.karlrunge.com/x11vnc/faq.html

My docker version is 1.11.1, build 5604cbe. Could you please give me some hints how to make it work?

@xlc
Copy link

xlc commented May 26, 2016

I have the same issue as @kchudy and I think it is caused by this line
$(for E in $(grep -vxFf asseluser asroot); do echo $E=$(eval echo \$$E); done) \
https://github.com/SeleniumHQ/docker-selenium/blob/master/StandaloneChromeDebug/entry_point.sh#L21

root@584a2f7bf35e:/# echo $(for E in $(grep -vxFf asseluser asroot); do echo $E=$(eval echo \$$E); done)
CHROME_DRIVER_VERSION=2.21 DBUS_SESSION_BUS_ADDRESS=/dev/null DEBCONF_NONINTERACTIVE_SEEN=true DEBIAN_FRONTEND=noninteractive GEOMETRY=1360x1020x24 no_proxy=*.local, 169.254/16 SCREEN_DEPTH=24 SCREEN_HEIGHT=1020 SCREEN_WIDTH=1360

In this case, no_proxy=*.local, 169.254/16 breaks it and xvfb was never started.
I am not shell export so I don't know the proper way to handle it but somebody must know a better way to transfer environment variables that contains space?

This is my workaround in docker-compose.yml

hub:
  image: selenium/standalone-chrome-debug
  command: bash -c "sed -e '19a\echo no_proxy >> asseluser' /opt/bin/entry_point.sh | bash"

@pwaller
Copy link

pwaller commented May 27, 2016

@xlc nice observation. To the developers: what is that code trying to do? I interpret that it's passing the root environment through, but just for variables which are set in the seluser environment. Why? and why given that -E is being passed? The intent there is lost and the code makes it very hard to read and determine what will happen. Presumably it's possible to achieve the intended effect with less code and in a less error-prone way. I would have sent a patch if I could determine what the purpose is.

scottturley added a commit to scottturley/docker-selenium that referenced this issue Jun 1, 2016
…start xvfb.

The issue is documented here:

SeleniumHQ#91

Looks to be an problem with the lock files preventing a restart intermittently. Adding a line here to remove the lock files on startup.
@garagepoort
Copy link
Contributor

@adamlievrouw This is inside the container. Why would the host lock the X-server?

@ivycas
Copy link

ivycas commented Jul 18, 2016

Was experiencing this issue on selenium/standalone-chrome-debug and selenium/standalone-firefox-debug latest.

Current workaround based on xlc's solution seems to cover both issues. Just added this to my docker-compose.

command: bash -c "rm /tmp/.X99-lock || echo 'Lock not found, continuing normal startup' && export no_proxy=*.local && /opt/bin/entry_point.sh"

@go2guy
Copy link

go2guy commented Aug 3, 2016

So I deleted the container and image for selenium/standalone-chrome-debug.
Re-ran docker run -d -P selenium/standalone-chrome-debug.

Container still has exited with 127. See docker ps -a results below. Thought this was fixed in latest
selenium/standalone-chrome-debug?

CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
5af24880c451 selenium/standalone-chrome-debug "/opt/bin/entry_point" 2 minutes ago Exited (127) 2 minutes ago cocky_wright

docker logs -f 5af24880c451
Looking for lock file: /tmp/.X??-lock
Waiting xvfb...
-bash: 169.254/16: No such file or directory
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...
Waiting xvfb...

@wywincl
Copy link

wywincl commented Aug 4, 2016

I have the same problem with @nuggit32 , I hope someone can give some advise.
My platform is
OSX 10.11.6
Docker 1.12.0

@davidino
Copy link

davidino commented Aug 5, 2016

Same error with this configuration:

  selenium-hub:
    container_name: selenium-hub
    image: selenium/hub:2.53.0
    ports:
      - "4444:4444"

  selenium-chrome:
    image: selenium/node-chrome-debug:2.53.0
    environment:
      - HUB_PORT_4444_TCP_ADDR=selenium-hub
      - HUB_PORT_4444_TCP_PORT=4444
    links:
      - selenium-hub
    ports:
      - "5900:5900"

switching to image: selenium/node-firefox:2.53.0 it works fine!

@gurukiran007
Copy link

#256 the pull request must fix it... make sure the VNC password is set by the seluser but the bug was, that root was setting that and seluser failed to access it and this failed....

@wywincl
Copy link

wywincl commented Aug 6, 2016

I use your @gurukiran007 code and then it work well. Thank you very much for fixing it.

@vikramvi
Copy link

Issue still happens for me Mac mini ( 10.10.5 ) with docker 1.12.1 ( build: 121333) + selenium debug version 2.53.0

@vikramvi
Copy link

Got this working with below command

docker run -d -P -e no_proxy=localhost -e HUB_ENV_no_proxy=localhost --link selenium-hub:hub selenium/node-chrome-debug

ddavison pushed a commit that referenced this issue Oct 2, 2016
…start xvfb.

The issue is documented here:

#91

Looks to be an problem with the lock files preventing a restart intermittently. Adding a line here to remove the lock files on startup.

Signed-off-by: Daniel Davison <daniel.jj.davison@gmail.com>
@rcambrj
Copy link

rcambrj commented Oct 6, 2016

This seems to happen because the /tmp/.X99-lock file exists and gets cached which prevents xvfb from starting the next time around.

in docker-compose.yml, try:

  selenium-chrome:
    tmpfs:
      - /tmp

That should stop it from getting cached, right?

@Loki-Afro
Copy link

Funny thing:

docker run -d --name selenium_chrome selenium/standalone-chrome
works just fine.
docker stop selenium_chrome
and it will stop.
docker start selenium_chrome
and docker logs selenium_chrome will print out Xvfb failed to start
docker restart selenium_chrome everything works as expected ...

@alievrouw
Copy link

alievrouw commented Oct 28, 2016

@Loki-Afro you can avoid this error like this:

To restart the container:
sudo docker exec -it CONTAINER rm /tmp/.X99-lock && sudo docker restart CONTAINER

To stop the container:
sudo docker exec -it CONTAINER rm /tmp/.X99-lock && sudo docker stop CONTAINER

@saifsysim
Copy link

Where is the Cache stored in any chrome docker containers? i want to clear the cache and cookie after every test and rerun the test only after cleaning up the cache and cookies

currently using /selenium-chrome-node:54.0.0 container

I am looking for cache /opt/google/chrome , dont see anything in the directory?

@alievrouw
Copy link

alievrouw commented Dec 29, 2016

Is this what you're looking for?

[test@test ~]$ sudo docker exec -it perf-node-gce-01 sh -c 'find / -iname cache | fgrep -i google'
/tmp/.com.google.Chrome.TqYuQs/Default/Cache
/tmp/.com.google.Chrome.9eBUc3/Default/Cache
/tmp/.com.google.Chrome.z3xTJ1/Default/Cache
/tmp/.com.google.Chrome.J64esA/Default/Cache
/tmp/.com.google.Chrome.yUOJ11/Default/Cache
/tmp/.com.google.Chrome.A3zIeZ/Default/Cache
/tmp/.com.google.Chrome.x9iF6C/Default/Cache
/tmp/.com.google.Chrome.3PzCtu/Default/Cache
/tmp/.com.google.Chrome.QdwRFC/Default/Cache
/tmp/.com.google.Chrome.AKZhjn/Default/Cache
/tmp/.com.google.Chrome.fZ4PpA/Default/Cache

@saifsysim
Copy link

what I am trying to do is run a test on a chrome node (access my internal web app) which caches some content in the chrome node.
I would like to delete the cache and cookie and use the same container to run the same test at a later time , Do not want the cache to be present in the next run

however when i run the above command

[root@ip-10-205-73-250 ec2-user]# sudo docker exec -i -t 7ad01eb90a24 sh -c 'find / -iname cache | fgrep -i google'
find: '/etc/polkit-1/localauthority': Permission denied
find: '/etc/ssl/private': Permission denied
find: '/proc/tty/driver': Permission denied
find: '/proc/55/task/55/fd': Permission denied
find: '/proc/55/task/55/fdinfo': Permission denied
find: '/proc/55/task/55/ns': Permission denied
find: '/proc/55/fd': Permission denied
find: '/proc/55/map_files': Permission denied
find: '/proc/55/fdinfo': Permission denied
find: '/proc/55/ns': Permission denied
find: '/proc/56/task/56/fd': Permission denied
find: '/proc/56/task/56/fdinfo': Permission denied
find: '/proc/56/task/56/ns': Permission denied
find: '/proc/56/fd': Permission denied
find: '/proc/56/map_files': Permission denied
find: '/proc/56/fdinfo': Permission denied
find: '/proc/56/ns': Permission denied
find: '/proc/57/task/57/fd': Permission denied
find: '/proc/57/task/57/fdinfo': Permission denied
find: '/proc/57/task/57/ns': Permission denied
find: '/proc/57/fd': Permission denied
find: '/proc/57/map_files': Permission denied
find: '/proc/57/fdinfo': Permission denied
find: '/proc/57/ns': Permission denied
find: '/root': Permission denied
find: '/var/cache/ldconfig': Permission denied
find: '/var/lib/apt/lists/partial': Permission denied
find: '/var/lib/polkit-1': Permission denied

@alievrouw
Copy link

I'm not sure if it's useful, but I use this in many of my selenium scripts to achieve what your talking about.

driver.manage().deleteAllCookies();

@ecirtap
Copy link

ecirtap commented Jan 9, 2017

The following bug, signaled above, occurs only on Docker4Mac (and not on Docker4Windows) even when a proxy is not specified at the Docker Engine level:

Waiting xvfb...
 -bash: 169.254/16: No such file or directory

This is due to the default value of the Exclude textfield of the Advanced pane of Docker4Mac settings which is greyed out initially but nevertheless taken into account by the Docker Engine. One can see it using this simple command:

$ docker run -it --rm alpine env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=68427e1a6e8c
TERM=xterm
no_proxy=*.local, 169.254/16
HOME=/root

If this value is changed to a single space character, the no_proxy variable disappears:

$ docker run -it --rm alpine env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=99a77bf3380e
TERM=xterm
HOME=/root

=> it is a docker4mac bug

@ghost
Copy link

ghost commented Mar 4, 2017

Thanks @ecirtap, you saved me a lot of time.

@punitmishra
Copy link

This is mac specific - If you run into the same error change the proxy setting on docker service on a mac change the proxy settings to "No Proxy" - it should fix the xvfb issue.

screen shot 2017-03-08 at 2 13 46 pm

@djalexd
Copy link

djalexd commented Apr 21, 2017

It seems that also HUB_ENV_no_proxy needs to be empty, I was able to start debug with following command:
docker run -d -e HUB_ENV_no_proxy= -p5900:5900 --link selenium-hub:hub selenium/node-chrome-debug:3.3.1-cesium

@laichimirum
Copy link

@punitmishra funny thing that I don't have that radio button on mac :D

@BirdTho
Copy link

BirdTho commented Jul 13, 2018

@ivycas Getting the selenium images to work in docker-compose, good. I'm stuck trying to get them working in Codeship, a bit more restrictive. Also fancy seeing one of your comments here!

@ejoebstl
Copy link

ejoebstl commented Jul 13, 2018

For those who might still face this issue: If you try to run multiple nodes with the docker networking mode set to host, you will also get the xvfb-run: error: Xvfb failed to start error.

@BirdTho
Copy link

BirdTho commented Jul 13, 2018

@ejoebstl Actually it's codeship which uses a bridge network always.

@lock lock bot locked and limited conversation to collaborators Aug 14, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests