New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gateway Timeout issue in selenium chrome docker image #392

Closed
thyagab opened this Issue Feb 16, 2017 · 15 comments

Comments

Projects
None yet
6 participants
@thyagab

thyagab commented Feb 16, 2017

Meta -

Image(s):
selenium/node-chrome:3.0.1-germanium
selenium/hub:3.0.1-germanium
Docker Version:
1.12.6
OS:
Amazon Linux AMI 2016.09
Environment:
Protractor+webdriver

Expected Behavior -

Chrome node should be working normal and no gateway timeout issues should be coming.

Actual Behavior -

We are getting gateway timeout issues very frequently. Here is the log

[chrome #1-3] [23:41:32] I/hosted - Using the selenium server at https://selenium.corp.xyz.com/wd/hub
[chrome #1-3]
[chrome #1-3] /var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/error.js:27
[chrome #1-3] super(opt_error);
[chrome #1-3] ^
[chrome #1-3] WebDriverError:

504 Gateway Time-out


[chrome #1-3] The server didn't respond in time.
[chrome #1-3]
[chrome #1-3]
[chrome #1-3]
[chrome #1-3] at WebDriverError (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/error.js:27:5)
[chrome #1-3] at parseHttpResponse (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/http/index.js:554:11)
[chrome #1-3] at client_.send.then.response (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/http/index.js:472:11)
[chrome #1-3] at ManagedPromise.invokeCallback_ (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/promise.js:1379:14)
[chrome #1-3] at TaskQueue.execute_ (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/promise.js:2913:14)
[chrome #1-3] at TaskQueue.executeNext_ (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/promise.js:2896:21)
[chrome #1-3] at asyncRun (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/promise.js:2820:25)
[chrome #1-3] at /var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/promise.js:639:7
[chrome #1-3] at process._tickCallback (internal/process/next_tick.js:103:7)
[chrome #1-3] From: Task: WebDriver.createSession()
[chrome #1-3] at Function.createSession (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/lib/webdriver.js:329:24)
[chrome #1-3] at Builder.build (/var/jenkins_home/workspace/FEproject/node_modules/selenium-webdriver/builder.js:458:24)
[chrome #1-3] at Hosted.DriverProvider.getNewDriver (/var/jenkins_home/workspace/FEproject/node_modules/protractor/built/driverProviders/driverProvider.js:37:33)
[chrome #1-3] at Runner.createBrowser (/var/jenkins_home/workspace/FEproject/node_modules/protractor/built/runner.js:190:43)
[chrome #1-3] at /var/jenkins_home/workspace/FEproject/node_modules/protractor/built/runner.js:264:30
[chrome #1-3] at _fulfilled (/var/jenkins_home/workspace/FEproject/node_modules/q/q.js:834:54)
[chrome #1-3] at self.promiseDispatch.done (/var/jenkins_home/workspace/FEproject/node_modules/q/q.js:863:30)
[chrome #1-3] at Promise.promise.promiseDispatch (/var/jenkins_home/workspace/FEproject/node_modules/q/q.js:796:13)
[chrome #1-3] at /var/jenkins_home/workspace/FEproject/node_modules/q/q.js:556:49
[chrome #1-3] at runSingle (/var/jenkins_home/workspace/FEproject/node_modules/q/q.js:137:13)

@tate-e

This comment has been minimized.

tate-e commented Mar 27, 2017

This happens around 10-20% of the time for me when run on the server and is bogging down my regressions runs. probably have to look for a different tool if this doesn't get fixed sometime soon.

@tate-e

This comment has been minimized.

tate-e commented Mar 28, 2017

I have fixed the problem on my end thanks to this thread: #87
I changed my docker command from:
docker run -d -p 4444:4444 selenium/standalone-chrome:3.2.0-actinium
to:
docker run -d -p 4444:4444 -e DBUS_SESSION_BUS_ADDRESS='/dev/null' selenium/standalone-chrome:3.2.0-actinium

Even though they say it should have been fixed by 3.1 it does not appear to be even though we are now in 3.2

@tparikka

This comment has been minimized.

Contributor

tparikka commented Mar 30, 2017

@tate-e I was encountering the same exact behavior. Docker Version 17.03.1-ce-win5 (10743), selenium/hub:3.2.0, selenium/node-chrome:3.2.0, Win10, C#/.NET bindings. Browser instances were hanging completely in my node containers and would NEVER release, eventually choking out my entire Selenium grid. I am using docker compose, so my equivalent yml looks like this now:

selenium-hub:
image: selenium/hub:3.2.0
environment:
- "GRID_MAX_SESSION=15"
- "GRID_TIMEOUT=180000"
- "GRID_BROWSER_TIMEOUT=180000"
ports:
- "4444:4444"

chrome:
image: selenium/node-chrome:3.2.0
links:
- selenium-hub:hub
environment:
- DBUS_SESSION_BUS_ADDRESS=/dev/null
shm_size: 512MB
volumes:
- /dev/shm:/dev/shm

I've highlighted what seem in my experience to be the most important stabilizing factors. The environment value is the compose equivalent of what you listed in your straight docker run command, the shm_size is a recommendation I found mentioned at Chromium, and the volumes value comes from the SeleniumHQ docs.

@diemol

This comment has been minimized.

Member

diemol commented Apr 19, 2017

This DBUS_SESSION_BUS_ADDRESS=/dev/null option was added to the 3.3.1-cesium release, check Chrome and Firefox.

It should work now @thyagab @tate-e , perhaps you can try again with this latest release?

@thyagab

This comment has been minimized.

thyagab commented Apr 19, 2017

@diemol Thank you. Will try it out.

@thyagab

This comment has been minimized.

thyagab commented Apr 20, 2017

@diemol we tried with 3.3.1 and we still see this issue when we run the tests in parallel.

@diemol

This comment has been minimized.

Member

diemol commented Apr 20, 2017

@thyagab

Oh really? Wow.

Is it possible for you to provide:

  • How you start the docker containers? E.g. docker-compose or docker commands.

Edited my comment, it should not be a Selenium issue but more a docker one, I am curious to check it though

@thyagab

This comment has been minimized.

thyagab commented Apr 20, 2017

@diemol We have our rancher setup with docker compose.Here is our docker compose

version: '2'
services:
  chrome-dev:
    image: selenium/node-chrome:3.3.1-cesium
    environment:
      HUB_PORT_4444_TCP_ADDR: selenium-dev
      HUB_PORT_4444_TCP_PORT: '4444'
      JAVA_OPTS: -Xmx4G
      LOGSPOUT: ignore
      shm_size: 512MB
      DBUS_SESSION_BUS_ADDRESS: /dev/null
    volumes:
    - /dev/shm:/dev/shm
    labels:
      io.rancher.container.start_once: 'true'
  chrome-qa:
    privileged: true
    image: selenium/node-chrome-debug:3.3.1-cesium
    environment:
      HUB_PORT_4444_TCP_ADDR: selenium-qa
      HUB_PORT_4444_TCP_PORT: '4444'
    stdin_open: true
    volumes:
    - /dev/shm:/dev/shm
    tty: true
    links:
    - selenium-qa:selenium-qa
    - selenium-qa:selenium-qa
    ports:
    - XXXX:5900/tcp
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'
  selenium-qa:
    image: selenium/hub:3.3.1-cesium
    environment:
      GRID_BROWSER_TIMEOUT: '180000'
      GRID_MAX_SESSION: '6'
      GRID_TIMEOUT: '180000'
    volumes:
    - /dev/shm:/dev/shm
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'
  selenium-dev:
    image: selenium/hub:3.3.1-cesium
    environment:
      GRID_BROWSER_TIMEOUT: '180000'
      GRID_MAX_SESSION: '10'
      LOGSPOUT: ignore
      GRID_TIMEOUT: '180000'
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'
@diemol

This comment has been minimized.

Member

diemol commented Apr 20, 2017

@thyagab I tried to indent your docker-compose file and this is what I got:

version: '2'
services:
  chrome-dev:
    image: selenium/node-chrome:3.3.1-cesium
    environment:
      HUB_PORT_4444_TCP_ADDR: selenium-dev
      HUB_PORT_4444_TCP_PORT: '4444'
      JAVA_OPTS: -Xmx4G
      LOGSPOUT: ignore
      shm_size: 512MB
      DBUS_SESSION_BUS_ADDRESS: /dev/null
    volumes:
      - /dev/shm:/dev/shm
    labels:
      io.rancher.container.start_once: 'true'
  chrome-qa:
    privileged: true
    image: selenium/node-chrome-debug:3.3.1-cesium
    environment:
      HUB_PORT_4444_TCP_ADDR: selenium-qa
      HUB_PORT_4444_TCP_PORT: '4444'
      stdin_open: true
    volumes:
      - /dev/shm:/dev/shm
    tty: true
    links:
      - selenium-qa:selenium-qa
      - selenium-qa:selenium-qa
    ports:
      - XXXX:5900/tcp
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'
  selenium-qa:
    image: selenium/hub:3.3.1-cesium
    environment:
      GRID_BROWSER_TIMEOUT: '180000'
      GRID_MAX_SESSION: '6'
      GRID_TIMEOUT: '180000'
    volumes:
      - /dev/shm:/dev/shm
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'      
  selenium-dev:
    image: selenium/hub:3.3.1-cesium
    environment:
      GRID_BROWSER_TIMEOUT: '180000'
      GRID_MAX_SESSION: '10'
      LOGSPOUT: ignore
      GRID_TIMEOUT: '180000'
    labels:
      io.rancher.container.pull_image: always
      io.rancher.container.start_once: 'true'

I replaced - XXXX:5900/tcp with - 5900:5900/tcp locally to start it and I got:

ERROR: The Compose file './docker-compose.yml' is invalid because:
services.chrome-qa.links value ['selenium-qa:selenium-qa', 'selenium-qa:selenium-qa'] has non-unique elements
services.chrome-qa.environment.stdin_open contains true, which is an invalid type, it should be a string, number, or a null

I commented those lines and then I could start it, but it is still confusing because you start two grids, but the none of the 4444 get mapped to anything on the localhost, so how do you run the tests? Maybe you can try first with just one grid and see how it goes?

Anyway, without checking further, I would increase shm_size: 512MB to shm_size: 1024MB, we start the Chrome containers like that (sometimes with 2GB), and then we don't get the error anymore. I think it is related to how "heavy" the web app is, therefore Chrome needs more resources to render it properly.

@thyagab

This comment has been minimized.

thyagab commented Apr 20, 2017

@diemol I will increase the shm_size and try it out. You can ignore qa grid. chrome instances are mapped with
environment:
HUB_PORT_4444_TCP_ADDR: selenium-dev
HUB_PORT_4444_TCP_PORT: '4444'

@diemol

This comment has been minimized.

Member

diemol commented Apr 23, 2017

This issue might have the same root cause as #465, DBUS_SESSION_BUS_ADDRESS related.

@ddavison

This comment has been minimized.

Member

ddavison commented Apr 24, 2017

(I've edited their YAML with the indentation they intended.)

@diemol

This comment has been minimized.

Member

diemol commented Apr 29, 2017

@thyagab can you please try with the latest release? 3.4.0-chromium
The issue with DBUS_SESSION_BUS_ADDRESS was fixed.

@jonaseicher

This comment has been minimized.

Contributor

jonaseicher commented Jun 9, 2017

Version 3.4.0-chromium has the DBUS_SESSION issue fixed, but I had to mount /dev/shm volume to fix the page-crashes.
Also works on OpenShift: https://docs.openshift.org/latest/dev_guide/shared_memory.html

@diemol

This comment has been minimized.

Member

diemol commented Jun 25, 2017

Yeah, the /dev/shm is needed for both Chrome and Firefox containers to prevent that.

I'll close the issue since in several others it was reported that the issue was solved. Nevertheless, please re-open if it is still not working for you (with a way to reproduce the issue as well).

@diemol diemol closed this Jun 25, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment