Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chrome not reachable with Selenium Python #133

Closed
robroc opened this issue Apr 16, 2018 · 46 comments
Closed

Chrome not reachable with Selenium Python #133

robroc opened this issue Apr 16, 2018 · 46 comments

Comments

@robroc
Copy link

@robroc robroc commented Apr 16, 2018

I'm using serverless-chrome with Python in Lambda, following this repo's instructions.

It works great locally in a Docker container, but not when deployed to Lambda. This error comes up. It seems the webdriver starts up fine but the binary is unreachable when asked to parse the DOM with Selenium.

(Session info: headless chrome=65.0.3325.146)
(Driver info: chromedriver=2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Linux 4.9.85-38.58.amzn1.x86_64 x86_64)
: WebDriverException
Traceback (most recent call last):
File "/var/task/src/lambda_function.py", line 284, in lambda_handler
driver.find_element_by_xpath('//*[@id="searchForm"]/div[4]/ul/li[1]/input').click()
File "/var/task/lib/selenium/webdriver/remote/webdriver.py", line 385, in find_element_by_xpath
return self.find_element(by=By.XPATH, value=xpath)
File "/var/task/lib/selenium/webdriver/remote/webdriver.py", line 955, in find_element
'value': value})['value']
File "/var/task/lib/selenium/webdriver/remote/webdriver.py", line 312, in execute
self.error_handler.check_response(response)
File "/var/task/lib/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: chrome not reachable
(Session info: headless chrome=65.0.3325.146)
(Driver info: chromedriver=2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Linux 4.9.85-38.58.amzn1.x86_64 x86_64)

This is the code leading up to the line that fails:

chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--window-size=1280x1696')
chrome_options.add_argument('--user-data-dir=/tmp/user-data')
chrome_options.add_argument('--hide-scrollbars')
chrome_options.add_argument('--enable-logging')
chrome_options.add_argument('--log-level=0')
chrome_options.add_argument('--v=99')
chrome_options.add_argument('--single-process')
chrome_options.add_argument('--data-path=/tmp/data-path')
chrome_options.add_argument('--ignore-certificate-errors')
chrome_options.add_argument('--homedir=/tmp')
chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir')
chrome_options.add_argument('user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')
chrome_options.binary_location = '/var/task/bin/headless-chromium'

driver = webdriver.Chrome(chrome_options=chrome_options)
driver.implicitly_wait(3)
driver.get(URL)

Any advice would be fantastic.

@zekth
Copy link

@zekth zekth commented Apr 19, 2018

I'm experiencing the same issue but using node 6.10. Using the chrome binary given with this project : https://github.com/blackboard/lambda-selenium
it works perfectly.
I'm using the same parameters as @robroc

Loading

@robroc
Copy link
Author

@robroc robroc commented Apr 19, 2018

After a lot of tooling around, I got it to work on Lambda. The problem was with incompatible versions of serverless-chrome, chromedriver, and Selenium. These are the versions that play well together in Lambda. Why is beyond me:

chromedriver v.2.37
severless-chrome v.0.0-37
selenium 2.53.6 (for Python)

Loading

@zekth
Copy link

@zekth zekth commented Apr 19, 2018

Thanks for the quick feeback @robroc . I'm going to update the chromedriver to 2.37 and check if it works.

Loading

@zekth
Copy link

@zekth zekth commented Apr 19, 2018

@robroc it works 💃

Is there any document where we have the correlation between chromedriver and chrome itself?

Edit: Everything is written on the chromedriver page: https://sites.google.com/a/chromium.org/chromedriver/downloads

Maybe a warning on the documentation of serverless-chrome could be a good point for troubleshooting?

Loading

@rodel-talampas
Copy link

@rodel-talampas rodel-talampas commented Jul 6, 2018

@robroc
You saved my day mate!!!

Loading

adieuadieu added a commit that referenced this issue Jul 8, 2018
Suggested in #133. Adds a note about incompatible versions of selenium, chromedriver
@adieuadieu
Copy link
Owner

@adieuadieu adieuadieu commented Jul 8, 2018

Added a note to the docs.

Loading

@adieuadieu adieuadieu closed this Jul 8, 2018
@rneu31
Copy link

@rneu31 rneu31 commented Jul 23, 2018

Did anyone look into investigating why we can't use newer versions of Chromium? Seems like pinning on such old versions works up until you want a new feature!

cc: @robroc @adieuadieu

Loading

@zekth
Copy link

@zekth zekth commented Jul 23, 2018

This is related to the chromedriver itself

Loading

@rneu31
Copy link

@rneu31 rneu31 commented Jul 23, 2018

Just for an additional data point, leaving chromedriver at 2.37 (which claims to support Chrome v64-66), and switching to v1.0.0-41 (Chromium 65.0.3282) of this project triggers the error. The version of selenium itself seems to have no impact (as suggested earlier).

Loading

@rneu31
Copy link

@rneu31 rneu31 commented Jul 23, 2018

Found some time to do a little more digging. The only error in the chromedriver log that jumps out at me is

[1532377838.274][SEVERE]: CreatePlatformSocket() returned an error, errno=1: Operation not permitted (1)
[1532377838.274][INFO]: listen on IPv6 failed with error ERR_ACCESS_DENIED

Anyone smarter than I? Thanks!

Loading

@NikolaiT
Copy link

@NikolaiT NikolaiT commented Sep 10, 2018

Can we now use newer versions of headless-chromium? Or still use the old version combination?

Loading

@zekth
Copy link

@zekth zekth commented Sep 10, 2018

@NikolaiT versions of chromium/chromedriver are tied in term of compatibility look here : http://chromedriver.chromium.org/downloads

And check for the SUPPORT section of each chrome driver version to know which one fits your setup.

Loading

@marioavs
Copy link

@marioavs marioavs commented May 22, 2019

Thank you @zekth for the reference. This combination of versions worked out:

chromedriver 2.43
severless-chrome 1.0.0-55
selenium 3.14 (Python package)

Loading

@robroc
Copy link
Author

@robroc robroc commented May 22, 2019

@marioavs It looks like chromedriver versions below 2.46 are no longer offered. Do you know if those versions of serverless-chrome and Selenium work with the offered versions?

Loading

@marioavs
Copy link

@marioavs marioavs commented May 22, 2019

chromedriver versions below 2.46 are still available, for example chromedriver version 2.43 that I mentioned.

Version 2.44 did not work for me, that is why I wanted to share the specific versions that did work in my tests. serverless-chrome 1.0.0-55 is built with chromium 69.0.3497.81 (stable channel) for amazonlinux:2017.03, that means it "should" work with chromedriver versions 2.41, 2.42, 2.43 and 2.44.

Loading

@ramisvik
Copy link

@ramisvik ramisvik commented Jul 3, 2019

After a lot of tooling around, I got it to work on Lambda. The problem was with incompatible versions of serverless-chrome, chromedriver, and Selenium. These are the versions that play well together in Lambda. Why is beyond me:

chromedriver v.2.37
severless-chrome v.0.0-37
selenium 2.53.6 (for Python)

I'm still facing this error. Did anyone stumble upon this?

receive errorMessage": "Message: unknown error: unable to discover open window in chrome\n (Session info: headless chrome=64.0.3282.167)\n (Driver info: chromedriver=2.37.544315 (730aa6a5fdba159ac9f4c1e8cbc59bf1b5ce12b7),platform=Linux 4.14.123-95.109.amzn2.x86_64 x86_64)\n",

Loading

@kevenpinto
Copy link

@kevenpinto kevenpinto commented Nov 16, 2019

This Combination works for me
chromedriver v.2.37
severless-chrome v.0.0-37
selenium 3.14 (for Python)

Loading

@AboveTheHeavens
Copy link

@AboveTheHeavens AboveTheHeavens commented Jan 14, 2020

Which combination is working atm? I've tried driver v2.43 + serverless v1.0.0-55 and driver v2.37 + serverless v1.0.0-37,

I keep getting an error: 'Chromedriver unexpectedly existed. status code was: 127'

Loading

@fzamperin
Copy link

@fzamperin fzamperin commented Feb 7, 2020

@ramisvik I'm still with this error, using headless chrome=69.0.3497.81, chromedriver=2.43.600233, selenium=3.14.0, did you manage to make it work?

An error occurred during JSON serialization of response: WebDriverException('unknown error: unable to discover open window in chrome\n (Session info: headless chrome=69.0.3497.81)\n (Driver info: chromedriver=2.43.600233 (523efee95e3d68b8719b3a1c83051aa63aa6b10d),platform=Linux 4.14.138-99.102.amzn2.x86_64 x86_64)', None, None) is not JSON serializable

Loading

@ramisvik
Copy link

@ramisvik ramisvik commented Feb 7, 2020

Really difficult to get the combinations working. I switched to Scrapy + Splash which is much better.

Loading

@syunkevichdemandbase
Copy link

@syunkevichdemandbase syunkevichdemandbase commented Feb 29, 2020

@AboveTheHeavens Have you found out the solution for this issue? I lost any hope to make it working.

UPD: It is necessary to use python 3.6

Loading

@syunkevichdemandbase
Copy link

@syunkevichdemandbase syunkevichdemandbase commented Mar 3, 2020

@marioavs could you please clarify, what exact version of "severless-chrome 1.0.0-55" worked out for you? Did you use chromium 69.0.3497.81 (stable channel)?

I am using:

selenium==3.14.0
chromedriver==2.43 (2.43.600233 to be precise)
severless-chrome==everything from v1.0.0-55 and v1.0.0-54

Maybe the reason is the chromedriver? Is it possible that your 2.43 is not 2.43.600233? If so, can you please share your chromedriver somewhere in case if it's different?
I am desperately need this and can not make this work :(

Loading

@syunkevichdemandbase
Copy link

@syunkevichdemandbase syunkevichdemandbase commented Mar 3, 2020

I have finally set it up! The combination is:

selenium==3.14.0
chromedriver==2.43 (2.43.600233)
severless-chrome==serverless-chrome 1.0.0-55 (69.0.3497.81 stable channel)

I've been getting the error "Chrome unreachable" because of

chrome_options.add_argument('--single-process')

Its removal solved my problem.

UPD: it works locally. but when I run it in AWS Lambda it raises the following error:

unable to discover open window in chrome

Loading

@HernanG234
Copy link

@HernanG234 HernanG234 commented Mar 9, 2020

Did anyone got this to work with recent versions? D:

Loading

@ramisvik
Copy link

@ramisvik ramisvik commented Mar 9, 2020

Hi guys, just dropping a quick update from my side. I tried multiple versions but didn't find this sustainable.

I went ahead and wrote the spiders in scrapy and use splash to emulate selenium actions.
https://blog.scrapinghub.com/2015/03/02/handling-javascript-in-scrapy-with-splash

Splash handles most of my use cases perfectly well.

Loading

@dmitrykvochkin
Copy link

@dmitrykvochkin dmitrykvochkin commented Apr 9, 2020

Hey guys, spend last night fiddling with different versions and finally made it work. These version setup works for me:

selenium==3.14.0
chromedriver==2.43
severless-chrome==1.0.0-55
AND
python=3.6

Initially I was using python 3.8 (both locally and on AWS) and it did not work.
Following @syunkevichdemandbase recommendation, switching to python 3.6 did the job!

Loading

@Denijar
Copy link

@Denijar Denijar commented Apr 17, 2020

Hi @dmitrykvochkin could you perhaps copy-paste the contents of your requirements.txt? I'm still having issues getting the versioning right/figuring out what I need to include in the requirements file. This repository seems to specify chromedriver-binary==2.43 while this tutorial doesn't specify it at all. Tad confused what I need. Cheers.

Loading

@dmitrykvochkin
Copy link

@dmitrykvochkin dmitrykvochkin commented Apr 17, 2020

Hey @Denijar
I am building my project using serverless, so it is slightly different. I don't have selenium or
chromedriver-installer in my requirenments.txt .
Instead, it is packaged it into Lambda Layers. This tutorial is very close to my setup.

Make sure that you are building yours with python=3.6 on your machine and that the same python version is used on AWS Lambda. (this was giving me issues before)

I hope this helps!

Loading

@umihico
Copy link

@umihico umihico commented May 15, 2020

My script worked well with

These binaries could be one lambda layer with this script. The test code is here.
Please remove unnecessary arguments such as region in the script.

Loading

@mcharbonnier
Copy link

@mcharbonnier mcharbonnier commented Jun 10, 2020

My script worked well with

These binaries could be one lambda layer with this script. The test code is here.
Please remove unnecessary arguments such as region in the script.

Well I tried it, but I get a problem of timeout. It's blocked at "get function".
Did you have this problem too?

Loading

@umihico
Copy link

@umihico umihico commented Jun 11, 2020

@mcharbonnier
Did you extend lambda timeout? The default value is 3 seconds which is not enough.

If you still get the same error,

It's blocked at "get function"

I need more detail about this.

Loading

@mcharbonnier
Copy link

@mcharbonnier mcharbonnier commented Jun 11, 2020

Loading

@xoxwgys56
Copy link

@xoxwgys56 xoxwgys56 commented Feb 26, 2021

My script worked well with

These binaries could be one lambda layer with this script. The test code is here.
Please remove unnecessary arguments such as region in the script.

@umihico thanks for sharing your result. it works fine.
but i am trying to work with ECR, container.

i located your files on /var/task/bin. but it did not worked.
what is differences between container and layer?
i really don't get it.

i used public.ecr.aws/lambda/python:3.7 image.

Loading

@umihico
Copy link

@umihico umihico commented Mar 2, 2021

@xoxwgys56
Thanks for visiting my repository and try it out.

ECR is also possible. I am also trying to deploy with ECR in my project and could make it at least.
I haven't pushed the changes yet (I'll do it soon.) but I'll paste my Dockerfile and modification here. I hope it helps you to figure it out

FROM public.ecr.aws/lambda/python:3.7

RUN mkdir -p /opt/bin/ && \
    mkdir -p /opt/fonts/ && \
    mkdir -p /tmp/downloads/fonts && \
    curl -SL https://chromedriver.storage.googleapis.com/2.37/chromedriver_linux64.zip > /tmp/downloads/chromedriver.zip && \
    curl -SL https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-37/stable-headless-chromium-amazonlinux-2017-03.zip > /tmp/downloads/headless-chromium.zip && \
    curl -SL https://fonts.google.com/download?family=Noto%20Sans%20JP > /tmp/downloads/Noto_Sans_JP.zip && \
    unzip /tmp/downloads/chromedriver.zip -d /opt/bin/ && \
    unzip /tmp/downloads/headless-chromium.zip -d /opt/bin/ && \
    unzip /tmp/downloads/Noto*.zip -d /tmp/downloads/fonts/ && \
    mv /tmp/downloads/fonts/NotoSansJP-Regular.otf /opt/fonts/ && \
    rm -rf /tmp/downloads

COPY requirements.txt ./
RUN pip install -r requirements.txt

COPY server.py ./
COPY chromeless/picklelib.py ./
COPY fonts.conf /opt/fonts/
RUN mkdir -p ./versions
COPY versions/*.py ./versions/
CMD ["server.handler"]

Due to the Dockerfile above, I need to locate binary files like this.

options.binary_location = "/opt/bin/headless-chromium"
return webdriver.Chrome("/opt/bin/chromedriver", options=options)

Loading

@xoxwgys56
Copy link

@xoxwgys56 xoxwgys56 commented Mar 2, 2021

@umihico thanks so much your reply it will really helpful for me.
and i love your repo what you trying to.

This is another question but did you tried higher version of chrome? (like 70 or 80?)
I tried almost 2 weeks but not succeed.
Really wonder why chrome 86 and sort of higher version can not launched.

I guess this is issue about al1 and al2. but why?

Loading

@umihico
Copy link

@umihico umihico commented Mar 2, 2021

@xoxwgys56
I have no idea yet, instead of direct troubleshooting, I'm gonna try the latest chrome with a virtual display inside docker without headless-chrome.

Loading

@xoxwgys56
Copy link

@xoxwgys56 xoxwgys56 commented Mar 4, 2021

@umihico
Yeah, i agree with virtual display without headless-chrome is could be possible.

Loading

@umihico
Copy link

@umihico umihico commented Mar 12, 2021

@xoxwgys56
I created a demo repository which works with image container. It uses these versions.

  • Python 3.7
  • serverless-chrome v1.0.0-37
  • chromedriver 2.37
  • selenium 3.141.0 (latest)

Please check out https://github.com/umihico/docker-selenium-lambda/

I am trying to upgrade any of these versions but can't make it yet.

Loading

@milanzivic
Copy link

@milanzivic milanzivic commented Apr 14, 2021

Did anyone manage to get this running on Python 3.8?

Loading

@umihico
Copy link

@umihico umihico commented Apr 15, 2021

@milanzivic
I did. I updated the above repository. https://github.com/umihico/docker-selenium-lambda/

Loading

@milanzivic
Copy link

@milanzivic milanzivic commented Apr 15, 2021

Thanks! Will check it out

Loading

@duolabmeng6
Copy link

@duolabmeng6 duolabmeng6 commented Jun 3, 2021

https://github.com/adieuadieu/serverless-chrome/releases/tag/v1.0.0-55

chromium 69.0.3497.81 (stable channel) for amazonlinux:2017.03
https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-55/stable-headless-chromium-amazonlinux-2017-03.zip
https://chromedriver.storage.googleapis.com/index.html?path=2.43/

thank

    chrome_options = webdriver.ChromeOptions()
    chrome_options.add_argument('--headless')
    chrome_options.add_argument('--no-sandbox')
    chrome_options.add_argument('--disable-gpu')
    chrome_options.add_argument('--window-size=1366x768')
    chrome_options.add_argument('--user-data-dir=/tmp/user-data')
    chrome_options.add_argument('--hide-scrollbars')
    chrome_options.add_argument('--enable-logging')
    chrome_options.add_argument('--log-level=0')
    chrome_options.add_argument('--single-process')
    chrome_options.add_argument('--data-path=/tmp/data-path')
    chrome_options.add_argument('--ignore-certificate-errors')
    chrome_options.add_argument('--homedir=/tmp')
    chrome_options.add_argument('--disk-cache-dir=/tmp/cache-dir')
    chrome_options.add_argument(
        'user-agent=Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/61.0.3163.100 Safari/537.36')


    chrome_options.binary_location = './headless-chromium'

    chrome = webdriver.Chrome("./chromedriver",
                              options=chrome_options)
    print(chrome)

    data = chrome.get("https://www.baidu.com")

    print(chrome.find_element_by_xpath("//html").text)

v1.0.0-55
Status: 200 OK
Max memory used: 175.12 MB
Duration: 2946 ms

v1.0.0-57
Status: 200 OK
Max memory used: 229.32 MB
Duration: 3019 ms

https://github.com/adieuadieu/serverless-chrome/releases/download/v1.0.0-57/stable-headless-chromium-amazonlinux-2.zip
https://chromedriver.storage.googleapis.com/index.html?path=86.0.4240.22/

Loading

@xoxwgys56
Copy link

@xoxwgys56 xoxwgys56 commented Jun 9, 2021

@duolabmeng6 posted version works fine but like what you said it takes more time.

In this case, I wonder what is differences between each versions.
I can not find which version take more memory and which version is not.
Is higher version of chrome is always best choice? or lower version is good? I can not specified that.

I notice built image sizes are different but not sure it makes chrome more slower or heavier.

Loading

@duolabmeng6
Copy link

@duolabmeng6 duolabmeng6 commented Jun 9, 2021

@duolabmeng6 posted version works fine but like what you said it takes more time.

In this case, I wonder what is differences between each versions.
I can not find which version take more memory and which version is not.
Is higher version of chrome is always best choice? or lower version is good? I can not specified that.

I notice built image sizes are different but not sure it makes chrome more slower or heavier.

I chose chrome 69 v1.0.0-55 because Aliyun 函数计算 Greater than is not allowed 100m And use less memory and time

But I like to use the latest version

If only he could use less memory and be faster

For the time being, v1.0.0-55 is appropriate in the case of no bugs

Build the latest version if chrome can be released

The current chrome version is 91

I'd like to try the latest version. I'm in aliyun test

Loading

@xoxwgys56
Copy link

@xoxwgys56 xoxwgys56 commented Jun 9, 2021

@duolabmeng6 Thanks for replying. Yes I agree with that.
If you succeed with more latest version of chrome. I hope you share with that.

P.S I don't know what aliyun test is. I just guess meaning of like alpha test or pre test

Loading

@duolabmeng6
Copy link

@duolabmeng6 duolabmeng6 commented Jun 9, 2021

@duolabmeng6 Thanks for replying. Yes I agree with that.
If you succeed with more latest version of chrome. I hope you share with that.

P.S I don't know what aliyun test is. I just guess meaning of like alpha test or pre test

https://www.alibabacloud.com/product/function-compute

The results were tested on Alibaba cloud

I don't know how to build the latest version of chrome 91. I still need to learn. Now I see the latest version is 88 v1.0.0-57

Because Alibaba cloud needs special processing for more than 100m, I chose version 69 v1.0.0-57 This makes it easier to deploy functions

In function compute, memory requirements are obvious. It is easier to choose files with smaller speed and faster deployment

If you choose a small memory without major problems

If you use docker deployment, can use the latest version of chrome 91 It's bigger

https://github.com/duolabmeng6/heroku_selenium

The latest version of chrome built in heroku can also be tested

My English is a translation. I'm sorry for the inaccurate expression

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet