Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not working when deployed (Google Cloud). TimeoutError: waiting for selector .cf-browser-verification to be hidden failed: timeout 30000ms exceeded #40

Open
mlarcher opened this issue Feb 14, 2022 · 4 comments

Comments

@mlarcher
Copy link

mlarcher commented Feb 14, 2022

This is what I get running on GCP using offersByScrolling: TimeoutError: waiting for selector ``.cf-browser-verification`` to be hidden failed: timeout 30000ms exceeded
It seems it is sometimes working and sometimes failing on this error.
Any idea what's happening there ?

@dcts
Copy link
Owner

dcts commented Feb 15, 2022

Waiting for .cf-browser-verification to be hidden means that you are on the cloudflare page (cf = cloudflare) and within 30 seconds are not being redirected to the actual opensea page. I think most likely opensea is detecting that you run the scraper from a google cloud IP and the cloudflare loop kicks in where it will refresh the page in an endless loop asking you to wait to resolve, which it never does.

I have no way around that currently, deploying scrapers on cloud infrastructure is difficult.

If you (or someone else) has ideas please share, its a very common problem.

One solution that might work but is costly is using a service like bright data (proxy with unblocker API).

@dcts dcts changed the title TimeoutError: waiting for selector .cf-browser-verification to be hidden failed: timeout 30000ms exceeded Not working when deployed (Google Cloud). TimeoutError: waiting for selector .cf-browser-verification to be hidden failed: timeout 30000ms exceeded Feb 15, 2022
@mlarcher
Copy link
Author

UPDATE: When running on GCP we now have a less frequent TimeoutError: waiting for selector ``.cf-browser-verification`` to be hidden failed: timeout 30000ms exceeded error, but when we don't have the error we end up with a empty offers list and stats, i.e.:

offers: []
stats: {}

I hope this will be fixed by v7's new approach 🤞

@dcts
Copy link
Owner

dcts commented Apr 12, 2022

REPORT FROM @mlarcher :

I dug a bit into the code and setup a test case...
It seems that on GCP I'm stuck on a page that says

Checking your browser before accessing opensea.io.
This process is automatic. Your browser will redirect to your requested content shortly.

Please allow up to 5 seconds…
DDoS protection by [Cloudflare](https://www.cloudflare.com/5xx-error-landing/)

:(

From what I gathered :

All in all this doesn't seem too good, but not directly related to the current library. Let me know if you have expertise on the matter and know some other way to tackle the problem though :)

@dcts
Copy link
Owner

dcts commented Apr 12, 2022

Bypassing cloudflare is definately not my expertise. I have tried to solve this problem for some time now, and it is definately possible but as you mentioned its an arms race. I tried these packages:

  • cloudflare-scraper in JS, did not work for me. To me it seems like its not maintained anymore.
  • cloudscraper python package. I managed to setup a google cloud run environmen with python and successfully overcome cloudflare. That was 3 months ago approximately. To make it work with OpenseaScraper you could: => only get HTML through python, then extract top 32 offers with the code provided in this repo. OR: rewrite everything in pypeteer, but that is just an idea as I am not even sure if that would work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants