Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Captha taking longer than expected #42

Open
albertlaudia opened this issue Mar 25, 2021 · 12 comments
Open

Captha taking longer than expected #42

albertlaudia opened this issue Mar 25, 2021 · 12 comments

Comments

@albertlaudia
Copy link

I am experiencing "This is taking longer than expected; please reload the page."
Loading

Not sure what's wrong. it happened on on the third captcha

@rocketinventor
Copy link
Contributor

Hi @albertlaudia

At what part of the script is this happening? At the sign-in, or somewhere else? What happens if you manually change the URL after seeing this page..?

The way that the script is set up right now, it is meant to block the captcha from loading and just skip to the next page...
If you want to load the captcha, you can open up uBlock, and switch over hcaptcha.com from
block to allow.

However, I don't think that will help you much, because if you are getting the captcha page already, then you'll probably just get it again (even if you solve it).

@albertlaudia
Copy link
Author

albertlaudia commented Mar 25, 2021

I am actually not 100% sure on how the captcha works. It just seems that the script stuck on the captha then throw an error that no internet
Loading
image

@FirstClassCitizenFCC
Copy link
Contributor

Assuming you get the captcha after the login check, try to change the URL manually to https://www.blinkist.com/{language} when the captcha occurs.

@obsessivelearner
Copy link

Not OP but I have a similar issue and it started a day before this issue was opened. The script ran perfectly for about 2 weeks prior to this. I don't have a premium account, I just scrape the daily book at midnight every day.

That out of the way, I tried what @FirstClassCitizenFCC suggested and changing the URL manually first redirects me to https://www.blinkist.com/en/nc/library followed by the same captcha page immediately after. Interestingly, even though the terminal says "logged into blinkist" initially, the final error was "Failed to log in to Blinkist" so I am not sure if the captcha is before or after the login check though I'm assuming after because it does load my account's library for a split second before it gets stuck on the captcha.

The first image is my terminal output on a regular run, the second image is the output I get when I manually change URL after I'm stuck on the captcha.

image

image

@johndoe-dev00
Copy link
Contributor

johndoe-dev00 commented Mar 27, 2021

I had the same problem with the captcha not loading correctly.
Disabling ublock did the trick for me.
Once you have sucessfully logged in (cookie file has been created) you can activate it again.

I also did a few other workarounds for the login process. You can check my fork.

@obsessivelearner
Copy link

I had no clue how to disable ublock in the script because I'm very new to coding but disabling ublock in my Chrome instance after scraping started let me solve a captcha and then it scraped the books as normal once I accepted cookies.

@johndoe-dev00 I see you have a docker build for this project! That is something I had been searching for like a madman. I'll definitely check out your fork and docker. I hope to run this project on my Synology NAS via docker :)

I realize the issue isn't solved but having found the inelegant solution that we have, I realize the issue may be closed and I just wanted to thank everybody who's worked on the project and I hope to pay it forward in the near future.

@Riviss
Copy link

Riviss commented Mar 29, 2021

What ended up being successful for me was disabling Ublock, then clicking on the captcha area quickly when the page first loads, then the captcha would actually pop up to be completed and everything would work. (This may work without first disabling ublock, I had already disabled it when I tried this)

If I just left the page to load without clicking quickly, it would go to the page with the screenshot @albertlaudia posted.

@obsessivelearner
Copy link

obsessivelearner commented Mar 31, 2021

Disabling uBlock manually doesn't work anymore. Redirects to the following work of art:

image

The Title of the daily book is "The Internet of Us: Knowing More and Understanding Less in the Age of Big Data" and I'm not even mad.

Terminal Output looks like this:

image

@rocketinventor
Copy link
Contributor

rocketinventor commented Mar 31, 2021

@obsessivelearner The issue that you are having has nothing to do with the script. The site is just broken right now...

Try navigating to https://www.blinkist.com/en/nc/daily/reader/the-internet-of-us-en manually in your web browser, you should see the same issue.

@leoncvlt
Copy link
Owner

leoncvlt commented Apr 5, 2021

Also I think that link appears broken simply because "The internet of us" is not available as the free daily book anymore - it probably worked for that day it was. https://www.blinkist.com/en/nc/daily should dynamically resolve to the free daily book, but reading the book from that link it doesn't send you to the book's generic reader page, but to a special https://www.blinkist.com/en/nc/daily/reader/{book-slug} url which obviously works for one day only.

@rocketinventor
Copy link
Contributor

@leoncvlt Well, that was the book five days ago, but the Blinkist site was actually broken

@jonaschn
Copy link
Contributor

Using --no-ublock worked for me.
Also manually using the privacy-pass extension makes scraping audio possible again.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants