-
Notifications
You must be signed in to change notification settings - Fork 545
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WiP: Fix looping Cloudflare challenge, Resolves #1036 #1163
base: master
Are you sure you want to change the base?
Conversation
FWIW But after each solve there remains a chrome subtask that starts to spin up to 15% CPU and I have to manually kill them off. |
Another thing that I've noticed is that in the user-agent headless replacement: self.execute_cdp_cmd(
"Network.setUserAgentOverride",
{
"userAgent": self.execute_script(
"return navigator.userAgent"
).replace("Headless", "")
},
) I don't know why but If I hardcode the user-agent using the exact that my computer has like this: user_agent = "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/123.0.0.0 Safari/537.36"
options.add_argument(f"--user-agent={user_agent}") it bypasses cloudflare, but if i put this to make it automatically like you have it on line 533 from So an alternative could be to setup a driver only to get the user agent: def get_user_agent(driver):
return driver.execute_script("return navigator.userAgent;").replace("Headless", "") And then pass the user-agent to the definitive driver PD: I only can tell you what I've discovered to see if we can go through the solution cuz I'm having troubles to get the project installed/set up 😅 |
I didnt actually use this branch, it worked fine after I switched to it. Thanks |
@garfield69 yea this seems to be an issue with Chrome v124. You can revert to v123 in the mean time if it's easier - #1161 Alternatively, build your own binaries, which will use Chromium v123:
|
@m33ts4k0z were you doing this on Windows? |
Yes on a Windows 11 VM on Unraid but it did work in the end. I updated my first post here with the cause. |
Oh cool, did not know I could build on windows. |
@juanfrilla sorry for the delay in replying, been busy and only got to a few quick ones on my phone. I'll have a look at the UA idea when I next get a chance, thanks. Assuming you're following the run from source instructions, what issue are you having? https://github.com/FlareSolverr/FlareSolverr#from-source-code |
@ilike2burnthing my main problem is that i cannot install Xvfb on MacOS |
Tried XQuartz? |
yessir now the project is set up, let's see what I can fix |
What exactly is left to do on this to get it merge? I tried to guess with the comments here and some different issues but I can't get the current status of this. It seems to be stale for quite some time, so what's needed? |
|
Well, I made my own implementation of this "new tab" idea and I was able to make it work with every website I could (ext.to, www3.yggtorrent.cool, dodi-repacks.site, hd-torrents.me/login.php, nhentai.net) on my Linux system using a VPN / socks5 proxy and also with my container image on my own remote Linux server, which was blocked by cloudflare too. Public image with my edits: 21hsmw/flaresolverr:fixlooping |
That's working 95% of the time on Windows for me, even with a proxy, but failing 95% of the time on Docker. Usual error:
Seems it's related to |
When you say it fails on Docker, is it still on Windows or Linux? I got this error on Linux while doing my implementation, but have not been able to replicate it since. For the looping challenges, it seems to be a timing issue. Playing with the timer values can make it work in some cases, but it's not easy to know what works for everyone since it seems to take network latency into account. For example, if I use a proxy close to my location, it works 100% of the time with the sites I listed earlier, but if I use a proxy very far from me, it works 50% of the time. |
Linux. I'll play around with timings again (I did a bunch yesterday), see if I can get something that works both on my Docker and Windows. |
Strange then. I'm able to solve the challenges of all sites I try on my Debian and Fedora systems with different VPNs/Proxies with and without Docker involved. Here's an example with dodi-repacks.site using the docker image I shared previously: |
This comment was marked as off-topic.
This comment was marked as off-topic.
Thanks for your workaround @21hsmw Working with @aevrard the solution you provide will kill the killswitch if you're using something like gluetun... |
Thanks @21hsmw ! |
Worked for me on whatbox.ca services:
flaresolverr:
image: 21hsmw/flaresolverr:fixlooping
environment:
- LOG_LEVEL=${LOG_LEVEL:-info}
- LOG_HTML=${LOG_HTML:-false}
- CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
- TZ=UTC
- PORT=25000
- HOST=127.0.0.1
network_mode: host
pull_policy: always
restart: unless-stopped |
replacing the image of the dockerfile for this: I tested as well on a centOS server with the previous image ( |
I have set up Flaresolverr, Jackett and Prowlarr in containers and I have tested the last 2 sites you talked about: I set up Flaresolverr in both Prowlarr and Jackett, added the indexers, created an account on seatracker to test it out, and was able to get both to work with Flaresolverr using my image on docker hub. I tried a few searches, both Prowlarr and Jackett works and I'm able to download torrent files or get the magnets. I also tried https://ilcorsaroblu.org with Jackett and Prowlarr and both worked. It was slow, but eventually it worked. Do you have another system running linux (live system can also be tried) with a different CPU that you can try the stack on? |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
@21hsmw can you try those with an HTTP proxy enabled in Jackett? |
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
This comment was marked as duplicate.
I found a working HTTP proxy online, set it up in Jackett and verified in the flaresolverr logs that it was being used in the incoming command. I was then able to go through the cf challenge and search for torrents for the 3 websites. |
I'll try on Windows |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
Last week, 21hsmw/flaresolverr:nodriver worked 90% of time on my Debian ARM (without VPN or proxy) but it doesn't work anymore since the update from 4-5 days ago.
And when I try again I've this :
Hope this will help your invetigations ! |
Since I'm accessing to pdf urls, a problem i'm facing it's that sometimes the "No space left on device" message appears and it stop working until free space it's available, how can I automatically remove temporal files or how can I not download anything cuz I dont need it? |
That's probably because I changed to reusing the nodes. Instead of being re-created like before, they are taken at a certain point in time, which could lead to missing elements. I'll see what I can do about that.
Nodriver deletes all user data directories when it exits, which in our case is when flaresolverr is completely stopped, so that might explain why you are getting this. |
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
This comment was marked as off-topic.
I also noticed this issue with the one I am currently using .cookies.load(flie) |
Getting errors and a few zombie Chrome processes on start (though it still works):
Working fine for every link I throw at it, with or without a proxy. While I can start and use a session, I'm getting an error when trying to destroy it (it continues to run after):
I'll go back and try Docker on my NAS again, but failing that I'll get a live system going. |
Well I eventually got a couple of successive successful runs with ilcorsaroblu.org, but with high memory usage from the Python process persisting after. The other two unfortunately continue to return invalid cookies (but work fine on the current release). |
i found a fix for the memory leak. ultrafunkamsterdam/undetected-chromedriver#1851 (comment) |
@ilike2burnthing Thanks, I'll try to see what I can do about the errors you're getting. The first one was fixed on my end the last time I worked on it, so it needs a more global way to fix it. I'll also check the sessions and chromium processes that are still running.
I tried it, but it still creates memory leaks when a website has a lot of elements while using query_selector. |
please send me an example code. maybe there's something more i could do. my approach should fix memory leaks which happened when commands were being sent from nodriver to devtools. i don't think that there is any other memory leak. what you're experiencing is probably just default memory load. every function which returns an element loads page content to memory each time you're calling it. wait_for does it every 0.1s. there always will be some memory load. previously due to a bug the content wasn't able to unload from memory which caused a heavy leak. |
Using Flaresolverr as an example, if you iterate selectors based on the selectors in "CHALLENGE_SELECTORS" with nodriver query_selector, you will see that the memory fills up and then goes down a bit after the instance is closed, but not to the original state. So if you continue, it goes up and up until the system kills the process. The only way I have currently found to stop this memory leak issue and to speed up the queries is to reuse the node ( If you remove all |
@ilike2burnthing I pushed some changes in my repo and on docker hub, let me know if you still get the errors you were getting. |
Thanks to @juanfrilla for #1036 (comment).
Unfortunately, currently this only works on Windows, and the looping challenges return if using proxies or VPNs.