Skip to content
This repository has been archived by the owner on Jan 31, 2024. It is now read-only.

Downloader is failing, due to recent Rate Limiting update by Fansly #148

Open
cinfulsinamon opened this issue Aug 24, 2023 · 26 comments
Open
Labels
bug Something isn't working contributors appreciated help adding this functionality/feature help wanted Extra attention is needed investigating currently looking into this issue

Comments

@cinfulsinamon
Copy link

cinfulsinamon commented Aug 24, 2023

Bug Description

For some creators I try to download from, the program fails to recognize posts after the first set it finds. It will also even fail to find the first set of posts if a previous successful run happened recently. It doesn't seem to happen with all creators or with downloads using the download_mode = Single param, however.

Expected behavior

All Timeline posts from a creator should download.

Actual behavior

Only the first set of posts found were downloaded.

Environment Information

  • Operating System: Windows, Linux, MacOS
  • Fansly Downloader Version: 0.4.1 and above
  • Fansly Downloader Type: Executable & python version
  • Specific creators name: All creators

Additional context

I had issues with the Windows executable so I tried the latest python script to see if it was solved, and it was not. Adding some debug lines to print the request output shows that the first request is successful and the timeline_cursor is correctly updated to the last entry, but the second request still returned with all fields present but empty. Adding an extra delay between each request seems to fix the issue.

@cinfulsinamon cinfulsinamon added the needs triage issue needs to be validated label Aug 24, 2023
@Avnsx Avnsx added bug Something isn't working investigating currently looking into this issue and removed needs triage issue needs to be validated labels Aug 24, 2023
@Avnsx

This comment was marked as outdated.

Repository owner deleted a comment from cinfulsinamon Aug 24, 2023
@Avnsx Avnsx pinned this issue Aug 26, 2023
@Avnsx Avnsx changed the title Failing to grab posts if requests are too close together Downloader v0.4.1 is failing, due to recent Rate Limiting update by Fansly Aug 26, 2023
@Avnsx

This comment was marked as outdated.

@Sebastian1175
Copy link

Sebastian1175 commented Sep 3, 2023

I tried the executable and the most recent 0.4.2 python version of fansly downloader and It is still having the same rate-limiting problem as before.

@Avnsx
Copy link
Owner

Avnsx commented Sep 4, 2023

So it appears just after I initially bypassed the first introduction of rate-limiting, by switching back to the old fansly api endpoint for timeline downloads, they've also noticed it and adjusted their website code to apply the rate-limiting onto it too. This change happend just a few hours after I released fansly-downloaders 0.4.1-post1 version, which makes me think that they're now actively looking up the commit history of this downloader and are counter patching my changes 🤣

Anyways, so can you guys try out this branch of version 0.4.2 and see if it solves the rate-limiting issue again? Within that branch, fansly-downloader is just artifically slowed down to avoid hitting the rate-limit. I'm on a vacation for a few weeks, chilling on the beach, so I don't have access to a python environment (or a PC) and I won't make greater deeds to change that.

Additionally I noticed they're introducing more variables / tokens for each request to the api endpoints, to further validate the requests, which their backend has to handle. If they've already added logging, to see which requests are not sending these new tokens, they're already at this point in time, able to tell, which requests came from 3rd party code like fansly-downloader (as in version 0.4.2 these tokens are still not replicated). It's also very possible that maybe only if these tokens are not sent the rate-limiting is applied, because last time I checked when scrolling around on their website it still instantly loads all media content which means that there's no rate limiting applied, it would require further testing, which I don't currently have the time for.

@Avnsx Avnsx changed the title Downloader v0.4.1 is failing, due to recent Rate Limiting update by Fansly Downloader is failing, due to recent Rate Limiting update by Fansly Sep 4, 2023
Repository owner deleted a comment from Prolapsexd Sep 4, 2023
@Avnsx Avnsx added the help wanted Extra attention is needed label Sep 5, 2023
@lordoffools
Copy link

lordoffools commented Sep 6, 2023

Strangely, I don't always hit this rate-limit issue.

Sometimes it goes all the way, and sometimes I get this:

WARNING | 12:29 || Low amount of Pictures scraped. Creators total Pictures: 1683 | Downloaded: 300
WARNING | 12:29 || Low amount of Videos scraped. Creators total Videos: 873 | Downloaded: 113

Sometimes it downloads only 10 items, and sometimes it downloads thousands.

Is it possible to slow down even further on our side (by allowing param level rate-limiting)?

@lordoffools
Copy link

If it helps, I am using the forked (0.4.2) version.

@lordoffools
Copy link

lordoffools commented Sep 6, 2023

Another data point, after the failure I noted above, I tried a different creator, and it's been scraping for a while now (successfully). We'll see what the final count is when it's done. I will update back once it's complete.

Update: The new run (for a different creator) ended successfully:

Finished Normal type, download of 2911 pictures & 461 videos! Declined duplicates: 30

So, I'm puzzled as to why some creator scrapes are throttled, and others are not (especially when those that aren't sometimes have way more content).

@Avnsx
Copy link
Owner

Avnsx commented Sep 7, 2023

I am using the forked (0.4.2) version.

Can you try out this branch and let me know if it succesfully reliably passes the rate-limit all the time?

@lordoffools
Copy link

lordoffools commented Sep 7, 2023

Can you try out this branch

Done. Tested multiple times.

It does not successfully pass the rate-limit all the time. At least, there are some creators where it fails all the time.

There are some where it passes 100% of the time.

I'm not entirely sure why.

@Avnsx
Copy link
Owner

Avnsx commented Sep 7, 2023

It does not successfully pass the rate-limit all the time. At least, there are some creators where it fails all the time.

Looks like the most efficient way to handle this would be a function that before starting timeline downloads, meassures if a rate-limit is even existing for a specific creator and depending on the result dynamically adjusts the wait time. Would be cool if someone contributed that, else I'll write it myself when I return from my vacation in a few weeks.

But for now you might aswell just higher this sleep timer from 5, 6 to whenever it reliably passes the wait timer all the time with e.g. 7, 8

@Avnsx Avnsx added the contributors appreciated help adding this functionality/feature label Sep 7, 2023
@lordoffools
Copy link

lordoffools commented Sep 7, 2023

Thanks for the tip! I'll play around with the sleep timer and report back on my findings.

@Sebastian1175
Copy link

Sebastian1175 commented Sep 7, 2023

Could you solve the problem? I can't tell reading your comments.

@lordoffools
Copy link

Could you solve the problem? I can't tell reading your comments.

The author says they will work on it when they return, and are also putting out a call for help for contributors to help with solving it and writing code.

@plywood234
Copy link

plywood234 commented Sep 8, 2023

I set my sleep timer to 105,108 and it started working on an account that previously did not scrape much. It probably doesn't need to be that crazy but it's definitely an issue with the sleep timer.

Edit: 72,75 worked but 52,55 did not work.

@lordoffools
Copy link

I did something similar. I have it set to 120, 240 right now, and it's working on all the ones I shared above that failed previously (and consistently).

Obviously taking forever. And not each one required 120.. So I'm not sure why some do and some don't.

@Sebastian1175
Copy link

Sebastian1175 commented Sep 8, 2023

I set my sleep timer to 105,108 and it started working on an account that previously did not scrape much. It probably doesn't need to be that crazy but it's definitely an issue with the sleep timer.

Edit: 72,75 worked but 52,55 did not work.

It is slow as hell, but yes it works ok-ish with 72,75. Thank you

@lordoffools
Copy link

I finally encountered a creator that I cannot scrape with 120, 122. Doubling the numbers now to see if that helps (and yes, it'll take ages and ages).

@lordoffools
Copy link

Confirmed: I have an example of a creator where no matter how high I set the delay, it still fails.

@LastInvoker
Copy link

Confirmed: I have an example of a creator where no matter how high I set the delay, it still fails.

same here, i can only scrap till jan.2023 everything older is failed.

@lordoffools
Copy link

Confirmed: I have an example of a creator where no matter how high I set the delay, it still fails.

same here, i can only scrap till jan.2023 everything older is failed.

It has nothing to do with the age of the posts, it seems. I've had some that don't pull before August 2023, and some that don't pull before yesterday... and then some that pull 100% successfully. This is repeatable, so it's creator specific, it seems.

Very confusing to me.

@LastInvoker
Copy link

LastInvoker commented Sep 10, 2023

i always recieve the error that there are no media on current cursor, i dont know what to change anymore XD

@Bearded-Baguette
Copy link

I think I created a workaround for the rate limiting. I used the sleep function created above and added retry attempts after each sleep. If the program fails to pull posts from a timeline, it will wait X seconds then try to pull the same timeline. It seems to take 5-8 attempts, but it can take more sometimes. After it successfully pulls posts from the timeline, the number of retry attempts resets.

I created a pull request with these changes, but I'm not sure what the process is for reviewing those changes. It's definitely not a perfect fix, but it seems to push through the rate limiting most of the time.

@LastInvoker
Copy link

LastInvoker commented Sep 11, 2023

I think I created a workaround for the rate limiting.

can you please upload the part where you made this changes?

@Bearded-Baguette
Copy link

Bearded-Baguette commented Sep 11, 2023

can you please upload the part where you made this changes?

Sure thing, I think you can check it out on my branch here. This is my first time trying to fork a branch in Github so please let me know if you can't get to it. There's also a pull request with the changes I made.

As a side note, I played around with increasing the number of attempts and the timer. 20 attempts at 5-20 second intervals is slow, but it was able to go through a page with content going back to late 2021 in a few hours.

@Avnsx
Copy link
Owner

Avnsx commented Sep 11, 2023

Looks like the most efficient way to handle this would be a function that before starting timeline downloads, meassures if a rate-limit is even existing for a specific creator and depending on the result dynamically adjusts the wait time.

After reading the stuff you guys said, I need to correct myself. Considering some of you need 70+ seconds wait timers, it would be more benefitial to just replicate whatever the fansly website is doing, as it obviously allows instantly scrolling around and loading media and as I was pointing out before, I've seen some newly introduced identifier/auth tokens the timeline requests have now, if I had to take a wild guess, they prolly introduced the requirement for a javascript backend which within a real browser creates those tokens for each timeline request before they're being sent and this way the rate limiting is just entirely not applied. Just replicating that with python and specific 3rd party libraries will most likely get rid of the need of having to wait so long for each request.

Fansly devs, if you read this; I would be down to just keep static 5 second timers inbetween each request, but everything above that forces me to a propper replication, which will in return load up your servers with requests again. Down for a gentlemans agreement, that will work for both sides? Keep in mind even if I ceased service of this tool, someone else will re-create it (in-fact there's already multiple people that actively maintain scrapers for fansly); so even for you guys it would be profitable to just stick with this. It's a average case of don't blame the player, blame the game 🫣

@melithine
Copy link

melithine commented Nov 12, 2023

I think I created a workaround for the rate limiting. I used the sleep function created above and added retry attempts after each sleep. If the program fails to pull posts from a timeline, it will wait X seconds then try to pull the same timeline. It seems to take 5-8 attempts, but it can take more sometimes. After it successfully pulls posts from the timeline, the number of retry attempts resets.

What about doing an incremental backoff timer based on the retry attempts? Ie, assuming the initial value is 1s for attempt 1, then use 2s for attempt 2, 4s for attempt 3, 8s for attempt 4, etc.? If someone set it to 5s, then it would go 5s/10s/20s/40s/etc.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working contributors appreciated help adding this functionality/feature help wanted Extra attention is needed investigating currently looking into this issue
Projects
None yet
Development

No branches or pull requests

8 participants