Skip to content
This repository has been archived by the owner on May 12, 2021. It is now read-only.

Better http throttling post 429s #51

Closed
zw5 opened this issue Mar 3, 2020 · 2 comments
Closed

Better http throttling post 429s #51

zw5 opened this issue Mar 3, 2020 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@zw5
Copy link

zw5 commented Mar 3, 2020

When i do something like:

for song in music_files:
        tasks.append(asyncio.create_task(client.search(
            cleanup_name(song.name),
            types=["track"])))
  done = await asyncio.gather(*tasks)

And i get ratelimited, the client, instead of waiting, justs tries requesting and exits with error code 439(Too many requests), when it should wait for the ratelimit to pass.

@mental32 mental32 added the bug Something isn't working label Mar 7, 2020
@mental32 mental32 self-assigned this Mar 7, 2020
@mental32
Copy link
Owner

mental32 commented Mar 7, 2020

Thanks for reporting this!

I'll look into the issue now and keep this thread updated.

@mental32
Copy link
Owner

mental32 commented Mar 7, 2020

Right, so I've investigated the issue and have a possible fix ready.
But first I'd like to address a couple points.

instead of waiting, justs tries requesting and exits with error code 439(Too many requests), when it should wait for the rate limit to pass.

The underlying http client does actually stall the request waiting for the rate limit to pass over. This issue here was that the stalling is performed only for that request and not all requests client wide resulting in a form of head of line blocking.

Additionally due to a large amount of requests made in a short period of time, which the library did not anticipate in the original design of the stalling mechanism, requests could essentially start "choking" each other out where one request finishes their rate limit stall but is then immediately rate limited again due to a sibling request taking its slot. The exception you received was a result of this following ten consecutive request failures all being rate limited.

When i do something like:
SNIP
And i get ratelimited

Unfortunately the API response does not let us know how many tries we have remaining in order to start performing client side throttling so the rate limit as a whole can be avoided. The response will only let us know if we've been rate limited or not and the library can only deal with that to the extent of waiting out the rate limit but not avoiding it.

If you do not want to hit the rate limit in the first place then you must throttle your own requests manually (this can be as simple as sleeping for a small amount of time between requests) or use multiple client applications as rate limits are applied on a per client app basis (although I do not know if this is endorsed by Spotify)

Right, so I've investigated the issue and have a possible fix ready.

I was a little reluctant to use the patch as it is right now since it feels inefficient (I performed 4096 search queries in four minutes with over 30 rate limits and I feel like that its possible to assume that either less time taken or less rate limits triggered but maybe not both) but I think its acceptable to release it under 0.8.6 temporarily and then succeed it with 0.8.7 as quickly as possible.

The fix that I've hacked together uses a barrier to stall all requests uniformly which gets controlled by the request that first gets rate limited. Unfortunately once the barrier gets lifted it's essentially a mad max free for all zerg rush for all stalled requests leading to more frequent rate limiting responses feeding a frequent stall, release, stall cycle until the amount of requests being made lowers enough to allow the API to recover from the clients madness.

Additionally that doesn't stop the possibility of choking out the entire pipeline leading to the same exception being raised, it merely delays it as much as possible. To combat this I've added a fallback stalling around the entire process, so when a request gets choked out the stalling falls back to an exponential backoff algorithm supplied by the new backoff dependency.

Ideally the two systems would get merged into one to avoid unnecessary overhead and allow some sort of slow bleed of requests back into the http session, but I'm not sure of how to tackle this and am experimenting in the dark for the time being.

@mental32 mental32 changed the title Problem with asyncio.gather() Better http throttling post 429s Mar 8, 2020
@mental32 mental32 pinned this issue Mar 8, 2020
@mental32 mental32 unpinned this issue Dec 4, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants