Implemented bucket algorithm for handling bursts / simultaneous requests #373

samholt · 2017-10-23T05:51:50Z

Implemented token bucket algorithm for handling burst / simultaneous requests. (https://en.wikipedia.org/wiki/Token_bucket). Implemented in python3, specifically for asyncio. Have written tests in the test_asyncio.py, which tests it for that exchange when the load tickers method is not available, thus it loads all the tickers, using this method.

Works for all exchanges in python3. Noticed bug that causes python 2 to break in tests, simply that running python2 test/test.py, ends up calling functions inside test/test_asyncio.py, needs further analysis to properly seperate these function namespaaces. However the token bucket algorithm implementation is sound.

Do let me know what we need to change, to successfully merge this into master. Best, Sam.

Update from master

updating

Update Fork

Update from master

update from master

…ck off, that facilitates bursts of asyncio requests , whilst staying under the strict pre-defined rate limits for each exchange. SH

…e file

…mly selecting the exponential backoff time in the bucket algorithm rate limiter

… loading all tickers

…rate limit

…hon tests

Update from master

kroitor · 2017-10-23T06:40:34Z

This is awesome! Respect, man! I'm going to spend some time merging this, will let you know on the progress!

kroitor · 2017-10-23T06:40:49Z

Reference: #297

kroitor · 2017-10-23T07:20:30Z

ccxt/async/exchange.py

+            #     print('Waiting for tokens: Exchange: {0}'.format(self.id))
+            self.add_new_tokens()
+            seconds_delays = [0.01, 0.1, 0.7, 1, 1.5, 2]
+            delay = random.choice(seconds_delays)


@samholt, these two lines are not very clear to me, namely, the choice for 0.01, 0.1, 0.7, 1, 1.5 and 2. I can see that you are randomizing the delays here, but I'm not sure why. Can you please elaborate on that?

Certainly, CS/EEE theory from how routers / ethernet optimally scales bandwidth to the maximal optimal rate for the resources. I.e. the Ethernet protocol or similarly the IEEE 802.3 CSMA/CD standard, both implement an exponential back off algorithm (https://en.wikipedia.org/wiki/Exponential_backoff) to avoid multiple accesses to the same resource (a collision) (This prevents hundreds of asyncio tasks checking for new available tokens at the same time). From theory the randomised delays should follow a similar process to https://en.wikipedia.org/wiki/Exponential_backoff#Example_exponential_backoff_algorithm, however a simplified version here is to just combine 1 - 3, to achieve an approximation, which is crudely the set above. Upon second thought the times should be exponentially distributed in the set, thus if we assume a minimum delay of 1ms and max delay of 500ms, of a set of 5, thus: [0.001, 0.005, 0.022, 0.106, 0.5].

import numpy as np x = np.linspace(np.log(0.001), np.log(0.5),5) y = np.exp(x) for i in y: print(round(i,3))

kroitor · 2017-10-23T07:50:37Z

ccxt/async/exchange.py

+        #     print('Adding new tokens: Exchange: {0}'.format(self.id))
+        now = time.monotonic()
+        time_since_update = now - self.rateLimitUpdateTime
+        new_tokens = math.floor( ( 0.8 * 1000.0 * time_since_update ) / self.rateLimit )


... and this one, the 0.8 magic constant looks misterious ) I may be missing something here %)

Initially new tokens was to set to be dynamically calculated, purposely based on how many tokens are initially sent out to adjust the token generation, so as to over a fixed time span (e.g. 60 seconds), the bucket algorithm to send the same number of requests a single delayed poller, sending out requests at a periodic time ( period of the rateLimit). Thus stay under the rate limit, and support bursts of requests over time, for optimal request rates etc.

The theory is as follows:

Traditional linear rate limit poller, rateLimit = 1.5 seconds, over a period for example T = 60 seconds, will send 60/1.5 = 40 requests, or here each request is represented as a token.

Dynamic rate limiter, must still satisfy sending 40 requests or less, over this same period of T = 60 seconds. Thus if the first burst, takes 16 requests, that leaves 40 - 16 = 24 requests left, of which should be added to the token bucket linearly, i.e. over this time period of T, thus a new token/request must be added every 2.5 seconds. This ensures that the original rate limit is met, whilst allowing bursts of requests over short periods of time.
Thus the new_tokens = (1/2.5)* time_since_update.

These simple equations can be represented as:

However upon calculating this for a typical exchange with a rateLimit as defined above, we can express the new token rate a percentage of the old rate limit, i.e. 2.5/1.5, which can be factored out, to yield.

new_tokens = (1.5/2.5)*(1/1.5)*time_since_update
new_tokens = 0.6 * (1/rateLimit) * time_since_update

Thus we should approximately take 0.6 or 60% of the linear rate limit if we use the constants defined above. Although this works, I found after repeated tests that exchanges can be throttled faster than this and don't error with DDOS. The 0.8, or 80% of (1/rateLimit), was one of the highest rates, that exchanges could still be throttled at without reporting DDOS or rate limit. Keen to hear everyones views on this, especially if we should stick to the lower limits, what T time period rate limits are usually calculated for behind exchanges, and if we should always try to maximise the rate at which we can send requests.

kroitor · 2017-10-23T10:53:18Z

test/test_async.py

+            # Some exchanges not all the symbols can fetch tickers for
+            symbols_to_load = [symbol for symbol in exchange.symbols if not '.d' in symbol]
+            if exchange.id == 'bitmex':
+                symbols_to_load = ['BTC/USD', 'B_BLOCKSZ17', 'DASHZ17', 'ETC7D', 'ETHZ17', 'LTCZ17', 'XBJZ17', 'XBTZ17', 'XMRZ17', 'XRPZ17', 'XTZZ17', 'ZECZ17']


Unfortunately, this line for Bitmex will break the test upon monthly contract expiration. Basically, their tickers change continuously, so we have to make another workaround for it. But, this isn't a big problem, will fix it anyway.

Interesting yes, it was just a quick fix, as errors were occurring for attempting to fetch all of the individual tickers for each symbol contained in bitmex.symbols, it seems some of them are not supported, however still listed ? Keen to see your fix :)

kroitor · 2017-10-23T10:57:55Z

test/test_async.py

+            elif exchange.id == 'virwox':
+                symbols_to_load = [symbol for symbol in symbols_to_load if symbol != 'CHF/SLL']
+            input_coroutines = [exchange.fetchTicker(symbol) for symbol in symbols_to_load]
+            tickers = await asyncio.gather(*input_coroutines)


Even though sending 16 requests at once to fetch tickers can work, it will most likely result in a ban from Bittrex or Kraken, if you try doing the same with private requests and with L2 orderbooks continuously...

Interesting, it sends 16 requests, then adds the possibility for a new request at slightly less than the rate limit, to make sure over a fixed time period the specified rate limit is met for the exchange. Looking at Bittrex API documentation, I can't readily see any information about their API rate limit ? (https://bittrex.com/home/api) Multiple tests to their api using this seems to work, however long polling of continuous order books is yet to be tested (I plan to add a test to test loading all order books for all exchanges). Yes I agree with Kraken in its current implementation, can make it work for Kraken by specifically adding a lower starting 'rateLimitMaxTokens' : 8, and 'rateLimitTokens' : 8.

it sends 16 requests, then adds the possibility for a new request at slightly less than the rate limit, to make sure over a fixed time period the specified rate limit is met for the exchange.

Yep, and if those 16 requests are private requests (fetching balance, placing and canceling orders, fetching orders/trades history), those will fail most of the time, because practically none of the exchanges tolerate that rate.

Looking at Bittrex API documentation, I can't readily see any information about their API rate limit ?

Yes, they don't have it documented, but it's been reverse-engineered to be approx 1 request per second. They don't want to be transparent, I don't really know why. Liqui will also complain upon querying at rates higher than one or two requests per second. And it takes Kraken approx 10 (!) seconds to process an order, and it has been so for past few weeks, so, i guess sending 16 of them at once is not our option ) I am still adding some minor edits here and there to complete the merge, but most of it is already there ;)

Very good, agreed, perhaps we should limit the throttle for asyncio to only public calls ? And keep the old rate limit delay structure for private API calls ? How do we solve this ?

I would say, we are fine. It's just that I made that functionality optional, so users may or may not enable it at their discretion depending on their use case. So if they design a poller application for alerts or something, they can go ahead and turn it full on (no warranties though). And if they want to do trading, then they can leave it off until we come up with a generic solution.

Nice very good

kroitor · 2017-10-23T11:40:00Z

So, after having a deeper look into it, apart from the above, my thoughts are:

It does a simple-case job, but it actually does not rate limit the requests, sending the first burst right away (16 requests by default). If you try sending 16 private requests with it – those will fail. We also need to control the cost per requests, while current implementation considers it to be constant -1. This will fail on Kraken private requests, because their call costs depend on the type of the call. A ledger call costs -3, whereas a simple balance call costs -1.
This implementation has a known bug to it (also present in current implementation of the poller in JS), namely: private requests are not signed before being sent, but are signed before throttling, so, most of them have outdated nonces by the time they are sent. That leads to failures.
As this functionality is still experimental, I'm adding a "if enabled" conditional to it, otherwise it will be on by default for everyone, whereas some people may not want it just yet. We will switch it to default=on, when we have a solution to 1 and 2.

kroitor · 2017-10-23T11:40:25Z

Now merging it into the master, thx again!

samholt · 2017-10-23T13:07:38Z

Great thank you, working with CCXT full time, so please let me know any other features I can implement, very interested in calculating arbitrage with the order books, thus will write a test to get the L2 order books simultaneously for all exchanges.

Re: Above,

We can add another flag for particular exchanges that have the cost of requests associated with them, hasRequestCostsCalculation , and then use the counter to avoid hitting the rate limit for those exchanges ?
Can we re-factor to sign after throttling the request to keep the nonces in date ?

Let me know your thoughts :)

kroitor · 2017-10-23T13:31:23Z

@samholt

We can another flag for particular exchanges that have the cost of requests associated with them, has ... , and then use the counter to avoid hitting the rate limit for those exchanges.

It's exactly what we should do :)

Can we re-factor to sign after throttling the request to keep the nonces in date ?

Yep. We have been doing some preparation work to make that possible (and it's done). We had to separate the sign() method from the request() method for that. So, if you look at line 655 in test/test_async.py from where the sign() is called now, we should move it to be called after the await for token, but before proxies, etc... The fetch2 call was added temporarily as an intermediary for that, but it should be deleted in the end of this migration process. And we also have to pass all of necessary args for the sign() along the way. This may require adding a queue of requests or something...

... In fact, we thought, maybe we should move it from ccxt, because that functionality can be detached very-well from there into a separate lib with a single configurable throttle() method. We want to write synthetic tests for the poller without hitting the internet first, and that would be much more convenient to do in a standalone repo. We actually started working on a separate JS/Py lib for throttling to implement the poller in JS and Py in a "non-leaky-bucket" fashion. And I was going to do the initial commit very soon, maybe in a day or two at max. So I'll let you know as soon as I make the repo available, and you are welcome to contribute to it as well. I am still going to merge this PR, almost done with that. But then we think of splitting some parts of ccxt into smaller separate libs. This is on our plans to refactor for ccxt v2 anyway. We also have a multitude of tasks yet to solve, namely the other open issues and the list of pending tasks in the CONTRIBUTING doc... So much to do! )

…ilds/291377722 fixes #373

kroitor · 2017-10-23T14:28:52Z

I merged it, but, for some reason GitHub is not showing my commits to this branch here.
So, if you're interested how it was merged: https://github.com/ccxt-dev/ccxt/commits/master ← all commits starting from 1.9.231 are related to this PR.

samholt · 2017-10-23T14:48:03Z

Excellent I can see the merge thank you :) Answered all your questions detailed, keen to change the random delays to: [0.001, 0.005, 0.022, 0.106, 0.5], see above reason why, in short its better as these times are exponentially created. Also will work next to write a test to pull in all the L2 order books for all the exchanges, do DM with any suggestions you might have, or how you would like :)

…exponential backoff random selection times (in seconds) and lowering Krakens starting tokens, to avoid being rate limited on Kraken

ccxt 1.91.21

samholt and others added 15 commits October 20, 2017 17:24

Merge pull request #1 from ccxt-dev/master

a078be7

Update from master

Merge pull request #2 from ccxt-dev/master

ce6acdb

updating

Merge pull request #3 from ccxt-dev/master

ab49a79

Update Fork

Merge pull request #4 from ccxt-dev/master

94c31dd

Update from master

Merge pull request #5 from ccxt-dev/master

d2d55d4

update from master

Added token bucket algorithm rate limiter, with random exponential ba…

8cda621

…ck off, that facilitates bursts of asyncio requests , whilst staying under the strict pre-defined rate limits for each exchange. SH

Fixing testcase error, of missing python module time in async exchang…

ea547dc

…e file

Removing old code, which was used for debug purposes

db81b41

Fixed test case error of missing python library random, used in rando…

28cb785

…mly selecting the exponential backoff time in the bucket algorithm rate limiter

Fixing some of the missing fetch ticker pairs in the asyncio test for…

f932376

… loading all tickers

Tweaking the parameter of the bucket algorihtm, to avoid hitting the …

10b89a1

…rate limit

Corrected ratelimit variable for DDOS on exchange error

e7ba903

Removed timeout of exchange _1btcxe, to default timeout

00faab9

Removed time monotonic function call from exchange, failed in 2.7 pyt…

16a470f

…hon tests

Merge pull request #6 from ccxt-dev/master

34e5789

Update from master

kroitor self-assigned this Oct 23, 2017

kroitor reviewed Oct 23, 2017

View reviewed changes

kroitor merged commit 34e5789 into ccxt:master Oct 23, 2017

kroitor added a commit that referenced this pull request Oct 23, 2017

PEP8 E201, E202 whitespace + ( https://travis-ci.org/ccxt-dev/ccxt/bu…

29a1137

…ilds/291377722 fixes #373

kroitor added a commit that referenced this pull request Oct 23, 2017

more minor edits to #373

1d70ef8

kroitor changed the title ~~Implemented bucket algorithm for handling burts / simultaneous requests~~ Implemented bucket algorithm for handling bursts / simultaneous requests Oct 28, 2017

kroitor mentioned this pull request Dec 19, 2017

BCH/coins orderbooks missing from BitBay exchange #892

Closed

kroitor mentioned this pull request Jan 21, 2018

rateLimits #1399

Closed

kroitor mentioned this pull request May 10, 2018

Binance: too many requests despite using #2810

Closed

StasToken mentioned this pull request Nov 28, 2019

PHP binance fetch_order - Mysticism low grade #6196

Closed

academe-01 pushed a commit to academe-01/ccxt that referenced this pull request Oct 2, 2022

Merge pull request ccxt#373 from carlorevelli/ccxt-1.91.21

28d4ba4

ccxt 1.91.21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implemented bucket algorithm for handling bursts / simultaneous requests #373

Implemented bucket algorithm for handling bursts / simultaneous requests #373

samholt commented Oct 23, 2017 •

edited

Loading

kroitor commented Oct 23, 2017

kroitor commented Oct 23, 2017

kroitor Oct 23, 2017

samholt Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017

samholt Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017

kroitor Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017

kroitor Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017

kroitor Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017

kroitor commented Oct 23, 2017

kroitor commented Oct 23, 2017

samholt commented Oct 23, 2017 •

edited

Loading

kroitor commented Oct 23, 2017 •

edited

Loading

kroitor commented Oct 23, 2017

samholt commented Oct 23, 2017

Implemented bucket algorithm for handling bursts / simultaneous requests #373

Implemented bucket algorithm for handling bursts / simultaneous requests #373

Conversation

samholt commented Oct 23, 2017 • edited Loading

kroitor commented Oct 23, 2017

kroitor commented Oct 23, 2017

kroitor Oct 23, 2017

Choose a reason for hiding this comment

samholt Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

kroitor Oct 23, 2017

Choose a reason for hiding this comment

samholt Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

kroitor Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

samholt Oct 23, 2017

Choose a reason for hiding this comment

kroitor Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

samholt Oct 23, 2017

Choose a reason for hiding this comment

kroitor Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

samholt Oct 23, 2017

Choose a reason for hiding this comment

kroitor Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

samholt Oct 23, 2017

Choose a reason for hiding this comment

kroitor commented Oct 23, 2017

kroitor commented Oct 23, 2017

samholt commented Oct 23, 2017 • edited Loading

kroitor commented Oct 23, 2017 • edited Loading

kroitor commented Oct 23, 2017

samholt commented Oct 23, 2017

samholt commented Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017 •

edited

Loading

samholt Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017 •

edited

Loading

kroitor Oct 23, 2017 •

edited

Loading

samholt commented Oct 23, 2017 •

edited

Loading

kroitor commented Oct 23, 2017 •

edited

Loading