Datanode API rate limiting #7445

jeremyletang · 2023-01-27T13:51:13Z

As of today, the datanode API do rate limite only the number of subscribtion open per IPs. Unary ueries (graphql, rest or GRPC) are not rate limited.

We want to implement rate limiting on these 3 API per IP address. Because the graphl API uses the grpc API, and because it is sligltly more difficult to control the amount of request done by the graphql API to the grpc API, the rate limiting needs to be accounted separately for graphql, and grpc endpoint (to be set in the configuration. e.g: 10 graphql request per second , 20 grpc request per second).

In order for a request to not be rate limited twice, all request to the grpc API coming from the graphql layer will need to embed a metadata saying so, all these request are then excluded from the grpc rate limiting.

A user, which break the maximum number of request per seconds will be rate limited. A breach is valid as soon as in a second:
(number graphql request for ip / max graphql request per second + number of grpc request fro ip / max grpc request per second) > 1

A user breaching the rate limit, is banned exponentially applicable immediately. All following request are rejected with an appropriate error message. Once the ban period is lifted, for the duration of the previous ban period any breach will of the rate limite will provoke a ban period of an increase duration.
Example:

Rate limit breached at 1PM: initial ban period of 10 minutes
User can start using the API again at 1:10PM.
until 1:20PM, any new breach will incure a ban of an extended time of 1h instead of 10 minutes, etc.
after 1:20PM, any breach wil incure a ban of 10 minutes again.

The APIs should also return in some ways (headers, or metada for graphql) the amount of request allowed by the client in the current rate limited scope.

The rate limiting should be optional on the datanode, and configured by the operator to eventually be totally disabled.

The text was updated successfully, but these errors were encountered:

ettec · 2023-01-27T13:56:54Z

Only scanned this (https://www.imperva.com/learn/application-security/rate-limiting/#:~:text=Rate%2Dlimiting%20solutions%20work%20by,made%20in%20a%20set%20timeframe.) , but worth whoever picks this up having a read around best practice, for example we may wnat to limit not just on number of calls, but how long in total the calls take to execute, total time used in a given period, this would make sense for datanode for example possibly as some calls much more expenisvie than others - the time taken to execute is a good approx of the cost incurred by datanode (or at least an indicator its under load, which may in itself be a useful way to limit calls) - anyway - perhaps too fancy for the first cut, but may be worth some thought

MM0819 · 2023-01-27T13:59:20Z

It woudl also be helpful to include in the response headers how many requests I have left within the fixed time period, so that I can prevent myself from hitting the rate limit but also don't need to track that myself as a client. For example, something similar to this:

https://www.bitmex.com/app/restAPI#Request-Rate-Limits

ettec · 2023-01-27T14:17:35Z

worth a place holder ticket to dfferentiate the malicious case and capture proto thoughts on that?

davidsiska-vega · 2023-01-27T14:57:04Z

It would be also good to make this configurable switch-off-able or e.g. not rate-limit queries coming from localhost to data node (if that makes sense) - anyway I want to not break the null-chain / vega-market-sim use case of the data node.

jeremyletang · 2023-01-27T14:57:51Z

It would be also good to make this configurable switch-off-able or e.g. not rate-limit queries coming from localhost to data node (if that makes sense) - anyway I want to not break the null-chain / vega-market-sim use case of the data node.

Yes I forgot to mention it, but this should be optional indeed. I'll update the ticket.

davidsiska-vega · 2023-01-27T15:51:54Z

but this should be optional indeed

Indeed, another use case is me running my own data-node for my own trading bots; I don't want to rate limit those and I'll simply close all the connections to the outside world.

jeremyletang added validator performance datanode labels Jan 27, 2023

jeremyletang added this to the 🤠 🤸 OT Stretch milestone Jan 27, 2023

gordsport assigned pscott31 Feb 1, 2023

gordsport mentioned this issue Feb 7, 2023

Add ACs around the data node rate limiting vegaprotocol/specs#1578

Closed

pscott31 mentioned this issue Feb 8, 2023

feat: add API rate limiting #7545

Merged

gordsport added the sim-2 label Feb 9, 2023

pscott31 closed this as completed in #7545 Feb 10, 2023

gordsport added the breaking-change label Feb 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Datanode API rate limiting #7445

Datanode API rate limiting #7445

jeremyletang commented Jan 27, 2023 •

edited

ettec commented Jan 27, 2023 •

edited

MM0819 commented Jan 27, 2023

ettec commented Jan 27, 2023 •

edited

davidsiska-vega commented Jan 27, 2023

jeremyletang commented Jan 27, 2023

davidsiska-vega commented Jan 27, 2023

Navigation Menu

Datanode API rate limiting #7445

Datanode API rate limiting #7445

Comments

jeremyletang commented Jan 27, 2023 • edited

ettec commented Jan 27, 2023 • edited

MM0819 commented Jan 27, 2023

ettec commented Jan 27, 2023 • edited

davidsiska-vega commented Jan 27, 2023

jeremyletang commented Jan 27, 2023

davidsiska-vega commented Jan 27, 2023

jeremyletang commented Jan 27, 2023 •

edited

ettec commented Jan 27, 2023 •

edited

ettec commented Jan 27, 2023 •

edited