Navigation Menu

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datanode API rate limiting #7445

Closed
jeremyletang opened this issue Jan 27, 2023 · 6 comments · Fixed by #7545
Closed

Datanode API rate limiting #7445

jeremyletang opened this issue Jan 27, 2023 · 6 comments · Fixed by #7545

Comments

@jeremyletang
Copy link
Member

jeremyletang commented Jan 27, 2023

As of today, the datanode API do rate limite only the number of subscribtion open per IPs. Unary ueries (graphql, rest or GRPC) are not rate limited.

We want to implement rate limiting on these 3 API per IP address. Because the graphl API uses the grpc API, and because it is sligltly more difficult to control the amount of request done by the graphql API to the grpc API, the rate limiting needs to be accounted separately for graphql, and grpc endpoint (to be set in the configuration. e.g: 10 graphql request per second , 20 grpc request per second).

In order for a request to not be rate limited twice, all request to the grpc API coming from the graphql layer will need to embed a metadata saying so, all these request are then excluded from the grpc rate limiting.

A user, which break the maximum number of request per seconds will be rate limited. A breach is valid as soon as in a second:
(number graphql request for ip / max graphql request per second + number of grpc request fro ip / max grpc request per second) > 1

A user breaching the rate limit, is banned exponentially applicable immediately. All following request are rejected with an appropriate error message. Once the ban period is lifted, for the duration of the previous ban period any breach will of the rate limite will provoke a ban period of an increase duration.
Example:

  • Rate limit breached at 1PM: initial ban period of 10 minutes
  • User can start using the API again at 1:10PM.
  • until 1:20PM, any new breach will incure a ban of an extended time of 1h instead of 10 minutes, etc.
  • after 1:20PM, any breach wil incure a ban of 10 minutes again.

The APIs should also return in some ways (headers, or metada for graphql) the amount of request allowed by the client in the current rate limited scope.

The rate limiting should be optional on the datanode, and configured by the operator to eventually be totally disabled.

@ettec
Copy link
Contributor

ettec commented Jan 27, 2023

Only scanned this (https://www.imperva.com/learn/application-security/rate-limiting/#:~:text=Rate%2Dlimiting%20solutions%20work%20by,made%20in%20a%20set%20timeframe.) , but worth whoever picks this up having a read around best practice, for example we may wnat to limit not just on number of calls, but how long in total the calls take to execute, total time used in a given period, this would make sense for datanode for example possibly as some calls much more expenisvie than others - the time taken to execute is a good approx of the cost incurred by datanode (or at least an indicator its under load, which may in itself be a useful way to limit calls) - anyway - perhaps too fancy for the first cut, but may be worth some thought

@MM0819
Copy link
Contributor

MM0819 commented Jan 27, 2023

It woudl also be helpful to include in the response headers how many requests I have left within the fixed time period, so that I can prevent myself from hitting the rate limit but also don't need to track that myself as a client. For example, something similar to this:

https://www.bitmex.com/app/restAPI#Request-Rate-Limits

@ettec
Copy link
Contributor

ettec commented Jan 27, 2023

worth a place holder ticket to dfferentiate the malicious case and capture proto thoughts on that?

@davidsiska-vega
Copy link
Contributor

It would be also good to make this configurable switch-off-able or e.g. not rate-limit queries coming from localhost to data node (if that makes sense) - anyway I want to not break the null-chain / vega-market-sim use case of the data node.

@jeremyletang
Copy link
Member Author

It would be also good to make this configurable switch-off-able or e.g. not rate-limit queries coming from localhost to data node (if that makes sense) - anyway I want to not break the null-chain / vega-market-sim use case of the data node.

Yes I forgot to mention it, but this should be optional indeed. I'll update the ticket.

@davidsiska-vega
Copy link
Contributor

but this should be optional indeed

Indeed, another use case is me running my own data-node for my own trading bots; I don't want to rate limit those and I'll simply close all the connections to the outside world.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging a pull request may close this issue.

6 participants