Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: API rate limiting #6795

Closed
pires opened this issue May 29, 2019 · 9 comments
Closed

Feature request: API rate limiting #6795

pires opened this issue May 29, 2019 · 9 comments

Comments

@pires
Copy link

pires commented May 29, 2019

Problem

Without API rate limiting, Vault can be DoS'ed by less educated, or ill motivated, clients. As far as we can understand, the limits are undetermined and are invariably tied to the resources allocated to the machine where Vault is running and/or the definition of ulimits.
We are aware there is a limit for request size, though.

In one of our scenarios, a less educated client is running millions of short-lived jobs in parallel
(not that it's relevant, but this takes place atop Nomad), and said jobs requests many secrets from Vault, which means millions of parallel requests to said service. Unfortunately, we, the server side of the team, can't do proper capacity planning for this kind of use case and can only try and:

  • Increase the resources allocated to Vault and tune any OS and Vault performance-related configurations;
  • Educate the client to reduce the number of jobs and that run in parallel, or do their own rate-limiting - this has been source for heated debates;
  • Deploy some man-in-the-middle, stop-gap solution that somehow protects Vault;
  • Contribute API rate-limiting functionality to Vault.

We want to explore the latter and this issue is the second step in that direction. The first was this Twitter thread.

This issue relates to #336.

Possible solution(s)

API rate-limiter at the HTTP listener

This seems to be the cheapest thing to do. However, if Vault rate limits at the HTTP listener as a whole, we may shoot ourselves in the foot by rate-limiting the /sys endpoint too. Splitting into two or more listeners could help here.

API rate-limiter at the request handler

This could be abstracted and called only on pre-defined handlers, eg all but /sys. However, this doesn't act on the listener level, so ulimits can still be crossed and DoS can still occur, eg too many open files.

Considered alternative(s)

MITM stop-gap solution

One idea that ran through our mind was to put an HTTP proxy in front of the Vault instances and implement throttling there. Here's one example on how to do it with HAProxy. However, this could easily become a redirect hell, unless the proxy understands which backend is Vault leader and use that all the time.

@jefferai
Copy link
Member

Hi there,

We've had many customers requesting rate limiting and we are actively working on the design of it, but we've had to ensure that we can find a design that doesn't impact those customers with extremely latency-sensitive workloads which has been the challenge to this point.

It's likely to take place in two parts; one, and likely first, is to ensure that pathological cases are addressed without affecting performance for all clients, by allowing user-defined hooks very deep in the request path; two is more proactive request measuring to try to do higher-level throttling at the listener/http level but only as necessary to keep performance, where turning this on would impact latency for all clients but provide more dynamic scaling.

I can't give a roadmap estimate yet, all I can say is that nothing will appear before 1.3.

@pires
Copy link
Author

pires commented May 29, 2019

I started this issue to discuss design as per @armon response to my tweet (linked in the issue description above). I wouldn't have done it if I knew the outcome would be closing the issue just because you are working on it behind closed doors. You could have saved me the time and simply mention you are not interested in discussion. What a waste of my time :(

@jefferai
Copy link
Member

Hi @pires,

Unfortunately the issue is that I don't yet know whether or not the eventual feature will be a part of Vault OSS or Enterprise. Since I don't know, I and the rest of the Vault team can't discuss design without an NDA in place. If you are a customer you can reach out through your support/TAM and we would be more than happy to have such a discussion!

@jefferai
Copy link
Member

Hi @pires,

First: apologies for the whiplash! I didn't click on the Twitter thread initially since I figured all context was in the (very detailed, thanks for that) original post and thus didn't realize Armon had specifically asked you to file an issue about this until you mentioned it in your later comment.

I want to give some context on closing the issue -- we get a lot of feedback from mailing lists, customer requests, past issues, talking with users, and more, so it happens quite often that we get a request for a feature that is already in process. Usually we let the user know the good news and close the ticket since at that point once there’s any (OSS) code for the feature it is being worked on in branches and PRs in the open. Generally users have been receptive to this, and when in some cases they've asked us to re-open it for tracking (e.g. to link to from a PR later), we've been happy to do so. Clearly it's not a good approach in all cases, and I especially didn't realize the extent to which you wanted to help out with design discussion (something that is quite rare among issue filers/feature requestors) -- I'm sorry about that. I'm happy to reopen this issue now for tracking and discussion.

Regarding OSS/Enterprise and NDAs: rate limiting is a fairly complicated problem and we have a lot of very specific requests from customers that we have to trade off with stringent performance requirements (often from the same customers). We actually believe that what Vault needs is a multi-faceted solution that includes leaky-bucket approaches like what you described along with deeper, more targeted capabilities specific to Vault's internal workings. This is likely to end up with aspects in both OSS and Enterprise. For instance, we're exploring whether, in order to handle these very specific requirements for our Enterprise customers, we can take advantage of Sentinel's flexibility to allow complex rule specifications. This would obviously require that particular aspect of a rate limiting solution to be part of the Enterprise offering, and discussion of the design of that aspect would take place under NDA, as they do for most unreleased features in commercial products.

Additionally, the HashiCorp Research team is also looking at the more general problem of maintaining QoS and avoiding system overload. We believe there is novel work we can do there and contribute to the literature. While the outcome of that work will hopefully span all of our open source products, we want to preserve our ability to publish the work. Hence in this very particular case there is some extra sensitivity around NDAs even for some features we're thinking of that we hope to be OSS. If you are interested in collaborating with us on the research project, we are happy to have that conversation offline.

Since as an overall aspect of Vault rate limiting is still in the design phase, I was trying to exercise an abundance of caution and gave the wrong impression that NDAs are required for all discussion. That is absolutely not the case, as is now hopefully clear, and I'm sorry for causing confusion there.

I'm more than happy to discuss the approaches you put in the original post -- that part of the solution I know for sure would be a part of our OSS and isn't dependent on Enterprise features or overlapping with our research team.

@jefferai jefferai reopened this May 30, 2019
@Vye
Copy link

Vye commented Aug 21, 2019

I was just going through Vault intro documentation and thought of the same scenario OP is describing. Vault would likely be a core part of infrastructure w/ many services fully dependent on it. I'm concerned about how easily a developer could neglect resource plan and create a lambda that DoS vault.

Has anyone forced users to go through their API gateway to use vault so you have rate limiting knobs to turn? That could give the /sys separation OP mentioned.

@Constantin07
Copy link

We are using Vault with DynamoDB backend. Some of clients/services occasionally create a flood of API requests to Vault (misbehaving) which consume all read/write units allocated to DynamoDB.
It would be nice to be able to specify API request limit per IP so that no one client can completely abuse the system by consuming all quotas.

Impact/Risk: at the moment a single client can "bring down" the Vault cluster.

@aphorise
Copy link
Contributor

May be linked and bear relation to: #9651 - similar to X-Forwarded-For bug and feature request.

@shreyasmoolya09
Copy link

will vault agent reduce the rate limitting dependancy for now?

@vishalnayak
Copy link
Member

Rate limiting is implemented under the Resource Quotas feature and was released with Vault 1.5.0.
https://www.vaultproject.io/docs/concepts/resource-quotas#rate-limit-quotas

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants