New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: API rate limiting #6795
Comments
Hi there, We've had many customers requesting rate limiting and we are actively working on the design of it, but we've had to ensure that we can find a design that doesn't impact those customers with extremely latency-sensitive workloads which has been the challenge to this point. It's likely to take place in two parts; one, and likely first, is to ensure that pathological cases are addressed without affecting performance for all clients, by allowing user-defined hooks very deep in the request path; two is more proactive request measuring to try to do higher-level throttling at the listener/http level but only as necessary to keep performance, where turning this on would impact latency for all clients but provide more dynamic scaling. I can't give a roadmap estimate yet, all I can say is that nothing will appear before 1.3. |
I started this issue to discuss design as per @armon response to my tweet (linked in the issue description above). I wouldn't have done it if I knew the outcome would be closing the issue just because you are working on it behind closed doors. You could have saved me the time and simply mention you are not interested in discussion. What a waste of my time :( |
Hi @pires, Unfortunately the issue is that I don't yet know whether or not the eventual feature will be a part of Vault OSS or Enterprise. Since I don't know, I and the rest of the Vault team can't discuss design without an NDA in place. If you are a customer you can reach out through your support/TAM and we would be more than happy to have such a discussion! |
Hi @pires, First: apologies for the whiplash! I didn't click on the Twitter thread initially since I figured all context was in the (very detailed, thanks for that) original post and thus didn't realize Armon had specifically asked you to file an issue about this until you mentioned it in your later comment. I want to give some context on closing the issue -- we get a lot of feedback from mailing lists, customer requests, past issues, talking with users, and more, so it happens quite often that we get a request for a feature that is already in process. Usually we let the user know the good news and close the ticket since at that point once there’s any (OSS) code for the feature it is being worked on in branches and PRs in the open. Generally users have been receptive to this, and when in some cases they've asked us to re-open it for tracking (e.g. to link to from a PR later), we've been happy to do so. Clearly it's not a good approach in all cases, and I especially didn't realize the extent to which you wanted to help out with design discussion (something that is quite rare among issue filers/feature requestors) -- I'm sorry about that. I'm happy to reopen this issue now for tracking and discussion. Regarding OSS/Enterprise and NDAs: rate limiting is a fairly complicated problem and we have a lot of very specific requests from customers that we have to trade off with stringent performance requirements (often from the same customers). We actually believe that what Vault needs is a multi-faceted solution that includes leaky-bucket approaches like what you described along with deeper, more targeted capabilities specific to Vault's internal workings. This is likely to end up with aspects in both OSS and Enterprise. For instance, we're exploring whether, in order to handle these very specific requirements for our Enterprise customers, we can take advantage of Sentinel's flexibility to allow complex rule specifications. This would obviously require that particular aspect of a rate limiting solution to be part of the Enterprise offering, and discussion of the design of that aspect would take place under NDA, as they do for most unreleased features in commercial products. Additionally, the HashiCorp Research team is also looking at the more general problem of maintaining QoS and avoiding system overload. We believe there is novel work we can do there and contribute to the literature. While the outcome of that work will hopefully span all of our open source products, we want to preserve our ability to publish the work. Hence in this very particular case there is some extra sensitivity around NDAs even for some features we're thinking of that we hope to be OSS. If you are interested in collaborating with us on the research project, we are happy to have that conversation offline. Since as an overall aspect of Vault rate limiting is still in the design phase, I was trying to exercise an abundance of caution and gave the wrong impression that NDAs are required for all discussion. That is absolutely not the case, as is now hopefully clear, and I'm sorry for causing confusion there. I'm more than happy to discuss the approaches you put in the original post -- that part of the solution I know for sure would be a part of our OSS and isn't dependent on Enterprise features or overlapping with our research team. |
I was just going through Vault intro documentation and thought of the same scenario OP is describing. Vault would likely be a core part of infrastructure w/ many services fully dependent on it. I'm concerned about how easily a developer could neglect resource plan and create a lambda that DoS vault. Has anyone forced users to go through their API gateway to use vault so you have rate limiting knobs to turn? That could give the |
We are using Vault with DynamoDB backend. Some of clients/services occasionally create a flood of API requests to Vault (misbehaving) which consume all read/write units allocated to DynamoDB. Impact/Risk: at the moment a single client can "bring down" the Vault cluster. |
May be linked and bear relation to: #9651 - similar to |
will vault agent reduce the rate limitting dependancy for now? |
Rate limiting is implemented under the Resource Quotas feature and was released with Vault 1.5.0. |
Problem
Without API rate limiting, Vault can be DoS'ed by less educated, or ill motivated, clients. As far as we can understand, the limits are undetermined and are invariably tied to the resources allocated to the machine where Vault is running and/or the definition of
ulimits
.We are aware there is a limit for request size, though.
In one of our scenarios, a less educated client is running millions of short-lived jobs in parallel
(not that it's relevant, but this takes place atop Nomad), and said jobs requests many secrets from Vault, which means millions of parallel requests to said service. Unfortunately, we, the server side of the team, can't do proper capacity planning for this kind of use case and can only try and:
We want to explore the latter and this issue is the second step in that direction. The first was this Twitter thread.
This issue relates to #336.
Possible solution(s)
API rate-limiter at the HTTP listener
This seems to be the cheapest thing to do. However, if Vault rate limits at the HTTP listener as a whole, we may shoot ourselves in the foot by rate-limiting the
/sys
endpoint too. Splitting into two or more listeners could help here.API rate-limiter at the request handler
This could be abstracted and called only on pre-defined handlers, eg all but
/sys
. However, this doesn't act on the listener level, soulimits
can still be crossed and DoS can still occur, eg too many open files.Considered alternative(s)
MITM stop-gap solution
One idea that ran through our mind was to put an HTTP proxy in front of the Vault instances and implement throttling there. Here's one example on how to do it with HAProxy. However, this could easily become a redirect hell, unless the proxy understands which backend is Vault leader and use that all the time.
The text was updated successfully, but these errors were encountered: