New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Limit request size #16011
Comments
Just a thought: I wonder if the default limit should be something like the default |
Thanks for your input @dadoonet! I am not sure about your suggestion on the check of |
ha! That is correct. I thought |
Ok, thanks for the clarification. |
Design ParametersAfter looking at the source, I have identified a few options which I'd like to discuss before implementing this feature. Below are some design parameters which are worth considering from my point of view. Type of endpointA (bulk) request can hit the cluster via the following endpoints:
Applicability of the limitThis means whether we want to apply the limit for all types of requests or just for a limited number of explicitly defined request types (like bulk requests). When to check the limit
Request size calculationTo know whether a request has (b)reached the limit, we need to calculate its size. Considering
ProposalBased on the options above, I want to sketch a simple solution proposal: Considering that it is likely we want limit checks not only for bulk requests but also similar ones, like multi-gets, we should not tie this too specifically to bulk requests. Hence, each for each request type that should be size-limited, the corresponding public interface RequestSizeEstimator {
int getRequestSizeInBytes();
} We define one configurable request size limit (default e.g. 50MB) for all request types and implement limit breach detection in a high-level fashion as an In summary, the pros and cons of this solution are: Pros:
Cons:
Feedback is very much appreciated. |
Based on feedback I got by @bleskes, here's a revised approach: We limit the size of requests on protocol level during deserialization. We consider two cases:
Limiting on protocol level has a couple of advantages:
In the first step, we will implement an action-independent limit which applies to all actions not just bulk actions. Transports that need to be considered are Netty transport and local transport (for testing purposes). HTTP is unaffected because we just stream data from the HTTP layer to the transport layer (no up-front allocation). We need to check whether circuit breakers are a feasible option to detect limit breaches. |
@dakrone: Could you share your thoughts whether using
In both cases the memory usage will be determined by the size of the deserialized request payload. (more details above; that's just the summary for you) Scenario 1I have seen that there is already a "request" circuit breaker but its Scenario 2We also want to limit memory usage on a single request basis (not across all requests that are in flight) and tbh I think that circuit breakers are not a good fit because we would need a new circuit breaker instance for each request as far as I understand it. However, it would make the implementation more uniform. It would be great to hear your input on this. |
This is exactly what the circuit breaker is for, you can easily define a new
To me, this doesn't sound like a good fit for the circuit breaker (since it is |
Thanks for your feedback! I'll implement it based on your pointers. Would be great if you could take a look at it then. |
I have started implementing request size limiting based on your pointers (see danielmitterdorfer/elasticsearch@802f6ed). I have added a new circuit breaker to Other things I have noted:
|
This came up when I added the request breaker on So in your case, I think we may want to limit which requests fall into the What do you think?
No, they definitely don't have to add up to 100%, that's why the parent is fielddata: 40% parent: 60% Which allows us to limit individual parts to certain amounts (like absolute I definitely think this new breaker should be absolute instead of relative.
What makes you think it is too implementation specific? You could always |
I think it makes sense to never circuit break a response (both when sending and receiving). |
@dakrone: First of all, thanks for your thoughts.
With the current implementation this is not as easy as I've added a Another approach I am thinking of is to decide at creation time of stream based on the action whether it needs to break or not. If yes, we wrap the original one in a limiting stream, otherwise we just leave the stream as is. For me, the most appropriate place to add this support seems to be For now I have stabilized the test in question by increasing the limit so it still breaks as intended but does not hit the request size limit. I think this is ok given that we need some minimum amount to handle the request at all.
That's great, then I'll leave that part as is. Thanks for the clarification. :)
My thought was that we should throw
@bleskes: Thanks for the hint. I have reduced the scope of size limiting to requests only (i.e. |
I have pushed another commit on the feature branch. We exhibit the following behaviour when a (bulk) request hits the limit:
I am not too happy that we behave differently depending on the protocol but as this is implemented on (transport) protocol level, it is not much of a surprise. Wdyt @bleskes? |
@danielmitterdorfer I think the http.max_content_length is to limit content length for any single http request, not the total size of in-flight http requests. I made a patch Dieken@f2d487e against v2.2.0 to limit total size of in-flight HTTP bulk requests as a temporary solution, because it seems your patch hasn't finished and it may be not merged to 2.x. |
@Dieken You're right: Regarding my changes: Support on transport level is finished but we need to iron out some details (property names, default values). Your patch looks fine for the bulk use case but I just want to raise your awareness of one (minor) thing though: You use |
for people following - @danielmitterdorfer and I went through the code together and Daniel is working on another iteration. |
With this commit we limit the size of all in-flight requests on transport level. The size is guarded by a circuit breaker and is based on the content size of each request. By default we use 100% of available heap meaning that the parent circuit breaker will limit the maximum available size. This value can be changed by adjusting the setting network.breaker.inflight_requests.limit Relates elastic#16011
With this commit we limit the size of all in-flight requests on HTTP level. The size is guarded by the same circuit breaker that is also used on transport level. Similarly, the size that is used is HTTP content length. Relates elastic#16011
With this commit we limit the size of all in-flight requests on transport level. The size is guarded by a circuit breaker and is based on the content size of each request. By default we use 100% of available heap meaning that the parent circuit breaker will limit the maximum available size. This value can be changed by adjusting the setting network.breaker.inflight_requests.limit Relates #16011
With this commit we limit the size of all in-flight requests on HTTP level. The size is guarded by the same circuit breaker that is also used on transport level. Similarly, the size that is used is HTTP content length. Relates #16011
Overly large bulk request can threaten the stability of Elasticsearch. Hence we want to limit the size of a bulk request.
Although the need originally arose for bulk requests, the solution will apply to requests in general and not just bulk requests.
There will be two (configurable) limits:
The text was updated successfully, but these errors were encountered: