-
Notifications
You must be signed in to change notification settings - Fork 751
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for processing requests with StreamContent to AddStandardHedgingHandler() #5112
Conversation
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. Link to issue: dotnet#5105
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. This is a small commit that is part of a larger PR. See the PR and its corresponding initial commit for the full set of changes.
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. This is a small commit to update the ConfigureAwait arg on an async
Apparently, the code coverage for M.E.Http.Resilience has dropped from 97% to 96.75% and that's why CI failed. |
...icrosoft.Extensions.Http.Resilience/Hedging/ResilienceHttpClientBuilderExtensions.Hedging.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.Http.Resilience/Internal/RequestMessageSnapshot.cs
Outdated
Show resolved
Hide resolved
cc: @iliar-turdushev can you take a peek? |
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. This is a small commit to resolve comments made on the PR
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. This is a small commit to resolve comments made on the PR.
@amcasey is code coverage calculated for the entire package or just for the lines I touched? I have added full test coverage for this change with the exception of the catch-block in two of the files where I catch |
This comment was marked as resolved.
This comment was marked as resolved.
{ | ||
if (request.Content is StreamContent) | ||
HttpContent? clonedContent = null; | ||
if (content is not null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This introduces performance regression for non-streamed content. Create an extra branch for StreamContent
where this logic belongs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@martintmk can you expand on why you believe this will cause a performance regression?
The cloning of the HttpContent
(i.e. this method) only executes if the request.Content
is StreamContent
. See the other places in this class where this CloneContentAsync
method is called for more context. You will see that each call is wrapped in a if (request.Content is StreamContent)
check. Therefore, I am performing the branching that you are suggesting (pending that I'm understanding your usage of the term "branch" to mean logic branch is correct). Note: I did this specifically to prevent any performance regression for existing non-Streamed content users of the AddStandardHedgingHandler
API.
Also, if you are referring to the if (request.Content is StreamContent)
check as being the culprit of a new performance regression, please observe that this check already existed before my change. The only difference is that, prior to this change, if request.Content is StreamContent
is true, then the RequestMessageSnapshot
methods throw an InvalidOperationException
. Therefore, perf should remain the same for existing, non-Streamed content users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks for detailed explanation. Can you still try to avoid the extra branches by condensing the code into a single PrepareContentAsync
method
It would also reduce the number of mutants you have to kill.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh I see what you mean. I understand where you are coming from; however, I don't think encapsulating the content operations behind a single method will achieve the additional clarity or eliminate the required mutation tests as you are suggesting. Below are my thoughts:
- The mutation tests still need to be accounted for regardless.
- I don't believe it would add any additional clarity. In fact, I believe it might do the opposite and here's why: the method would need to operate in both directions (i.e. cloning a passed-in request object into the given
RequestMessageSnapshot
instance fields and cloning the givenRequestMessageSnapshot
instance into a new request object). I don't see any clear way to represent that behind a single method without it creating more confusion. It even seems like it would be a code smell.
HttpContent originalContent = content; | ||
Stream originalRequestBody = await content.ReadAsStreamAsync().ConfigureAwait(false); | ||
MemoryStream clonedRequestBody = new MemoryStream(); | ||
await originalRequestBody.CopyToAsync(clonedRequestBody).ConfigureAwait(false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The streamed content can point to a large file which can have > GBs in size. Since we are copying to memory we should constraint it. Let's say < 10MB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if it's a good idea to impose a size limit. For example, I don't think the framework imposes any such limit when cloning request content for redirected requests. IMO, we should allow this to be limitless, and require users to enforce their own request content size limits, if they have them, via server configurations and/or their own custom handlers, filters, middleware, etc. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change can buffer potentially endless stream into memory and crash the process. I think it's something we should guard against.
This thing should be used for relatively small streamed payloads.
Btw, you can try to call LoadIntoBufferAsync
on the content to see how it behaves.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do understand the concern; however, the max size that a given stream can grow before it crashes the process is bound only by the memory resources available on the given server that it's executing on. Therefore, by choosing a static max limit, we are creating a cap that might unnecessarily hinder users of the API that have both requirements and the necessary server resources available to process requests with exceptionally large stream content. Further, RequestMessageSnapshot
is only cloning content from existing request objects that contain existing StreamContent
. Therefore, a HttpRequestMessage
object has already been created with StreamContent
of the given size before any of the RequestMessageSnapshot
operations are executed. If the size of the StreamContent
was in fact too large, it would have already crashed the process.
I feel strongly that we should not impose a size limit when cloning the StreamContent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the size of the StreamContent was in fact too large, it would have already crashed the process.
This is not true, the StreamContent
can point to file stream and it consumes minimal memory even as you read the whole file. What you are doing here is materializing the whole stream into the memory. This is what concerns me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only way what you are saying is true is if the content is a FileStream
and is read into a buffer, correct? This is somewhat of a remarkable edge case since we are talking about streams in the context of HTTP request content. It would help me understand this better if you could provide an example scenario where a developer would need to support this. Also, it is universally known that a given request must be deep cloned in order to support hedging. I believe the general assumption is that deep cloning will be handled via cloning to a MemoryStream
as opposed to any alternative stream, which would require the entire stream to be copied to memory. Therefore, I return to my previous stance that users of the AddStandardHedgingHandler
extension method should add their own handler, middleware, etc. to impose a size limit on request content if they have one. My biggest concern is that we don't have enough information to know what the optimal max limit value should be that would support the most use cases effectively.
src/Libraries/Microsoft.Extensions.Http.Resilience/Internal/RequestMessageSnapshot.cs
Outdated
Show resolved
Hide resolved
src/Libraries/Microsoft.Extensions.Http.Resilience/Internal/RequestMessageSnapshot.cs
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Add support for HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API. This change does not update any public API contracts. It updates internal and private API contracts only. This is a small commit to resolve comments made on the PR.
@RussKie I was able to add code coverage for the catch-block. I had to create a helper that extended |
This comment was marked as resolved.
This comment was marked as resolved.
AddContentHeaders(_requestMessage!.Content); | ||
using RequestMessageSnapshot snapshot = await RequestMessageSnapshot.CreateAsync(_requestMessage).ConfigureAwait(false); | ||
((IResettable)snapshot).TryReset(); | ||
_ = await Assert.ThrowsAsync<InvalidOperationException>(async () => await snapshot.CreateRequestMessageAsync().ConfigureAwait(false)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_ = await Assert.ThrowsAsync<InvalidOperationException>(async () => await snapshot.CreateRequestMessageAsync().ConfigureAwait(false)); | |
_ = await Assert.ThrowsAsync<InvalidOperationException>(() => snapshot.CreateRequestMessageAsync()); |
nit: This and other similar assert statements can be simplified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PR looks good to me.
I have alternative proposal that should enable your scenarios and won't cause potential issues. In your code-base you add the following handler: public class BufferingHandler : DelegatingHandler
{
protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
{
await request.Content.LoadIntoBufferAsync();
return await base.SendAsync(request, cancellationToken);
}
} Now hedging will support if (request.Content is StreamContent streamContent && await streamContent.ReadAsStreamAsync(cancellationToken) is Stream { CanSeek: true } stream)
{
byte[] buffer;
if (stream is MemoryStream memoryStream)
{
// but buffer asside
buffer = memoryStream.GetBuffer();
}
else
{
var memory = new MemoryStream();
await stream.CopyToAsync(memory);
buffer = memory.GetBuffer();
}
request.Content = new ByteArrayContent(buffer);
} This delegates the responsibility of buffering to the caller. In other words it's an explicit actions that tells us: "I have buffered the streamed content and I know what I am doing" The only extra code on your part is that minimal WDYT? |
@martintmk I like your proposal because it makes the decisions about storing the stream in memory explicit. |
Just to reiterate why I am pushing back on on this. The standard hedging is/will be used broadly across Microsoft most important services. My intuition tells me that allowing unlimited or default buffering of large streams into memory by default with eventually cause problems or outages. So we need to do it opt-in with the option to limit the size that will be buffered. |
I do like the idea of utilizing a buffer if one is available; however, this unfortunately is not possible (or rather, is not safe) to do in the context of hedging. It violates the "snapshot" pattern, breaking the implicit contract of the To expand, recall that the entire purpose of To illustrate, let's say that a user added a custom For hedging to work properly, the |
I agree here, let's find a way to make this work opt-in. How about adding a Then the snapshot can read this value a use it for buffering, throwing exception of buffer is exceeded. Another alternative is to add |
requestMessage.SetResilienceContext(args.ActionContext); | ||
|
||
// replace the request message | ||
args.ActionContext.Properties.Set(ResilienceKeys.RequestMessage, requestMessage); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
args.ActionContext.Properties.Set(ResilienceKeys.RequestMessage, requestMessage); | |
args.ActionContext.Properties.Set(ResilienceKeys.RequestMessage, requestMessage); |
There is a breaking change here :/
The OnHedging
callback does not have access to request message anymore. This is because OnHedging is called after the action is created and before it is invoked.
Hi @adamhammond I'm closing this PR as it looks like this work has stalled. If/when you decide to resume, feel free to re-open. |
Fixes #5105
Add support for processing HttpRequestMessage objects containing StreamContent to the AddStandardHedgingHandler() resilience API.
This change does not update any public API contracts. It updates internal and private API contracts only.
Microsoft Reviewers: Open in CodeFlow