Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution time reported for HTTP request by logger is lower than in reality #1121

Closed
ilya-spv opened this issue Feb 15, 2019 · 12 comments · Fixed by #2718
Closed

Execution time reported for HTTP request by logger is lower than in reality #1121

ilya-spv opened this issue Feb 15, 2019 · 12 comments · Fixed by #2718
Assignees

Comments

@ilya-spv
Copy link

ilya-spv commented Feb 15, 2019

Describe the bug

I make an HTTP requests to REST API that returns quite a lot of data. I have now used HttpClientFactory from .Net Core to get the data and it works fine. But what has puzzled me is a discrepancy between my timing and internal timing that comes from the library itself.

Here is an example message from the library:

End processing HTTP request after 130311.0094ms - OK

The thing is: it really takes 20-30 more seconds to really download all the data even though the library has written it's all finished (To execute client.GetAsync() method).

I assume what happens here is that library is very eager to report right after the header is received without waiting for the body to get downloaded.

This really sound like a bug, but I am not sure if it is. Maybe someone has a better explanation on why this is happening?

To Reproduce

Steps to reproduce the behavior:

  1. Using version '2.2.0.' of package 'Microsoft.Extensions.Http'
  2. Run this code: Register HttpClient with HttpClientFactory to get logging, make a call that takes a lot of time and returns lots of data ('client.GetAsync()') and check how much time does it take
  3. The time it takes is significantly higher than what the library itself reports in the log statement

Expected behavior

Library should log the end request when it is really finished, that is all the data is received.

Additional context

I am not 100% sure this is a bug, but would like at least to have som explanation on why this is happening then.

P.S. Here is an example of my code:

Stopwatch watch = new Stopwatch();
watch.Start();
var response = await Client.GetAsync(url);
watch.Stop();
Console.WriteLine($"Elapsed time {watch.ElapsedMilliseconds} ms");

Elapsed time in my log is way higher than in the system message

P.P.S I have tried running with HttpCompletionOption.ResponseContentRead but it really did not change the outcome, it's still a big difference.

@ilya-spv ilya-spv changed the title Execution time reported for HTTP request is lower than in reality Execution time reported for HTTP request by logger is lower than in reality Feb 15, 2019
@rynowak
Copy link
Member

rynowak commented Feb 15, 2019

That's interesting, do you have ability to profile this and see where the time is spent?

@ilya-git
Copy link

I can try, but I assume the reporting happens right after the header is received, and it takes some time to receive everything. But I don't really know, and probably profiling won't help either since it's not CPU usage

@rynowak
Copy link
Member

rynowak commented Feb 18, 2019

Most profilers have an option to profile 'thread time' instead of 'cpu time'. In particular I know that the timeline mode of JetBrains dotTrace supports this sort of thing, and they have a very generous trial period.

If you're to provide some profiler data, or a simple repro I'd be happy to investigate this.

One other question - are you running in an environment with a sync context? Does it make a difference if you do await Client.GetAsync(url).ConfigureAwait(false)?

The log messages that you're reporting are from our outer logging handler, so I would expect them to be as accurate as possible. Message handlers also act like middleware, and this one is the one that surrounds all of the pipeline. https://github.com/aspnet/Extensions/blob/master/src/HttpClientFactory/Http/src/Logging/LoggingScopeHttpMessageHandler.cs#L66

I would expect HttpCompletionOption.ResponseContentRead to have an effect on the overall timing of your requests, but I still don't have a guess as to what the cause of this might be so it's hard to really speculate.

@rynowak
Copy link
Member

rynowak commented Feb 18, 2019

FYI @karelz @stephentoub

@rynowak
Copy link
Member

rynowak commented Feb 18, 2019

OK I can definitely reproduce this with a "slow" server and HttpCompletionOption.ResponseContentRead

The reason why this is the case is that the support for HttpCompletionOption.ResponseContentRead is implemented inside HttpClient, not inside the message handler chain. You can see the code here

So when our logging code runs, it's always going to be just after the headers of the response have been read. I'm not sure that there's anything we can or should do about this - but there's definitely a reasonable explanation for it.

If for some reason you really want the body to be totally read during the message handler chain, you can write your own handler to force this earlier.

Example:

        public class MyBufferingHandler : DelegatingHandler
        {
            protected override async Task<HttpResponseMessage> SendAsync(HttpRequestMessage request, CancellationToken cancellationToken)
            {
                var response = await base.SendAsync(request, cancellationToken).ConfigureAwait(false);
                await response.Content.LoadIntoBufferAsync();
                return response;
            }
        }
info: System.Net.Http.HttpClient.foo.LogicalHandler[100]
      Start processing HTTP request GET https://localhost:44316/Download?bytes=3145728&delay=1
info: System.Net.Http.HttpClient.foo.ClientHandler[100]
      Sending HTTP request GET https://localhost:44316/Download?bytes=3145728&delay=1
info: System.Net.Http.HttpClient.foo.ClientHandler[101]
      Received HTTP response after 732.418ms - OK
info: System.Net.Http.HttpClient.foo.LogicalHandler[101]
      End processing HTTP request after 12748.0145ms - OK
Elapsed time 00:00:12.7516826

@ilya-spv
Copy link
Author

ilya-spv commented Feb 18, 2019

Thanks for the explanation, that is pretty much what I expected to happen that logging code runs right after the headers are being read. It's not that critical for me to write a special handler, since I can rely on my logging that is made on top instead.

A little problem here is that it is a bit misleading to say "End processing HTTP request", since it is actually not quite "ended", depending of course on what we mean by that word. So maybe some different wording can be applied, though I can't quite come up with a good sentence here.

So it does mean additionally if one wants to know how much time it takes end to end, it's better to have a logging on top yourself, rather than rely on internal logging that will always be off (with the delta depending on the size of the content, bandwidth, etc.)

@rynowak
Copy link
Member

rynowak commented Feb 19, 2019

I think you're right about that - do you have a suggestion about what this log message should say?

@rynowak rynowak removed the question label Feb 19, 2019
@ilya-git
Copy link

ilya-git commented Feb 19, 2019

End receiving HTTP header or HTTP header received? Sounds a bit strange, but looks correct :)

I am a bit confused in your example about this log message that comes before end receive:

Received HTTP response that happens 10 seconds before the last message. If the last one is after headers are received, then when does the first message occur?

Ideally though it's best to have both messages

  1. Http header received
  2. Http response body received

But it seems it's not that easy from the current architecture, and I guess changing architecture because of logging is not going to happen :) Perhaps HttpClient can fire some additional event that can somehow be intercepted by logging as well

@rynowak
Copy link
Member

rynowak commented Feb 19, 2019

If the last one is after headers are received, then when does the first message occur?

In my example, I wrote a handler to force downloading of the request body. Since a call to HttpClient can trigger multiple actual requests, we have logging at both ends of the pipeline. In between is where the download of the message body occurs. It's academic though because we're not going to put this handler in the product code. I was just demonstrating that it's possible.

@antoinebj
Copy link

I hit this problem too.

When a call is made to GetAsync with content buffering option (which is the default if the completion option parameter is not specified), it is possible that the developer expects the operation to be very fast. If it turns out that it takes a significant time to complete on average, he might want to act on that by doing something after the headers have been received, before proceeding to load the content. In order to diagnose this need, he requires proper instrumentation.

There does seem to be a lack of opportunity to intercept the content buffering, specifically to log its duration separately from the duration of getting just the response headers.
The distinction between the 2 can give important clues. If you know that you have low network latency (thanks to ping), getting response headers late incriminates the server rather than the network. Getting the full content late might however incriminate the network bandwidth, unless network bandwidth tests say otherwise, in which case the server is streaming the response body from a slow source (obviously, as is too often the case, a 5.25" floppy disk).

Since the completion option is not passed to the message handlers, there's no way to know if the time of buffering should be taken into consideration.
Even if we had a way of getting the completion option within the pipeline, injecting a content buffering handler into the pipeline would unfortunately not allow the logging handler to discriminate the header response time from the content buffering time as it would only be able to measure the single SendAsync call to the inner handler.
Therefore, in that scenario, the logging handler would also have to be the one doing the content buffering. That sounds like a missed opportunity to separate responsibilities, which is a shame when using the chain of responsibilities pattern.

Perhaps the ideal solution is to allow interception of the content buffering in HttpClient separately from the chain of message handlers.
That can be done a number of ways:

  • Make SendAsync or some other method virtual to allow interception by overriding
  • Create an IHttpClient interface allowing for full customization, but that would impact the HttpClientFactory signature
  • Use a chain of responsibility pattern similarly to what is done for message handling, but this time for the content buffering (we'd then have a LoggingContentBufferingHandler calling the inner default ContentBufferingHandler), or in a more abstract way that resembles the parameter for the HTTP completion (e.g. HttpCompletionHandler)

@antoinebj
Copy link

antoinebj commented Mar 9, 2019

There's a possible workaround that I'm going to investigate: adding a message handler that replaces the response content object (HttpContent) with a decorator over the original content object.
That decorator will override CreateContentReadStreamAsync by returning a stream that also decorates the stream of the original content object. The stream decorator will allow logging when the content reading starts, and when the end of the stream is reached, and of course get the duration between the 2.
It might not be meaningful if the stream is not read in one shot, but for my use cases, it always is. I suspect that for 99% of use cases of HttpClient, it also is. A potential advantage is that it should work even if the content is not buffered and is read separately from the SendAsync call's completion.

rynowak added a commit that referenced this issue Nov 24, 2019
Fixes: #1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.
@rynowak
Copy link
Member

rynowak commented Nov 24, 2019

I'm sending a small pr to adjust this logging message to be more clear. All of the real solutions to this problem seem like things that are either:

  • Out of scope for the client factory
  • OR have a big impact on how request processing works

@rynowak rynowak self-assigned this Nov 24, 2019
@rynowak rynowak added this to the 5.0.0-preview1 milestone Nov 24, 2019
rynowak added a commit that referenced this issue Nov 26, 2019
Fixes: #1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.
@dotnet dotnet locked as resolved and limited conversation to collaborators Dec 26, 2019
maryamariyan pushed a commit to maryamariyan/runtime that referenced this issue Mar 11, 2020
Fixes: dotnet/extensions#1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.


Commit migrated from dotnet/extensions@846de6f
maryamariyan pushed a commit to maryamariyan/runtime that referenced this issue Mar 11, 2020
Fixes: dotnet/extensions#1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.


Commit migrated from dotnet/extensions@846de6f
maryamariyan pushed a commit to maryamariyan/runtime that referenced this issue Mar 27, 2020
Fixes: dotnet/extensions#1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.


Commit migrated from dotnet/extensions@846de6f
JunTaoLuo pushed a commit to dotnet/aspnetcore that referenced this issue Nov 17, 2020
Fixes: dotnet/extensions#1121

Just a small tweak to logging to make it clear that this is the elapsed
time to get the response headers.
\n\nCommit migrated from dotnet/extensions@846de6f
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants