Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive allocations in HttpContent.ReadAsByteArrayAsync #81628

Open
dotMorten opened this issue Feb 4, 2023 · 20 comments
Open

Excessive allocations in HttpContent.ReadAsByteArrayAsync #81628

dotMorten opened this issue Feb 4, 2023 · 20 comments
Labels
Milestone

Comments

@dotMorten
Copy link

Description

When using HttpClient.SendAsync and reading the HttpResponse.Content into a byte-array, it results in allocations in my testing 4 times that of the actual http response. If using HttpClient.GetByteArrayAsync allocations much closer to the actual http response is made. However this API call only allows for basic get requests, and no way to look at response headers etc.

I'm trying to find ways to heavily reduce memory allocations while still having full access to the power SendAsync provides over the simplified Get* methods.

Configuration

.NET 7.0.101, Windows 11 Pro, build 22621, x64

Regression?

Same thing observed with .NET 6, allocations almost identical.

Data

Benchmark:

   [MemoryDiagnoser]
    public class HttpTests
    {
        private readonly HttpClient client;
        private const string url = "http://sampleserver6.arcgisonline.com/arcgis/rest/services/Census/MapServer/3/query?where=1%3D1&text=&objectIds=&time=&timeRelation=esriTimeRelationOverlaps&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&distance=&units=esriSRUnit_Foot&relationParam=&outFields=*&returnGeometry=true&returnTrueCurves=false&maxAllowableOffset=&geometryPrecision=&outSR=&havingClause=&returnIdsOnly=false&returnCountOnly=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&returnZ=false&returnM=false&gdbVersion=&historicMoment=&returnDistinctValues=false&resultOffset=&resultRecordCount=&returnExtentOnly=false&sqlFormat=none&datumTransformation=&parameterValues=&rangeValues=&quantizationParameters=&featureEncoding=esriDefault&f=pjson";
        public HttpTests()
        {
            client = new HttpClient();
        }

        [Benchmark] 
        public async Task<byte[]> Client_GetByteArrayAsync()
        {
            // Allocates ~1mb, about size of response
            var response = await client.GetByteArrayAsync(url).ConfigureAwait(false);
            return response;
        }
        [Benchmark]
        public async Task<byte[]> Client_Send_GetByteArrayAsync()
        {
            // Allocates ~4mb, about 4 times the response size
            var request = new HttpRequestMessage(HttpMethod.Get, url);
            var response = await client.SendAsync(request);
            var bytes = await response.Content.ReadAsByteArrayAsync().ConfigureAwait(false);
            return bytes;
        }
   }
Method Gen0 Gen1 Gen2 Allocated
Client_GetByteArrayAsync - - - 1.02 MB
Client_Send_GetByteArrayAsync 1000.0000 1000.0000 1000.0000 4.87 MB

Now I was able to reduce memory by about 1mb, using this approach, but still quite short of the simple GetByteArrayAsync:

        public async Task<int> ReadAsSpanAsync()
        {
            var request = new HttpRequestMessage(HttpMethod.Get, url);
            var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
            using MemoryStream stream = new MemoryStream();
            await response.Content.CopyToAsync(stream);
            ReadOnlyMemory<byte> bytes;
            if (!stream.TryGetBuffer(out ArraySegment<byte> dataSegment))
                throw new Exception("Can't get buffer");
            bytes = dataSegment;
            if(bytes.Span.Length == 0)
                throw new Exception("Buffer empty");
            return bytes.Span.Length;
        }
Method Gen0 Gen1 Gen2 Allocated
ReadAsSpanAsync 750.0000 750.0000 750.0000 3.87 MB

Analysis

Looking at the HttpContent, lots of byte array copying is performed to ensure no access to the internal memorystream that was buffered to. It might be helpful to create some access to this data with ReadOnlyMemory<byte> or similar.

@dotMorten dotMorten added the tenet-performance Performance related issue label Feb 4, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 4, 2023
@ghost
Copy link

ghost commented Feb 4, 2023

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

Issue Details

Description

When using HttpClient.SendAsync and reading the HttpResponse.Content into a byte-array, it results in allocations in my testing 4 times that of the actual http response. If using HttpClient.GetByteArrayAsync allocations much closer to the actual http response is made. However this API call only allows for basic get requests, and no way to look at response headers etc.

I'm trying to find ways to heavily reduce memory allocations while still having full access to the power SendAsync provides over the simplified Get* methods.

Configuration

.NET 7.0.101, Windows 11 Pro, build 22621, x64

Regression?

Same thing observed with .NET 6, allocations almost identical.

Data

Benchmark:

   [MemoryDiagnoser]
    public class HttpTests
    {
        private readonly HttpClient client;
        private const string url = "http://sampleserver6.arcgisonline.com/arcgis/rest/services/Census/MapServer/3/query?where=1%3D1&text=&objectIds=&time=&timeRelation=esriTimeRelationOverlaps&geometry=&geometryType=esriGeometryEnvelope&inSR=&spatialRel=esriSpatialRelIntersects&distance=&units=esriSRUnit_Foot&relationParam=&outFields=*&returnGeometry=true&returnTrueCurves=false&maxAllowableOffset=&geometryPrecision=&outSR=&havingClause=&returnIdsOnly=false&returnCountOnly=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&returnZ=false&returnM=false&gdbVersion=&historicMoment=&returnDistinctValues=false&resultOffset=&resultRecordCount=&returnExtentOnly=false&sqlFormat=none&datumTransformation=&parameterValues=&rangeValues=&quantizationParameters=&featureEncoding=esriDefault&f=pjson";
        public HttpTests()
        {
            client = new HttpClient();
        }

        [Benchmark] 
        public async Task<byte[]> Client_GetByteArrayAsync()
        {
            // Allocates ~1mb, about size of response
            var response = await client.GetByteArrayAsync(url).ConfigureAwait(false);
            return response;
        }
        [Benchmark]
        public async Task<byte[]> Client_Send_GetByteArrayAsync()
        {
            // Allocates ~4mb, about 4 times the response size
            var request = new HttpRequestMessage(HttpMethod.Get, url);
            var response = await client.SendAsync(request);
            var bytes = await response.Content.ReadAsByteArrayAsync().ConfigureAwait(false);
            return bytes;
        }
   }
Method Gen0 Gen1 Gen2 Allocated
Client_GetByteArrayAsync - - - 1.02 MB
Client_Send_GetByteArrayAsync 1000.0000 1000.0000 1000.0000 4.87 MB

Now I was able to reduce memory by about 1mb, using this approach, but still quite short of the simple GetByteArrayAsync:

        public async Task<int> ReadAsSpanAsync()
        {
            var request = new HttpRequestMessage(HttpMethod.Get, url);
            var response = await client.SendAsync(request, HttpCompletionOption.ResponseHeadersRead).ConfigureAwait(false);
            using MemoryStream stream = new MemoryStream();
            await response.Content.CopyToAsync(stream);
            ReadOnlyMemory<byte> bytes;
            if (!stream.TryGetBuffer(out ArraySegment<byte> dataSegment))
                throw new Exception("Can't get buffer");
            bytes = dataSegment;
            if(bytes.Span.Length == 0)
                throw new Exception("Buffer empty");
            return bytes.Span.Length;
        }
Method Gen0 Gen1 Gen2 Allocated
ReadAsSpanAsync 750.0000 750.0000 750.0000 3.87 MB

Analysis

Looking at the HttpContent, lots of byte array copying is performed to ensure no access to the internal memorystream that was buffered to. It might be helpful to create some access to this data with ReadOnlyMemory<byte> or similar.

Author: dotMorten
Assignees: -
Labels:

area-System.Net.Http, tenet-performance, untriaged

Milestone: -

@stephentoub
Copy link
Member

stephentoub commented Feb 4, 2023

I'm unclear as to the issue being reported. The title says that HttpContent.GetByteArrayAsync had excessive allocation, but there isn't such a method, and the text of the issue says HttpClient.GetByteArrayAsync had the desired allocation. Is this issue asking how to write code that reads the response stream and achieves the same allocations as HttpClient.GetByteArrayAsync? Or did you intend to ask about HttpContent.ReadAsByteArrayAsync?

@LanceMcCarthy
Copy link

@stephentoub this screen explains the issue in a glance https://twitter.com/dotMorten/status/1621672844760592385?t=mpRkFiS_w25HzqUxGsL1Tg&s=19

(Sorry for doing a Kramer entrance, but I'd like to improve my networking stack too 😉)

@dotMorten dotMorten changed the title Excessive allocations in HttpContent.GetByteArrayAsync Excessive allocations in HttpContent.ReadAsByteArrayAsync Feb 4, 2023
@dotMorten
Copy link
Author

dotMorten commented Feb 4, 2023

@stephentoub sorry for the typo. Fixed the title.
Yes I want to read the data from HttpContent just as efficient as I can with the shortcut method on HttpClient.

@davidfowl
Copy link
Member

If you want this code to be efficient, don't allocate the byte[] and instead use the Stream (I know it's unrelated to the issue but at the end of the day the above code can't really be efficient).

@dotMorten
Copy link
Author

dotMorten commented Feb 4, 2023

I do want to allocate the byte array for the contents (it’ll be sent to native code in my case as is) but I don’t want to allocate it 4 times over.
I was trying to optimize a little and move to more Span/Memory use and was shocked to see a 1mb download becomes a 4mb allocation.

As shown above I was able to use stream and reduce a little bit but it is still 3 times as inefficient

@stephentoub
Copy link
Member

The difference comes from a few places:

  • HttpContent for whatever reason was designed to buffer its contents and be readable multiple times, e.g. you can call ReadAsByteArrayAsync multiple times on the same content and it'll succeed. That means it doesn't hand out the same array each time, because the consumer could change the contents and thereby affect the behavior of future calls, which means HttpContent buffers everything once and returns a copy. If you only call it once, though, you end up with twice the allocation needed. That explains 1x of the overhead.
  • The response is chunked rather than having a content length, so the client has no idea how big the response will be. As such, via MemoryStream, it ends up using a standard doubling algorithm on the buffer size as data arrives. The way the data ends up arriving in chunks, it ends up doubling ~10 times. The way the chunks are sent by the server relative to the total amount sent, the server ends up sending about 1MB but in a way that causes the buffer to have doubled to ~2MB (enough data was sent in just the right (wrong) way that it forced another double but didn't use most of it). That difference is another 1x of the overhead. But we allocated doubling each time to get there, so along the way it also allocated another ~2MB in temporary arrays... that explains another ~2x of the overhead.

In contrast, GetByteArrayAsync doesn't have to buffer in a way that's reusable, so it buffers into ArrayPool memory, creates a byte[] from that, and returns the temporary arrays to the pool. ReadAsByteArrayAsync could also use temporary ArrayPool buffers, but it needs to store the resulting temporary for any future calls (because of the "allow multiple future reads" design), which either it would need to hold onto the last ArrayPool array it got indefinitely and hope that the developer disposes of the response message appropriately (note your repro doesn't), or it would need to copy the final ArrayPool array into yet another temporary; this would enable pooling the temporaries along the way, but would still require copying the final one.

We could explore doing using ArrayPool in that manner. We could also explore using a stream that uses a linked list of chunks rather than keeping the data contiguous. But frankly if you care about allocation, I'm with David here: consume the Stream yourself. At that point you can avoid allocating any temporary buffers for the response data.

@dotMorten
Copy link
Author

@stephentoub thanks for the detailed answer. I’m more than happy to use stream but I still don’t see how I can use SendAsync and consume the HttpContent’s stream and still get the same low allocation that GetByteArrayAsync does. As shown in my 3rd benchmark I can reduce it by 30% with stream still far from the 75% reduction that might be possible (and an array pool added that might become 50%). From what I can tell the buffered stream will always be there, even when returning as soon as headers have been read.

(Good calll on the dispose - my real code does that or when using in .net framework it ends up exhausting all connections if hitting the same server a lot in parallel)

@stephentoub
Copy link
Member

stephentoub commented Feb 4, 2023

From what I can tell the buffered stream will always be there, even when returning as soon as headers have been read.

It won't. If you use ResponseHeadersRead and get the stream with ReadAsStream, you can read from that and there won't be any additional buffering by the HttpContent. GetBytesArrayAsync isn't doing anything magical; it's just copying from that same response stream into a stream backed by array pool arrays.

@dotMorten
Copy link
Author

Thanks again! I lifted out LimitArrayPoolWriteStream and was able to get the performance I wanted. In fact I actually see a way now with the rented pool to make it never create extra copies and make it almost allocation free, since I only need the array for a brief moment before returning it to the pool, so just need to expose the rented array directly.

I think we can close this, but I gotta admit that it might be nice with some new overload/method to get the contents out with low allocation without all those hoops.

@ManickaP
Copy link
Member

ManickaP commented Feb 9, 2023

I think this has been answered and resolved, so I'm closing this.

@ManickaP ManickaP closed this as completed Feb 9, 2023
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Feb 9, 2023
@davidfowl
Copy link
Member

It’s resolved but there’s a reasonable point about the fact that there’s no one liner that does the equivalent (not suggesting we should break ReadAsByteArrayAsync)

@dotMorten
Copy link
Author

Thanks David. I found if there’s a content length header the allocations are “only” 2x the download size. Without the length header it’s 4x.

@stephentoub
Copy link
Member

stephentoub commented Feb 9, 2023

I found if there’s a content length header the allocations are “only” 2x the download size. Without the length header it’s 4x.

That's because of the growth doubling-algorithm I mentioned in #81628 (comment). The extra full 2x is also the extreme version of this where you double the last time but then use little-to-none of the increased size. The HttpClient.GetByteArrayAsync implementation still experiences that exact same 2x, but it does so using pooled arrays, so it's less visible in an allocation profile.

not suggesting we should break ReadAsByteArrayAsync

If we were to do anything here, I'd actually suggest this be the path forward. It'd be interesting to understand who actually uses ReadAsByteArrayAsync multiple times on the same content and also mutates the returned byte[]. If the answer is "almost no one", as I suspect, it'd be simpler and cheaper to just take a breaking change to the implementation to not make a defensive copy.

@davidfowl
Copy link
Member

If we were to do anything here, I'd actually suggest this be the path forward. It'd be interesting to understand who actually uses ReadAsByteArrayAsync multiple times on the same content and also mutates the returned byte[]

I'm not sure how we get this information, but I agree.

If the answer is "almost no one", as I suspect, it'd be simpler and cheaper to just take a breaking change to the implementation to not make a defensive copy.

With an opt out switch?

@stephentoub
Copy link
Member

With an opt out switch?

Ideally not. The opt-out switch is: if you're doing this, make a copy of the byte[] on your own instead of mutating the original.

I'm not sure how we get this information, but I agree.

Search for all calls on GitHub and see if/how folks are using it. Then try to make the change, document it as breaking, and see who complains :)

@davidfowl
Copy link
Member

Then try to make the change, document it as breaking, and see who complains :)

Would calling HttpContext.LoadIntoBufferAsync() be the workaround?

Then try to make the change, document it as breaking, and see who complains :)

I would just do this 😄.

@stephentoub
Copy link
Member

Would calling HttpContext.LoadIntoBufferAsync() be the workaround?

My proposal would be to just hand out the loaded buffer, in which case, no.

@davidfowl
Copy link
Member

I think as long as something like this works:

HttpClient client = new HttpClient();
var response = await client.GetAsync(url);
var b0 = await response.Content.ReadAsByteArrayAsync();
response.Content = new ByteArrayContent(b0);
var b1 = await response.Content.ReadAsByteArrayAsync();

Then we have a solid workaround for people that need to replay.

@stephentoub
Copy link
Member

stephentoub commented Feb 9, 2023

That would suffer from the same issue: if b0 were handed out and b0 were mutated, it'd be mutating the same array stored by the new ByteArrayContent and thus would affect subsequent reads.

My suggestion was just:

HttpClient client = new HttpClient();
var response = await client.GetAsync(url);
byte[] b0 = (byte[])(await response.Content.ReadAsByteArrayAsync()).Clone();
... // mutate b0 to your hearts content, as it's a copy of what's in the HttpContent

In other words, allow whether a copy is made to be left up to the caller rather than forcing it upon everyone.

(Regardless of whether we make such a change, we could change the implementation of ReadAsByteArrayAsync to use the same pooling stream that GetByteArrayAsync uses. The reason we don't today is because, whereas GetByteArrayAsync just copies the final stream's data to a byte[] and returns that, the code in ReadAsByteArrayAsync stores the filled stream itself, and we don't want that storing buffers from the array pool. It could instead be made to copy out to a byte[] and store that, but then you're actually increasing the amount of allocation for shorter responses that didn't require growing beyond the initial allocation, in particular if you need to hand out copies. It's also complicated by other methods like ReadAsStreamAsync and ReadAsStringAsync, which don't require the byte[] underlying the stream be exactly sized, so they could no longer share all of the same buffering code.)

@stephentoub stephentoub reopened this Feb 9, 2023
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Feb 9, 2023
@stephentoub stephentoub removed the untriaged New issue has not been triaged by the area owner label Feb 9, 2023
@stephentoub stephentoub added this to the Future milestone Feb 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants