Allocations in FormReader #553

rynowak · 2016-02-04T17:17:42Z

There's a lot of overhead right now in FormReader

Data is from 3000 requests to https://github.com/aspnet/Performance/tree/dev/testapp/BigModelBinding (x64). The client is doing a form post of about 100 form fields - we kinda consider this the upper bound for the amount of data (and shape of data) that you'd want to put through MVC model binding.

Based on this profile, my napkin math shows about 77mb of allocations from our underlying representation (string + System.Collections.Generic.Dictionary<String, StringValues> + etc) and about 122mb coming from various overheads.

I'm excluding from the 122mb stuff that's in theory covered by @benaadams current work (byte/char buffers, etc).

Breaking this down further

`Task<string>` - 47mb

Cost of being async

`Task<Nullable<KeyValuePair<string, string>>>` - 28mb

Cost of being async

`Dictionary<string, List<String>>`, `List<string>`, `string[]` - 25mb 12mb 9mb

Scratch data used in KeyValueAccumulator to build the final Dictionary<string, StringValues>

Note that 5mb of string[] allocations are coming from a hashset in MVC, so I excluded it from this total

Some thoughts

We should consider a design for FormReader where we pass the accumulator around or store values in fields/properties instead of returning them via Task<T>. Using async plus Task<T> at such a chatty level is what causes all of this overhead. Using Task on the other hand can avoid this issue.

If we're in a state where we know the entire body is buffered in memory, we might want to just optimize that path to do a synchronous read. That's probably a simpler fix, and it will result in running much more efficient code without any async overhead.

We should also consider changing KeyValueAccumulator to just operate on Dictionary<string, StringValues directly. Having multiple values for the same fields is the less-common case. If we wanted to be smart about it, we could basically build List<T>'s resizing behavior into a StringValues.Add method (returns a StringValues since they are immutable), and then all cases would be pretty optimal.

Right now we're making the worst case the common case by allocating a List<string> and string[] for the common case of a single value per key.

The text was updated successfully, but these errors were encountered:

benaadams · 2016-02-04T21:40:59Z

ValueTask would also be an option here, its also the exact scenario it was designed for.

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams · 2016-02-07T23:00:04Z

#554 + #556 should hopefully resolve all of the above

rynowak · 2016-02-08T16:32:31Z

Would it be better overall to just make formreader a "push" parser? Using value task might kill the allocs, but it's still a lot of code to run. We could also avoid a second pass on the "output" stream.

Pseudocode:

Stream input = ...;
Stream output = ...; // our buffered stream
byte[] bBuffer = ...;
char[] cBuffer;
Decoder decoder;
FormReader formReader = ...;

while (await input.Read(bBuffer, 0, bBuffer.Length > 0)
{
    var chars = decoder.GetChars(...);
    formReader.Write(cBuffer, 0, chars);

   await output.WriteAsync(bBuffer, 0, bBuffer.Length > 0);
}

output.Seek(Begin, 0);
formReader.Complete();

rynowak · 2016-02-08T16:34:06Z

I'll do some profiling this AM. I want to see where reading form data fits in to the big picture in a sampling profile

rynowak · 2016-02-08T17:46:16Z

Some sampling data (same benchmark):

MVC at a high level

The 100% in this case is time spend inside MVC+Routing (only negligible work done in other parts of the pipeline).

The 13.39% AddValueProviderAsync is the time spend reading the form data and building the form value provider. I'll provide a further drill-down here.
The 2.65% AddValueProviderAsync is the jQuery form value provider which parses and duplicates the form data keys for some reasons
The 80.05% InvokeExceptionFilterAsync is model binding + the action code

Some extras:
2.47% InvokeResultFilterAsync is JSON.Net outputting the same data as JSON.
About 1% of time spend on routing and other misc overhead to get to this point.

Breaking down form reading

There's a very small amount of overhead here between MVC and the form reader, I was surprised how small it was. So, reading and parsing the input is 13.29% of the request time. Let's drill in more.

More drill in

The bottom-up breakdown provided by dottrace is pretty instructive:

We're looking at a pretty significant amount of time spend in async/task overhead 11.63% spent in AsyncMethodBuilder and 17.95% in <ReadWordAsync>...s exclusive time. The call tree above clearly shows that all the methods called here were not inlined https://github.com/aspnet/HttpAbstractions/blob/dev/src/Microsoft.AspNetCore.WebUtilities/FormReader.cs#L112 and are not included in the 17.95%. The fact that StringBuilder.Append(char) wasn't inlined seems like a red flag.

Additionally, another big cost is hashing. In the common case, we hash each key twice (once for a TryGetValue and again for an Insert. If we can find a way to avoid rehashing, that would be a significant gain (hashing was 18.75%).

benaadams · 2016-02-08T19:42:08Z

Is this pre or post change? ValueTask should also kill a lot of the AsyncMethodBuilder work

benaadams · 2016-02-08T20:09:29Z

Re: 18.75% of the total time being spent on hashing and multi-hashing. The cost is in KeyValueAccumulator for the Dictionary?

@stephentoub proposed a TryAdd here https://github.com/dotnet/corefx/issues/1942 though it would probably need an AddOrUpdate with Func<TKey, TVal, TVal, TVal> on the exists branch that has access to current & new value to avoid a closure, as that's where it does interesting things.

Other approach would be to not use Dictionary and use a different data structure - however that's more complicated as its also the return type (to allow struct enumerable rather than IEnumerable)

Also the Dictionary has had security mitigations applied for this scenario - so a different datastructure would be a complicated choice :(

rynowak · 2016-02-08T21:13:26Z

@benaadams - well we're using dictionary as a multi-map, so really the best choice for us would be a dedicated multi-map type, or something like python's dict which lets you specify the default value for a key.

A GetOrAdd would certainly help in the common case. You'd want it to be something like:

List list;
dict.GetOrAdd(key, () => new List(), out list);
list.Add(value);

or

var list = dict.GetOrAdd(key, () => new List());
list.Add(value);

It has to be built in to Dictionary<> or else you'll hash twice.

Really for our usage, a sorted list<kvp<string, *>> is probably more efficient overall than hashing. It depends how many values we really want to accept into a form. For MVC that number is around 100, if you're doing more than that, you've got an extreme scenario and your design is crazy.

Example here: https://github.com/aspnet/Mvc/blob/dev/src/Microsoft.AspNetCore.Mvc.ViewFeatures/ViewFeatures/AttributeDictionary.cs

Form-data isn't a good general serialization format, and it's not a replacement for JSON (or others).

rynowak · 2016-02-08T22:37:23Z

Is this pre or post change? ValueTask should also kill a lot of the AsyncMethodBuilder work

This is without any of your changes I think.

benaadams · 2016-02-09T00:09:42Z

if you're doing more than that, you've got an extreme scenario and your design is crazy.

Is user input though so want to avoid a data structure that might behave like this did http://www.troyhunt.com/2012/08/fixing-hash-dos-good-and-proper-and.html

benaadams · 2016-02-09T04:45:49Z

Actually TryAdd would probably cut a lot of these...

stephentoub · 2016-02-10T13:11:43Z

@benaadams, @rynowak, if the concern is that we need to frequently double-hash, have you considered storing a:

struct StringWithHash
{
    public int HashCode;
    public string Value;
    ...
}

as the key rather than a raw string? You can either compute the hash upfront and store it in the struct, and override GetHashCode to return it, or you can compute the hash lazily in GetHashCode if it wasn't already computed.

rynowak · 2016-02-10T16:19:20Z

@stephentoub - that introduces yet another tradeoff, on the common-case path we're directly building the dictionary that's going to exposed to the user as IDictionary<string, StringValues>.

A custom dictionary implementation would get us out of this mess, but I'd rather try literally everything else first.

Do you have any thoughts on the AsyncMethodBuilder overhead? Will using ValueTask<> change anything there or is our best bet to rewrite this as a push parser?

stephentoub · 2016-02-10T16:32:20Z

Do you have any thoughts on the AsyncMethodBuilder overhead?

Where in Async*MethodBuilder? And is this trace on full framework or CoreCLR?

Will using ValueTask<> change anything

Depends where the costs are.

rynowak · 2016-02-10T17:04:55Z

Snapshot from some similar data - 10% of execution time reading the form is in AsyncMethodBuilder:

This is x64 CoreCLR

Call Tree

https://github.com/aspnet/HttpAbstractions/blob/dev/src/Microsoft.AspNetCore.WebUtilities/FormReader.cs#L188

There's a parsing loop around async reads of a stream

rynowak · 2016-02-10T17:06:32Z

BTW @benaadams - this benchmark is here https://github.com/aspnet/Performance/tree/dev/testapp/BigModelBinding

npm install -g loadtest to get the driver
dnx web
modify loadtest.ps1 to your liking

stephentoub · 2016-02-10T17:52:36Z

In that case, yes, I'd expect @benaadams' PR at #556 to help.

benaadams · 2016-02-10T21:42:23Z

@stephentoub some thoughts, not conclusions...

Not sure storing the hash code is possible; doesn't Dictionary have randomised string hashing for strings that was specifically introduced for aspnet and user-input, would hashing on an int cause issues?

However, string does look like it uses InternalMarvin32HashString to generate the hashcode so it might be ok; but there is a lot of behaviour determined by #if in this area so am not entirely sure what the code paths are.

I did briefly look previously as to whether string could cache its hash before realising the special relationship between string and m_firstChar would mean it would need changes in the C++ (also didn't know whether it was a viable overhead either adding it to every string - or how it would behave with previously interned strings, or whether dictionary would actually use it).

Not changing the public signature (e.g. returning Dictionary<string, StringValue>) but using Dictionary<StringWithHash, StringValue> internally would mean a double hash in the dictionary conversion and an extra dictionary create for the return.

This could be hidden via implementing the interface IDictionary<string, StringValues> but it would need an additional Dictionary<StringWithHash, StringValue> return for those areas that want access to the struct enumerator (which is why its exposed as a Dictionary on the return currently) - then that type would be on the public api :-/

benaadams · 2016-02-11T02:31:26Z

@rynowak it wasn't happy with the wacky date format ;)

So set it to 1st Jan, but have the tests working now

benaadams · 2016-02-11T08:22:33Z

@rynowak I'm getting degraded perf with #556; but there is a lot of changes in it so investigating.

benaadams · 2016-02-11T09:07:06Z

Hitting disk; due to different Length behaviour with fixed sized MemoryStream, correcting.

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams · 2016-02-11T11:08:00Z

Current

With #556 (Buffers + ValueTask)

With both #554 + #556 (+ Accumulator)

So they all give a boost

benaadams · 2016-02-11T11:26:41Z

BufferAsync breakdown

benaadams · 2016-02-11T11:56:20Z

Revisiting the original allocation issue from 30,000 requests (x10 on original)

ValueTask dramatically reduces the Task allocations to around 1% of the current (7MB vs 750MB)

benaadams · 2016-02-11T22:22:55Z

Re: Dictionary hashing on non string types https://github.com/dotnet/coreclr/issues/2279

Tratcher · 2016-04-06T19:03:59Z

Summary:

Switching from Tasks to ValueTasks would help allocations, but requires writing your own state machine at 10x the code & complexity. Maybe Roslyn can do this for us? Proposal: arbitrary task-like types returned from async methods dotnet/roslyn#7169
There are still string hashing issues in the KeyValueAcumulator

benaadams mentioned this issue Feb 4, 2016

Lower alloc KeyValueAccumulator #554

Closed

benaadams added a commit to benaadams/HttpAbstractions that referenced this issue Feb 7, 2016

Use ValueTask in FormReading

8ff6275

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams mentioned this issue Feb 7, 2016

Use System.Buffers + ValueTask in FormReading #556

Closed

benaadams added a commit to benaadams/HttpAbstractions that referenced this issue Feb 7, 2016

Use ValueTask in FormReading

28b6cd9

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams added a commit to benaadams/HttpAbstractions that referenced this issue Feb 7, 2016

Use ValueTask in FormReading

25c7ac3

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams added a commit to benaadams/HttpAbstractions that referenced this issue Feb 11, 2016

Use ValueTask in FormReading

5c39ce8

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

benaadams added a commit to benaadams/HttpAbstractions that referenced this issue Feb 11, 2016

Use ValueTask in FormReading

0ee149c

Addressing Task allocations in aspnet#553 Unfortunately this has a return type split between in public api net451 and dotnet5.4

Tratcher added the Perf label Mar 7, 2016

Tratcher added a commit that referenced this issue Mar 28, 2016

#553 Use System.Buffers for temporary arrays

bd60507

muratg assigned Tratcher Apr 5, 2016

Tratcher added this to the 1.0.0 milestone Apr 6, 2016

Tratcher removed their assignment Apr 6, 2016

ljw1004 mentioned this issue Apr 8, 2016

Proposal: arbitrary task-like types returned from async methods dotnet/roslyn#7169

Closed

Tratcher self-assigned this Apr 28, 2016

Tratcher added the 0 - Backlog label May 17, 2016

ljw1004 mentioned this issue May 19, 2016

Discussion thread for arbitrary async returns dotnet/roslyn#10902

Closed

muratg assigned JunTaoLuo and unassigned Tratcher May 19, 2016

muratg added 1 - Ready and removed 0 - Backlog labels May 19, 2016

JunTaoLuo added 2 - Working 1 - Ready and removed 1 - Ready 2 - Working labels May 23, 2016

muratg assigned pakrym and unassigned JunTaoLuo May 25, 2016

muratg modified the milestones: 1.0.1, 1.0.0 May 26, 2016

pakrym added 2 - Working and removed 1 - Ready labels May 26, 2016

pakrym mentioned this issue May 26, 2016

Another try on form reader #640

Merged

pakrym closed this as completed in #640 May 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allocations in FormReader #553

Allocations in FormReader #553

rynowak commented Feb 4, 2016

benaadams commented Feb 4, 2016

benaadams commented Feb 7, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

benaadams commented Feb 8, 2016

benaadams commented Feb 8, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

benaadams commented Feb 9, 2016

benaadams commented Feb 9, 2016

stephentoub commented Feb 10, 2016

rynowak commented Feb 10, 2016

stephentoub commented Feb 10, 2016

rynowak commented Feb 10, 2016

rynowak commented Feb 10, 2016

stephentoub commented Feb 10, 2016

benaadams commented Feb 10, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

Tratcher commented Apr 6, 2016

Allocations in FormReader #553

Allocations in FormReader #553

Comments

rynowak commented Feb 4, 2016

There's a lot of overhead right now in FormReader

Breaking this down further

Task<string> - 47mb

Task<Nullable<KeyValuePair<string, string>>> - 28mb

Dictionary<string, List<String>>, List<string>, string[] - 25mb 12mb 9mb

Some thoughts

benaadams commented Feb 4, 2016

benaadams commented Feb 7, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

benaadams commented Feb 8, 2016

benaadams commented Feb 8, 2016

rynowak commented Feb 8, 2016

rynowak commented Feb 8, 2016

benaadams commented Feb 9, 2016

benaadams commented Feb 9, 2016

stephentoub commented Feb 10, 2016

rynowak commented Feb 10, 2016

stephentoub commented Feb 10, 2016

rynowak commented Feb 10, 2016

rynowak commented Feb 10, 2016

stephentoub commented Feb 10, 2016

benaadams commented Feb 10, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

benaadams commented Feb 11, 2016

Tratcher commented Apr 6, 2016

`Task<string>` - 47mb

`Task<Nullable<KeyValuePair<string, string>>>` - 28mb

`Dictionary<string, List<String>>`, `List<string>`, `string[]` - 25mb 12mb 9mb