Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Micro optimizations for serialization #39323

Merged
merged 2 commits into from
Jul 15, 2020
Merged

Conversation

steveharter
Copy link
Member

Primarily this helps with Blazor performance on the Mono interpreter during serialization.

The writer tests are ~1.4x faster and serialization (which uses the writer) are ~1.25x faster. This is an existing benchmark we have been using that represents a large object graph. These timings taken from Edge browser:

Before:
Serialization took 1383 ms	Deserialization took 4074 ms	Writer took 1146 ms

After:
Serialization took 1117 ms	Deserialization took 4055 ms	Writer took 804 ms

The main perf benefit is due to avoiding accessing the static ReadOnlySpan<byte> table when checking to see if we need to escaper each characters. Instead, a local stack variable is made once. This performance may be able to be fixed in the interpreter @BrzVlad; not sure why this is much slower on Mono interpreter, although it could be taking a lock to ensure the static table is properly initialized.

The non-Mono (core) numbers are a bit faster as well (about 2-3%) due to micro optimizations in the writer and serializer code and was not affected by the escaping change mentioned above. Here's a generic writer test:

Before:

|          Method | Formatted | SkipValidation | DataSize |     Mean |   Error |  StdDev |   Median |      Min |      Max |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------- |---------- |--------------- |--------- |---------:|--------:|--------:|---------:|---------:|---------:|-------:|------:|------:|----------:|
|  WriteBasicUtf8 |     False |          False |       10 | 642.9 ns | 4.94 ns | 4.62 ns | 641.7 ns | 636.7 ns | 650.4 ns | 0.0179 |     - |     - |     120 B |
| WriteBasicUtf16 |     False |          False |       10 | 705.0 ns | 5.50 ns | 5.15 ns | 705.6 ns | 697.5 ns | 715.5 ns | 0.0167 |     - |     - |     120 B |

After:
|          Method | Formatted | SkipValidation | DataSize |     Mean |   Error |  StdDev |   Median |      Min |      Max |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|---------------- |---------- |--------------- |--------- |---------:|--------:|--------:|---------:|---------:|---------:|-------:|------:|------:|----------:|
|  WriteBasicUtf8 |     False |          False |       10 | 630.2 ns | 5.96 ns | 5.58 ns | 631.0 ns | 616.8 ns | 636.9 ns | 0.0174 |     - |     - |     120 B |
| WriteBasicUtf16 |     False |          False |       10 | 692.4 ns | 7.38 ns | 6.90 ns | 693.7 ns | 682.4 ns | 705.0 ns | 0.0168 |     - |     - |     120 B |

@steveharter steveharter added this to the 5.0.0 milestone Jul 14, 2020
@steveharter steveharter requested a review from jozkee as a code owner July 14, 2020 21:31
@steveharter steveharter self-assigned this Jul 14, 2020
@@ -12,25 +12,24 @@ public sealed partial class Utf8JsonWriter
{
private void ValidateWritingValue()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make this an aggressive-inlining method whose implementation is if (!skip) { ValidateWritingValueSlowNonInlined(); }? Then you don't have to change a dozen call sites.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@GrabYourPitchforks Mono interpreter currently can't inline methods with branches, so it would no longer provide any benefit there.

Copy link
Member

@jkotas jkotas Jul 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not feel right to be doing manual inlining and making the code less maintainable just to make it run faster under interpreter.

The interpreter is always going to be a low throughput environment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Avoiding the overhead of calling a no-op method is a normal idiom.

Also the existing writer code already had the same up-front if check before calling ValidateStart() and ValidateEnd() (instead of within the called method) so one way to think about this is extending that existing pattern to ValidateWritingValue. Granted there were only 4 cases of that before and now there's 21.

@BrzVlad
Copy link
Member

BrzVlad commented Jul 15, 2020

While this looks good, I would prefer to wait for #39195, which should get in today. That change should fix most of the overhead with accessing the AllowedList property. I would expect the inlining of NeedsEscaping as that part of this PR to still have some additional benefit, but it's hard to be sure without testing it.

@steveharter
Copy link
Member Author

steveharter commented Jul 15, 2020

I would prefer to wait for #39195, which should get in today

I was hoping to get this in before the Preview 8 snap today. However it makes sense to wait and see if there is still a gain by caching the reference to the static ReadOnlySpan instead of retroactively pulling it out.

@BrzVlad
Copy link
Member

BrzVlad commented Jul 15, 2020

@steveharter The interpreter PR was just merged. I think we could consider the more controversial inlining of NeedsEscape separately in another PR and have the rest of this merged in time for the snap.

@steveharter
Copy link
Member Author

steveharter commented Jul 15, 2020

Test failure in dotnet-runtime-perf unrelated: Crossgen System.Private.Xml.dll in job 0fe31b46-00e8-442a-a19b-8a85a1ac4ea9 has failed.

@steveharter
Copy link
Member Author

Some runtime CI legs seem to be stuck (4.5 hours).

@steveharter steveharter merged commit 246b729 into dotnet:master Jul 15, 2020
@steveharter steveharter deleted the MicroPerf branch July 15, 2020 22:27
@ghost ghost locked as resolved and limited conversation to collaborators Dec 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants