More Vector performance #1000

Open
wants to merge 4 commits into
from

Projects

None yet

4 participants

@benaadams
Contributor

Now Unsafe is in nuget, re-opening of (minus CopyBlk, as not sure about perf on that)

#811
#804
#812

@benaadams benaadams changed the title from Vector stuffs to More Vector performance Jul 20, 2016
@halter73
Member

Cool. Maybe we can get the System.Runtime.CompilerServices.Unsafe NuGet package a little above 125 downloads!

Based on the micro benchmarks I did a few months back, this should be about 39% faster than what we have today with some large (~60KB) headers mixed in, and 32% faster for more normal sized (~2.5KB) headers. I'll have to to rerun with this PR to make sure the numbers haven' changed before merging. We'll also run the plaintext benchmark just to make sure nothing is worse.

@halter73 halter73 and 1 other commented on an outdated diff Jul 21, 2016
...Kestrel/Internal/Infrastructure/MemoryPoolIterator.cs
@@ -714,6 +705,33 @@ public MemoryPoolIterator CopyTo(byte[] array, int offset, int count, out int ac
}
}
+ public MemoryPoolIterator Consume(int count, out int actual)
+ {
+ var block = _block;
+ var index = _index;
+ var remaining = count;
+ while (true)
+ {
+ var following = block.End - index;
@halter73
halter73 Jul 21, 2016 Member

While you might not need to use the wasLastBlock pattern since we want to skip it anyway, I don't feel to good about using a different pattern for traversing blocks in this one place.

It certainly risks returning an invalid actual out param. While this might never technically be used today by callers to CopyTo with a null array, I don't think it's OK for it to ever be wrong in case it is used.

@benaadams
benaadams Jul 21, 2016 Contributor

Changed to same pattern

@halter73 halter73 and 1 other commented on an outdated diff Jul 21, 2016
...Kestrel/Internal/Infrastructure/MemoryPoolIterator.cs
@@ -714,6 +705,33 @@ public MemoryPoolIterator CopyTo(byte[] array, int offset, int count, out int ac
}
}
+ public MemoryPoolIterator Consume(int count, out int actual)
@halter73
halter73 Jul 21, 2016 Member

Why public?

@benaadams
benaadams Jul 21, 2016 Contributor

Made private

@benaadams
Contributor
benaadams commented Jul 21, 2016 edited

Rebased the feedback into commit 2 so it stays with the correct commit tag, my git fu is improving

@benaadams
Contributor

There are some crazy cookie sizes in the wild 👍

@halter73
Member

I really want the overall benchmarks for this PR to be better since the microbenchmarks for the header validation showed a nice improvement. At least with the plaintext benchmarks, there doesn't seem to be an improvement. Maybe we'll see an improvement with larger request and response headers.

BTW. Did you microbenchmark Unsafe.Read vs the normal Vector ctor. Obviously Unsafe.Read can have an array to begin with (like with the header strings), but I haven't measured the difference between using that vs the normal ctor with our blocks where we have both the pointer and the array.

benchmarks\src\BenchmarksDriver> dotnet run -- -n plaintext -s http://adx-3u-perfsvr:5001 -c http://adx-3u-svr02:5002 -o  "KestrelHttpServer@benaadams/vector-stuffs-merged"
Project Repository (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project Benchmarks.ClientJob (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project Benchmarks.ServerJob (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project BenchmarksDriver (.NETCoreApp,Version=v1.0) was previously compiled. Skipping compilation.
[01:14:27.419] Starting scenario Plaintext on benchmark server...
[01:14:48.257] Warmup
[01:14:48.266] Starting scenario Plaintext on benchmark client...
[01:14:59.420] Scenario Plaintext completed on benchmark client
[01:14:59.420] RPS: 1005705.16
[01:14:59.421] Stopping scenario Plaintext on benchmark client...
[01:14:59.423] Benchmark
[01:14:59.423] Starting scenario Plaintext on benchmark client...
[01:15:10.467] Scenario Plaintext completed on benchmark client
[01:15:10.468] RPS: 1093077.62
[01:15:10.468] Stopping scenario Plaintext on benchmark client...
[01:15:10.471] Stopping scenario Plaintext on benchmark server...
benchmarks\src\BenchmarksDriver> dotnet run -- -n plaintext -s http://adx-3u-perfsvr:5001 -c http://adx-3u-svr02:5002 -o  "KestrelHttpServer@dev"
Project Repository (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project Benchmarks.ClientJob (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project Benchmarks.ServerJob (.NETStandard,Version=v1.0) was previously compiled. Skipping compilation.
Project BenchmarksDriver (.NETCoreApp,Version=v1.0) was previously compiled. Skipping compilation.
[01:15:24.546] Starting scenario Plaintext on benchmark server...
[01:15:45.182] Warmup
[01:15:45.195] Starting scenario Plaintext on benchmark client...
[01:15:56.320] Scenario Plaintext completed on benchmark client
[01:15:56.321] RPS: 1057693.46
[01:15:56.321] Stopping scenario Plaintext on benchmark client...
[01:15:56.323] Benchmark
[01:15:56.323] Starting scenario Plaintext on benchmark client...
[01:16:07.380] Scenario Plaintext completed on benchmark client
[01:16:07.380] RPS: 1129921.75
[01:16:07.381] Stopping scenario Plaintext on benchmark client...
[01:16:07.383] Stopping scenario Plaintext on benchmark server...
@benaadams
Contributor

I'll do some standalone Unsafe vs new

@muratg
Member
muratg commented Aug 17, 2016

@benaadams did you end up doing more benchmarking?

benaadams added some commits Jul 20, 2016
@benaadams benaadams Unsafe.Read<> for faster Vector ctors 315d81a
@benaadams benaadams Extract Consume from CopyTo 0029f4e
@benaadams benaadams Vector header validate
Some headers are pretty big e.g. Cookie and Content-Security-Policy
4847cc2
@benaadams benaadams Rebase fixes
bc4a92d
@benaadams
Contributor

@halter73 @muratg some outliers which might be windows doing something or a quirk :( But mostly good

Pre

Method Mean StdDev Scaled RPS
ParsePlaintext 927.9056 ns 13.4866 ns 1.00 1,077,695.89
ParsePipelinedPlaintext 753.9081 ns 9.5989 ns 0.81 1,326,421.56
ParseLiveAspNet 5,015.1654 ns 71.4417 ns 5.41 199,395.22
ParsePipelinedLiveAspNet 4,684.1782 ns 30.5706 ns 5.05 213,484.62
ParseUnicode 8,448.2189 ns 145.9482 ns 9.11 118,368.15
ParseUnicodePipelined 8,247.4369 ns 95.3297 ns 8.89 121,249.79

Post

Method Mean StdDev Scaled RPS Change
ParsePlaintext 896.9537 ns 16.9184 ns 1.00 1,114,884.80 + 3.45%
ParsePipelinedPlaintext 755.1664 ns 12.9676 ns 0.84 1,324,211.48 - 0.16%
ParseLiveAspNet 4,868.1637 ns 32.2829 ns 5.43 205,416.26 + 3.02%
ParsePipelinedLiveAspNet 4,541.4938 ns 46.2298 ns 5.06 220,191.87 + 3.14%
ParseUnicode 8,435.4590 ns 102.8463 ns 9.41 118,547.19 + 0.15%
ParseUnicodePipelined 8,032.8591 ns 12.0597 ns 8.96 124,488.68 + 2.67%
@benaadams
Contributor

Though it probably needs some more TLC; my understanding of how Vectors and the jit interact has changed considerably since I first did this; and it will likely clash horribly with the MemoryPoolIterator changes #1138

@benaadams
Contributor

Not sure what MemoryPoolIterator will be like post Pipelines so moved the Header Validate to #1301

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment