Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Encoding.UTF8.GetString / GetChars performance for small inputs #27268

Merged
merged 3 commits into from Nov 9, 2019

Conversation

@GrabYourPitchforks
Copy link
Member

GrabYourPitchforks commented Oct 17, 2019

This improves the performance of Encoding.UTF8.GetString(byte[]) : string and Encoding.UTF8.GetBytes(string) : byte[] by building on the existing JIT devirtualization logic and taking advantage of the case that most inputs to these functions are likely to be small (32 elements or fewer). For small inputs such as these, we already know that the maximum input size fits nicely into a stackalloc, so we can avoid the counting step and move straight to transcoding + the final memcpy.

Method Toolchain Text Mean Error StdDev Median Ratio RatioSD
GetString_FromByteArray master 1.098 ns 0.0619 ns 0.0549 ns 1.075 ns 1.00 0.00
GetString_FromByteArray proto 8.682 ns 0.2368 ns 0.1978 ns 8.664 ns 7.91 0.48
GetByteArray_FromString master 21.109 ns 0.5399 ns 1.4032 ns 20.549 ns 1.00 0.00
GetByteArray_FromString proto 8.081 ns 0.1581 ns 0.1401 ns 8.025 ns 0.36 0.03
GetString_FromByteArray master Hello 23.182 ns 0.3626 ns 0.3028 ns 23.191 ns 1.00 0.00
GetString_FromByteArray proto Hello 18.949 ns 0.4584 ns 1.1997 ns 18.436 ns 0.88 0.06
GetByteArray_FromString master Hello 23.336 ns 0.5520 ns 1.0232 ns 22.987 ns 1.00 0.00
GetByteArray_FromString proto Hello 16.103 ns 0.1553 ns 0.1377 ns 16.098 ns 0.67 0.03
GetString_FromByteArray master Hello world! 27.745 ns 0.6358 ns 0.6803 ns 27.784 ns 1.00 0.00
GetString_FromByteArray proto Hello world! 20.178 ns 0.2909 ns 0.2429 ns 20.168 ns 0.73 0.02
GetByteArray_FromString master Hello world! 26.552 ns 0.6135 ns 1.4935 ns 25.959 ns 1.00 0.00
GetByteArray_FromString proto Hello world! 16.829 ns 0.4128 ns 0.6303 ns 16.582 ns 0.63 0.04
GetString_FromByteArray master Lorem(...)elit. [56] 38.538 ns 0.3517 ns 0.3290 ns 38.418 ns 1.00 0.00
GetString_FromByteArray proto Lorem(...)elit. [56] 38.339 ns 0.4026 ns 0.3766 ns 38.403 ns 0.99 0.01
GetByteArray_FromString master Lorem(...)elit. [56] 34.777 ns 0.8004 ns 1.2695 ns 34.059 ns 1.00 0.00
GetByteArray_FromString proto Lorem(...)elit. [56] 36.436 ns 0.8080 ns 1.3937 ns 35.688 ns 1.05 0.06
GetString_FromByteArray master Nǐhǎo 你好 44.171 ns 0.8742 ns 0.7749 ns 44.235 ns 1.00 0.00
GetString_FromByteArray proto Nǐhǎo 你好 30.719 ns 0.3818 ns 0.3188 ns 30.668 ns 0.70 0.01
GetByteArray_FromString master Nǐhǎo 你好 42.092 ns 0.6221 ns 0.4857 ns 42.092 ns 1.00 0.00
GetByteArray_FromString proto Nǐhǎo 你好 27.424 ns 0.2978 ns 0.2487 ns 27.441 ns 0.65 0.01
GetString_FromByteArray master Γεια σου κόσμε 47.637 ns 0.5525 ns 0.4614 ns 47.535 ns 1.00 0.00
GetString_FromByteArray proto Γεια σου κόσμε 31.119 ns 0.4889 ns 0.4334 ns 31.074 ns 0.65 0.01
GetByteArray_FromString master Γεια σου κόσμε 45.591 ns 0.4591 ns 0.4295 ns 45.518 ns 1.00 0.00
GetByteArray_FromString proto Γεια σου κόσμε 30.522 ns 0.6794 ns 0.8592 ns 30.236 ns 0.67 0.02

Marked WIP because it's not fully tested and I'm trying to figure out if it would make sense to provide additional overloads beyond these two. In my own experience, the two call sites under consideration here are far and away the most commonly used methods.

These methods are overridden on the internal sealed type rather than the UTF8Encoding base type because that type is configurable in unexpected ways. For example, somebody may have configured the UTF8Encoding instance with a custom fallback mechanism, or they may have overridden a virtual method in a manner we're not anticipating. Putting this logic in the internal sealed type instead of the base type works around such potential problems.

It's also possible that we don't want to do this because it represents duplication of logic we'd rather not have. That's understandable - I'm primarily putting this out there to gauge the temperature of the response.

@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Oct 17, 2019

One consideration is that the performance of zero-length byte[] to string is regressed since the early-exit check was removed. Personally I'm fine with this, since zero-length byte[] inputs should be rare, and removing these checks makes the code cleaner. The string ctor is already checking for zero-length buffers and returning the string.Empty singleton, so this won't result in any extra allocations in this case.

Also, an important caveat for the above perf numbers: they rely on mono/linker#792 being fixed. Since UTF8EncodingSealed is an internal nested type, it's affected by that bug. These perf numbers were generated from making UTF8EncodingSealed a non-nested type and running the benchmarks. (This PR leaves the type nested.)

@stephentoub

This comment has been minimized.

Copy link
Member

stephentoub commented Oct 18, 2019

Seems reasonable overall.

@GrabYourPitchforks GrabYourPitchforks marked this pull request as ready for review Oct 18, 2019
@GrabYourPitchforks GrabYourPitchforks changed the title [WIP] Improve Encoding.UTF8.GetString / GetChars performance for small inputs Improve Encoding.UTF8.GetString / GetChars performance for small inputs Oct 18, 2019
@jkotas

This comment has been minimized.

Copy link
Member

jkotas commented Oct 20, 2019

Marking blocked - depends on mono/linker#792 being fixed.

@jkotas
jkotas approved these changes Oct 20, 2019
@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Nov 6, 2019

The mono linker issue is now fixed, currently just waiting for things to flow back to our build system.

@maryamariyan

This comment has been minimized.

Copy link
Member

maryamariyan commented Nov 6, 2019

Thank you for your contribution. As announced in #27549 this repository will be moving to dotnet/runtime on November 13. If you would like to continue working on this PR after this date, the easiest way to move the change to dotnet/runtime is:

  1. In your coreclr repository clone, create patch by running git format-patch origin
  2. In your runtime repository clone, apply the patch by running git apply --directory src/coreclr <path to the patch created in step 1>
@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Nov 7, 2019

Spoke offline with Sven, and we expect Maestro bot to inject the new mono linker into coreclr tonight or early tomorrow morning. I can merge this PR immediately after that goes through.

@sbomer

This comment has been minimized.

Copy link
Member

sbomer commented Nov 8, 2019

The subscription still isn't flowing the linker update even with #27736. @dotnet/dnceng would someone be able to look at why dependency flow from mono/linker to coreclr isn't working? darc update-dependencies works locally. darc trigger-subscription succeeds but still hasn't created a PR.

@riarenas

This comment has been minimized.

Copy link
Member

riarenas commented Nov 8, 2019

I was investigating the missing dependency update PR and I manually reset the subscription's last Applied build and retriggered it to get a fresh set of logs. The retrigger opened #27773.

I'll see if I can find why maestro thought it had already applied the build without opening a PR first. I opened dotnet/arcade#4314 for investigation.

@sbomer Let me know what you want to do with #27773 and #27771

@sbomer

This comment has been minimized.

Copy link
Member

sbomer commented Nov 8, 2019

Thanks @riarenas! I closed #27771 in favor of the maestro update.

@GrabYourPitchforks

This comment has been minimized.

Copy link
Member Author

GrabYourPitchforks commented Nov 9, 2019

Confirmed that the correct codegen is now being produced by these methods. Thank you for your help!

@GrabYourPitchforks GrabYourPitchforks merged commit 7e78595 into dotnet:master Nov 9, 2019
53 checks passed
53 checks passed
WIP Ready for review
Details
coreclr-ci Build #20191018.36 had test failures
Details
coreclr-ci (Build Linux arm checked) Build Linux arm checked succeeded
Details
coreclr-ci (Build Linux arm64 checked) Build Linux arm64 checked succeeded
Details
coreclr-ci (Build Linux arm64 release) Build Linux arm64 release succeeded
Details
coreclr-ci (Build Linux x64 checked) Build Linux x64 checked succeeded
Details
coreclr-ci (Build Linux_musl x64 checked) Build Linux_musl x64 checked succeeded
Details
coreclr-ci (Build Linux_musl x64 release) Build Linux_musl x64 release succeeded
Details
coreclr-ci (Build Linux_rhel6 x64 release) Build Linux_rhel6 x64 release succeeded
Details
coreclr-ci (Build OSX x64 checked) Build OSX x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 CoreFX Linux x64 checked) Build Test Pri0 CoreFX Linux x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 CoreFX Windows_NT x64 checked) Build Test Pri0 CoreFX Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Linux arm checked) Build Test Pri0 Linux arm checked succeeded
Details
coreclr-ci (Build Test Pri0 Linux arm64 checked) Build Test Pri0 Linux arm64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Linux_musl x64 release) Build Test Pri0 Linux_musl x64 release succeeded
Details
coreclr-ci (Build Test Pri0 OSX x64 checked) Build Test Pri0 OSX x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 R2R OSX x64 checked) Build Test Pri0 R2R OSX x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 R2R Windows_NT x64 checked) Build Test Pri0 R2R Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 R2R Windows_NT x86 checked) Build Test Pri0 R2R Windows_NT x86 checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT arm checked) Build Test Pri0 Windows_NT arm checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT arm64 checked) Build Test Pri0 Windows_NT arm64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT x64 checked) Build Test Pri0 Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Test Pri0 Windows_NT x86 checked) Build Test Pri0 Windows_NT x86 checked succeeded
Details
coreclr-ci (Build Windows_NT arm checked) Build Windows_NT arm checked succeeded
Details
coreclr-ci (Build Windows_NT arm release) Build Windows_NT arm release succeeded
Details
coreclr-ci (Build Windows_NT arm64 checked) Build Windows_NT arm64 checked succeeded
Details
coreclr-ci (Build Windows_NT arm64 release) Build Windows_NT arm64 release succeeded
Details
coreclr-ci (Build Windows_NT x64 checked) Build Windows_NT x64 checked succeeded
Details
coreclr-ci (Build Windows_NT x64 debug) Build Windows_NT x64 debug succeeded
Details
coreclr-ci (Build Windows_NT x64 release) Build Windows_NT x64 release succeeded
Details
coreclr-ci (Build Windows_NT x86 checked) Build Windows_NT x86 checked succeeded
Details
coreclr-ci (Build Windows_NT x86 debug) Build Windows_NT x86 debug succeeded
Details
coreclr-ci (Checkout (Unix)) Checkout (Unix) succeeded
Details
coreclr-ci (Checkout (Windows)) Checkout (Windows) succeeded
Details
coreclr-ci (Formatting Linux x64) Formatting Linux x64 succeeded
Details
coreclr-ci (Run Test Pri0 CoreFX Linux x64 checked) Run Test Pri0 CoreFX Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 CoreFX Windows_NT x64 checked) Run Test Pri0 CoreFX Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux arm checked) Run Test Pri0 Linux arm checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux arm64 checked) Run Test Pri0 Linux arm64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux x64 checked) Run Test Pri0 Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux_musl x64 checked) Run Test Pri0 Linux_musl x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Linux_musl x64 release) Run Test Pri0 Linux_musl x64 release succeeded
Details
coreclr-ci (Run Test Pri0 OSX x64 checked) Run Test Pri0 OSX x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Linux x64 checked) Run Test Pri0 R2R Linux x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R OSX x64 checked) Run Test Pri0 R2R OSX x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Windows_NT x64 checked) Run Test Pri0 R2R Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 R2R Windows_NT x86 checked) Run Test Pri0 R2R Windows_NT x86 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT arm checked) Run Test Pri0 Windows_NT arm checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT arm64 checked) Run Test Pri0 Windows_NT arm64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT x64 checked) Run Test Pri0 Windows_NT x64 checked succeeded
Details
coreclr-ci (Run Test Pri0 Windows_NT x86 checked) Run Test Pri0 Windows_NT x86 checked succeeded
Details
coreclr-ci (Test crossgen-comparison Linux arm checked) Test crossgen-comparison Linux arm checked succeeded
Details
license/cla All CLA requirements met.
Details
@GrabYourPitchforks GrabYourPitchforks deleted the GrabYourPitchforks:utf8_perf branch Nov 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.