Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable SIMD for JSON API #28937

Open
terrajobst opened this issue Mar 11, 2019 · 10 comments
Open

Enable SIMD for JSON API #28937

terrajobst opened this issue Mar 11, 2019 · 10 comments
Labels
area-System.Text.Json enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Milestone

Comments

@terrajobst
Copy link
Member

From @hez2010 on March 9, 2019 9:19

Is your feature request related to a problem? Please describe.

Feature request. Enable SIMD for the new Json APIs in ASP.NET Core 3.

Describe the solution you'd like

See SimdJsonSharp
With SIMD support, SimdJsonSharp is faster than any existing Json library on .NET.
I think the new Json APIs in ASP.NET Core 3 should add SIMD support to improve its performance.

Copied from original issue: dotnet/aspnetcore#8366

@terrajobst
Copy link
Member Author

Hey @hez2010 that's a great suggestion. We're in the middle of providing a new perf oriented set of JSON APIs for .NET Core 3.0.

@ahsonkhan, what's your take on this?

@Tornhoof
Copy link
Contributor

My personal take on this:
If you take a look at @EgorBo's SimdJsonSharp and the available benchmarks the structural symbol tape concept really shines the larger the data gets, especially if it is pretty-printed (i.e. much whitespace between relevant symbols). If that's not the case, e.g. no pretty-printing, most of the time is spent in parsing of the more complex data types. That's visibile in the benchmark, a simpler, "not-as-valid" double parser is easily ten times faster than the corefx version.
Still I expect SimdJsonSharp (especially if it's further optimized) to be twice as fast (or more) for data sizes of 1kb and larger, compared to the "normal" approach of a byte-per-byte parser.

@terrajobst
Copy link
Member Author

I think we'll never have the option to be forgiving in parsing in the name of perf, due to security concerns. However, I don't see a reason why we shouldn't be speeding up the implementation using SIMD.

@filipnavara
Copy link
Member

filipnavara commented Mar 12, 2019

SimdJsonSharp does full validation, including checking for valid UTF-8. However, it is designed around parsing large chunks and may need to be adjusted to provide something more "streamable" and memory efficient in addition to being fast. I recommend reading the papers about original SimdJson to figure out what it does and how it accomplished it.

@NinoFloris
Copy link
Contributor

Any news on this?

@tannergooding
Copy link
Member

I'd defer to @ahsonkhan for his stance; but I would expect we aren't going to look into this for the 3.0 timeframe. Instead, 3.0 will likely aim towards getting the new API shipped and stable and perf improvements (like this) will be investigated for a future release.

@ahsonkhan
Copy link
Member

but I would expect we aren't going to look into this for the 3.0 timeframe. Instead, 3.0 will likely aim towards getting the new API shipped and stable and perf improvements (like this) will be investigated for a future release.

Precisely.

The SIMD work definitely looks promising and we should definitely investigate how we can incorporate it into the implementation of System.Text.Json APIs. I prefer we defer this work until after we have locked the API design and features for 3.0. If anyone wants to tackle parts of this or work on prototypes to incorporate some of the SIMD effort with the existing Utf8JsonReader, that would definitely be appreciated!

@ahsonkhan ahsonkhan removed their assignment May 9, 2019
@msftgits msftgits transferred this issue from dotnet/corefx Feb 1, 2020
@msftgits msftgits added this to the Future milestone Feb 1, 2020
@ahsonkhan ahsonkhan removed the help wanted [up-for-grabs] Good issue for external contributors label Feb 21, 2020
@RobertHenry6bev
Copy link
Contributor

See the paper https://arxiv.org/pdf/1902.08318.pdf on ideas for using SIMD

See the git repo https://github.com/simdjson/simdjson

@eiriktsarpalis
Copy link
Member

Next step would be to identify concrete vectorization opportunities in the codebase. STJ already uses vectorization indirectly via the encoding routines in System.Text.Encodings.Web and the Span helper methods.

@layomia
Copy link
Contributor

layomia commented Oct 13, 2022

Triage: we should also assess prior art from other serializers and how they utilize SIMD.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-System.Text.Json enhancement Product code improvement that does NOT require public API changes/additions tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests