-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Utf8JsonWriter API Proposal #27938
Comments
Why is this in the |
Also, why do we need |
I like:
I'd like to see:
I don't really like:
|
Given the reader is already there, we would put the writer there too. We put
Fair point. We already have Utf8JsonReader, so the writer follows suit. We could consider removing
We are trying to merge and do as much work as possible in single API calls for performance, and such an API helps there.
Unlike the reader, WriteComments is an API so it is difficult to provide an option to disallow/hide it. What should it do if the user set default option and still calls it? Throw? If the caller explicitly opt'd to call the API, they are knowingly violating rfc (which we can document). I am not sure if hiding it (maybe with EditorBrowsableNever) really helps, and having a setting here only gets in the way without much benefit. |
As I don't know your plans for a higher level serialization API, one which possibly will need some kind of codegen to be performant. This higher level serializer should then take care of i.e. precalculating the named array start. I understand that your optimized method could write three bytes directly
Yes I would model it symmetrically in the Reader and Writer and throw if disabled. But I do admit it's more relevant for the Reader side. |
I didn't know the reader was already approved, I see now that this is following some of the namings from that. Yes, I would recommend I don't think the existing APIs in the namespace would cause much confusion. |
API shape based on feedback from the previous review: dotnet/apireviews#82
Questions:
namespace System.Text.Json {
// Using InvalidOperationException instead, to be consistent with XmlWriter.
/*public sealed class JsonWriterException : Exception {
public JsonWriterException(string message);
}*/
public struct JsonWriterOptions {
public bool Indented { get; set; }
public bool SkipValidation { get; set; }
}
public struct JsonWriterState {
public JsonWriterState(JsonWriterOptions options = default(JsonWriterOptions));
public long BytesCommitted { get; }
public long BytesWritten { get; }
public JsonWriterOptions Options { get; }
}
public ref struct Utf8JsonWriter {
public Utf8JsonWriter(IBufferWriter<byte> bufferWriter, JsonWriterState state = default(JsonWriterState));
//public Utf8JsonWriter(Span<byte> outputSpan, JsonWriterState state = default(JsonWriterState));
public long BytesCommitted { get; }
public long BytesWritten { get; }
public int CurrentDepth { get; }
//public JsonWriterState CurrentState { get; } // Changing to a method instead of a property
public JsonWriterState GetCurrentState();
public void Flush(bool isFinalBlock = true);
public void WriteStringArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<DateTime> values, bool suppressEscaping = false);
public void WriteStringArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<DateTimeOffset> values, bool suppressEscaping = false);
public void WriteStringArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<Guid> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<decimal> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<double> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<int> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<long> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<float> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<uint> values, bool suppressEscaping = false);
public void WriteNumberArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<ulong> values, bool suppressEscaping = false);
public void WriteBooleanArray(ReadOnlySpan<byte> propertyName, ReadOnlySpan<bool> values, bool suppressEscaping = false);
public void WriteStringArrayValue(ReadOnlySpan<DateTime> valuese);
public void WriteStringArrayValue(ReadOnlySpan<DateTimeOffset> values);
public void WriteStringArrayValue(ReadOnlySpan<Guid> values);
public void WriteNumberArrayValue(ReadOnlySpan<decimal> values);
public void WriteNumberArrayValue(ReadOnlySpan<double> values);
public void WriteNumberArrayValue(ReadOnlySpan<int> values);
public void WriteNumberArrayValue(ReadOnlySpan<long> values);
public void WriteNumberArrayValue(ReadOnlySpan<float> values);
public void WriteNumberArrayValue(ReadOnlySpan<uint> values);
public void WriteNumberArrayValue(ReadOnlySpan<ulong> values);
public void WriteBooleanArrayValue(ReadOnlySpan<bool> values);
public void WriteBoolean(ReadOnlySpan<byte> propertyName, bool value, bool suppressEscaping = false);
public void WriteBoolean(ReadOnlySpan<char> propertyName, bool value, bool suppressEscaping = false);
public void WriteBoolean(string propertyName, bool value, bool suppressEscaping = false);
public void WriteBooleanValue(bool value);
public void WriteCommentValue(ReadOnlySpan<byte> value, bool suppressEscaping = false);
public void WriteCommentValue(ReadOnlySpan<char> value, bool suppressEscaping = false);
public void WriteCommentValue(string value, bool suppressEscaping = false);
public void WriteEndArray();
public void WriteEndObject();
public void WriteNull(ReadOnlySpan<byte> propertyName, bool suppressEscaping = false);
public void WriteNull(ReadOnlySpan<char> propertyName, bool suppressEscaping = false);
public void WriteNull(string propertyName, bool suppressEscaping = false);
public void WriteNullValue();
public void WriteNumber(ReadOnlySpan<byte> propertyName, decimal value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, double value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, int value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, long value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, float value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, uint value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<byte> propertyName, ulong value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, decimal value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, double value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, int value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, long value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, float value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, uint value, bool suppressEscaping = false);
public void WriteNumber(ReadOnlySpan<char> propertyName, ulong value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, decimal value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, double value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, int value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, long value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, float value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, uint value, bool suppressEscaping = false);
public void WriteNumber(string propertyName, ulong value, bool suppressEscaping = false);
public void WriteNumberValue(decimal value);
public void WriteNumberValue(double value);
public void WriteNumberValue(int value);
public void WriteNumberValue(long value);
public void WriteNumberValue(float value);
public void WriteNumberValue(uint value);
public void WriteNumberValue(ulong value);
public void WriteStartArray();
public void WriteStartArray(ReadOnlySpan<byte> propertyName, bool suppressEscaping = false);
public void WriteStartArray(ReadOnlySpan<char> propertyName, bool suppressEscaping = false);
public void WriteStartArray(string propertyName, bool suppressEscaping = false);
public void WriteStartObject();
public void WriteStartObject(ReadOnlySpan<byte> propertyName, bool suppressEscaping = false);
public void WriteStartObject(ReadOnlySpan<char> propertyName, bool suppressEscaping = false);
public void WriteStartObject(string propertyName, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, DateTime value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, DateTimeOffset value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, Guid value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, ReadOnlySpan<byte> value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, ReadOnlySpan<char> value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<byte> propertyName, string value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, DateTime value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, DateTimeOffset value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, Guid value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, ReadOnlySpan<byte> value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, ReadOnlySpan<char> value, bool suppressEscaping = false);
public void WriteString(ReadOnlySpan<char> propertyName, string value, bool suppressEscaping = false);
public void WriteString(string propertyName, DateTime value, bool suppressEscaping = false);
public void WriteString(string propertyName, DateTimeOffset value, bool suppressEscaping = false);
public void WriteString(string propertyName, Guid value, bool suppressEscaping = false);
public void WriteString(string propertyName, ReadOnlySpan<byte> value, bool suppressEscaping = false);
public void WriteString(string propertyName, ReadOnlySpan<char> value, bool suppressEscaping = false);
public void WriteString(string propertyName, string value, bool suppressEscaping = false);
public void WriteStringValue(DateTime value);
public void WriteStringValue(DateTimeOffset value);
public void WriteStringValue(Guid value);
public void WriteStringValue(ReadOnlySpan<byte> value, bool suppressEscaping = false);
public void WriteStringValue(ReadOnlySpan<char> value, bool suppressEscaping = false);
public void WriteStringValue(string value, bool suppressEscaping = false);
}
} |
Have you considered making the API more interface based and providing a reasonable and safe implementation as a default and a performant option as an alternate? Either to me me should possibly take overloads on either ctors or methods to propertyvalidators or valuevalidators with defaults being either complex or noop. Wouldn’t inlining handle the performance case and suck appropriately in the correct case? (I genuinely do not know) To me formatting is a completely nice-to-have but valuable, but I think adds more value as a base type than an interface. To me if intention checking is cheap (well base syntax in general) do that always. Again correctness > performance but you should ideally be able to opt into the trust me model regardless of how wrong you are. I still maintain that I would rather get my bank balance in 20ms over 10ms over it sometimes being off by -100M. If Jeff or Bill or Steve or Satya disagree please transfer said amount to my bank (won’t make a diff to you, but somehow my creditors and kids private school does (they are just too worried about stupid decimals). I told them soo many time that I transferred the exact 99.999 thing they were asking about in the appropriate currency I chose ;) (.s and ,s are TOTALLY the same thing) (hint I am German :)) I transferred the rounded up amount in Euros (€100) as i had a hard time worrying about the dumb decimals and they still complained. Wait they didn’t really mean $100K in dollars did they? That would be insane. I am obviously being ridiculous, but my point is correctness always beats speed. Opt into speed, choose correctness. Leave it to nitwits to point to “but it is fast”. I really struggle with watching design vids as I keep thinking please choose correctness and simplicity over performance first and mega complexity. I think way too many people publicize edge case scenarios in which C++ and Redis and ... tech fail. We need a push on if you know nothing about this you will most likely be OK and, if you care, here are your alternatives. I mean this kindly. Correctness > performance, but ideal is opting into either or both or ... You guys genuinely have an impossible job! |
Not sure about this:
Listening to the Design Review to understand the reasoning. |
Where did this come from?
Or TODO in dotnet/corefxlab#2612
Didn't catch it in the design review. It does mean a struct The alternative .ctor taking So it would be control flow via |
Marking as approved based on dotnet/apireviews#82.
@jkotas provided feedback to change JsonWriter to no longer be generic. The usability concern with a generic JsonWriter was not worth the trade off of preventing boxing of a struct
Since we are no longer relying on
That is a good point and was an oversight on my part. Deferring writing directly to a span for now since we need a mechanism to grow the destination span if there is not enough space (removed that ctor). |
That's just a case of wrapping it in a struct to switch from the slowest calling convention to the fastest calling convention (see: TechEmpower/aspnetcore/PlatformBenchmarks); which is faster than the interface calls [MethodImpl(MethodImplOptions.AggressiveInlining)]
private static BufferWriter<WriterAdapter> GetWriter(PipeWriter pipeWriter)
=> new BufferWriter<WriterAdapter>(new WriterAdapter(pipeWriter));
private struct WriterAdapter : IBufferWriter<byte>
{
public PipeWriter Writer;
public WriterAdapter(PipeWriter writer) => Writer = writer;
public void Advance(int count) => Writer.Advance(count);
public Memory<byte> GetMemory(int sizeHint = 0) => Writer.GetMemory(sizeHint);
public Span<byte> GetSpan(int sizeHint = 0) => Writer.GetSpan(sizeHint);
} Which is where a
Making it an interface base closes down this speed up .
Shared generics can inline and then become direct calls rather than interface calls; if its interface based then its always an interface call regardless. If a non-inlined shared generic is significantly slower than an interface dispatch, is that more for something for the Jit to resolve/work on; rather than alter the public api for it?
People that need the extra performance will deal with the usability of the generic being worse; people that don't won't be using In practice for TechEmpower/.../BenchmarkApplication.Json.cs the change will look like this? private static void Json(PipeWriter pipeWriter)
{
var writer = GetWriter(pipeWriter);
// HTTP 1.1 OK
writer.Write(_http11OK);
// Server headers
writer.Write(_headerServer);
// Date header
writer.Write(DateHeader.HeaderBytes);
// Content-Type header
writer.Write(_headerContentTypeJson);
// Content-Length header
writer.Write(_headerContentLength);
- var jsonPayload = JsonSerializer.SerializeUnsafe(new JsonMessage { message = "Hello, World!" });
- writer.WriteNumeric((uint)jsonPayload.Count);
- // End of headers
- writer.Write(_eoh);
- // Body
- writer.Write(jsonPayload);
- writer.Commit();
+ // Save location for writing length
+ var lengthWriter = writer;
+ writer.Write(_contentLengthGap);
+ // End of headers
+ writer.Write(_eoh);
+ // Flush to the writer
+ writer.Commit();
+ var jsonWriter = new Utf8JsonWriter(pipeWriter);
+ jsonWriter.WriteStartObject();
+ jsonWriter.WriteString("message", "Hello, World!");
+ jsonWriter.WriteEndObject();
+ jsonWriter.Flush();
+ // Go back and write the length
+ lengthWriter.WriteNumeric((uint)(jsonWriter.BytesCommitted));
} |
Note: If it was generic would share the |
So now we’re forced to box and allocate 😫 |
The perf difference is staggering. I think @ahsonkhan should dig into it and if we cannot remove the overhead, consider going back to a generic writer. |
Is there anything the Jit can do on the constrained shared generics? i.e. why is a regular interface call significantly faster? (If it is?) |
You are measuring CPU cycles on a small micro-benchmark. It is not the only performance metric to worry about. The code size and startup time is important as well. It will shows the same staggering difference once you start using the generic code stamping trick - in opposite direction. I had to answer many times (last time yesterday) why the libraries using the generic code stamping micro-optimization end up with large code footprint. FWIW, 100% allocation-free JSON formatting is not mainline-enough scenario to me to warrant a public .NET API. |
Yes but wouldn't the common usage be via a shared generic? The generic just leaves the door open to other optimizations; whereas the interface shuts them down. |
But then the common usage pays extra overhead of shared generic. |
Is there anything that can be improved on how that works? i.e. here and a lot of other classes are simple generic use where for the non-inlined shared generic the constrained type could be switched for the interface (as suggested doing in the C# above) |
I think the improvement here is make the allocations cheaper or local to make the generic code-stamping to avoid allocations irrelevant. Escape analysis, manual annotations to aid escape analysis, local per-request or per-thread heaps, ... |
The code-stamping with a wrapper struct also makes the calls faster direct calls rather than via interface. Utf8JsonWriter/Reader are also the lowest level apis where performance is more sensitive; whereas a more common usage would be using a object Utf8Serializer type; which would then use these apis underneath. If you are using these apis directly then you probably care about performance more? |
@ahsonkhan, can this issue be closed, or is it still tracking adding additional APIs? |
Nope, the issue can be closed. Any other future APIs/discussions can be tracked separately. |
A JsonWriter API that supports writing UTF-8 encoded data natively with emphasis on high performance and low allocations.
Previous iteration of the APIs
Sample Usage:
From SignalR: https://github.com/ahsonkhan/SignalR/blob/9d4a51d6c1eb7cb2a68b154e107e4265fc804b7d/src/Microsoft.AspNetCore.Http.Connections.Common/NegotiateProtocol.cs#L31
For Reference:
IBufferWriter<T>
, which PipeWriter implements.Notes:
ref struct BufferWriter<T> where T : IBufferWriter<byte>
). From @benaadams:See dotnet/corefxlab#2358 (comment) for more details.
IBufferWriter<byte>
). If we can get language support to enable that, we could add a span-based factory method.Questions:
bool escape
.name
orpropertyName
?System.String
, UTF-8 string asReadOnlySpan<byte>
, and raw bytes asReadOnlySpan<byte>
that need to be Base64-encoded? Should the argument name remain value or be explicitly different? The current proposal is to add AsBase64 to the API name to help with overload resolution.cc @KrzysztofCwalina, @terrajobst, @davidfowl, @steveharter, @joshfree, @benaadams
The text was updated successfully, but these errors were encountered: