Update UTF-8 JsonWriter APIs based on previous API feedback #2612

ahsonkhan · 2018-12-07T11:11:45Z

Addressing most of dotnet/apireviews#80 + other changes.

API proposal updated: https://github.com/dotnet/corefx/issues/33552

TODO:

Finish implementation of API skeletons.
Add more tests (especially exceptions and edge cases) - outside this PR.
Add BytesWritten support.
Figure out how to deal with escaping (either implement from scratch or leverage exising APIs) - cc @GrabYourPitchforks - outside this PR.
Fixup and remove the TODOs left in code. - outside this PR.
Add state-passing and write sample/test for async support.
Re-evaluate size of JsonWriter and maybe reduce it by packing more info in available bits/custom struct layout. - outside this PR.
XML comments/documentation - outside this PR.
Consider making the type non-generic with a ctor overload that accepts span (or memory) - outside this PR.

cc @steveharter, @stephentoub, @bartonjs, @KrzysztofCwalina

ahsonkhan · 2018-12-07T11:14:14Z

src/System.Text.JsonLab/System/Text/Json/BitStack.cs

+
+namespace System.Text.JsonLab
+{
+    public struct BitStack


Ignore this. It is a copy of the internal API from https://github.com/dotnet/corefx/blob/master/src/System.Text.Json/src/System/Text/Json/BitStack.cs which I just copied for convenience. This will go away once the code moves to corefx.

ahsonkhan · 2018-12-07T11:16:08Z

src/System.Text.JsonLab/System/Text/Json/JsonConstants.cs

+        public const int MaximumUInt64Length = 20;  // i.e. 18446744073709551615
+        public const int MaximumDoubleLength = 32;  // default (i.e. 'G') TODO: Should it be 22?
+        public const int MaximumSingleLength = 32;  // default (i.e. 'G') TODO: Should it be 13?
+        public const int MaximumDecimalLength = 32; // default (i.e. 'G') TODO: Should it be 31?


What should this be? @tannergooding

I don't understand why some of these are so large: https://github.com/dotnet/coreclr/blob/80c0a0ea2446b665d13e1632422802f4bf208ae5/src/System.Private.CoreLib/shared/System/Number.NumberBuffer.cs#L14

The NumberBufferLength values are large enough to represent the exact string representation of any given input plus one for a rounding digit and one for the null-terminating character.

For example, the input with the longest exact string for a double value is double.Epsilon, whose exact representation has 767 significant digits (there are an additional 307 leading zeros, giving 1074 digits total; but we don't need to track leading or trailing zeros). If a user gives us an input string that contains all 1074 digits of this value, we need to be able to parse that. If they provide additional digits, we also have to consider them to determine which direction we round (do we round down to double.Epsilon or do we round up to the next representable value above double.Epsilon). If the 768th digit is a 5, then we may be in "midpoint-rounding" and we have to continue processing the rest of the string until we find a non-zero digit to determine if this is actually midpoint (the rest of the digits in the input were 0, in which case we round up or down to the value with the first bit set) or if we are slightly above (one or more of the digits after the 768th were non-zero) in which case we round up.

ahsonkhan · 2018-12-07T11:16:59Z

src/System.Text.JsonLab/System/Text/Json/Utf8JsonWriter.StaticHelpers.cs

+
+        public static OperationStatus EscapeString(ReadOnlySpan<char> value, Span<byte> destination, out int consumed, out int bytesWritten)
+        {
+            //    // TODO:


Intentionally commented out for now.

ahsonkhan · 2018-12-07T11:19:01Z

src/System.Text.JsonLab/System/Text/Json/Utf8JsonWriter.WriteProperties.cs

+                {
+                    ValidateWritingPropertyWithEncoding(propertyName);
+                }
+                WriteStringFormattedWithEncoding(propertyName, value)


Any suggestions on reducing code duplication here (or in general)? The code for Guid, DateTime, and DateTimeOffset is essentially identical (all calling into the Utf8Formatter).

ahsonkhan · 2018-12-07T11:41:32Z

src/System.Text.JsonLab/System/Text/Json/Utf8JsonWriter.WriteArrayValues.cs

+            if (values.Length > 0 && values.Length < (int.MaxValue - 2) / (JsonConstants.MaximumInt64Length + 1))
+            {
+                // Calculated based on the following: '[number0,number1,...,numberN]'
+                int bytesNeeded = 2 + values.Length * (1 + JsonConstants.MaximumInt64Length);


Is there a concern with pessimistically asking IBufferWriter<byte> for more bytes than needed (i.e. get the bytes needed for the worst case)?

cc @davidfowl, @pakrym - any thoughts on the approach of over-asking for the amount of bytes from IBufferWriter<byte> for the worst case? For instance, if we are writing a 4, single digit integers as a JSON array, I ask for 4*20 + 3 + 2 =85 bytes rather than 4 + 3+ 2 = 9 (for example: [1, 2, 3, 4]). This way I don't have to count the digits of each element (at most it would be 20 digits, for an Int64).

Is there a scenario where IBufferWriter has only enough space to fit the actual payload, exactly, and asking for more is incorrect, even if we only use what was actually needed?

Over-asking a little bit is fine and especially considering that IBufferWriter is free to give you less than requested if it doesn't have enough space.

MaxPossibleDepth

ahsonkhan added 10 commits December 4, 2018 15:43

Remove static factory methods.

ea5533d

Add JsonWriterOptions and start a copy with updated APIs.

a0729cc

Add WriteString overloads.

da52411

Add WriteBoolean and WriteNull key-value APIs.

00facae

Add write number (int) APIs.

6860fc9

Split up types into separate files.

a58dafa

Code refactoring and adding more tests, properties, etc.

5dbd898

Add all other WriteNumber overloads and add a test.

bb2540b

Add Guid, Date, and DateTime APIs and tests.

3a286f5

Add skeletong for other APIs and more tests.

cbf3599

ahsonkhan added the area-System.Text.Json label Dec 7, 2018

ahsonkhan self-assigned this Dec 7, 2018

ahsonkhan commented Dec 7, 2018

View reviewed changes

ahsonkhan added 11 commits December 7, 2018 11:46

Remove unnecessary test that was leftover from debugging.

f9dbe0d

Add stream and memory formatters, and add async pipe tests.

acb3268

Remove use of BufferWriter_T

9ea2e6b

Fix typo in if condition.

ff33f3f

Remove GetSpan and use Ensure

71e227f

Undo change from Ensure to GetSpan. Use GetSpan to get local span.

cc13a60

Remove unused previous token type and rename MaxDepth to

790df8b

MaxPossibleDepth

Add single value valid and invalid json tests.

598a611

Pass spans by ref instead, especially property names for the fast path.

bf6f45e

Remove WriteRawBytes, finish WriteArray, and add tests.

e1cede3

Fix build and tests.

e58703a

ahsonkhan changed the title ~~[WIP] Update UTF-8 JsonWriter APIs based on previous API feedback~~ Update UTF-8 JsonWriter APIs based on previous API feedback Dec 10, 2018

ahsonkhan merged commit 5de823f into dotnet:master Dec 10, 2018

ahsonkhan deleted the UpdateJsonWriterAPIs branch December 10, 2018 11:07

ahsonkhan mentioned this pull request Dec 21, 2018

Revert S.T.JsonLab TFMs back to netstandard1.1 #2626

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update UTF-8 JsonWriter APIs based on previous API feedback #2612

Update UTF-8 JsonWriter APIs based on previous API feedback #2612

ahsonkhan commented Dec 7, 2018 •

edited

ahsonkhan Dec 7, 2018

ahsonkhan Dec 7, 2018

tannergooding Dec 10, 2018

ahsonkhan Dec 7, 2018

ahsonkhan Dec 7, 2018

ahsonkhan Dec 7, 2018

ahsonkhan Dec 10, 2018 •

edited

pakrym Dec 10, 2018 •

edited

Update UTF-8 JsonWriter APIs based on previous API feedback #2612

Update UTF-8 JsonWriter APIs based on previous API feedback #2612

Conversation

ahsonkhan commented Dec 7, 2018 • edited

ahsonkhan Dec 7, 2018

Choose a reason for hiding this comment

ahsonkhan Dec 7, 2018

Choose a reason for hiding this comment

tannergooding Dec 10, 2018

Choose a reason for hiding this comment

ahsonkhan Dec 7, 2018

Choose a reason for hiding this comment

ahsonkhan Dec 7, 2018

Choose a reason for hiding this comment

ahsonkhan Dec 7, 2018

Choose a reason for hiding this comment

ahsonkhan Dec 10, 2018 • edited

Choose a reason for hiding this comment

pakrym Dec 10, 2018 • edited

Choose a reason for hiding this comment

ahsonkhan commented Dec 7, 2018 •

edited

ahsonkhan Dec 10, 2018 •

edited

pakrym Dec 10, 2018 •

edited