Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[API Proposal]: JsonSerializerOptions.UseZeroByteReads #77935

Open
MihaZupan opened this issue Nov 5, 2022 · 9 comments
Open

[API Proposal]: JsonSerializerOptions.UseZeroByteReads #77935

MihaZupan opened this issue Nov 5, 2022 · 9 comments
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json tenet-performance Performance related issue
Milestone

Comments

@MihaZupan
Copy link
Member

MihaZupan commented Nov 5, 2022

Background and motivation

Zero-byte reads are a technique used in our networking stacks (System.Net, ASP.NET, YARP) to defer renting/allocating memory until there's data available on the socket.
This can greatly reduce memory usage in applications with a lot of concurrent operations.
In the case of YARP, we've seen a 10x reduction in process memory consumption when used with WebSockets.

With Json, this would help in cases where the whole object hasn't been read into memory yet.
An API that would benefit greatly is JsonSerializer.DeserializeAsyncEnumerable which can be long-lived and spend a lot of time just waiting on IO.

Starting with .NET 7.0, the streams returned by HttpClient are able to take advantage of this technique.
System.Net.Http.Json extensions are therefore a prime candidate where we should strongly consider enabling this option by default.

Prior art:

@davidfowl I assume this wouldn't do much in ASP.NET since the HttpContext.Body already does zero-byte reads internally, right? That is unless the json implementation would be able to defer allocating its own buffers on top of using ZBRs on the underlying transport.

API Proposal

namespace System.Text.Json;

public sealed class JsonSerializerOptions
{
    // New property on existing type
    public bool UseZeroByteReads { get; set; }
}

API Usage

private static readonly JsonSerializerOptions s_jsonOptions = new() { UseZeroByteReads = true };

public static IAsyncEnumerable<Foo> DeserializeAsync(Stream stream) =>
    JsonSerializer.DeserializeAsyncEnumerable<Foo>(stream, s_jsonOptions);

Alternative Designs

If the json implementation isn't capable of deferring allocations of its own buffers while waiting for more data, the user could achieve similar performance gains by implementing their own Stream wrapper that performed zero-byte reads and passing that to JsonSerializer instead.

@MihaZupan MihaZupan added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json labels Nov 5, 2022
@ghost ghost added the untriaged New issue has not been triaged by the area owner label Nov 5, 2022
@ghost
Copy link

ghost commented Nov 5, 2022

Tagging subscribers to this area: @dotnet/area-system-text-json, @gregsdennis
See info in area-owners.md if you want to be subscribed.

Issue Details

Background and motivation

Zero-byte reads are a technique used in our networking stacks (System.Net, ASP.NET, YARP) to defer renting/allocating memory until there's data available on the socket.
This can greatly reduce memory usage in applications with a lot of concurrent operations.
In the case of YARP, we've seen a 10x reduction in process memory consumption when used with WebSockets.

With Json, this would help in cases where the whole object hasn't been read into memory yet.
An API that would benefit greatly is JsonSerializer.DeserializeAsyncEnumerable which can be long-lived and spend a lot of time just waiting on IO.

Starting with .NET 7.0, the streams returned by HttpClient are able to take advantage of this technique.
System.Net.Http.Json extensions are therefore a prime candidate where we should strongly consider enabling this option by default.

Prior art:

@davidfowl I assume this wouldn't do much in ASP.NET since the HttpContext.Body already does zero-byte reads internally, right? That is unless the json implementation would be able to defer allocating its own buffers on top of using ZBRs on the underlying transport.

API Proposal

namespace System.Text.Json;

public sealed class JsonSerializerOptions
{
    // New property on existing type
    public bool UseZeroByteReads { get; set; }
}

API Usage

private static readonly JsonSerializerOptions s_jsonOptions = new() { UseZeroByteReads = true };

public static IAsyncEnumerable<Foo> DeserializeAsync(Stream stream) =>
    JsonSerializer.DeserializeAsyncEnumerable<Foo>(stream, s_jsonOptions);

Alternative Designs

If the json implementation isn't capable of deferring allocations of its own buffers while waiting for more data, the user could achieve similar performance gains by implementing their own Stream wrapper that performed zero-byte reads and passing that to JsonSerializer instead.

Risks

YARP always uses zero-byte reads on HTTP bodies. We've seen in the past that some implementations of Stream are not aware of the concept of zero-byte reads, causing them to report premature EOF exceptions when reading 0 bytes.

If the user runs into such a scenario, they should fix that Stream implementation, or turn off UseZeroByteReads.

Author: MihaZupan
Assignees: -
Labels:

api-suggestion, area-System.Text.Json

Milestone: -

@krwq
Copy link
Member

krwq commented Nov 8, 2022

@MihaZupan Stream.Read/ReadAsync are documented that zero bytes means EOF (last sentence specifically):
https://learn.microsoft.com/en-us/dotnet/api/system.io.stream.read?view=net-7.0

The total number of bytes read into the buffer. This can be less than the number of bytes allocated in the buffer if that many bytes are not currently available, or zero (0) if the end of the stream has been reached.

If that is no longer true should we change the docs?

IMO we're working correctly here, I think it makes more sense for you to create stream wrapper if you prefer different to default behavior.

Possibly it makes more sense to add that flag to Stream itself?

@stephentoub
Copy link
Member

are documented that zero bytes means

The zero here refers to the input, i.e. asking for zero bytes.

@krwq krwq added tenet-performance Performance related issue api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Nov 23, 2022
@krwq krwq added this to the Future milestone Nov 23, 2022
@ghost ghost removed the untriaged New issue has not been triaged by the area owner label Nov 23, 2022
@krwq
Copy link
Member

krwq commented Nov 23, 2022

@MihaZupan I'm marking this as Future for now. If this is impacting your scenario significantly or if you're planning to contribute please let me know and we can bump priority. We will unlikely have time to address this in 8.0 otherwise.

@MihaZupan
Copy link
Member Author

I suppose another alternative design here would be for System.Text.Json to always use zero-byte reads and there is no API to toggle that behavior.

@MihaZupan
Copy link
Member Author

Any thoughts on this @krwq? We might not need an API review.

@MihaZupan MihaZupan added api-suggestion Early API idea and discussion, it is NOT ready for implementation and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jul 27, 2023
@ilmax
Copy link

ilmax commented Jul 27, 2023

I feel like this doesn't belong to the JsonSerializerOptions. It's something too low level compared to what's in the option IMO. As a user I am expecting to find details about serialization convention, case handling and what not there, not low level details about how the serializer allocates memory while reading from a stream.

My suggestion is to always do zero bytes read and eventually, if you really want to have a switch, implement it via AppContext or something alike.

Just my 2 cent :)

@MihaZupan MihaZupan added the untriaged New issue has not been triaged by the area owner label Sep 20, 2023
@jeffschwMSFT jeffschwMSFT removed the untriaged New issue has not been triaged by the area owner label Sep 28, 2023
@MihaZupan
Copy link
Member Author

Any thoughts on this @dotnet/area-system-text-json: #77935 (comment)?

@En3Tho
Copy link
Contributor

En3Tho commented Oct 13, 2023

I agree with @ilmax here. I think if profitable, STJ should just implement this. Zero byte reads are now used in other places in BCL by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Text.Json tenet-performance Performance related issue
Projects
None yet
Development

No branches or pull requests

6 participants