-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Developers using JsonSerializer can asynchronously (de)serialize IAsyncEnumerable<T> #1570
Comments
You guys might be able to point me to if my issue with quick code for this is HttpClient or MVC related? dotnet/aspnetcore#12883 (comment) |
From @tangkhaiphuong in https://github.com/dotnet/corefx/issues/41378
|
Hi, Can you try on Linux and we’ll as windows, I’ve got an issue open where I see this on windows but it streams on Linux as the host, Cheers, Ross |
For |
@steveharter #1569 has just deserialization APIs - might be good to consolidate these two issues. |
I would love to see this. Major perf enhancement for both memory and time to first data to clients (and clients could handle the stream instead if they wanted and parse the stream in json immediately instead of waiting for the entire response which has a perf benefit too.) The key is to make sure that the entire MVC pipeline can handle the IAsyncEnumerable so that if you return ActionResult it will stream it to the client. And it should also work on a wrapper object. So if I have: public class BaseResponse { It should serialize this over the wire in a streaming manner to the clients without ever loading the entire thing into memory when I return ActionResult and return it a base response with an IAsyncEnumerable in it. I.e. MVC needs to stream it instead of just dumping the string produced by the serializer, and the serializer needs to stream the object as it goes and handle the sub IAsyncEnumerable properly and stream the array out properly. To me that's the acceptance test on this one. Stream it out as json incrementally in both those cases, and the client can either wait for the entire string and jsonify it, or grab the stream and stream it using a json reader to deserialize. |
From @steveharter in #1569:
|
Not saying this isn't a desirable feature, but doesn't This tiny test app shows that
So, yes, support for
|
The purpose of this is the following: Database query is executed, mapped to DTOs as part of the select, and then sent to the client. Right now, the entire recordset is loaded into memory and returned as an ienumerable that is then sent on to the client in whatever serialization format you request, and the client may or may not stream that. In this scenario, that database query is returned as an IAsyncEnumerable instead, and then streams record by record that data into the serializer which is then streamed out of the HTTP response. In the later, the response starts sooner, and only the memory necessary for the single record is required at any given time instead of the entire recordset. Notice how much more efficient and responsive the latter is versus the former? That's what this is about. Streaming data from a source, to the client, without having to load the entire source into memory, which isn't possible right now with asp.net core except for files. Using IAsyncEnumerable and ensuring that the entire serialization and response pipeline supports it without caching in memory solves this limitation and in real world scenarios means a HUGE perf improvement on your most expensive operations. (and since almost all REST calls are read in most real world scenarios, this has a huge impact on scalability and responsiveness for most APIs) |
@JohnGalt1717 not to mention it throws after 8192 records (IIRC) |
That's all true but missing my point. It's not buffering and blowing up because it's This issue's description is:
The premise of this description is that There is a problem, though, in that This code illustrates this:
|
I don't believe that's actually true. I've tried it, and if I return an IAsyncEnumerable from EF Core on the mvc endpoint, it loads the entire thing in memory and then streams it to the client via the serializer. If I have this: public async Task<IAsyncEnumerable> SomeMethod() { It should stream record by record through the pipeline as a streamed serialized result. It absolutely doesn't do anything of the sort right now, the serializer gets the ENTIRE resultset in memory and then serializes. That's what this ticket is partially about. The other part is making the entire pipeline non-blocking so that it runs in it's own thread the entire way from ef core query to client receiving the data in a stream. And it should do so without me having to jump through hoops in the action result method and do so based on the accepts header. |
Oh? I tried passing Making something An
It is true that no blocking at all would be even better, and But it is categorically not the case that using I don't know the internals at all, so this could be entirely wrong, but my speculation is that making the ASP.NET Core controller infrastructure stream its results out instead of buffering them would be a major refactoring and not a simple change at all, requiring that control be inverted across different layers of the flow. Without that change, it is largely irrelevant whether But what wouldn't be a major refactoring would be to have As I understand it, the example I showed with |
@logiclrd is right that there’s 2 types of “async” happening here:
ASP.NET is currently buffering to avoid having to implement custom array serialization, that should be in the JSON serializer (hence this issue). I also agree that for EF it’s not as big a deal if results are produced “quickly” enough then there’s blocking only as each item is being produced not written to the response. That’s not great for then general case of streaming though as it’ll block a thread while waiting for the next result to be produced. |
I don't think ASP.NET is buffering only because of System.Text.Json not supporting |
Okay, I tested it and I was wrong. In my earlier testing, I just didn't wait long enough for it to fill the buffer. The test had a delay between objects returned, and that delay meant the data rate was low enough that it could run for quite some time without producing any output. There is in fact a serious bug, though: If you return an So, maybe the solution in https://github.com/aspnet/AspNetCore/pull/11118/files is wrong. If it used the adapter I showed in my first comment on this issue, then it could pull records from an |
I initially arrived at this issue with ASP.NET Core-coloured glasses, and feel a bit foolish now, as it seems clear that this issue was in fact created for the upstream problem in I see that @steveharter self-assigned this issue back in November, but there haven't been further updates w.r.t. an implementation. I have done some experimentation locally and would like to consider submitting a PR to make System.Text.Json support serializing The preceding issue #1569 talks about deserializing
In order to stream the deserialization of this, the underlying data stream would need to be in two places at once, because there's no way to know if the caller wants I can't think of anything better than to make a |
@steveharter \ @layomia would you mind pinging me once this feature is in? We'll need to teach MVC to take advantage of this feature once it lands. |
Are you assuming only 1 nested level deep ( Supporting all levels is possible, but I think the majority of users would be surprised that the serializer is "faulting in all delayed-loaded IAsyncEnumerable properties" which perhaps could write a huge amount of unnecessary and unexpected data to the Stream. In addition, it wouldn't round-trip as expected on the DeserializeAsync* methods. If only 1 nested level is supported, that may be difficult to document and\or discover from a usability perspective. If supporting |
Following feedback from the latest API review, we have decided to pivot on the design: I've been prototyping a JsonConverter implementation for The design comes with a few caveats that are worth noting:
|
Could you explain what the results would be for this type: https://github.com/dotnet/efcore/blob/809fe7a219c99fbe4f3576ef42bd084d4a6ce056/src/EFCore/Query/Internal/EntityQueryable%60.cs#L24-L26 if the call to Serialize looked like so |
Off the top of my head:
|
@pranavkm My analysis is that that type would require special handling, because from @eiriktsarpalis's description, it would simply not be supported at all, and even with my proposal, the presence of members such as |
If the |
I tend to agree, one potential mitigation might be to have users opt into any IAsyncEnumerable serialization using a flag in |
Alternatively requiring the user to register the JsonConverter. Either option seems like a better way to mitigate the risk of accidentally streaming too much content while making the API feel consistent. |
Following feedback I have updated the prototype so that the IAsyncEnumerable converter becomes a public class users can opt into. This can be done either by registering the converter in a var options = new JsonSerializerOptions { Converters = { new JsonAsyncEnumerableConverter() } }; or by annotating relevant properties: public class AsyncEnumerableDto<TElement>
{
[JsonConverter(typeof(JsonAsyncEnumerableConverter))]
public IAsyncEnumerable<TElement> Data { get; set; }
}
Proposed APInamespace System.Text.Json.Serialization
{
public class JsonAsyncEnumerableConverter : JsonConverterFactory
{
public JsonAsyncEnumerableConverter() { }
}
} Optional namespace System.Text.Json
{
public static class JsonSerializer
{
public static IAsyncEnumerable<TValue?> DeserializeAsyncEnumerable<TValue>(Stream utf8Json, JsonSerializerOptions? options = null) { throw null; }
}
} |
This looks great. MVC will enable this converter by default and we'll document how you could undo it if you absolutely need to, but this should cover all of our 6.0 scenarios.
Sounds practical. |
Enabling it by default sounds risky to me. Users might have DTOs that unbeknownst to them implement IAsyncEnumerable in strange ways (e.g. because some base class is doing it). Wouldn't it make more sense to use a dedicated
It can be convenient, but there are gotchas involved. Really it would just be a |
We have this problem, today so enabling it by default would be an improvement 😄 |
Doing it across a major version change somewhat makes it more "okay" for this breakage to occur, and it then illuminates this weirdness in the consuming codebases so that it can be sorted out properly. :-) |
I've updated the prototype to include optional support for deserialization. Usage ExamplesUsers can opt into IAsyncEnumerable serialization by registering a new converter factory: var options = new JsonSerializerOptions { Converters = { new JsonAsyncEnumerableConverter() } }; or by annotating relevant properties: public class AsyncEnumerableDto<TElement>
{
[JsonConverter(typeof(JsonAsyncEnumerableConverter))]
public IAsyncEnumerable<TElement> Data { get; set; }
} Serialization Exampleasync IAsyncEnumerable<int> PrintNumbers(int n)
{
for (int i = 0; i < n; i++) yield return i;
}
Stream stream = Console.OpenStandardOutput();
await JsonSerializer.SerializeAsync(new AsyncEnumerableDto { Data = PrintNumbers(5) }, options); // prints { "Data" : [0,1,2,3,4] } Deserialization ExampleStreaming IAsyncEnumerable deserialization can only happen using the dedicated Stream utf8Json = new MemoryStream(Encoding.UTF8.GetBytes(@"[ { ""value"" : 0 }, { ""value"" : 1 } ]"));
await foreach(MyPoco item in JsonSerializer.DeserializeAsyncEnumerable<MyPoco>(stream, options))
{
count += item.Value;
} Additional Information
Proposed APInamespace System.Text.Json.Serialization
{
public class JsonAsyncEnumerableConverter : JsonConverterFactory
{
public JsonAsyncEnumerableConverter() { }
public JsonAsyncEnumerableConverter(bool supportDeserialization) { }
public bool SupportsDeserialization { get; }
}
} Optional namespace System.Text.Json
{
public static class JsonSerializer
{
public static IAsyncEnumerable<TValue?> DeserializeAsyncEnumerable<TValue>(Stream utf8Json, JsonSerializerOptions? options = null) { throw null; }
}
} |
namespace System.Text.Json
{
public partial class JsonSerializer
{
public static IAsyncEnumerable<TValue?> DeserializeAsyncEnumerable<TValue>(Stream utf8Json, JsonSerializerOptions? options = null);
}
} |
Would the serialization requiring opting in? |
Both serialization and buffered deserialization would be turned on by default. Note that the value would need to be explicitly of type |
It would serialize to a JSON array but in a streaming manner. This would be very useful for things like MVC that want to support returning entity framework queries (which implement IAsyncEnumerable) to the response steam without buffering the entire enumeration first (which is what is currently being implemented https://github.com/aspnet/AspNetCore/pull/11118/files).
EDIT @eiriktsarpalis see #1570 (comment) for the full API proposal.
The text was updated successfully, but these errors were encountered: