Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ValueTask to corefx #15809

Closed
benaadams opened this issue Nov 28, 2015 · 22 comments · Fixed by dotnet/corefx#4857
Closed

Add ValueTask to corefx #15809

benaadams opened this issue Nov 28, 2015 · 22 comments · Fixed by dotnet/corefx#4857
Assignees
Labels
api-approved API was approved in API review, it can be implemented area-System.Threading
Milestone

Comments

@benaadams
Copy link
Member

ValueTask being developed for System.Threading.Tasks.Channels in corefxlab is very valuable on its own and should be included independently of Channels (and earlier).

ReadAsync is defined to return a ValueTask<T>, a new struct type included in this library. ValueTask<T> is a discriminated union of a T and a Task<T>, making it allocation-free for ReadAsync<T> to synchronously return a T value it has available (in contrast to using Task.FromResult<T>, which needs to allocate a Task<T> instance). ValueTask<T> is awaitable, so most consumption of instances will be indistinguishable from with a Task<T>. If a Task<T> is needed, one can be retrieved using ValueTask<T>'s AsTask() method, which will either return the Task<T> it stores internally or will return an instance created from the T it stores.

@stephentoub
Copy link
Member

This is something we plan to do, it's just a matter of when and how. It's in the Channels library right now as a placeholder. You can see there's also a copy of it used internally in corefx, at least in one place.

@stephentoub stephentoub self-assigned this Nov 28, 2015
@davidfowl
Copy link
Member

We're going to do some measurements and get some concrete numbers. This could be huge for us (asp.net) in certain scenarios.

@mgravell
Copy link
Member

Oh hell yes. Would use that. Would be great (although I imagine that ship has sailed) if libs like ADO.NET could use that for scenarios where data is probably available, like:

while(await reader.ReadAsync(cancel)) {
    ...
}

And of course sockets etc (it reminds me of the cheap return option of the non-task async IO API), and streams.

I can't see of any nice way of retrofitting this onto all the existing code without making the APIs ugly, but: yes. So much yes. I spend my life hunting allocations.

@benaadams
Copy link
Member Author

Might loose some benefits with the Stream Async methods needing Task<int> returns; amended my Stream Api proposal https://github.com/dotnet/coreclr/issues/2179 to use ValueTask<int> rather than Task<int>; other non-value Tasks are ok as they can use cached Tasks

@JonHanna
Copy link
Contributor

Glancing through this code is like when you open a present and only then realise you'd been wanting this thing you hadn't known existed, for some time.

@stephentoub
Copy link
Member

Glad you like it :)

@omariom
Copy link
Contributor

omariom commented Nov 29, 2015

👍

@JonHanna
Copy link
Contributor

Are there plans to let async be used with this? Having an async use the non-task-building constructor when it goes through a path that doesn't hit await would be awesome. Having it do so when it does go through await but only ever got a non-task-containing ValueTask even more so.

@svick
Copy link
Contributor

svick commented Nov 30, 2015

@JonHanna I think you can generalize that to just "completed synchronously". That includes not hitting any awaits, awaiting non-task ValueTasks, awaiting already completed Tasks and awaiting any other awaitable whose IsCompleted returned true.

@stephentoub
Copy link
Member

Are there plans to let async be used with this?

No concrete plans, but it's being discussed (cc: @jaredpar, @MadsTorgersen). There is value in doing so. However, using a ValueTask<TResult> instead of a Task<TResult> isn't a pure win. There is overhead associated with it (for example, it's a field bigger, which means you're not only passing back an extra field from a method, if you're awaiting one of these it's likely going to increase the size of an async caller's state machine object by the size of a field), and that's pure overhead without benefit in the case where the operation does complete asynchronously. That means developers really need to think carefully about switching to such a ValueTask<TResult> when exposing it in an API, and if they're optimizing it, they likely also care about other involved overheads. As such, it's likely that a careful developer who cares about such costs would choose to use a non-async method that wraps an async one, with that wrapper doing argument validation, checking for a fast path, etc., e.g.

public ValueTask<Something> BarAsync(string arg)
{
    if (arg == null) throw new ArgumentNullException("arg");
    return CanUseFastPath(arg) ?
        new ValueTask<Something>(ComputeSync(arg)) :
        new ValueTask<Something>(BarAsyncCore(arg));
}

private async Task<Something> BarAsyncCore(string arg) { ... }

That's not to say there isn't value in being able to write:

public ValueTask<Something> BarAsync(string arg)
{
    if (arg == null) throw new ArgumentNullException("arg");
    return BarAsyncCore(arg);
}

private async ValueTask<Something> BarAsynccore(string arg) { ... }

There obviously is. I'm simply highlighting it to point out that ValueTask<TResult> should only be used after careful consideration, and with such careful consideration, you're likely also considering other optimizations that might lead you away from this. And of course there's cost involved in developing such a feature to the language, in developers needing to learn about it, etc.

@JonHanna
Copy link
Contributor

I think you can generalize that to just "completed synchronously"

In terms of desire, certainly. In terms of implementation I could see one being simpler than the other.

That means developers really need to think carefully about switching to such a ValueTask when exposing it in an API, and if they're optimizing it, they likely also care about other involved overheads.

If it's ever attempted and abandoned I'd still love to know how close it came to having the first example you give perform like the second. (If it's ever attempted and adopted I'll likely be doing my own experiments on the results to decide when I should and shouldn't use it anyway).

@stephentoub
Copy link
Member

I would like to add this to corefx as part of a new library: System.Threading.Tasks.Extensions.dll.

Here’s a commit containing the library; I'll submit it as a PR once approved:
stephentoub/corefx@217376f

Here’s the API surface area being added:
https://github.com/stephentoub/corefx/blob/217376f756071bfd124b5408538457f411c7865c/src/System.Threading.Tasks.Extensions/ref/System.Threading.Tasks.Extensions.cs

The motivation for adding ValueTask is performance. We still encourage folks to use Task and Task<TResult> by default as the return type from async methods. However, for cases where the async method returns a Task<TResult>, very frequently returns synchronously, and is invoked on such a hot path that the cost of allocating the Task<TResult> for the synchronous result is prohibitive, ValueTask<TResult> may be used instead. ValueTask<TResult> is a struct-based discrimated union of a Task<TResult> and a TResult. If the method completes successfully synchronously, then it doesn’t need to allocate a Task<TResult>, and can instead just return a ValueTask<TResult> created from the raw TResult. Otherwise, it can return a ValueTask<TResult> created from the Task<TResult>.

Existing examples:

@benaadams
Copy link
Member Author

it's a field bigger, which means you're not only passing back an extra field from a method

If the ValueTask return is always the last statement in each method in the call chain; will this type play well with jit tail call optimizations? Best I can seem to find is from 2009; which isn't RyuJit.

http://blogs.msdn.com/b/clrcodegeneration/archive/2009/05/11/tail-call-improvements-in-net-framework-4.aspx

  • The caller or callee returns a value type.

We changed the JIT to recognize this case as well. Specifically the X64 calling convention dictates that for value types that don’t fit in a register (1,2,4, or 8 byte sized value types), that the caller should pass a ‘hidden’ argument called the return buffer. The CLR 2 JIT would pass a new return buffer to the callee, and then copy the data from the callee’s return buffer to the caller’s return buffer. The CLR 4 runtime now recognizes when it is safe to do so, and then just passes the caller’s return buffer into the callee as its return buffer. This means even when we don’t do a tail call, the code has at least one less copy, and it enables one more case to be ‘easy’!

Which would suggest sizeof(type+ptr) > 8 byte; it wouldn't be copied at each step of the stack unwind at least?

@mgravell
Copy link
Member

I'd trail a tail-call for an unnecessary allocation in a heartbeat...

The bigger problem (for me, at least) is that a lot of the obvious places
to use this are in existing Task-based APIs: sockets, file-streams, ADO.NET
data-readers - the places that have probably read more than one "thing" and
can hand back synchronously most of the time. Without getting this down
into those layers, much of the useful scenarios may be limited.
On 30 Nov 2015 6:00 pm, "Ben Adams" notifications@github.com wrote:

If the ValueTask return is always the last statement in each method in the
call chain; will this type play well with jit tail call optimizations?


Reply to this email directly or view it on GitHub
https://github.com/dotnet/corefx/issues/4708#issuecomment-160706827.

@benaadams benaadams changed the title Add ValueTask to BCL Add ValueTask to corefx Nov 30, 2015
@stephentoub
Copy link
Member

lot of the obvious places to use this are in existing Task-based APIs

Yes. Ths is a niche thing, very valuable in a few cases and unnecessary or not applicable in the majority.

For non-generic Tasks, Task.CompletedTask is perfectly good. For generic Tasks, it's often the case that a cached task of some shape or form is sufficient to cover the majority of cases, e.g. MemoryStream/FileStream/etc.'s ReadAsync frequently returning a Task<int> containing the same result due to being asked for the same amount of data each time and having it available, and thus caching the last successful task it returned to be able to return it again. It's really only for the generic cases where caching isn't reasonable with a hot-path, frequently-synchronously-returning method that this provides value.

(I also really do want folks to think carefully about using this, as it not only has the potential to add overhead, it also complicates the programming model when it's used, at least a bit. But for the more niche cases where it adds value, it often adds a lot of value.)

@benaadams
Copy link
Member Author

returning a Task<int> containing the same result due to being asked for the same amount of data each time and having it available

A weird side-effect of this would be in certain circumstances reading with a smaller buffer; then copying to larger may be more performant as it would allocate less Tasks; rather than reading with the larger buffer which would return different values; or better reading with small windows on the larger buffer.

Though that would probably end up with far stranger code than using a ValueTask with multiple paths.

@stephentoub
Copy link
Member

A weird side-effect of this would be in certain circumstances reading with a smaller buffer; then copying to larger may be more performant as it would allocate less Tasks; rather than reading with the larger buffer which would return different values; or better reading with small windows on the larger buffer.

I didn't quite follow that, but it's often the case that an optimization applies to a particular common pattern and stepping away from that pattern causes the optimization to not apply. That would be the case here as well. It's extremely common to always read from a given stream the same number of bytes each time, and often for you to get back the number you requested (for a stream like MemoryStream, that only doesn't happen if you're done reading, and even with a stream like FileStream it tries to give you the data from a buffer and then tries to read the remaining you asked for from the file), in which case the optimization would apply. If you frequently change the number of bytes you request, then yeah, this particular optimization wouldn't be relevant. There are certainly patterns of access that do that, they're just not the most common in my experience.

@benaadams
Copy link
Member Author

For example Stream.CopyToAsync without a specified buffer size will use _DefaultCopyBufferSize = 81920 which is a pretty sizeable buffer for anything to have completed sync so unless the stream being read from always has the same amount buffered each loop will need to return a different Task with a different value.

So you may experience perf gains by specifying a much smaller buffer size; as it could then return a cached Task<int> on each read.

Assuming the following was true for the read stream (else you'd just be increasing call count for no gain)

  • The underlying stream is continuing to buffer between calls to ReadAsync
  • ReadAsync doesn't wait till the buffer is completely full before returning - e.g. a network type stream I'd hope wouldn't wait for a 82kB buffer to fill before returning

@terrajobst
Copy link
Member

terrajobst commented Dec 4, 2015

Video

No major concerns. We should agree, though, how we name types that provide a struct-based alternative to a class. Should this be a prefix or a suffix?

  • Prefix. Matches how must people read C# code, i.e. struct Foo, ValueFoo.
  • Suffix. Sorts better in IntelliSense, documentation, or any other case where APIs are sorted.

The best location seems to be System.Threading.Tasks. We need to check whether we can make this a partial façade for .NET Framework.

@i3arnon
Copy link
Contributor

i3arnon commented Dec 5, 2015

Another place where this can be very useful is TPL Dataflow as calls to await block.SendAsync(item) on most blocks complete synchronously until the BoundedCapacity is reached and only then does asynchronously blocking becomes relevant.

Also, since TDF is used for concurrency to begin with, the probability of performance being an issue is higher than normal

@stephentoub
Copy link
Member

Another place

But that's also a case where it can and does return a cached task.

@i3arnon
Copy link
Contributor

i3arnon commented Dec 5, 2015

it can and does return a cached task.

Well then... another nice surprise from TPL Dataflow :)

I should have looked at the source first.

@msftgits msftgits transferred this issue from dotnet/corefx Jan 31, 2020
@msftgits msftgits added this to the 1.0.0-rc2 milestone Jan 31, 2020
@dotnet dotnet locked as resolved and limited conversation to collaborators Jan 4, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api-approved API was approved in API review, it can be implemented area-System.Threading
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants