Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: CLR Thread Scheduler and Scheduler API (a.k.a. Green Threads) #50796

Closed
3 tasks
RobertBouillon opened this issue Apr 6, 2021 · 63 comments
Closed
3 tasks
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner

Comments

@RobertBouillon
Copy link

RobertBouillon commented Apr 6, 2021

Overview

Create an abstraction layer that allows the CLR to run its own thread scheduler.

What problem(s) does this solve?

  1. Deprecate complexity and instability created by the async/await pattern
  2. Improve performance of context-switching

Compatability

No breaking changes to the API. Backwards-compatible with async/await.

Tasks (Proposed)

  • Design abstract API based on existing OS-threading API
  • Proof-of-Concept for CLR-based thread scheduler
  • Design suite of tests to compare async/await against a CLR threading model

Technical Details

Async/await is an attempt to mitigate the cost of context-switching, but does so by adding significant cost and complexity to software. This proposal is a successor to async/await: an attempt to achieve the same performance gain without the cost (without the language and compiler overhead of the async/await pattern). This proposal is similar to Java's "Green Threads" and Go's "Goroutines."

There is a specific technical challenge that needs to be flushed out in a PoC, first, and that is how to persist part of the stack. During a context switch, the OS swaps out the thread's context, including its stack. This is expensive. If we're performing a user-mode context switch, we only need to persist the stack above our custom thread scheduler. This smaller stack copy is where we gain our performance. Exactly how to do this has been explored by other languages, but hasn't necessarily been solved.

Problems with the current design

Async/await is driven by a form of thread-scheduler that competes directly with the scheduler provided by the OS. While in an "async context", blocking and threading primitives are prohibited. Async/await is prohibited outside of an "async context." Thus, these execution models / contexts are mutually exclusive. Despite being mutually exclusive, it is possible to try to "block" in an async context, or to forcefully await a task in a synchronous context, with potentially catastrophic results.

In addition to the stability and pattern caveats of async/await, methods also have to be duplicated for both execution models (e.g. void Foo and Task FooAsync). Code duplication immediately suggests that some form of abstraction would be beneficial, however it is not possible to write methods that would work in both sync and async contexts (The await keyword provides necessary metadata for state machine generation). If we can achieve the same benefits of async/await without the caveats, we should, but this depends on our ability to create a high-level thread scheduler.

API Suggestions

An abstract API should allow operations to be performed against the currently executing context (similar to SycnchronizationContext, but specialized for thread scheduling).

//Threading API
static void Foo() => Thread.Sleep(0);

//Async API
static async void FooAsync() => await Task.Yield();

//Proposed Abstract API
static void foo() => Scheduler.Yield();

This would allow code to be written agnostic of the threading model: the code doesn't change whether we're using a "green" thread, an OS thread, or an async/await state-machine (presuming the latter is even still possible).

Ideally, most of the classes in the System.Threading namespace would be updated, allowing for advanced thread synchronization - something that's currently not possible with async/await.

@HaloFour
Copy link

HaloFour commented Apr 6, 2021

This would allow code to be written agnostic of the threading model: the code doesn't change whether we're using a "green" thread, an OS thread, or an async/await state-machine (presuming the latter is even still possible).

As mentioned in the previous thread, I don't think this would change much with how async/await works. At best the state machine would be converted to continuation primitives instead. That would be nice, and I think those primitives would be useful in other cases, especially if they can be used to build other forms of coroutines. But beyond that I don't imagine that the runtime would be onboard with trying to make the entire .NET ecosystem safe for green threads or that the C# team would be willing to move away from async/await.

I'd also argue that green threads are not about mitigating the costs of context switching, they're about mitigating the cost of blocking: https://inside.java/2020/08/07/loom-performance/

@CyrusNajmabadi
Copy link
Member

Async/await is an attempt to mitigate the cost of context-switching,

I don't think this is the case. Indeed, i can't recall any discussions around teh design of the features/apis here that brought that into consideration. Can you clarify this idea more to help ground the motivation of the discussion?

@CyrusNajmabadi
Copy link
Member

Async/await is driven by a form of thread-scheduler

async/await is orthogonal to threads or other concurrency concerns. It's a means of expressing asynchrony and abstracting a programming pattern over it, with internal regard to making asynchronous points both evident in code, and non-blocking by default. It was a design goal of async/await to make such points visually apparent and to not attempt to hide them behind code that looked the same as synchronous threaded code.

the code doesn't change whether we're using a "green" thread, an OS thread, or an async/await state-machine

This is an drawback to me. It's an anti-benefit and it would make many forms of development more challenging IMO.

@RobertBouillon
Copy link
Author

Can you clarify this idea more to help ground the motivation of the discussion?

I see async/await as a means to avoid blocking. Blocking presents two primary problems - performance (context switches) and deadlocks. I don't see how async/await avoids the latter, so I presume it to be the former.

...with internal regard to making asynchronous points both evident in code, and non-blocking by default.

Can you elaborate on this? What specifically makes a method "asynchronous" and why should it be non-blocking? As a developer, how do I benefit from a method being decorated as asynchronous? Why do I care if it blocks if I depend on its successful execution, anyway?

To be more specific, consider the following code:

int Download(string url) => Int32.Parse(WebClient.Download(url));

int async DownloadAsync(string url) =>Int32.Parse(await WebClient.DownloadAsync(url));

Both methods do the exact same thing. How does the extra syntax required for the async version benefit me?

I presumed this to be a performance optimization. If it's not, I see less of a reason for it to exist. I worry I'm missing something, but every time I come back to it I see traditional threading as simpler / superior to async/await.

@benaadams
Copy link
Member

Both methods do the exact same thing. How does the extra syntax required for the async version benefit me?

The async one releases the thread of the scheduler (whether UI thread, ThreadPool thread or other) to do other work while its waiting for the download result.

The first one will lock the thread until it gets the result; freezing the UI or taking the ThreadPool thread out of circulation for no useful work.

On the default scheduler (ThreadPool) for a non-UI app; async/await is already essentially acting as green threads?

@CyrusNajmabadi
Copy link
Member

Both methods do the exact same thing. How does the extra syntax required for the async version benefit me?

One releases the thread to do other things, the other does not. It definitely is a decision I care about when writing a program. I don't want threads going off and doing other with unless I tell it to do that.

@CyrusNajmabadi
Copy link
Member

Can you elaborate on this? What specifically makes a method "asynchronous"

It returns something 'task like'.

This means is represents a computation that can complete sometime in the future, not when the method returns.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

every time I come back to it I see traditional threading as simpler / superior to async/await.

It's not a performance optimization, it's a resource optimization. Threads are expensive and must be paid for upfront in preallocated memory for their stack. Blocking them on I/O creates a bottleneck in the number of concurrent operations you can have in flight.

@RobertBouillon
Copy link
Author

There are a couple of ways to look at my example:

From an OOP/API perspective, the function Download encapsulates two operations - a download and a parse. As a developer, I've successfully expressed that the parse depends on the download. "How" the thread is suspended while waiting for the download to complete is outside of the scope of the function - it needs not be defined in the body or the signature, nor should it be. We can call this "Separation of Concerns," but I think it's more fundamental OOP - the method signature is independent of the threading model. The method body should define a sequence of operations. Async/await does not add value here - only complexity.

Perhaps more important is the execution: in both cases a sequence of operations depends on external data. In both cases, the state of the currently executing thread is stored and the hardware thread is redirected to another sequence of operations until the data is available. When the data is available, the original sequence is resumed with the newly available data. Async/await is just another way of keeping the hardware thread busy, with the same objective as the OS thread scheduler, only executed in different ways.

So if we were always able to suspend and resume a thread via traditional threading, what's the benefit of async/await?

The async one releases the thread of the scheduler (whether UI thread, ThreadPool thread or other) to do other work while its waiting for the download result.

This needs to be broken down to better understand my argument. Simplified for argument's sake, a hardware thread is the capacity to execute instructions (i.e. A CPU core). A software thread is a hardware thread and a stack allocation. When you say that a thread is "released to do other work", what you're saying is that the stack is popped down to a common location so it can be reused. This is done to prevent the allocation of another software thread (or the restoration of a suspended one). So this is an optimization.

Whenever I trace the "why" for async/await down to bedrock, I always find optimization to be the underlying reason, specifically with respect to thread context. This being the case, the best way to optimize thread context would be to expressly manage contexts in the CLR. If you need to hit a nail, you find a hammer. Async/await places a significant burden on the developer and API for something that seems to be the domain of the CLR.

On the default scheduler (ThreadPool) for a non-UI app; async/await is already essentially acting as green threads?

Not really because you can't use thread synchronization primitives with async/await (they block). Green threads would restore the ability to use thread synchronization for managing critical sections. Async/await also requires the use of async/await decorations, where green threads could operate without the need for specific method signatures or syntax decorations. A synchronous method would be identical to an asynchronous one. Execution context would always be explicit.

I don't want threads going off and doing other with unless I tell it to do that.

I agree. For example, I don't want my method's execution to be arbitrarily assigned to a thread in the thread pool without my control, which is what async/await does. When I call a method, I expect the method to be executed sequentially in the current context (the current thread). In the rare cases where I want parallelism, I want to explicitly manage the context of that parallelism. I don't care that a thread could run asynchronously (any method could) - I'm calling it because I want it to run synchronously. If I wanted it to run asynchronously, I'd set up a new context (New thread or pooled thread) and call the method from there. Returning a Task seems unnecessary because most of the time I'm just going to await it, and when I don't, I expect to have to do some work to manage the parallelism. Returning a Task doesn't make managing the parallelism any easier.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

Green threads and async/await coroutines are very similar constructs. Both are continuations that capture the execution state within the method. Both allow that continuation to be resumed to continue that execution. Both require that the entire ecosystem around them be redesigned to support them.

The goal of green threads is to make "blocking" cheap, but they can't do this automatically. They can only accomplish this if no method you call actually blocks. Every single one of those potentially-blocking methods has to be written to recognize that they are running within the context of a green thread, to capture the execution state of that green thread into a continuation, to use some kind of notification mechanism to trigger resuming that continuation and to return the OS thread back to the green thread scheduler. If you block a green thread you actually block the underlying OS thread, which brings you right back to square one, except worse as those potentially millions of green threads can only be scheduled by a much smaller number of OS threads. It's quite literally impossible to avoid this. Go was written from the ground up to support this. Java is making the bet that they control the vast majority of native code integrations that they can handle all of the rewriting themselves, but JNI interfaces are basically SOL. .NET is an infinitely more diverse ecosystem and this would be quite literally impossible to do.

Green threads don't achieve anything that you're suggesting, it's just a different strategy for hiding the fact that you're moving continuations between different OS threads. It's an exceptionally heavy lift on the part of the ecosystem because quite literally every part of it needs to play along. And none of that changes how it works in the language. If anything it makes async/await that much more important to make it clear that your intent is to suspend the green thread and not accidentally block the underlying OS thread.

@CyrusNajmabadi
Copy link
Member

the method signature is independent of the threading model.

I don't view this as a good thing. If downloading and parsing may happen asynchronously, I want to know so I can plan accordingly from the code that calls it.

Perhaps the caller will choose that that's ok, and it sounds just block until that work is done. That's fine, it can make that choice. Perhaps it will also want to yield the thread of execution. If so, that's fine. Perhaps it wants to change the state of the system while that async work is happening, and change it again once complete. That's fine, it can do that. Perhaps it wants to poll for the result for some amount of spins, and then change it's strategy after a while. That's fine, it can do that. Perhaps it will decide that it would like to do something else while that is happening, perhaps explicitly concurrently. It can do that too. Etc, etc.

By handing back the entity that allows one to represent the "work that will complete at some point in the future", this code can easily do all of this, as well as easily compose over N other pieces of with like this. You bring up OO, and I would say this is the OO way if representing 'a computation in flight'. I want that to be explicit and easy to work with.

@CyrusNajmabadi
Copy link
Member

So if we were always able to suspend and resume a thread via traditional threading, what's the benefit of async/await?

To make that suspension apparent. I don't want to code and discover that my code is suspending at certain points without my being aware of it. For example, in an app, I will often want to do something special there (like update ui state). It is extremely relevant to me what work will likely be quick and which I can just call in line, versus work that day depend on io, may take indeterminate time, and which I need to behave differently around.

This is a discussion about if this stuff couture happen implicitly versus explicit user action. The answer is: yes, it could happen implicitly. It's even something we considered. However, we choose not to make it implicit as we felt that was a wise programming model, and made it harder to solve the sorts of problems that arise is modern apps that want to be responsive while doing potentially expensive and King running work.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

I don't want my method's execution to be arbitrarily assigned to a thread in the thread pool without my control

This is what happens with green threads. You don't suspend and resume a thread. You suspend and resume a continuation. That continuation is the "green thread", the captured stack space of the thread at the point it was suspended. When you resume the continuation it will be on whatever OS thread is available, which will almost certainly be a different OS thread than you originally suspended. This will certainly matter in cases of UI apps where Win32 knows nothing about the green threads and will fail if the underlying OS thread is not the same as the UI thread.

@RobertBouillon
Copy link
Author

By handing back the entity that allows one to represent the "work that will complete at some point in the future"

Can't we do this already with Task.Run? Why does this belong in the method signature? This seems like the responsibility of the caller, not the callee. Do you have an example where returning Task provides functionality that is otherwise not possible? Where the callee must manage its call context?

I will often want to do something special there (like update ui state).

In the case of a UI, you don't want the UI thread to do anything except rendering the UI, anyway. Any non-trivial method would belong on a background thread. We can't say that returning a task indicates that the method is non-trivial because it's entirely possible to code a non-trivial method that doesn't block and doesn't return a Task. If the concern is whether or not a method is long-running, returning Task doesn't seem to be the appropriate solution.

@HaloFour
Take away async/await. Forget backwards-compatibility concerns. For the sake of argument, async/await never existed. What have we lost, exactly? What is no longer possible? What is objectively worse or more difficult?

@CyrusNajmabadi
Copy link
Member

Can't we do this already with Task.Run

No. Task.Run will place the work on the threadpool. Furthermore, it forces me to have to somehow know to do this, instead of hte method actually letting me know that this is what is being done. As the consumer of any API, i very much want to know if this will happen so i can plan accordingly at the callsite.

Why does this belong in the method signature?

So that i know this is what is going on at all consumption sites.

This seems like the responsibility of the caller, not the callee.

I cannot take responsibility here unless my callee let's me know this is going on. Furthermore, i don't want them to tell me that this won't happen in one release, and then under the covers they change their impl such that this happens in the future. If this happens in the future, i want their signature to change as i must adapt and react to that new approach they are taking.

Any non-trivial method would belong on a background thread.

How do i know it's non-trivial? If the method signature tells me nothing, literally anything could be non-trivial. This is the reason we went this way, and it's why things like the windows coding guidelines explicitly call out that non-trivial methods have these signatures. It's precisely so that this information is communicated through teh type system, and cannot be ignored.

We can't say that returning a task indicates that the method is non-trivial because it's entirely possible to code a non-trivial method that doesn't block and doesn't return a Task

THis would violate the coding guidelines for the platform. Conversely though, if you do take a lot of time, you should use Task as part of those guidelines, and now your callers need to deal with that. In a green thread world. If you take a lot of time... then what? How do i, as the caller, know that? If you initially dont' take a lot of time, and later on change to take a lot of time, how do i know as a caller that that's the case? How do i appropriately break so that i must react to this?

What is objectively worse or more difficult?

It is more difficult to write responsive apps. It is more difficult to understand what individual lines and blocks of code are doing and if they will have a problematic impact on the app. It is more difficult to ensure that changes to underlying libraries do not negatively impact your app.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

@RobertBouillon

Take away async/await. Forget backwards-compatibility concerns. For the sake of argument, async/await never existed. What have we lost, exactly? What is no longer possible? What is objectively worse or more difficult?

  1. You can't forget those concerns. We're not replacing .NET, or Win32, or Mono, or Xamarin, or the millions of lines of third party native code that runs on the .NET framework, or the billions of lines of native code that can be called by the .NET framework. Whatever solution here must work in the existing ecosystem.
  2. Without async/await you end up where you were with TPL and where most async frameworks in most languages are without coroutines, composition via callback methods. Having been there I can state unequivocally that it is substantially worse. This is why async/await is becoming so popular across so many languages.

@RobertBouillon
Copy link
Author

RobertBouillon commented Apr 7, 2021

As the consumer of any API, i very much want to know if this will happen so i can plan accordingly at the callsite

I'm not following some of what you're saying when you say "this" will happen. What is "this"? I want to be more specific. What is happening that you seek to capture when Task is returned? That the operation / thread has been suspended?

THis would violate the coding guidelines for the platform.

Could you link these guidelines? (I looked and couldn't find - I apologize).
So I have an MD5 operation in a tight loop. Stream in, Stream out. You're saying this needs to be rewritten to return a Task<Stream> to conform to guidelines?

@jnm2
Copy link
Contributor

jnm2 commented Apr 7, 2021

From an OOP/API perspective, the function Download encapsulates two operations - a download and a parse. As a developer, I've successfully expressed that the parse depends on the download. "How" the thread is suspended while waiting for the download to complete is outside of the scope of the function - it needs not be defined in the body or the signature, nor should it be.

From a conceptual perspective, returning Task<int> immediately gives the caller a much richer set of possibilities than blocking until the work is finished and returning int. It's just as much of a conceptual difference as int versus Func<int> would be, or int versus Lazy<int>.

Can't we do this already with Task.Run? Why does this belong in the method signature? This seems like the responsibility of the caller, not the callee.

But the actual I/O operation is naturally described by a callback, not naturally described by blocking. Task.Run would make it async-over-sync-over-async. By returning a Task instance, the callee is exposing the callback-based nature of what it does.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 7, 2021

I'm not following some of what you're saying when you say "this" will happen. What is "this"?

Long running work that yields the current thread.

So I have an MD5 operation in a tight loop. Stream in, Stream out. You're saying this needs to be rewritten to return a Task to conform to guidelines?

If it would be expected to take a long time. That would likely be reasonable to do.

https://web.archive.org/web/20120323020957/http://blogs.msdn.com/b/windowsappdev/archive/2012/03/20/keeping-apps-fast-and-fluid-with-asynchrony-in-the-windows-runtime.aspx

These are the most likely candidates to visibly degrade performance if written synchronously (e.g. could likely take longer than 50 milliseconds to execute).

@RobertBouillon
Copy link
Author

Is this a good time to ask for clear, well-maintained guidelines?

If you're saying that I need to return a Task for any method that could take longer than 50ms, that's absurd. It's absurd because execution time could be dependent on the environment or the input. It means as I develop my software, my API would need to change based on its behavior characteristics (and dependent code refactored). It means I have to wrap synchronous code into a task just-in-case the caller wants to make it asynchronous because if it might take longer than 50ms. It's absurd because "a long time" is completely subjective. It means I can't create completely abstract interfaces because they depend on implementation. It violates mature and well-established OO principles. I hope this notion remains dead and buried in an old blog because it's 72 flavors of wrong.

I created this presuming that the reason for async/await was performance. What I'm hearing is that async/await facilitates the explicit desire to label methods as "long-running" and manage parallelism around them. Is this correct? If not, could you rephrase it succinctly for me so I can better understand it? And could you clarify if this reflects the .NET team's design objectives? I'm not clear on who's who in this discussion.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

I think that the official docs do a fairly good job of describing when and why you'd use async/await:

https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/

Even ignoring coroutines like async/await, you'd return a Task<T> or similar promise-like value in any case where you want to return to the caller before the operation represented by that task has completed. This enables the caller to do something else. This is true in every language that has the concept of futures, and that would be most of them.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 7, 2021

Is this a good time to ask for clear, well-maintained guidelines?

There could be, and likely are, entire books written on the topic :) It's extremely large, and there is a ton of information provided here already. With all docs, it likely could always be beefed up even more. I would have no problem with that :)

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 7, 2021

If you're saying that I need to return a Task for any method that could take longer than 50ms, that's absurd.

I don't think it's absurd.

or the input.

Yes. if an input could really cause things to make much more time, that really should be encoded into your sig. Because your caller needs to adapt to that. I don't want to click a button, and have my entire UI freeze because i pass a filename to you that you end up taking 50 seconds on. Yes, you may tke 10ms on some files. But if if you are going to realistically take that long, then i need to know so i can design my own code accordingly.

It means I can't create completely abstract interfaces because they depend on implementation

Then use Task in those interfaces. It accurately represents that impls may take a long time.

Note: this is not theoretical. I'm currently deprecating an existing Roslyn api that operates synchronously because we need it to be async. This does impact all consumers of this api and they do need to adjust because they should no longer assuem that this code is fast.

It's absurd because "a long time" is completely subjective

This sort of 'subjectiveness' has been part of C# from the beginning. For example, we recommend methods vs properties if they would be 'expensive'. What truly defines that? It's subjective. C'est la vie. We generally show by example and trust that through enough experience developers can get a good sense of when: "yes, this should be a property" or "no... a no-arg method makes more sense here". Api design is inherently subjective and involves taking in guidance and knowledge of the problem domain to make sensible decisions.

And, in some cases, this does mean having multiple apis for similar concepts. It's why we have IAsyncEnumerable vs IEnumerable for example.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 7, 2021

And could you clarify if this reflects the .NET team's design objectives? I'm not clear on who's who in this discussion.

I'm a member of the C# language design team. I was heavily involved in the design around async/await and the general principles we had in mind when designing it. What i'm describing is one, of many, considerations taht were considered important when we were designing this. :)

It was a design goal to make asynchrony apparent, and to make it easy to compose operations over them.

These statements reflect my own views as well as those of others i know in this space. It may not be universal. At the time we did this, the ideas around 'green threads' was on the table and was under consideration. However, due to numerous issues both at the implementation and design, we considered it less ideal than the async/await model.

What I'm hearing is that async/await facilitates the explicit desire to label methods as "long-running" and manage parallelism around them

I wouldn't use the word parallelism here necessarily :). But 'composing asynchrony' was certainly a goal.

@RobertBouillon
Copy link
Author

I appreciate the time. So help me out here -

Code, by its nature, is synchronous; code is a predefined sequence of operations. Functions are also inherently synchronous, with a single entry and exit-point - functions are fundamentally a named sequence of operations with a well-defined input and output. What breaks the simplicity is the concept of a "blocking operation." A method may cause the sequence of operations to become idle, locking up resources associated with the thread.

My understanding of async/await is that, instead of blocking, we now return a Task. This gives the caller the option to suspend execution until the method completes or continue the execution in some way. I find asynchrony to be a misnomer here because what we're really talking about is parallelism. We either await the task, preserving the code's synchronous execution, or we perform other operations, inducing parallelism. Just to be clear, when I say "preserving the code's synchronous execution", I understand that it's technically yielding instead of blocking. For all intents and purposes, though, await exists to provide the illusion of synchronous execution, the same as blocking does.

So when we talk about async/await, we're still talking about parallelism. I've seen this suggested a few times, and even in here, that async/await is orthogonal to threading, and maybe that's true conceptually, but not in practice or as-implemented. Async/await has a hard dependency on a thread scheduler that's defaulted to the thread pool, and it prevents the use of threading primitives. It's more appropriate to say async/await is mutually exclusive to threading, since you no longer have direct control over thread context. You're either pulling your own threads in a sync context or running in an async context, but mixing the two is inadvisable (and against most best practices).

As far as I can tell, async/await is just another way to fork a process. With async/await, any time a Task is returned, you can optionally execute parallel code. With threading, you must explicitly invoke a parallel process. I would argue that threading is superior for many reasons: I have access to advanced and mature synchronization primitives, I have finely-tuned control over thread execution and thread context, and my code executes as-written with no state machines or performance hits from hidden boxing / unboxing. I also don't have to deal with the two dozen gotchas because of async/await voodoo. Oh, and my abstract interfaces don't need two different versions of every method (one sync and one async) to be complete. Threading gives me options - async/await gives me restrictions.

So I find myself pursuing our conversation in search of answers again - that maybe you've already answered? What's the benefit of async/await?

If the goal was to make "asynchrony apparent" or facilitate "asynchrony composition," how was it not possible already with traditional threading? If a WebClient.Download blocks while waiting for a result, there's no asynchrony to be aware of - it's a synchronous operation that's blocking for a result. If I need to do something else while it downloads, I'd put it on another thread. In the case of IAsyncResult WebClient.DownloadAsync, asynchrony was still apparent. There's no obfuscation for async/await to demystify.

Async/await makes sense, conceptually, as an evolution of IAsyncResult, since it replaces callbacks with a hidden state machine. That's cool, and that's where it should end. The fact that async is then needed "all the way down" increases the cost way beyond the benefit.

So maybe I can see await/async for performance-critical I/O operations that return a low-level handle, but "any operation that could be long-running"? That may as well be everything.

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

@HaloFour
Copy link

HaloFour commented Apr 7, 2021

If I need to do something else while it downloads, I'd put it on another thread.

Which is a giant waste of resources. Threads are expensive, which is why we end up with pools. Blocking threadpool threads leads to exhaustion. Using threads for the sake of preventing an I/O operation from blocking is totally unnecessary when most I/O operations have OS-level facilities to notify on completion. And creating threads is the easy part, synchronization of the result is complicated. Coroutines like async/await make this kind of development orders of magnitude simpler. Nothing can be simpler than single threaded synchronous blocking code, but little can be as wasteful. Server-side it tanks throughput. Client-side it renders the UI totally unresponsive. At this point I don't know what the argument is here.

@davidfowl
Copy link
Member

I think doing a green threads implementation would be a worthwhile interesting experiment to do in runtimelab. It's difficult, far reaching and requires work at multiple layers in the stack. I'd file the issue there and discuss what it would actually take in practice (it's much harder than most think when they file issues like this and point to other languages). .NET has constraints other runtimes may not have because of the features we expose.

Anyways, that said, it's not something trivial that can be done with the list of bullets here.

@benaadams
Copy link
Member

benaadams commented Apr 8, 2021

Using traditional threading caused deadlocks and there was no clear / easy way to secure the critical section.

Use SemaphoreSlim?

await semaphoreSlim.WaitAsync();
try
{
    // Do work
}
finally
{
    semaphoreSlim.Release();
}

Can still call out to a regular method that has a lock if its fast and isn't itself making any async, I/O or blocking calls

@CyrusNajmabadi
Copy link
Member

We even have this extension to make that nicer in roslyn: :)

https://github.com/dotnet/roslyn/blob/98530991561773ac1fbc2c511f40eab211a11790/src/Compilers/Core/Portable/InternalUtilities/SemaphoreSlimExtensions.cs#L19-L24

This way you can just do:

using var _ = await semaphoreSlim.WaitAsync();
// do work

No need for the try/finally/Release bits :)

@HaloFour
Copy link

HaloFour commented Apr 8, 2021

@benaadams

C++ got async coroutines (co_await) in C++20, so one more on that list.

AFAICT Java Loom won't run under virtual threads by default, so there would likely need to be some modest user code changes at least during bootstrap. I imagine frameworks like Spring will likely handle that under the hood possibly enabled by config.

@davidfowl

I think it'd be awesome if the CLR offered an API for capturing/resuming a continuation. As with Loom I think that would be the first step towards exploring green threads and also offers primitives that would make it substantially easier for languages (and perhaps libraries) to add coroutines without having to build their own state machines.

@jnm2
Copy link
Contributor

jnm2 commented Apr 8, 2021

For example, I had an ASP.NET site that required a JWT security token to access network resources. It expired every few minutes and needed to be refreshed. I created a DI service to periodically fetch a new token. Using traditional threading caused deadlocks and there was no clear / easy way to secure the critical section. I ended up writing a custom Spin Lock using Task.Yield. Outside of async/await, this was trivial with a ReaderWriterLockSlim. This underpins my request in this issue for an abstract API for threading primitives, and is just one of many examples where async/await has made complex a task that would have otherwise been simple.

I've been in a very similar scenario and made it async all the way down. It's not the most obvious to implement, but it is possible and it is the most true to the reality of what is happening. SemaphoreSlim.WaitAsync is one way to do an async mutex.

@RobertBouillon
Copy link
Author

RobertBouillon commented Apr 8, 2021

Use SemaphoreSlim?

Any thoughts on making thread synchronization more consistent between sync and async code?

More of an observation than a question I immediately care about - why would I use SemaphoreSlim instead of lock? My access pattern warranted a Reader/Writer Lock, which isn't available in asynchronous code because the current implementation is thread-affine. Another question I have that I don't immediately care about is - if I should avoid blocking in an async method, why is it okay to use lock, as @CyrusNajmabadi has pointed out? Does this mean I can also use wait handles? Is the concern really just rooted in thread pool exhaustion?

C# has always been easy for many of the reasons a good UI is easy - because the design is intuitive. If I needed a Reader Writer Lock, there was a class by that name in the Threading namespace. I find async has made C# unintuitive - many of the resources that were natural fits or otherwise standard practice don't apply when coding asynchronously. Is there a way that can be addressed?

My first thought was abstraction - which is why I mentioned it in this issue. Perhaps provide a common API that provides threading primitives that work in both contexts? Maybe a stop-gap would be an MSDN article on thread synchronization in async? Some analysis and guidance on threading primitives in async could provide a blueprint for a more complete & intuitive solution. Stephen Cleary has something of use here - could / should this be integrated into the BCL?

@CyrusNajmabadi
Copy link
Member

More of an observation than a question I immediately care about - why would I use SemaphoreSlim instead of lock?

They have different semantics. SemaphoreSlim provides semaphore semantics. lock provides Monitor semantics.

Examples of where this matters are things like monitor allowing reentrance for the same thread, while a semaphore does not care about threads and will let anything enter as long as the count is there (or will block of it is not).

@CyrusNajmabadi
Copy link
Member

if I should avoid blocking in an async method, why is it okay to use lock

The question implies a disparity where none exists. :-) There is no rule or restriction about locking in async code. Use of synchronization primitives to protect shared state is completely fine.

What you'd likely want to avoid is unbounded synchronous blocking. Same as if I had a sync system. Here, the lock acts to just allow the invariant that both of these are updated at the same time, and adjust provides the right barriers so that the data is consistent between the writing thread and the reading thread.

@CyrusNajmabadi
Copy link
Member

If I needed a Reader Writer Lock, there was a class by that name in the Threading namespace.

That type still works, even in async code. Though I find in practice it's not really necessary as it really is a solution for when you view your work as being served by many threads, am if which are free and are running concurrently.

In a world where the primitive is Task, the questions more naturally become: "when is this running?" And "how to I share data?". For both of these I find there are a lot more powerful abstractions.

For example, ConcurrentExclusiveSchedulerPair provides the basis for a system where you may want many tasks of one sort in flight, while restricting to only running one task at a time if another sort.

Similarly, Channels add a great way to pass data along, while decoupling the production and consumption side, but also allowing them to influence each other through things like back pressure.

All the old threading stuff is still there and still works. Indeed, if threads are the right metaphor for you, you can use them. Tasks translate nicely to this as you can just represent that async, bg with your threads are doing as Tasks.

@CyrusNajmabadi
Copy link
Member

many of the resources that were natural fits or otherwise standard practice don't apply when coding asynchronously.

This is a very broad statement, without specifics. It's hard to come up with solutions there, because the problems haven't really been explicitly enumerated. For example, several of the issues listed so far (like not being allowed to lock), aren't actually problems, but perhaps stem from a misinterpretation of information somewhere down the line. If you can provide more concrete examples, that would be helpful. Thanks! :-)

@CyrusNajmabadi
Copy link
Member

Maybe a stop-gap would be an MSDN article on thread synchronization in async?

Whole I think a lot has been written on this topic (especially in the early days of async), I think more information (esp around solving real issues) can always be helpful.

I will say this: I'm constantly amazed and enthusiastic about how good the abstractions are here for so many cases. Numerous times over the last 5 years I've come to the runtime to query if there is some api that would make a current async job easier, to find out that they had a rich, will tested, family of apis in that space ready to go for me :-)

The main thing remaining now is just an async version of Lazy (nudge nudge @davidfowl ).

@RobertBouillon
Copy link
Author

RobertBouillon commented Apr 9, 2021

There is no rule or restriction about locking in async code

There isn't anything about locking, but there is about blocking, and the Monitor class emit by lock blocks. From MSDN: "Don't block, Await instead". So if a method blocks, replace it with a method that returns Task instead, and await it. Monitor.Enter blocks. WaitHandle.WaitOne blocks. Neither have Async counterparts I can use instead. Many thread synchronization classes lack Async methods, so when I run into classes that that block and have no Async methods, the logical conclusion I draw is that they are not supported and I should not use them in an async context. Even bounded blocking can cause thread pool exhaustion with enough contention, so despite your advice to the contrary, I don't see blocking as an option here, even bounded blocking.

@benaadams example with semaphoreSlim.WaitAsync seems to be the appropriate way to effectively lock in an async context, so I'll use that moving forward, but there are still many synchronization classes that lack async members.

[Reader Writer Lock] still works, even in async code

No, it really doesn't.

Though I find in practice it's not really necessary as it really is a solution for when you view your work as being served by many threads, am if which are free and are running concurrently.

In my use-case, where a REST API servicing thousands of requests per minute tries to access a shared variable that only changes once every 5 minutes, a Reader/Writer Lock was appropriate. And since I couldn't use it, I was forced to use a mutex, instead, which is far less efficient. And worth restating that outside of an async context, using a Reader / Writer Lock is very simple and intuitive.

because the problems haven't really been explicitly enumerated

Does this help?

In the interest of "explicitly enumerating my problems" henceforth:

  1. Should lock emit semaphoreSlim.WaitAsync instead of Montor.Enter while async?
  2. Many thread synchronization classes lack "Async" methods. Is this a gap, and should an issue be created? Or must we rely on third party libraries? (This zombie issue is a similar ask)
  3. ReaderWriterLockSlim does not work in async. Is this a bug that should be logged?
  4. What is the best way to close this issue? @davidfowl seems to support some effort in runtimelab. Shall I create a new issue? This one is full of contentious digressions...

@davidfowl
Copy link
Member

davidfowl commented Apr 9, 2021

What is the best way to close this issue? @davidfowl seems to support some effort in runtimelab. Shall I create a new issue? This one is full of contentious digressions...

I've been discussing this with a few of the architects on .NET and it's non-trivial to do but a very interesting nonetheless. There's nothing actionable that can be solved in an issue like this AFAIK. The conversation here isn't talking about what would actually need to be solved to move this forward.

What is the goal here? To convince the team to investigate this work, or to flesh out the concrete investigations that would need to take place in order for this to be a reality.

@CyrusNajmabadi
Copy link
Member

CyrusNajmabadi commented Apr 9, 2021

No, it really doesn't.

It really does. You are violating rules of it, which say that it is defined in terms of threads. It would be similar to locking a monitor on one thread, then going to another thread to release it.

And worth restating that outside of an async context, using a Reader / Writer Lock is very simple and intuitive.

This is not about async contexts. This is about if you use thread affined APIs across multiple threads.

Is this a bug that should be logged?

I think you're confusing async vs threads. What you are doing would not be ok regardless of async/await. If you had a readerwriterlockslim and you just did things across threads you'd have this issue as wlel. The docs even make this clear:

Exceptions
SynchronizationLockException

The current thread has not entered the lock in read mode.

It's necessary to abide by the rules of the underlying primitives. If you do need to come back on another thread (surpirsing, but possible) i woudl use a rwlock that allows for that. Note that writing such a rwlock is something that can be done. Or, if you want to use something the framework provides, i'd recommend ConcurrentExclusiveSchedulerPair.

Again, this would be like saying "threads are broken and can't be used with readerwriterlockslim" because of this very same issue. Indeed, let's try that out:

            var rwLock = new ReaderWriterLockSlim();
            new Thread(() =>
            {
                rwLock.EnterReadLock();
                new Thread(() => rwLock.ExitReadLock()).Start();
            }).Start();

This will throw as well, despite there being a single exit-read following a single enter-read. Does this mean threading is broken with this type? No. It just means you have to use the type appropriately with this your concurrency/asynchrony model.

Note that this is fairly standard and normal in the task-based world. For example, in Roslyn, there is often a requirement to use the UI thread for certain operations. This is an external requirement imposed on components, and it is a requirement they must abide by. Despite that, it's not hard to support in an async/task world. It means we cannot blindly await things without considering that, just as we couldn't blindly access those types from an incorrect thread. But, at teh same time, it means we can easily bounce between threads and access things in a compliant fashion using this model to cleanly accomplish these tasks.

@CyrusNajmabadi
Copy link
Member

Fun anecdote, my MS interview was to write a rwlock. The one i wrote was fine to use across multiple threads :) It didn't detect things like improper recursion. And it was undefined what it means if you did something like release a read lock that wasn't taken, and the same with a writelock. This was the style i learned in OS class. Good for very few requirements. Bad for detecting problems.

@CyrusNajmabadi
Copy link
Member

Should lock emit semaphoreSlim.WaitAsync instead of Montor.Enter while async?

This would likkely be a very big breaking change, and in an area that would have the chance to cause the most problems (threading primitives). I can attest to this having very little chance of every happening.

What i would recommend instead would be just using the using construct. I linked above how roslyn uses thsi so we can just write:

using var _ = await semaphoreSlim.WaitAsync();
// do work

This works nicely and even allows use of the new usingform to prevent the need to even indent the code here if you don't want. This sort of approach of using 'using' with a sync primitive works well, and decouples the coding patterns against a very specific clr type (like what 'lock' does).

@RobertBouillon
Copy link
Author

@davidfowl The goal is to replace async/await. Async/await provides seamless continuation at the expense of pervasive syntax changes, pattern changes, and language restrictions. It's my goal to find a way to achieve the same benefits as async/await without the cost. I think fibers could achieve the same benefit as async/await without the restrictions it adds to the language and the runtime. This could represent the best possible change for any software application - one where the feature set remains the same, but the complexity is reduced; a net loss in lines of code without any feature loss.

The reason I believe fibers could be the next evolution of asynchronous programming for C#/.NET is because the fundamental problem is a runtime problem, not a syntax problem, and this would move the solution closer to the problem. When we talk about continuation, we're talking about how to suspend a thread until it's ready to be worked by the processor again. The OS does this by managing a snapshot of the thread context (stack, registers, etc). Async/await does this by means of a state machine. Fibers would give the CLR more control over execution context, creating the opportunity to leverage different algorithms for thread suspension and continuation.

It's risky and it's research-heavy. Nothing about this suggests it's going to be a slam dunk. But other languages are playing with it - .NET should, too. Unlike some other languages, such as Rust and Go, .NET has a Virtual Machine. For this reason, .NET may succeed where others fail (or fail where others succeed). I think an investment into researching this could yield many opportunities, even if none of those end up providing a superior alternative to async/await.

P.S. Please excuse my ignorance as I learn as I go. It seems I've been using different terms interchangeably - Green Threads, User-Mode threads, Managed Thread Scheduler, etc. I do believe the correct term I'm looking for here is Fibers. This pitch reflects that change to the best of my ability. Please feel free to correct any misnomers. I'll later update the title / description of this issue. Thanks @benaadams

@davidfowl
Copy link
Member

davidfowl commented Apr 9, 2021

I understand that but I'm trying to understand the purpose of this thread in relation. Is it to learn or is it just for showing interest in the idea. I think there's interest but the point being touched on in this thread are scratching the surface of what it would actually take. It's a multi-year effort to pull this off in a mature runtime that's been around for 20+ years. Not something we can patch or add a couple of APIs to make work. There are some real tough tradeoffs and considerations when trying to take .NET there. I want this experiment to happen (that's why I've been discussing it with the team) but I also understand its huge and would require a herculean effort from multiple layers of the stack.

The need is heard and we can continue the discussion but its a big enough task and I was just trying to understand the purpose of this thread. A bulleted list of tasks and API suggestions are premature at best, there needs to be deep investigation in several areas (threading, stack management, JIT, preemption, GC, threadpool, pinning, APIs etc etc). Go had it much easier as they don't have any legacy and still they made some big tradeoffs to enable the approach (like the cost of interop). Rust doesn't have green threads and mirrors the model that .NET has. Java's Loom , Go, Erlang and Elixer (BEAM VM) are the only mainstream languages stackful user mode threads with preemption that I'm aware of.

@HaloFour
Copy link

HaloFour commented Apr 9, 2021

@davidfowl

I want this experiment to happen (that's why I've been discussing it with the team) but I also understand its huge and would require a herculean effort from multiple layers of the stack.

I'd love to understand the thinking here. I do imagine that it would be a herculean^2 effort to try to make the .NET runtime compatible with green threads, but much more than that is the rest of the ecosystem which would need to come along for the ride too. With the diversity of interop the potential to block seems extremely high. How could .NET possibly solve for that, or would the gamble be that enough of the runtime itself could support green threads to make such a solution feasible?

It's a multi-year effort to pull this off in a mature runtime that's been around for 20+ years.

Loom has been going since mid-2018 and I have my doubts that it'll even hit a preview release this year and can only assume that it'll spend at least another year in preview before going GA.

@RobertBouillon
Copy link
Author

@davidfowl

The purpose of the thread is determine if there's an appetite on the .NET team to pursue this endeavor, and if so, establish next steps. The scope of this project is too large for me to take on as a "pet project," and it's probably too tightly-integrated to be a bolt-on to .NET. Something like this really requires a sponsor from within the .NET team.

I see the spectrum of possible outcomes something like this:

  • MS commits resources to R&D this effort
  • The team sees potential value and are willing to review supporting data or efforts, but are unwilling to commit resources or define milestones
  • Any investment of effort would likely be wasted as it is unlikely to be reviewed or considered

This is certainly not a definitive list, but just to give a better idea of what I'm hoping to achieve with this thread.

Ultimately, I'd like to aim for some Proof-of-Concept which would materialize the value and the risks beyond that which analysis alone could achieve.

MS has a lot of threading expertise inside and outside of the .NET team (OS, C++, etc), and pulling in the right people to discuss the feasibility of Fibers as a .NET feature would be more productive than an opt-in thread on GitHub, so I think we probably got a little ahead of ourselves here.

@davidfowl
Copy link
Member

The team sees value but this is a "big rock" .NET 7/8 planning exercise at this point. I don't think we'll establish next steps on an issue like this.

@RobertBouillon
Copy link
Author

RobertBouillon commented Apr 9, 2021

@davidfowl If you're telling me that this is on the radar, then that's all I can really ask for and the objective of this thread has been achieved.

Is there anything I can do to get this pinned into one of those .NET 7/8 planning exercises? Do I just leave this issue open as a placeholder that gets tagged for future review?

If not, I'm good to close the issue.

@davidfowl
Copy link
Member

@davidfowl If you're telling me that this is on the radar, then that's all I can really ask for and the objective of this thread has been achieved.

The current state is that we recognize there's a problem and have been having discussions about what it could look like. There's no plan beyond that. If anything this counts as an upvote in this direction!

@HaloFour
Copy link

HaloFour commented Apr 9, 2021

@RobertBouillon

I do believe the correct term I'm looking for here is Fibers.

I would stick with "green threads" or "user threads". Everyone would know what you were talking about. Windows already has the concept of "Fibers" as application-scheduled threads but they are a different beast since they also require a stack to be preallocated for them which eliminates many of the benefits.

I'd be extremely interested in whatever findings the runtime teams might have in exploring adding green threads. Adding the support to create, suspend and resume green threads, that is by far the easy part. I'd be happy if the runtime only added that as delimited continuations would massively simplify writing coroutines. The hard part is rewriting all of the bits of the runtime that currently block so that they don't block, because that will still block the underlying OS thread. It also involves rewriting everything that is thread-affine so that it also understands green threads. When a green thread is resumed it will more likely than not be on a different threadpool thread, thus existing synchronization primitives still won't work without being redesigned. Same with anything thread local. There would have to be a whole conversation as to what any of this means with WinForms and other UI libraries.

I find green threads to be very compelling, and I love the experience I've had with them in Go and Java Loom EA. I don't argue against them because I don't believe they wouldn't be beneficial in the .NET world, I just can't imagine how it could be a remotely practical project. It just requires redesigning and rewriting way too much code, much of which is outside of MS' control. It's taken Java over 3 years to get as far as they have and they're likely still another year or two off, and their ecosystem is significantly more constrained and under their control.

@RobertBouillon
Copy link
Author

I would stick with "green threads" or "user threads". Everyone would know what you were talking about.

I'm inclined to agree - I think the general idea was successfully conveyed and I don't want to get hung up on pedantic semantics :)

I agree that there are some interesting challenges to solve, as well. I'm encouraged to fork and play with it myself - given that this at least within the realm of possibilities, I won't feel like I'm wasting my time. I'd love to see what comes from a detailed design session among the vast expertise within MS. Despite my criticisms of async/await, C#'s big features have been nothing short of awe-inspiring. LINQ was the software equivalent of SpaceX landing a rocket on a boat: not only did they succeed where others failed or claimed it was too hard, they made it damned pretty. I'm reminded of LINQ's brilliance anytime I'm forced to endure Java streams. I'm excited to see where this goes, even if that means waiting a decade (hopefully not).

@vladd
Copy link

vladd commented Apr 10, 2021

I wonder how did Java solve the following issue.

In C#, if your code is running in a single-threaded synchronization context, you can be sure that the fields are not changed between awaits. So basically you don't need any locks (as everything is anyway running on the same thread), and you need to reread the fields only after await.

Now, if we would replace async/await with green threads, basically every function call would be potentially blocking and thus would be potentially yielding. This means that we would need to defensively reread the fields after every function call. (Traditional locks wouldn't help, since everything happens on the same thread in this setup.) Moreover, we would need to bring the object into a consistent state before every function call. This seems to be way too restrictive.

@davidfowl
Copy link
Member

In C#, if your code is running in a single-threaded synchronization context, you can be sure that the fields are not changed between awaits.

This is a very weak guarantee (nothing in the type system prevents concurrent access) that isn't hard to solve if you have a scheduler concept. Java LOOM has support for capturing continuations and it also has an Executor concept. Continuations can be scheduled on an executor so you can have a single threaded executor (Fibers allow pluggable schedulers).

So basically you don't need any locks (as everything is anyway running on the same thread), and you need to reread the fields only after await.

Not sure how you think async await works today but I don't see how this is different to how it works.

Now, if we would replace async/await with green threads, basically every function call would be potentially blocking and thus would be potentially yielding. This means that we would need to defensively reread the fields after every function call. (Traditional locks wouldn't help, since everything happens on the same thread in this setup.) Moreover, we would need to bring the object into a consistent state before every function call. This seems to be way too restrictive.

I think Java bridges this gap by requiring the caller to create a virtual thread (formerly fiber) to run code. The JVM is just aware of when you're in one context or the other.

Java also doesn't have pervasive interop making this doable (though I'm sure they have issues with libraries that do JNI). It also doesn't have pinning (nor does go) which makes stack management a bit easier. There are hard problems to solve if you want magic yielding in .NET.

@RobertBouillon
Copy link
Author

Closing to keep the backlog free of issues that are not actionable.

Thanks to everyone who provided input. Hopefully this will make it on the roadmap, and if not, maybe we'll get to see an interesting technical write-up on why it wasn't possible.

@ghost ghost locked as resolved and limited conversation to collaborators May 12, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-System.Threading untriaged New issue has not been triaged by the area owner
Projects
None yet
Development

No branches or pull requests

7 participants