-
Notifications
You must be signed in to change notification settings - Fork 4.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: CLR Thread Scheduler and Scheduler API (a.k.a. Green Threads) #50796
Comments
As mentioned in the previous thread, I don't think this would change much with how I'd also argue that green threads are not about mitigating the costs of context switching, they're about mitigating the cost of blocking: https://inside.java/2020/08/07/loom-performance/ |
I don't think this is the case. Indeed, i can't recall any discussions around teh design of the features/apis here that brought that into consideration. Can you clarify this idea more to help ground the motivation of the discussion? |
This is an drawback to me. It's an anti-benefit and it would make many forms of development more challenging IMO. |
I see async/await as a means to avoid blocking. Blocking presents two primary problems - performance (context switches) and deadlocks. I don't see how async/await avoids the latter, so I presume it to be the former.
Can you elaborate on this? What specifically makes a method "asynchronous" and why should it be non-blocking? As a developer, how do I benefit from a method being decorated as asynchronous? Why do I care if it blocks if I depend on its successful execution, anyway? To be more specific, consider the following code: int Download(string url) => Int32.Parse(WebClient.Download(url));
int async DownloadAsync(string url) =>Int32.Parse(await WebClient.DownloadAsync(url)); Both methods do the exact same thing. How does the extra syntax required for the async version benefit me? I presumed this to be a performance optimization. If it's not, I see less of a reason for it to exist. I worry I'm missing something, but every time I come back to it I see traditional threading as simpler / superior to async/await. |
The The first one will lock the thread until it gets the result; freezing the UI or taking the ThreadPool thread out of circulation for no useful work. On the default scheduler (ThreadPool) for a non-UI app; |
One releases the thread to do other things, the other does not. It definitely is a decision I care about when writing a program. I don't want threads going off and doing other with unless I tell it to do that. |
It returns something 'task like'. This means is represents a computation that can complete sometime in the future, not when the method returns. |
It's not a performance optimization, it's a resource optimization. Threads are expensive and must be paid for upfront in preallocated memory for their stack. Blocking them on I/O creates a bottleneck in the number of concurrent operations you can have in flight. |
There are a couple of ways to look at my example: From an OOP/API perspective, the function Perhaps more important is the execution: in both cases a sequence of operations depends on external data. In both cases, the state of the currently executing thread is stored and the hardware thread is redirected to another sequence of operations until the data is available. When the data is available, the original sequence is resumed with the newly available data. Async/await is just another way of keeping the hardware thread busy, with the same objective as the OS thread scheduler, only executed in different ways. So if we were always able to suspend and resume a thread via traditional threading, what's the benefit of async/await?
This needs to be broken down to better understand my argument. Simplified for argument's sake, a hardware thread is the capacity to execute instructions (i.e. A CPU core). A software thread is a hardware thread and a stack allocation. When you say that a thread is "released to do other work", what you're saying is that the stack is popped down to a common location so it can be reused. This is done to prevent the allocation of another software thread (or the restoration of a suspended one). So this is an optimization. Whenever I trace the "why" for async/await down to bedrock, I always find optimization to be the underlying reason, specifically with respect to thread context. This being the case, the best way to optimize thread context would be to expressly manage contexts in the CLR. If you need to hit a nail, you find a hammer. Async/await places a significant burden on the developer and API for something that seems to be the domain of the CLR.
Not really because you can't use thread synchronization primitives with async/await (they block). Green threads would restore the ability to use thread synchronization for managing critical sections. Async/await also requires the use of async/await decorations, where green threads could operate without the need for specific method signatures or syntax decorations. A synchronous method would be identical to an asynchronous one. Execution context would always be explicit.
I agree. For example, I don't want my method's execution to be arbitrarily assigned to a thread in the thread pool without my control, which is what async/await does. When I call a method, I expect the method to be executed sequentially in the current context (the current thread). In the rare cases where I want parallelism, I want to explicitly manage the context of that parallelism. I don't care that a thread could run asynchronously (any method could) - I'm calling it because I want it to run synchronously. If I wanted it to run asynchronously, I'd set up a new context (New thread or pooled thread) and call the method from there. Returning a |
Green threads and The goal of green threads is to make "blocking" cheap, but they can't do this automatically. They can only accomplish this if no method you call actually blocks. Every single one of those potentially-blocking methods has to be written to recognize that they are running within the context of a green thread, to capture the execution state of that green thread into a continuation, to use some kind of notification mechanism to trigger resuming that continuation and to return the OS thread back to the green thread scheduler. If you block a green thread you actually block the underlying OS thread, which brings you right back to square one, except worse as those potentially millions of green threads can only be scheduled by a much smaller number of OS threads. It's quite literally impossible to avoid this. Go was written from the ground up to support this. Java is making the bet that they control the vast majority of native code integrations that they can handle all of the rewriting themselves, but JNI interfaces are basically SOL. .NET is an infinitely more diverse ecosystem and this would be quite literally impossible to do. Green threads don't achieve anything that you're suggesting, it's just a different strategy for hiding the fact that you're moving continuations between different OS threads. It's an exceptionally heavy lift on the part of the ecosystem because quite literally every part of it needs to play along. And none of that changes how it works in the language. If anything it makes |
I don't view this as a good thing. If downloading and parsing may happen asynchronously, I want to know so I can plan accordingly from the code that calls it. Perhaps the caller will choose that that's ok, and it sounds just block until that work is done. That's fine, it can make that choice. Perhaps it will also want to yield the thread of execution. If so, that's fine. Perhaps it wants to change the state of the system while that async work is happening, and change it again once complete. That's fine, it can do that. Perhaps it wants to poll for the result for some amount of spins, and then change it's strategy after a while. That's fine, it can do that. Perhaps it will decide that it would like to do something else while that is happening, perhaps explicitly concurrently. It can do that too. Etc, etc. By handing back the entity that allows one to represent the "work that will complete at some point in the future", this code can easily do all of this, as well as easily compose over N other pieces of with like this. You bring up OO, and I would say this is the OO way if representing 'a computation in flight'. I want that to be explicit and easy to work with. |
To make that suspension apparent. I don't want to code and discover that my code is suspending at certain points without my being aware of it. For example, in an app, I will often want to do something special there (like update ui state). It is extremely relevant to me what work will likely be quick and which I can just call in line, versus work that day depend on io, may take indeterminate time, and which I need to behave differently around. This is a discussion about if this stuff couture happen implicitly versus explicit user action. The answer is: yes, it could happen implicitly. It's even something we considered. However, we choose not to make it implicit as we felt that was a wise programming model, and made it harder to solve the sorts of problems that arise is modern apps that want to be responsive while doing potentially expensive and King running work. |
This is what happens with green threads. You don't suspend and resume a thread. You suspend and resume a continuation. That continuation is the "green thread", the captured stack space of the thread at the point it was suspended. When you resume the continuation it will be on whatever OS thread is available, which will almost certainly be a different OS thread than you originally suspended. This will certainly matter in cases of UI apps where Win32 knows nothing about the green threads and will fail if the underlying OS thread is not the same as the UI thread. |
Can't we do this already with
In the case of a UI, you don't want the UI thread to do anything except rendering the UI, anyway. Any non-trivial method would belong on a background thread. We can't say that returning a task indicates that the method is non-trivial because it's entirely possible to code a non-trivial method that doesn't block and doesn't return a @HaloFour |
No. Task.Run will place the work on the threadpool. Furthermore, it forces me to have to somehow know to do this, instead of hte method actually letting me know that this is what is being done. As the consumer of any API, i very much want to know if this will happen so i can plan accordingly at the callsite.
So that i know this is what is going on at all consumption sites.
I cannot take responsibility here unless my callee let's me know this is going on. Furthermore, i don't want them to tell me that this won't happen in one release, and then under the covers they change their impl such that this happens in the future. If this happens in the future, i want their signature to change as i must adapt and react to that new approach they are taking.
How do i know it's non-trivial? If the method signature tells me nothing, literally anything could be non-trivial. This is the reason we went this way, and it's why things like the windows coding guidelines explicitly call out that non-trivial methods have these signatures. It's precisely so that this information is communicated through teh type system, and cannot be ignored.
THis would violate the coding guidelines for the platform. Conversely though, if you do take a lot of time, you should use Task as part of those guidelines, and now your callers need to deal with that. In a green thread world. If you take a lot of time... then what? How do i, as the caller, know that? If you initially dont' take a lot of time, and later on change to take a lot of time, how do i know as a caller that that's the case? How do i appropriately break so that i must react to this?
It is more difficult to write responsive apps. It is more difficult to understand what individual lines and blocks of code are doing and if they will have a problematic impact on the app. It is more difficult to ensure that changes to underlying libraries do not negatively impact your app. |
|
I'm not following some of what you're saying when you say "this" will happen. What is "this"? I want to be more specific. What is happening that you seek to capture when
Could you link these guidelines? (I looked and couldn't find - I apologize). |
From a conceptual perspective, returning
But the actual I/O operation is naturally described by a callback, not naturally described by blocking. Task.Run would make it async-over-sync-over-async. By returning a Task instance, the callee is exposing the callback-based nature of what it does. |
Long running work that yields the current thread.
If it would be expected to take a long time. That would likely be reasonable to do.
|
Is this a good time to ask for clear, well-maintained guidelines? If you're saying that I need to return a I created this presuming that the reason for async/await was performance. What I'm hearing is that async/await facilitates the explicit desire to label methods as "long-running" and manage parallelism around them. Is this correct? If not, could you rephrase it succinctly for me so I can better understand it? And could you clarify if this reflects the .NET team's design objectives? I'm not clear on who's who in this discussion. |
I think that the official docs do a fairly good job of describing when and why you'd use https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/concepts/async/ Even ignoring coroutines like |
There could be, and likely are, entire books written on the topic :) It's extremely large, and there is a ton of information provided here already. With all docs, it likely could always be beefed up even more. I would have no problem with that :) |
I don't think it's absurd.
Yes. if an input could really cause things to make much more time, that really should be encoded into your sig. Because your caller needs to adapt to that. I don't want to click a button, and have my entire UI freeze because i pass a filename to you that you end up taking 50 seconds on. Yes, you may tke 10ms on some files. But if if you are going to realistically take that long, then i need to know so i can design my own code accordingly.
Then use Task in those interfaces. It accurately represents that impls may take a long time. Note: this is not theoretical. I'm currently deprecating an existing Roslyn api that operates synchronously because we need it to be async. This does impact all consumers of this api and they do need to adjust because they should no longer assuem that this code is fast.
This sort of 'subjectiveness' has been part of C# from the beginning. For example, we recommend methods vs properties if they would be 'expensive'. What truly defines that? It's subjective. C'est la vie. We generally show by example and trust that through enough experience developers can get a good sense of when: "yes, this should be a property" or "no... a no-arg method makes more sense here". Api design is inherently subjective and involves taking in guidance and knowledge of the problem domain to make sensible decisions. And, in some cases, this does mean having multiple apis for similar concepts. It's why we have IAsyncEnumerable vs IEnumerable for example. |
I'm a member of the C# language design team. I was heavily involved in the design around async/await and the general principles we had in mind when designing it. What i'm describing is one, of many, considerations taht were considered important when we were designing this. :) It was a design goal to make asynchrony apparent, and to make it easy to compose operations over them. These statements reflect my own views as well as those of others i know in this space. It may not be universal. At the time we did this, the ideas around 'green threads' was on the table and was under consideration. However, due to numerous issues both at the implementation and design, we considered it less ideal than the async/await model.
I wouldn't use the word parallelism here necessarily :). But 'composing asynchrony' was certainly a goal. |
I appreciate the time. So help me out here - Code, by its nature, is synchronous; code is a predefined sequence of operations. Functions are also inherently synchronous, with a single entry and exit-point - functions are fundamentally a named sequence of operations with a well-defined input and output. What breaks the simplicity is the concept of a "blocking operation." A method may cause the sequence of operations to become idle, locking up resources associated with the thread. My understanding of async/await is that, instead of blocking, we now return a So when we talk about async/await, we're still talking about parallelism. I've seen this suggested a few times, and even in here, that async/await is orthogonal to threading, and maybe that's true conceptually, but not in practice or as-implemented. Async/await has a hard dependency on a thread scheduler that's defaulted to the thread pool, and it prevents the use of threading primitives. It's more appropriate to say async/await is mutually exclusive to threading, since you no longer have direct control over thread context. You're either pulling your own threads in a sync context or running in an async context, but mixing the two is inadvisable (and against most best practices). As far as I can tell, async/await is just another way to fork a process. With async/await, any time a So I find myself pursuing our conversation in search of answers again - that maybe you've already answered? What's the benefit of async/await? If the goal was to make "asynchrony apparent" or facilitate "asynchrony composition," how was it not possible already with traditional threading? If a Async/await makes sense, conceptually, as an evolution of So maybe I can see await/async for performance-critical I/O operations that return a low-level handle, but "any operation that could be long-running"? That may as well be everything. |
Which is a giant waste of resources. Threads are expensive, which is why we end up with pools. Blocking threadpool threads leads to exhaustion. Using threads for the sake of preventing an I/O operation from blocking is totally unnecessary when most I/O operations have OS-level facilities to notify on completion. And creating threads is the easy part, synchronization of the result is complicated. Coroutines like |
I think doing a green threads implementation would be a worthwhile interesting experiment to do in runtimelab. It's difficult, far reaching and requires work at multiple layers in the stack. I'd file the issue there and discuss what it would actually take in practice (it's much harder than most think when they file issues like this and point to other languages). .NET has constraints other runtimes may not have because of the features we expose. Anyways, that said, it's not something trivial that can be done with the list of bullets here. |
Use await semaphoreSlim.WaitAsync();
try
{
// Do work
}
finally
{
semaphoreSlim.Release();
} Can still call out to a regular method that has a lock if its fast and isn't itself making any async, I/O or blocking calls |
We even have this extension to make that nicer in roslyn: :) This way you can just do: using var _ = await semaphoreSlim.WaitAsync();
// do work No need for the try/finally/Release bits :) |
C++ got async coroutines ( AFAICT Java Loom won't run under virtual threads by default, so there would likely need to be some modest user code changes at least during bootstrap. I imagine frameworks like Spring will likely handle that under the hood possibly enabled by config. I think it'd be awesome if the CLR offered an API for capturing/resuming a continuation. As with Loom I think that would be the first step towards exploring green threads and also offers primitives that would make it substantially easier for languages (and perhaps libraries) to add coroutines without having to build their own state machines. |
I've been in a very similar scenario and made it async all the way down. It's not the most obvious to implement, but it is possible and it is the most true to the reality of what is happening. SemaphoreSlim.WaitAsync is one way to do an async mutex. |
Any thoughts on making thread synchronization more consistent between sync and async code? More of an observation than a question I immediately care about - why would I use C# has always been easy for many of the reasons a good UI is easy - because the design is intuitive. If I needed a Reader Writer Lock, there was a class by that name in the Threading namespace. I find async has made C# unintuitive - many of the resources that were natural fits or otherwise standard practice don't apply when coding asynchronously. Is there a way that can be addressed? My first thought was abstraction - which is why I mentioned it in this issue. Perhaps provide a common API that provides threading primitives that work in both contexts? Maybe a stop-gap would be an MSDN article on thread synchronization in async? Some analysis and guidance on threading primitives in async could provide a blueprint for a more complete & intuitive solution. Stephen Cleary has something of use here - could / should this be integrated into the BCL? |
They have different semantics. SemaphoreSlim provides semaphore semantics. Examples of where this matters are things like monitor allowing reentrance for the same thread, while a semaphore does not care about threads and will let anything enter as long as the count is there (or will block of it is not). |
The question implies a disparity where none exists. :-) There is no rule or restriction about locking in async code. Use of synchronization primitives to protect shared state is completely fine. What you'd likely want to avoid is unbounded synchronous blocking. Same as if I had a sync system. Here, the lock acts to just allow the invariant that both of these are updated at the same time, and adjust provides the right barriers so that the data is consistent between the writing thread and the reading thread. |
That type still works, even in async code. Though I find in practice it's not really necessary as it really is a solution for when you view your work as being served by many threads, am if which are free and are running concurrently. In a world where the primitive is Task, the questions more naturally become: "when is this running?" And "how to I share data?". For both of these I find there are a lot more powerful abstractions. For example, ConcurrentExclusiveSchedulerPair provides the basis for a system where you may want many tasks of one sort in flight, while restricting to only running one task at a time if another sort. Similarly, Channels add a great way to pass data along, while decoupling the production and consumption side, but also allowing them to influence each other through things like back pressure. All the old threading stuff is still there and still works. Indeed, if threads are the right metaphor for you, you can use them. Tasks translate nicely to this as you can just represent that async, bg with your threads are doing as Tasks. |
This is a very broad statement, without specifics. It's hard to come up with solutions there, because the problems haven't really been explicitly enumerated. For example, several of the issues listed so far (like not being allowed to lock), aren't actually problems, but perhaps stem from a misinterpretation of information somewhere down the line. If you can provide more concrete examples, that would be helpful. Thanks! :-) |
Whole I think a lot has been written on this topic (especially in the early days of async), I think more information (esp around solving real issues) can always be helpful. I will say this: I'm constantly amazed and enthusiastic about how good the abstractions are here for so many cases. Numerous times over the last 5 years I've come to the runtime to query if there is some api that would make a current async job easier, to find out that they had a rich, will tested, family of apis in that space ready to go for me :-) The main thing remaining now is just an async version of Lazy (nudge nudge @davidfowl ). |
There isn't anything about locking, but there is about blocking, and the @benaadams example with
No, it really doesn't.
In my use-case, where a REST API servicing thousands of requests per minute tries to access a shared variable that only changes once every 5 minutes, a Reader/Writer Lock was appropriate. And since I couldn't use it, I was forced to use a mutex, instead, which is far less efficient. And worth restating that outside of an async context, using a Reader / Writer Lock is very simple and intuitive.
Does this help? In the interest of "explicitly enumerating my problems" henceforth:
|
I've been discussing this with a few of the architects on .NET and it's non-trivial to do but a very interesting nonetheless. There's nothing actionable that can be solved in an issue like this AFAIK. The conversation here isn't talking about what would actually need to be solved to move this forward. What is the goal here? To convince the team to investigate this work, or to flesh out the concrete investigations that would need to take place in order for this to be a reality. |
It really does. You are violating rules of it, which say that it is defined in terms of threads. It would be similar to locking a monitor on one thread, then going to another thread to release it.
This is not about async contexts. This is about if you use thread affined APIs across multiple threads.
I think you're confusing async vs threads. What you are doing would not be ok regardless of async/await. If you had a readerwriterlockslim and you just did things across threads you'd have this issue as wlel. The docs even make this clear:
It's necessary to abide by the rules of the underlying primitives. If you do need to come back on another thread (surpirsing, but possible) i woudl use a rwlock that allows for that. Note that writing such a rwlock is something that can be done. Or, if you want to use something the framework provides, i'd recommend ConcurrentExclusiveSchedulerPair. Again, this would be like saying "threads are broken and can't be used with readerwriterlockslim" because of this very same issue. Indeed, let's try that out: var rwLock = new ReaderWriterLockSlim();
new Thread(() =>
{
rwLock.EnterReadLock();
new Thread(() => rwLock.ExitReadLock()).Start();
}).Start(); This will throw as well, despite there being a single exit-read following a single enter-read. Does this mean threading is broken with this type? No. It just means you have to use the type appropriately with this your concurrency/asynchrony model. Note that this is fairly standard and normal in the task-based world. For example, in Roslyn, there is often a requirement to use the UI thread for certain operations. This is an external requirement imposed on components, and it is a requirement they must abide by. Despite that, it's not hard to support in an async/task world. It means we cannot blindly await things without considering that, just as we couldn't blindly access those types from an incorrect thread. But, at teh same time, it means we can easily bounce between threads and access things in a compliant fashion using this model to cleanly accomplish these tasks. |
Fun anecdote, my MS interview was to write a rwlock. The one i wrote was fine to use across multiple threads :) It didn't detect things like improper recursion. And it was undefined what it means if you did something like release a read lock that wasn't taken, and the same with a writelock. This was the style i learned in OS class. Good for very few requirements. Bad for detecting problems. |
This would likkely be a very big breaking change, and in an area that would have the chance to cause the most problems (threading primitives). I can attest to this having very little chance of every happening. What i would recommend instead would be just using the using var _ = await semaphoreSlim.WaitAsync();
// do work This works nicely and even allows use of the new |
@davidfowl The goal is to replace async/await. Async/await provides seamless continuation at the expense of pervasive syntax changes, pattern changes, and language restrictions. It's my goal to find a way to achieve the same benefits as async/await without the cost. I think fibers could achieve the same benefit as async/await without the restrictions it adds to the language and the runtime. This could represent the best possible change for any software application - one where the feature set remains the same, but the complexity is reduced; a net loss in lines of code without any feature loss. The reason I believe fibers could be the next evolution of asynchronous programming for C#/.NET is because the fundamental problem is a runtime problem, not a syntax problem, and this would move the solution closer to the problem. When we talk about continuation, we're talking about how to suspend a thread until it's ready to be worked by the processor again. The OS does this by managing a snapshot of the thread context (stack, registers, etc). Async/await does this by means of a state machine. Fibers would give the CLR more control over execution context, creating the opportunity to leverage different algorithms for thread suspension and continuation. It's risky and it's research-heavy. Nothing about this suggests it's going to be a slam dunk. But other languages are playing with it - .NET should, too. Unlike some other languages, such as Rust and Go, .NET has a Virtual Machine. For this reason, .NET may succeed where others fail (or fail where others succeed). I think an investment into researching this could yield many opportunities, even if none of those end up providing a superior alternative to async/await. P.S. Please excuse my ignorance as I learn as I go. It seems I've been using different terms interchangeably - Green Threads, User-Mode threads, Managed Thread Scheduler, etc. I do believe the correct term I'm looking for here is Fibers. This pitch reflects that change to the best of my ability. Please feel free to correct any misnomers. I'll later update the title / description of this issue. Thanks @benaadams |
I understand that but I'm trying to understand the purpose of this thread in relation. Is it to learn or is it just for showing interest in the idea. I think there's interest but the point being touched on in this thread are scratching the surface of what it would actually take. It's a multi-year effort to pull this off in a mature runtime that's been around for 20+ years. Not something we can patch or add a couple of APIs to make work. There are some real tough tradeoffs and considerations when trying to take .NET there. I want this experiment to happen (that's why I've been discussing it with the team) but I also understand its huge and would require a herculean effort from multiple layers of the stack. The need is heard and we can continue the discussion but its a big enough task and I was just trying to understand the purpose of this thread. A bulleted list of tasks and API suggestions are premature at best, there needs to be deep investigation in several areas (threading, stack management, JIT, preemption, GC, threadpool, pinning, APIs etc etc). Go had it much easier as they don't have any legacy and still they made some big tradeoffs to enable the approach (like the cost of interop). Rust doesn't have green threads and mirrors the model that .NET has. Java's Loom , Go, Erlang and Elixer (BEAM VM) are the only mainstream languages stackful user mode threads with preemption that I'm aware of. |
I'd love to understand the thinking here. I do imagine that it would be a herculean^2 effort to try to make the .NET runtime compatible with green threads, but much more than that is the rest of the ecosystem which would need to come along for the ride too. With the diversity of interop the potential to block seems extremely high. How could .NET possibly solve for that, or would the gamble be that enough of the runtime itself could support green threads to make such a solution feasible?
Loom has been going since mid-2018 and I have my doubts that it'll even hit a preview release this year and can only assume that it'll spend at least another year in preview before going GA. |
The purpose of the thread is determine if there's an appetite on the .NET team to pursue this endeavor, and if so, establish next steps. The scope of this project is too large for me to take on as a "pet project," and it's probably too tightly-integrated to be a bolt-on to .NET. Something like this really requires a sponsor from within the .NET team. I see the spectrum of possible outcomes something like this:
This is certainly not a definitive list, but just to give a better idea of what I'm hoping to achieve with this thread. Ultimately, I'd like to aim for some Proof-of-Concept which would materialize the value and the risks beyond that which analysis alone could achieve. MS has a lot of threading expertise inside and outside of the .NET team (OS, C++, etc), and pulling in the right people to discuss the feasibility of Fibers as a .NET feature would be more productive than an opt-in thread on GitHub, so I think we probably got a little ahead of ourselves here. |
The team sees value but this is a "big rock" .NET 7/8 planning exercise at this point. I don't think we'll establish next steps on an issue like this. |
@davidfowl If you're telling me that this is on the radar, then that's all I can really ask for and the objective of this thread has been achieved. Is there anything I can do to get this pinned into one of those .NET 7/8 planning exercises? Do I just leave this issue open as a placeholder that gets tagged for future review? If not, I'm good to close the issue. |
The current state is that we recognize there's a problem and have been having discussions about what it could look like. There's no plan beyond that. If anything this counts as an upvote in this direction! |
I would stick with "green threads" or "user threads". Everyone would know what you were talking about. Windows already has the concept of "Fibers" as application-scheduled threads but they are a different beast since they also require a stack to be preallocated for them which eliminates many of the benefits. I'd be extremely interested in whatever findings the runtime teams might have in exploring adding green threads. Adding the support to create, suspend and resume green threads, that is by far the easy part. I'd be happy if the runtime only added that as delimited continuations would massively simplify writing coroutines. The hard part is rewriting all of the bits of the runtime that currently block so that they don't block, because that will still block the underlying OS thread. It also involves rewriting everything that is thread-affine so that it also understands green threads. When a green thread is resumed it will more likely than not be on a different threadpool thread, thus existing synchronization primitives still won't work without being redesigned. Same with anything thread local. There would have to be a whole conversation as to what any of this means with WinForms and other UI libraries. I find green threads to be very compelling, and I love the experience I've had with them in Go and Java Loom EA. I don't argue against them because I don't believe they wouldn't be beneficial in the .NET world, I just can't imagine how it could be a remotely practical project. It just requires redesigning and rewriting way too much code, much of which is outside of MS' control. It's taken Java over 3 years to get as far as they have and they're likely still another year or two off, and their ecosystem is significantly more constrained and under their control. |
I'm inclined to agree - I think the general idea was successfully conveyed and I don't want to get hung up on pedantic semantics :) I agree that there are some interesting challenges to solve, as well. I'm encouraged to fork and play with it myself - given that this at least within the realm of possibilities, I won't feel like I'm wasting my time. I'd love to see what comes from a detailed design session among the vast expertise within MS. Despite my criticisms of async/await, C#'s big features have been nothing short of awe-inspiring. LINQ was the software equivalent of SpaceX landing a rocket on a boat: not only did they succeed where others failed or claimed it was too hard, they made it damned pretty. I'm reminded of LINQ's brilliance anytime I'm forced to endure Java streams. I'm excited to see where this goes, even if that means waiting a decade (hopefully not). |
I wonder how did Java solve the following issue. In C#, if your code is running in a single-threaded synchronization context, you can be sure that the fields are not changed between awaits. So basically you don't need any locks (as everything is anyway running on the same thread), and you need to reread the fields only after await. Now, if we would replace async/await with green threads, basically every function call would be potentially blocking and thus would be potentially yielding. This means that we would need to defensively reread the fields after every function call. (Traditional locks wouldn't help, since everything happens on the same thread in this setup.) Moreover, we would need to bring the object into a consistent state before every function call. This seems to be way too restrictive. |
This is a very weak guarantee (nothing in the type system prevents concurrent access) that isn't hard to solve if you have a scheduler concept. Java LOOM has support for capturing continuations and it also has an Executor concept. Continuations can be scheduled on an executor so you can have a single threaded executor (Fibers allow pluggable schedulers).
Not sure how you think async await works today but I don't see how this is different to how it works.
I think Java bridges this gap by requiring the caller to create a virtual thread (formerly fiber) to run code. The JVM is just aware of when you're in one context or the other. Java also doesn't have pervasive interop making this doable (though I'm sure they have issues with libraries that do JNI). It also doesn't have pinning (nor does go) which makes stack management a bit easier. There are hard problems to solve if you want magic yielding in .NET. |
Closing to keep the backlog free of issues that are not actionable. Thanks to everyone who provided input. Hopefully this will make it on the roadmap, and if not, maybe we'll get to see an interesting technical write-up on why it wasn't possible. |
Overview
Create an abstraction layer that allows the CLR to run its own thread scheduler.
What problem(s) does this solve?
Compatability
No breaking changes to the API. Backwards-compatible with async/await.
Tasks (Proposed)
Technical Details
Async/await is an attempt to mitigate the cost of context-switching, but does so by adding significant cost and complexity to software. This proposal is a successor to async/await: an attempt to achieve the same performance gain without the cost (without the language and compiler overhead of the async/await pattern). This proposal is similar to Java's "Green Threads" and Go's "Goroutines."
There is a specific technical challenge that needs to be flushed out in a PoC, first, and that is how to persist part of the stack. During a context switch, the OS swaps out the thread's context, including its stack. This is expensive. If we're performing a user-mode context switch, we only need to persist the stack above our custom thread scheduler. This smaller stack copy is where we gain our performance. Exactly how to do this has been explored by other languages, but hasn't necessarily been solved.
Problems with the current design
Async/await is driven by a form of thread-scheduler that competes directly with the scheduler provided by the OS. While in an "async context", blocking and threading primitives are prohibited. Async/await is prohibited outside of an "async context." Thus, these execution models / contexts are mutually exclusive. Despite being mutually exclusive, it is possible to try to "block" in an async context, or to forcefully await a task in a synchronous context, with potentially catastrophic results.
In addition to the stability and pattern caveats of async/await, methods also have to be duplicated for both execution models (e.g.
void Foo
andTask FooAsync
). Code duplication immediately suggests that some form of abstraction would be beneficial, however it is not possible to write methods that would work in both sync and async contexts (Theawait
keyword provides necessary metadata for state machine generation). If we can achieve the same benefits of async/await without the caveats, we should, but this depends on our ability to create a high-level thread scheduler.API Suggestions
An abstract API should allow operations to be performed against the currently executing context (similar to SycnchronizationContext, but specialized for thread scheduling).
This would allow code to be written agnostic of the threading model: the code doesn't change whether we're using a "green" thread, an OS thread, or an async/await state-machine (presuming the latter is even still possible).
Ideally, most of the classes in the
System.Threading
namespace would be updated, allowing for advanced thread synchronization - something that's currently not possible with async/await.The text was updated successfully, but these errors were encountered: