Coroutines #561

nnsgmsone · 2019-06-25T06:22:02Z

I found that the coroutine of v is directly called pthread_create. I think it is possible to add user-level coroutines, so that more energy can be injected. Perhaps can learn from golang's approach and add a runtime. . .

spy16 · 2019-06-25T07:31:56Z

Implementing coroutines (similar to Go approach) has always been the plan and is in the roadmap. See https://vlang.io/docs#concurrency

nnsgmsone · 2019-06-27T09:17:32Z

ok

spytheman · 2019-09-05T08:16:07Z

@nnsgmsone what does 'more energy can be injected' mean in this context?

nnsgmsone · 2019-09-09T06:33:27Z

@spytheman make the routine more useful

joe-conigliaro · 2019-09-09T06:54:22Z

As far as I know this was the intention all along, they were Implemented that way to begin with to have something working.

gslicer · 2019-09-09T11:39:34Z

I think the use of subroutines is very limited in contrast to what threads can offer (as long as implented e.g. with the "actor" paradigm, without any semaphores/locks)... so threads shall be still supported.

See this statement:

Why create threads when there are coroutines?

Coroutine methods can be executed piece by piece over time, but all processes are still done by a single main Thread. If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

Threads are different. The execution of separate Threads is managed by the operating system. If you have more than one logical CPU, many threads are executed on different CPUs. Thanks to that, any expensive operation will not freeze your application.

dumblob · 2019-09-09T12:59:58Z

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc. I think the topic here is not about this though. I think it's about having a builtin primitive for concurrency and that's fully covered in #1868 .

Thus I think this topic can be closed as it got fully superseded by #1868 .

gslicer · 2019-09-09T13:02:57Z

@gslicer don't worry, there will always be the unsafe package with all primitives to create your own threads with your own condition variables, mutexes, locks etc.

As long it's "unsafe" I'm clearly worrying :)

nnsgmsone · 2019-09-16T01:34:49Z

@gslicer I think it's easy to achieve the effect of an actor with the routine and channel.For example, a library I wrote myself is such an effect.

crthpl · 2020-12-08T17:43:20Z

Will there be a compiler flag to make go start a new thread like it does now?

atomkirk · 2020-12-30T02:08:45Z

When this is implemented in V, would it be possible to implement as pre-emptive (like erlang) instaed of cooperative (like go)?

dumblob · 2020-12-30T09:53:33Z

@atomkirk so far V has built-in "go routines" which are fully preemptive (and I think the consensus is, that it should stay so). This GitHub issue seems to be about a different thing - namely about standard library offering simple pure coroutines (which are by definition non-preemptive).

atomkirk · 2020-12-30T14:05:30Z

@dumblob they can be. Erlang processes are user-level AND preemptive. They are very robust.

its preemptive now because it uses kernel threads which are bulky and expensive.

dumblob · 2020-12-30T18:11:41Z

Erlang processes are user-level AND preemptive.

That depends on how the Erlang VM is being executed. If it runs on bare hardware and no non-Erlang SW is being called, then you're right. In any other case Erlang processes are only partially preemptive (i.e. one Erlang process can starve indefinitely leading to stopping the whole Erlang VM). But I digress.

My point was different. V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language).

ntrel · 2021-01-22T10:44:52Z

If a Coroutine attempts to execute a time-consuming operation, the whole application freezes for the time being.

This is not true since Go 1.14:
https://medium.com/a-journey-with-go/go-asynchronous-preemption-b5194227371c

maddanio · 2021-08-25T11:03:50Z

as I just opened a related feature request and to clarify, having co-routines being pre-emptive is kind of a contradiction in itself. co-routines are routines which can enter and exit at any point, basically making it possible to have many of them "in-flight" without having to resort to threads. this becomes when co-routines start inter-depending, which ould otherwise lead to deadlock and/or very iniefficient use of threads and with large-scale networking, where wasting a thread for each connection has been known to be a very bad design choice at least since apache was implemented :)

maddanio · 2021-08-25T11:05:07Z

so is there any plan to implement real coroutines, i.e. yieldable functions, that dont require threading? co-routines are there to enable concurrency, which is not the same as parallelism (threads) though the two interact very well

dumblob · 2021-08-25T12:35:10Z

V community seems to incline to have built-in support (in the form of V's go routines) for fully preemptive execution while offering non-preemptive alternative (refered to as coroutines) in the standard library (i.e. not built into the language).

so is there any plan to implement real coroutines, i.e. yieldable functions, that dont require threading?

I'd say the answer has two parts:

yes, there are plans for real coroutines - but they'll probably not be very tightly connected to the language but rather a standard library construct mainly to allow easy porting of existing libraries which for some weird reason depend on coroutines behavior and have some problems with full preemptiveness
in the light of Proper support for distributed computing, parallelism and concurrency #1868 there is no actual need for true coroutines (except for point (1) ) as they'll perform worse in some scenarios while not outperforming Proper support for distributed computing, parallelism and concurrency #1868 in any scenario I can imagine (and I don't buy any examples relying on deliberately constructed techniques making it perform few percent better on carefully chosen platforms - sure, due to caching and false sharing etc. you can construct something 5%-10% slower, but that'll definitely be a "weird programming antipattern" and thus case (1) )

atomkirk · 2021-08-25T13:03:44Z

Imagine I’m building a chat app. If I use erlang, I can get about 500k-1M connections per node. If I use a language with threads, I can handle far fewer per node. This means it costs more to scale it, and the pubsub system is far more stressed because it has to handle a lot more traffic between nodes to handle the same traffic as erlang.

yes, I could write code to imitate coroutine performance using threads, but I could also write code to handle my own memory. Thats not what V is about.

If we rely on a library, I can accidentally starve other coroutines, wont get the same performance as go/erlang and will end up with the same confusing and disappointing concurrency story Ocaml is in right now.

dumblob · 2021-08-25T13:19:14Z

@atomkirk please read #1868 properly. It says V shall use only that many threads as there are processing units (CPU cores, FPGAs, ...) and all go routines will be fully preemptively multiplexed among these very few threads. Moreover, the plan is to make the number of threads follow sleeping and waking up of processing units in runtime (to not force a sleeping notebook with 32 CPU cores running currently only one CPU core to save energy to context-switch between 32 threads of a V app). It's a similar design as Go lang uses under the hood, but IMHO strictly better (due to the guaranteed full preemptiveness & support for power saving).

So, please go ahead and read the thread (incl. all links and links in those links...). I hope all the concerns immediately become void by that.

maddanio · 2021-08-25T13:26:37Z

how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support.
you could use the c coroutine support actually. afaik the c++ corutine plumbing can also be used in c

maddanio · 2021-08-25T13:28:19Z

go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones).

maddanio · 2021-08-25T13:40:25Z

the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes

dumblob · 2021-08-25T13:41:43Z

how would this preemption work? at which points will you yield a coroutine? usually this is done actively by the coroutine entering an async call and "waiting" (i.e. yielding) for the result. thats also the most efficient way. also how will you do this in a library? to do real context switching you need assembly. or for fully integrated coroutines like you have in c++ now you need compiler support.
you could use the c coroutine support actually. afaik the c++ corutine plumbing can also be used in c

Let me reiterate - please read the whole thread #1868 incl. all links recursively (depth 3 should be enough).

go lang most definitely will do coopoerative multitasking by having coroutines yield on channel action and async io (usually just networking, unless they also implemented async file io). so if you say "like go" you will have to do that, i.e. actively yield coroutines on these statements (channel actions, network io, mutexes, I think those are the important ones).

Partially yes - IMHO yielding under the hood will be done less aggressively than Go lang does, because there'll be the full preemptiveness, so presumably the inserted yields will be put only on critical places chosen based on true performance profiling analysis of representative apps (unlike in Go lang where they have no choice and have to put them really everywhere to make the language kind of work).

maddanio · 2021-08-25T13:44:14Z

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

maddanio · 2021-08-25T13:57:27Z

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

dumblob · 2021-08-25T14:01:11Z

the way to have efficient networking resolves around using epoll/select inside the coroutine scheduler and have it know what coroutine to wake up when a certain socket becomes readable, similarly with channels and mutexes

Well, this supposes that programmers are dumb and will use 1 go routine per 1 request (be it a network request or any other sample from a high-rate stream). Which is one of the dumbest things one can do. Nobody from the Go lang nor Erlang world does this because Go routines in Go (and Erlang processes as well) are still extremely expensive (you can have only smaller millions of them which is by far not enough for a scalable app).

Therefore I'd say V shall actually not make the scheduler this smart. But let's see, maybe someone will provide some measurements and data and in V 1.1 (which is years ahead IMHO) there'll be such a smart scheduler. But definitely not now for V 1.0 because it's a nonsense from my point of view.

what is "agressive" about cooperative yielding? its simply efficient because it will yield at exactly the points the thread would end up sleeping anyway

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

Please just finally find few hours to read the thread #1868 recursively (and maybe wait one more day to let the brain calmly absorb it all before proposing other concepts which actually are quite aging already). We'll be here, we won't run away 😉. This topic is not urgent and very old and many people smarter than me have put their thoughts in it - most of it documented in the #1868 thread and recursive links.

also isnt the fact that just about any other language (even c) now has cooperative coroutines an indication that they are a good idea?

Again - there is a plan for coroutines (as part of the standard library, maybe even with some intrinsics). But IMHO it's lower priority. Feel free to make a PR with a potential API (some people already worked on that, but I can't find the links now quickly - just search for them yourself and ask e.g. on Discord).

maddanio · 2021-08-26T07:40:44Z

At some point (if you have too many yields) it becomes less efficient than less frequent preemptying (and it has also other downsides - it increases code size a bit, it disallows good CPU-bound performance optimization, etc.).

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines. Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down". Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

dumblob · 2021-08-26T08:50:52Z

I am not convinced. How can it be more efficient to let a routine sleep (on a network select or waiting on a mutex or...) than yield it? The thread will idle and eventually be swapped out by the os kernel. Unless you are talking about spin locking. But even that you can model with proper co-routines.

Please really devote several hours to reading the whole thread #1868 incl. links. You'll learn among other things about Weave which nicely shows what the performance differences are - but don't forget we're talking about maxing out performance of multiple processing units, not just a single core.

And btw. as I said, V will insert yields internally on important places (preferably according to measurements and not human guesses) like selects, mutexes, etc. But I suppose it'll be on a (much) lower number of places than the recent Go versions started to do (you can read about this also in one of the linked resources from the #1868 thread).

Also see this talk where Gor Nishanov applies the new co-routines to micro-optimizations to mask cache line latencies, so I beleive co-routines, at least the new ones that have compiler support, have been driven "all the way down".

Thanks for the link. Gor explains a cool idea how to implement very lightweight coroutines which highly efficiently leverage modern CPU cache hierarchy. These nano-coroutines are unfortunately something V can't care about. Simply because they don't support scheduling across multiple processing units. In other words they do support only a single processing unit (i.e. only one thread) and based on Gor's explanation this can't be changed without losing some of their benefits.

I'd even guess that e.g. Weave (which offers tasks which is just a slightly different abstraction for the very same coroutine concept) is about as fast as Gor's nano coroutines even if run only on a single core (despite Weave being designed to max out performance of many processing units with different processing powers). Feel free to test it and post your results here to let us reproduce them on more machines.

Also bear in mind that with the new compiler support for co-routines in later c compiler the compiler can even inline across and through yield points.

Inlining micro-optimization sound like a patch to a wrong abstraction. But yes, thanks to this I'd guess it'll catch up with "normal function calls" when it comes overhead on one core.

maddanio · 2021-08-26T12:32:33Z

I had a quick look at weave, i dont see how it applies to async operations like networking where you have to keep on juggling tasks because most of them simply cannro make progress at any given time due to waiting for io. I think you are fundamentally mixing up concurreny with parallelism.

nnsgmsone added the Feature Request This issue is made to request a feature. label Jun 25, 2019

dumblob mentioned this issue Sep 4, 2019

Proper support for distributed computing, parallelism and concurrency #1868

Closed

medvednikov added this to To do in 0.3 Nov 28, 2020

medvednikov changed the title ~~User level coroutine?~~ Coroutines Nov 28, 2020

vlang locked and limited conversation to collaborators Sep 22, 2021

medvednikov closed this as completed Sep 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

This issue was moved to a discussion.

Coroutines #561

Coroutines #561

nnsgmsone commented Jun 25, 2019

spy16 commented Jun 25, 2019 •

edited

nnsgmsone commented Jun 27, 2019

spytheman commented Sep 5, 2019

nnsgmsone commented Sep 9, 2019

joe-conigliaro commented Sep 9, 2019

gslicer commented Sep 9, 2019 •

edited

dumblob commented Sep 9, 2019

gslicer commented Sep 9, 2019

nnsgmsone commented Sep 16, 2019

crthpl commented Dec 8, 2020

atomkirk commented Dec 30, 2020

dumblob commented Dec 30, 2020

atomkirk commented Dec 30, 2020 •

edited

dumblob commented Dec 30, 2020

ntrel commented Jan 22, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

dumblob commented Aug 25, 2021

atomkirk commented Aug 25, 2021

dumblob commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

dumblob commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021 •

edited

dumblob commented Aug 25, 2021 •

edited

maddanio commented Aug 26, 2021 •

edited

dumblob commented Aug 26, 2021 •

edited

maddanio commented Aug 26, 2021

This issue was moved to a discussion.

This issue was moved to a discussion.

Coroutines #561

Coroutines #561

Comments

nnsgmsone commented Jun 25, 2019

spy16 commented Jun 25, 2019 • edited

nnsgmsone commented Jun 27, 2019

spytheman commented Sep 5, 2019

nnsgmsone commented Sep 9, 2019

joe-conigliaro commented Sep 9, 2019

gslicer commented Sep 9, 2019 • edited

dumblob commented Sep 9, 2019

gslicer commented Sep 9, 2019

nnsgmsone commented Sep 16, 2019

crthpl commented Dec 8, 2020

atomkirk commented Dec 30, 2020

dumblob commented Dec 30, 2020

atomkirk commented Dec 30, 2020 • edited

dumblob commented Dec 30, 2020

ntrel commented Jan 22, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

dumblob commented Aug 25, 2021

atomkirk commented Aug 25, 2021

dumblob commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021

dumblob commented Aug 25, 2021

maddanio commented Aug 25, 2021

maddanio commented Aug 25, 2021 • edited

dumblob commented Aug 25, 2021 • edited

maddanio commented Aug 26, 2021 • edited

dumblob commented Aug 26, 2021 • edited

maddanio commented Aug 26, 2021

This issue was moved to a discussion.

spy16 commented Jun 25, 2019 •

edited

gslicer commented Sep 9, 2019 •

edited

atomkirk commented Dec 30, 2020 •

edited

maddanio commented Aug 25, 2021 •

edited

dumblob commented Aug 25, 2021 •

edited

maddanio commented Aug 26, 2021 •

edited

dumblob commented Aug 26, 2021 •

edited