Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIP-55 - Concurrency with Higher-Order Coroutines #63

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

diesalbla
Copy link

@diesalbla diesalbla commented Jun 15, 2023

A SIP for introducing coroutines to Scala using implicit parameters. We deal with usual coroutine problems such as colour-transparency and higher-order functions, as well as with concurrency problems.

This SIP follows the Pre-SIP discussions about Suspended functions and continuations, and addresses some of the issues discussed in that forum. It also succeeds recent presentatios about Direct Style Scala, and the ongoing work with async.

A SIP for introducing coroutines to Scala using implicit parameters.
We deal with usual coroutine problems such as colour-transparency
and higher-order functions, as well as with concurrency problems.

Co-authored-by: Diego E. Alonso Blas <diego.e.a@47deg.com>
Co-authored-by: Jack C. Viers <jack.viers@47deg.com>
Co-authored-by: Raul Raja Martinez <raul.raja@47deg.com>
@He-Pin
Copy link

He-Pin commented Jun 15, 2023

过年了!Epic!

sealed trait Green extends Color
sealed trait Red extends Green

sealed trait Suspend[+ Col <: Color]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about name it Suspendable?

@julienrf
Copy link
Contributor

Thank you for submitting the proposal. I’ve assigned a team of reviewers who will post their feedback within the next couple of weeks.

@bjornregnell
Copy link

bjornregnell commented Jun 16, 2023

I'm not an assigned reviewer but I think I found a typo (perhaps due to a name change not fully propagated):
fun should be changed to fili in two places (if I'm not incorrect?).

@doofin
Copy link

doofin commented Jun 16, 2023

I didn't find a systematic survey about why the current monadic syntax is not sufficient aside from being deemed too difficult for beginners. It would be great if there's something to justify this new syntax.

``` scala
sealed trait Color
sealed trait Green extends Color
sealed trait Red extends Green
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Green and Read is not easy understandable.

*/
class GcdFrame(
var state: Int,
var a: Int, var b: Int, var z: Boolean, var m: Int, var g: Int,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if we have many local varaibles ,will that be a problem?

Copy link
Author

@diesalbla diesalbla Jun 19, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An object of the Frame class stores in heap the method's stack frame. Just as with the stack frame, the size is proportional to the number of local variables in the method, which is known at compile time. We could do some optimisations, similar to the ones the compiler does with stack frames. For instance, we could have two local variables with non-overlapping time-lines (span from assignment to use) share a same "position".

Note that such optimisations would be addressed further down the line, as the focus of this SIP is 1) the syntax for coroutines, and 2) a possible implementation known to work in other languages.

Thanks to Bjorn Regnell @bjornregnell for pointing those out.
@sjrd
Copy link
Member

sjrd commented Jul 18, 2023

First, thank you for this extensive, well-research and well-presented SIP.

The SIP addresses an important problem; one for which I think many agree want a good solution. The solution space is however daunting. As a result, the solution proposed in this SIP contains substantial complexity, and it is likely that it will need several iterations, or never pass at all.

Now onto my high-level comments on this SIP.

Some things I like

Just to highlight some good stuff that I particularly like. ;)

  • The launcher abstraction. Thanks to that, the language feature is not tied to specific implementation details that may have performance tradeoffs.
  • Structured concurrency.

Keyword versus contextual parameters

This SIP proposes to use contextual parameters to encode the color of a function. It argues that contextual params are better because they do "not require modifying the lexicon or grammar of Scala, nor adding any major features to its system". While that is true, it does require special restrictions on how those contextual parameters can be used. See the section "Front-End Changes".

However necessary, the restrictions are severe enough that they make Suspend contextual params significantly depart from regular contextual params. For example, it means that these types can never be abstracted over; they do not really belong in the type lattice. In fact, they cannot even be subtypes of AnyKind, which is an entirely new world for the spec.

IMO, this is significant enough that we cannot in fact reason about them in the same way as other contextual params. Given that, it is not clear to me that "only being contextual parameters" is a good thing. In practice, I expect that it would be easier to specify and implement if we took coloring as a separate feature.

This aspect also compounds with my next comment.

Codegen and Binary/TASTy compatibility

Color-transparent functions, and a fortiori polychromatic functions, require 2 "lanes". Each lane will require a different actual method in the compilation pipeline. How do we guarantee binary compatibility of these generated-but-public methods? Also, it is not clear what happens when such methods are overridden (or implement abstract methods). Do we have some guarantee that all the semantics of overriding can be preserved by the code transformation?

Regarding my earlier comment about contextual parameters, it also means that some contextual parameters, depending on what type they have, have a profound impact on the binary API of a method. Normally, contextual parameters are all handled in the same way and have no such consequences.

Integration with the standard library

Integration with the standard library is problematic. As is, no higher-order function from the stdlib, such as List.map, will work in async contexts. The proposal mentions that adapting the stdlib is left for future work. Unfortunately, I think this will leave us in an intermediate state that is not really acceptable. It will be confusing that standard methods cannot be used with such a deep-reaching language feature.

Ironically, truly accepting the language-feature aspect of this SIP, instead of encoding it as contextual parameters, may provide a better path forward. If existing methods, with existing signatures, can be marked as color-transparent, we may be able to enhance the existing standard library with async support in a backward compatible way. This is not really possible if the signature of the method has to change, which is the case with contextual parameters.

@diesalbla
Copy link
Author

@sjrd Thanks for looking into the SIP. We have started to discuss the problems you have found, and will address them in an answer later this week.

@odersky
Copy link
Contributor

odersky commented Jul 18, 2023

I echo @sjrd: This is a great write-up for a SIP. Thanks to the authors for the clarity of writing! Thanks in particular for the set of example programs to be used in case studies. I found these very well chosen.

The SIP proposes a solution to adding coroutines (or rather: one-shot continuations) to Scala on runtimes that don't support these features directly. Runtimes that would support coroutines directly are JVM 21 and later through Loom and possibly Scala Native. But on earlier JVM versions and JS there is no native support in sight. That said, the proposal stresses that there should be a single language standard that can be implemented on all kinds of runtimes.

The proposal covers three areas:

  1. Changes to the language standard and to the compiler frontend
  2. Transforms needed to implement continuations
  3. A runtime that supports higher level structured concurrency abstractions built on the suspension primitive.

The main purpose of the SIP is to discuss (1). (2) is relevant insofar as it demonstrates that the proposal can be implemented. (3) is relevant insofar as it demonstrates expressiveness - we can build nice abstractions for async computation on top of coroutines. Let me discuss them in reverse order:

Runtime

I liked the proposed abstractions for Launchers. This holds a lot of promise for cross-platform concurrency libraries.
One question: On JVM, we already have InterruptedException. Do you mean CancellationException is an alias of that or should it be separate?

Transforms

The transforms described map suspendable functions to state machines where the state is stored in heap objects. This is similar to the techniques used in Kotlin. It also seems to have a lot of resemblance with the techniques in Scala 2.13's async translation - it would be good to explore this further.

The tradeoff of any implementation technique is between how efficient suspensions are and how much slowdown is accepted for transformed code that does not in the end suspend, or that suspends only rarely. State machines are quite efficient for suspensions but impose a significant overhead for normal code execution. The technique of snapshotting via exceptions mentioned in the proposal is an alternative that does the tradeoff in a different direction, by implementing non-suspending code with almost no slowdown. Before deciding on one or the other it might be good to compare both candidate implementations with benchmarks. As the authors note, this is not a mere implementation detail, since exceptions are exposed to programmers and therefore would have to be specified.

The tradeoff is also influenced by how much code gets transformed in the end to make it potentially suspendable. For instance, if we decide that all code should be potentially suspendable (probably not realistic), then we definitely need an implementation that does not penalize straight code that does not suspend. If we have an extremely good prediction of what code might be suspendable we can be less stringent in our requirement. The proposal is to indicate suspendability through an argument of a polymorphic method and to specialize on that parameter, which gives probably a reasonable upper approximation of what can suspend.

Language and Compiler Frontend

The proposal is to use context parameters to indicate suspendability, but to be sound it needs to restrict Suspend context parameters to be second class values. Contrary to what is claimed in the proposal this is a huge change of the type system! A simple suspend modifier as in Kotlin would be a much smaller change to the language than this.

I understand the appeal of context parameters since they tend to get out of the way without needing wrapping or unwrapping, and since they can carry type parameters (which are the colors in this proposal). But still, one cannot simply change the rules like this without risk of feature interactions with everything else.

There are two other problems:

  • These changes would not be binary backwards compatible (@sjrd already expanded on this, so I won't need to).
  • These changes would make the source definitions a lot more complicated. Compare the definition of map with the current one in the standard library. And map is not alone. In an OO language like Scala every call on a parameter method is a virtual method call, so every call could potentially suspend or not. This means we probably need a large scale proliferation of Color type parameters in Suspend arguments.

I believe the proposed restriction of context parameters to second class values is not only a big language change and therefore probably unfeasible but also a missed opportunity. We should embrace the fact that Suspend parameters can be
captured in closures since it gives us much cleaner types for higher-order functions like map and pipe! This is the essence of capture checking. To demonstrate, here are the three example functions expressed with Scala's capture checking language import:

def isZero(n: Int)(using Async): Boolean =
  Future(n == 0).await
def mod(num: Int, den: Int)(using Async): Int =
  Future(num % den).await

def gcd(a: Int, b: Int)(using Async): Int =
  if isZero(b) then a else gcd(b, mod(a, b)) + 1

extension [Y] (list: List[Y])
  def map[Z](fili: Y => Z): List[Z] = list match
    case Nil =>  Nil
    case y :: ys => fili(y) :: ys.map(fili)

def pipe[X, Y, Z](tick: X => Y, tock: Y => Z): X ->{tick, tock} Z =
  (x: X) => tock(tick(x))

This uses Async instead of Suspend, but the two are interchangeable in this context.

Notes:

  • We use a slightly higher level abstraction of awaitable futures for isZero and mod instead of shift/resume. (Btw I find the name shift a calamity, please let's use suspend instead!)
  • The definition of map is exactly what it would be without suspensions. More generally, we can keep the definitions of almost all library functions the same as before. The exception is only in lazy abstractions such as Iterator which will need a capture annotation. This changes the source code, but not the binaries.
  • The definition of pipe now states explicitly that the result function can have precisely the effects of tick and tock and nothing else. So you get not only suspension implementations, but on top of that effect checking for free!

One advantage of the current SIP over capture checking is that it makes it very clear which functions can suspend and which cannot (the disadvantage is that we need new types for every function that can suspend). But I believe that question can be answered for capture checking as well. Namely, a function can suspend if it is in the scope of a Suspend (respectively, Async) context parameter or if it invokes a method on another object that might capture a Suspend capability. It's a bit more indirect, but I believe it can be worked out.

Summary

There are lots of things to like about the proposal: A worked out implementation, the Launcher abstractions, the idiomatic code examples used. But it's an extremely impactful and involved change. Before jumping in one direction, we should try out and compare with other type system and implementation techniques.

@gabro
Copy link

gabro commented Jul 19, 2023

I second what @odersky and @sjrd said, great job on the proposal!

One minor addition: I too worry about the standard library integration and I don't think that should fall under "future work" since that seems to be a drastic change. Even just from a documentation perspective I fear we will return to a CanBuilFrom-like situation in which even explaining a simple method like map will become tricky since its signature will include the notion of a Color.

@diesalbla
Copy link
Author

diesalbla commented Jul 24, 2023

@sjrd Thanks for the feedback received. I wonder if you could you elaborate a bit further regarding the following

the restrictions are severe enough that they make Suspend contextual params significantly depart from regular contextual params. For example, it means that these types can never be abstracted over; they do not really belong in the type lattice. In fact, they cannot even be subtypes of AnyKind, which is an entirely new world for the spec.

Could you elaborate a bit further on this? What form of abstractions, or what language constructs, would not be possible with the encoding that would be possible with a different one?

We are not sure if by "abstraction" you are referring to polymorphism. The examples that in the SIP make use of parametric polymorphism, in which the colour is set by the caller. It would also be possible to have an example of Generalised-Algebraic-Data-Type.

trait Recorder[C <: Color]: 
  def record(s: String)(using Suspend[C])

object MemoryWriter extends Recorder[Green]
object DiskWriter extends Recorder[Red]

This could be transformed to bytecode as follows: the record method of the Recorder trait would correspond, in binary, to a class that declares two methods, which correspond to the green lane or the red lane.

@diesalbla
Copy link
Author

Integration with the standard library is problematic.
As is, no higher-order function from the stdlib, such as List.map, will work in async contexts.

@sjrd What is referred to with the words "work in async contexts"? I assume that it refers to the fact that, even from within a coroutine, it would not be possible to call a HOF passing it another coroutine as argument. Note that it would still be possible to call a HOF, applied to a green function, just like any other green function; except that the caller coroutine would not suspend in that HOF call.

@diesalbla
Copy link
Author

Here is a first comment to some of the concerns addressed.

Suspend as Second-Class values

Thanks for the reference to the "Gentrification" paper. This paper does describe how to achieve some of the restrictions we could needfor the Suspend parameters. It also anticipates, in Section 4.2, the essence of our solution for colour-transparent functions:

We informally introduce a degree of “second classiness”, which we achieve by parameterizing @local as @local[P], where P denotes a privilege level and is in contravariant position. Implicitly, a @local annotation denotes the most restricted privilege level, while its absence denotes no restrictions (first class). (Section 4.2)

[...] In Scala, we can represent privileges directly as types, and their relationships via subtyping.

[...] the key application relies on a much more general insight: in a system with subtyping, we can use the underlying type lattice as privilege lattice.

It is sometimes desirable to abstract over the level of privilege in order to prevent code duplication and keep an existing interface unmodified.

Going back to what @srjd mentioned about the Suspend types not being part of the type lattice, the paper also refers to a privilege lattice.

Before we submitted the SIP, we took a look at Erased Definitions.

We introduce erased terms to overcome this limitation: we are able to enforce the right constrains on terms at compile time. These terms have no run time semantics and they are completely erased.

[...] erased parameters will not be usable for computations, though they can be used as arguments to other erased parameters.

With erased definitions we can define a separate set of values that can be passed as parameters in compile-time to "prove" that a capability has been granted. At least at runtime, erased values cannot be captured in closures or escape method executions; but we are not clear as to whether erased values can still be passed as fields of non-erased classes. Perhaps these erased definitions could give the same kind of constraints that second-class values would give for the Suspend parameter.

Standard Library integration

Regarding the following points about the standard library:

The proposal mentions that adapting the stdlib is left for future work. Unfortunately, I think this will leave us in an intermediate state that is not really acceptable. It will be confusing that standard methods cannot be used with such a deep-reaching language feature.

One minor addition: I too worry about the standard library integration and I don't think that should fall under "future work" since that seems to be a drastic change. Even just from a documentation perspective I fear we will return to a CanBuilFrom-like situation in which even explaining a simple method like map will become tricky since its signature will include the notion of a Color.

We have left the changes to the standard library (StdLib) out of scope for this SIP, not as a matter of design but as a matter of process. Our point is that “coroutines in the language” and “coroutines in the library” should be separate features proposed in separate SIPs and delivered in separate releases, because “adding” things to a language is a different proposition to retrofitting the foundation of every Scala program. We frame this SIP as a scope of work that can be worked on and delivered in a minor release of the language. That intermediate state we seek is not our desire and may be barely acceptable, but it would manageable for the community of users of the language.

Major features such as coroutines needs to be introduced slowly, carefully, taking into account users (that is Scala developers) feedback. Regrettably, most users do not discuss nor try any new features at SIPs or compiler Pull Requests, but only once in a compiler release. That is why this SIP proposes adding coroutines as an opt-in feature, which willing early adopters may try out. Embedding them into the StdLib out of the door would affect existing codebases
and disrupt the work of those who may prefer to wait. This incremental approach would also account for our own optimistic bias on favour of our proposal. The subject is deep to deserve discussion beyond a SIP scope.

Note that this is not the same concern as that of experimental features, which are often taken not to be fully reliable. Instead, we seek for coroutines alone to be eventually delivered as a full-fledged non-experimental feature, which programmers may use from there on. However, to avoid any impact to existing codebases, it should be kept out of StdLib.

As a point of contrast, project Loom has been going on for years, but its outcomes are split into several features, each one going through a four-stage pipeline, each stage being a separate JEP and release.

@doofin
Copy link

doofin commented Jul 27, 2023

Frankly speaking, I only see this new syntax as a kind of shorthand for using monads which might be easier for new comers. However, this approach may introduce many complexities as @sjrd said. I tried to find the rationale behind this proposal but failed to be convinced.

@anatoliykmetyuk
Copy link
Contributor

anatoliykmetyuk commented Aug 23, 2023

@sjrd do you have any updates on this one? Do we need to discuss it during the next SIP meeting?

@sjrd
Copy link
Member

sjrd commented Aug 23, 2023

Sorry for the radio silence here.

@diesalbla We discussed this SIP at the last SIP meeting. Broadly speaking, the rest of the committee agreed with the comments that @odersky, @gabro and myself had posted before. In particular, to be able to move forward with this SIP, there were two major points made.


First, the standard library integration. I understand you're reluctance to touch the library because it would have immediate impact on everyone. However, if we introduce this into the language, and it turns out later that we cannot make it work with the standard library, we will have a much bigger problem on our hands. It is therefore critical to study the impact on the standard library, and demonstrate that it can evolve in a reasonable way together with this proposal.

Your counter-argument to this line of thought was that this proposal was all "opt in". However, you could say that about any proposal that is made. Since every proposal has to meet certain backward compatibility criteria, every proposal must be "opt in" in the sense that codebases can choose not to use the new features (and indeed, existing codebases, by definition, would not use them). Once a language feature enters the language and its specification, it is there forever. It is never "opt in" in that sense: every future change will have to take it into account. If we find future incompatibilities, for example with the standard library, we cannot "opt out" or "discard" that feature later; it still has to stay there.


The second point is a bit awkward. It is not necessarily up to you to figure out, but eventually we will need a comparison of this approach with the capture checking approach. It's awkward because you were the first to actually submit a SIP, but the capture checking is a large effort that we believe/hope will lead to an eventually better result, in terms of usability and notation overhead. Having both would be redundant, so eventually we'll have to choose. It is possible that we will want to wait for a SIP coming from the capture checking side to be able to weigh the two proposals against each other.

@Atry
Copy link
Contributor

Atry commented Dec 14, 2023

With the help of virtual threads, all vanilla functions in JVM is already color-transparent.

@gabro gabro removed their request for review January 22, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
10 participants