Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce boundary/break control abstraction. #16612

Merged
merged 34 commits into from
Jan 20, 2023

Conversation

odersky
Copy link
Contributor

@odersky odersky commented Jan 4, 2023

The abstractions are intended to replace the scala.util.control.Breaks and
scala.util.control.NonLocalReturns. They are simpler, safer, and more performant,
since there is a new MiniPhase DropBreaks that rewrites local breaks to labeled
returns, i.e. jumps.

The abstractions are not experimental. This break from usual procedure is because
we need to roll them out fast. Non local returns were just deprecated in 3.2, and
we proposed NonLocalReturns.{returning,throwReturn} as an alternative. But these
APIs were a mistake and should be deprecated themselves. So rolling out boundary/break now counts
as a bugfix.

@odersky odersky force-pushed the add-errorhandling branch 2 times, most recently from 42cffa3 to 2848427 Compare January 4, 2023 07:51
@odersky
Copy link
Contributor Author

odersky commented Jan 4, 2023

It would be good to discuss naming and semantics of this PR while it is in flight.

Comment on lines +36 to +38
catch case ex: Break[T] @unchecked =>
if ex.label eq local then ex.value
else throw ex
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This @unchecked is unsound. At the point of the case, you don't know yet that ex has the same T as the Label. So you're introduce unsound relationships here. This is the kind of things that led us to make a distinction between user cast and compiler-inserted casts. Let's not burn more of that in the standard library.

Consider:

Suggested change
catch case ex: Break[T] @unchecked =>
if ex.label eq local then ex.value
else throw ex
catch case ex: Break[?] =>
if ex.label eq local then ex.value.asInstanceOf[T]
else throw ex

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The asInstanceOf[T] is always a no-op since it erases to asInstanceOf[Object] which is then dropped. So it's only complicating the extractors. The problem we had previously was with a post-hoc isInstanceOf.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that, given that this method is inlined, when the T is statically known then asInstanceOf[T] will be checked.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This follows #16550 (comment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but the check will always succeed. And the cast will anyway inserted. I am leaving out the cast since it makes the boundary recognition more fragile. If we accept (and drop) a cast, how do we know it's not a user inserted cast that can fail?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a non-issue if we don't inline boundary.apply.

library/src/scala/util/boundary.scala Outdated Show resolved Hide resolved
library/src/scala/util/break.scala Outdated Show resolved Hide resolved
library/src/scala/util/boundary.scala Outdated Show resolved Hide resolved
library/src/scala/util/boundary.scala Outdated Show resolved Hide resolved
Comment on lines 8 to 9
case ex: ControlException => throw ex
case NonFatal(_) => fallback
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a perfect example of why ControlThrowables are not caught by NonFatal. By making this new ControlException by-passing that mechanism, we introduce more work for user-space code like this to do the right thing.

tests/run/breaks.scala Show resolved Hide resolved
compiler/src/dotty/tools/dotc/Compiler.scala Show resolved Hide resolved
* { val local: Label[...] = ...; <LabelTry(local, body)> }
*/
def unapply(tree: Tree)(using Context): Option[(Symbol, Tree)] = stripTyped(tree) match
case Block((vd @ ValDef(nme.local, _, _)) :: Nil, LabelTry(caughtAndRhs))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can't the ValDef be renamed under inlining? It might not always be called nme.local.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently I don't see how it would be renamed. Testing for the name means we narrow down the check must faster, so we can execute LabelTry only for very likely positives.

compiler/src/dotty/tools/dotc/transform/DropBreaks.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/transform/DropBreaks.scala Outdated Show resolved Hide resolved
compiler/src/dotty/tools/dotc/transform/DropBreaks.scala Outdated Show resolved Hide resolved
library/src/scala/util/boundary.scala Outdated Show resolved Hide resolved
library/src/scala/util/boundary.scala Show resolved Hide resolved
* Instances of `ControlException` should not normally have a cause.
* Legacy subclasses may set a cause using `initCause`.
*/
abstract class ControlException(message: String | Null) extends RuntimeException(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that we wanted to make Break extend Throwable to avoid catching it in a catch that expects an Exception. Why did this change? What is the use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We had a long discussion on contributors about that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -0,0 +1,72 @@
import scala.util.*

object breakOpt:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With run tests we only check the semantics of the boundary/break. We also need to test that the optimization in happening. I propose using bytecode tests similar to https://github.com/lampepfl/dotty/blob/main/compiler/test/dotty/tools/backend/jvm/DottyBytecodeTests.scala#L1519-L1558. There we will be able to check that the gotos and returns are generated as expected.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could add these tests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please go ahead!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can cherrypick 8715766

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These will have to be adapted to latest codegen. In fact, in the interest of robustness, it's probably best to just test that a test contains a labeled block or a try or both. That's all we need to know, and that leaves open later changes in code generation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. Now it can be cherry-picked from 5c7a000.

@sjrd
Copy link
Member

sjrd commented Jan 10, 2023

I am still skeptical about the new ControlException. I think the pros of having ControlThrowable not caught by NonFatal are valuable. One of the tests must even handle Breaks in addition of NonFatal, proving that point. This change is disregarding years of experience with them for something that is not a strict improvement.

@odersky
Copy link
Contributor Author

odersky commented Jan 10, 2023

@sjrd That test was an adaptation of rescue.scala which specifically tested the (in my opinion questionable) behavior of ReturnThrowable.

I think ControlThrowable was a mistake as the discussion in contributors illustrates. Specifically, I think it is a very serious problem that Future { ... } might silently terminate its thread and never return anything if it uses a ControlThrowable. That's just the kind of behavior that drives one insane.

With ControlException all you have to keep in your head is exceptions. breaks throw exceptions, that are caught by boundary, period. We simply can't pretend otherwise, it would be a leaky abstraction that can get us into trouble. The optimization in DropBreaks will only kick in if it can show that nobody can observe that a break is implemented by throwing an exception.

@sjrd
Copy link
Member

sjrd commented Jan 10, 2023

The discussion on Contributors illustrates both pros and cons to Try catching control-throwables, with several people saying that they definitely want control-throwables to bypass Try, hence NonFatal.

With ControlException all you have to keep in your head is exceptions. breaks throw exceptions, that are caught by boundary, period.

It is equally valid to say: breaks throw exceptions that are caught by boundary and not by NonFatal handlers, period. This is not any leakier. It only depends on how we specify what it does.

@odersky
Copy link
Contributor Author

odersky commented Jan 10, 2023

I think result the discussion in contributors was that we will not deprecate ControlThrowable. But I saw no good argument why new code should not use ControlException, given all the problems with ControlThrowable.

The reason for ControlThrowable was to implement non-local returns which specifically should not look like exceptions since local returns aren't exceptions either. But that argument no longer applies.

@sjrd
Copy link
Member

sjrd commented Jan 10, 2023

I think result the discussion in contributors was that we will not deprecate ControlThrowable. But I saw no good argument why new code should not use ControlException, given all the problems with ControlThrowable.

But what are "all the problems" with ControlThrowable? I see a problem with a keyword return silently becoming an exception sometimes. I see no problem with boundary/break always using a ControlThrowable. I see no problem with Try and other NonFatal-based user-space catch-all code letting ControlThrowables through. In fact this is what I want.

I can see an argument for why Future should perhaps catch ControlThrowables in addition to NonFatals (essentially only catching VirtualMachineErrors). But that does not in any capacity invalidate the perfectly good design of NonFatal+ControlThrowable.

Basically: I see no good argument why new code should suddenly use a different concept that doesn't work like the perfectly valid status quo.

@odersky
Copy link
Contributor Author

odersky commented Jan 10, 2023

My problem with ControlThrowable is that it is already very hard to reason about control flow when there are exceptions. Now we have a third weird way to abort, which is easily overlooked. I.e. we think Try catches all exceptions so Future(...) is safe since it uses Try. But no, ControlFlowables escape. And this causes the problem in Futures and other code that tries to use Try. It's just too many choices to think of and document. We can make our life much easier if the only way to abort non-local things is an exception.

Sure, sometimes we need to make an exception for ControlException if indeed we want it to go through. But that is analogous to all other kinds of exceptions: You catch the ones you need to handle and if your handler is too general you first propagate the ones you want to pass through. I don't see why breaks should be treated with the opposite default.

To illustrate the current insanity, look at this code:

 catch case ex: Throwable => // catches non-local returns
 catch case NonFatal(ex) => // does not catch non-local returns

Conclusion? non-local returns are fatal!

Overall it's just too much hair splitting going on.

@sjrd
Copy link
Member

sjrd commented Jan 10, 2023

Focusing on non-local-returns to make ControlThrowables look bad is not a good argument IMO. Non-local-returns themselves were bad, whether or not they would be caught by NonFatal or not. That's because they were sometimes throwing and sometimes not, resulting in a leaky abstraction. I repeat: the leaky abstraction was non-local-returns; not the ControlThrowables they were using under the hood.

The abstractions are intended to replace the `scala.util.control.Breaks` and
`scala.uitl.control.NonLocalReturns`. They are simpler, safer, and more performant,
since there is a new MiniPhase `DropBreaks` that rewrites local breaks to labeled
returns, i.e. jumps.

The abstractions are not experimental. This break from usual procedure is because
we need to roll them out fast. Non local returns were just deprecated in 3.2, and
we proposed `NonLocalReturns.{returning,throwReturn}` as an alternative. But these
APIs were a mistake and should be deprecated. So rolling out boundary/break now counts
as a bugfix.
Drop the `transparent` in order to curcumvent scala#16609
Change the recommendation in the warning about non-local returns accordingly.

Still to do: A PR against scala/scala that deprecates scala.util.control.Breaks.
`DropBreaks` now detects the calls to `break` instead of their inline expansion.

Also: Make `Break`'s constructor private.
This is needed to avoid verify errors
# Conflicts:
#	tests/run/errorhandling/break.scala
Move the detected `break` methods into the `boundary` object and keep inline
methods in object `scala.util.break` as facades.
@odersky
Copy link
Contributor Author

odersky commented Jan 18, 2023

The case for keeping break facades: I can use them with a single import

import scala.util.{boundary, break}

I think that's acceptable as a replacement for non-local returns. But requiring two imports

import scala.util.boundary
import scala.util.boundary.break

or requiring the user to write boundary.break instead of return is a bit heavy.

@nicolasstucki
Copy link
Contributor

An alternative design that would satisfy all our concerns

  • Single import
  • API located in a single place
  • No unnecessary protected methods

would be the following

import scala.utils.boundaries.*

boundary {
  while ... do
    ...
    if .. then break()
}
package scala.util

object boundaries:

  final class Break[T] private[boundaries](val label: Label[T], val value: T)
  extends RuntimeException(
    /*message*/ null, /*cause*/ null, /*enableSuppression=*/ false, /*writableStackTrace*/ false)

  final class Label[-T]

  def break[T](value: T)(using label: Label[T]): Nothing =
    throw Break(label, value)

  def break()(using label: Label[Unit]): Nothing =
    throw Break(label, ())

  inline def boundary[T](inline body: Label[T] ?=> T): T =
    val local = Label[T]()
    try body(using local)
    catch case ex: Break[T] @unchecked =>
      if ex.label eq local then ex.value
      else throw ex

end boundaries

@dwijnand
Copy link
Member

The case for keeping break facades: I can use them with a single import

import scala.util.{boundary, break}

I think that's acceptable as a replacement for non-local returns. But requiring two imports

import scala.util.boundary
import scala.util.boundary.break

or requiring the user to write boundary.break instead of return is a bit heavy.

What does import scala.util.boundary, boundary.break count as? 1 or 2? 😄

@odersky
Copy link
Contributor Author

odersky commented Jan 19, 2023

What does import scala.util.boundary, boundary.break count as? 1 or 2? 😄

I had to try this to be sure it works. Never saw that one in the wild 😜

We can't use `String.lines()` since that only exists on Java 11 and the CI still runs on Java 8.
Use `linesIterator` in `StringOps` instead.
Copy link
Contributor

@nicolasstucki nicolasstucki left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise LGTM

@@ -357,6 +357,7 @@ object StdNames {
val Flag : N = "Flag"
val Ident: N = "Ident"
val Import: N = "Import"
val Label_this: N = "Label_this"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
val Label_this: N = "Label_this"

*
* where `target` is the `goto` return label associated with `local`.
* Adjust associated ref counts accordingly. The local refcount is increased
* and the non-local refcount is decreased, since `local` the `Label_this`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't have the Label_this anymore

// we want by-name parameters to be converted to closures

/** The number of boundary nodes enclosing the currently analized tree. */
var enclosingBoundaries: Int = 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

private?

@gfZeng
Copy link

gfZeng commented Feb 2, 2023

Ha!
if Scala target to pure functional programming, why we need those?
if not, why don't we just use primitive break, continue?
I'd like use primitive break, continue. It's conveniently.

@SethTisue
Copy link
Member

SethTisue commented Feb 2, 2023

if Scala target to pure functional programming, why we need those?

That's not the only style of code Scala targets.

if not, why don't we just use primitive break, continue?

Because the Java/C break and continue only work with while and for. Scala shares its while with those languages, but Scala's for is utterly different. It isn't a looping construct per se, but desugars to method calls. So some other design is needed.

@gfZeng
Copy link

gfZeng commented Feb 2, 2023

Because the Java/C break and continue only work with while and for. Scala shares its while with those languages, but Scala's for is utterly different. It isn't a looping construct per se, but desugars to method calls. So some other design is needed.

Syntax design should follow literal meaning. The specific logic of the implementations should be handled by the compiler.

@Ichoran
Copy link

Ichoran commented Feb 2, 2023

boundary and break sound very similar to a construct I've been using for a while, which I called hop:

def nextPowerOf2(i: Int) = Hop.int{
  Iterator.iterate(1)(_ * 2)
    .takeWhile(_ > 0)
    .foreach(j => if j > i then hop(j))
  i
}

So, being able to discard this part of the library and just being able to

def nextPowerOf2(i: Int): Int = boundary {
  Iterator.iterate(1)(_ * 2)
    .takeWhile(_ > 0)
    .foreach(j => if j > i then break(j))
  i
}

seems nice!

However, I'm concerned about the composability of boundary/break as the only performant alternative to nonlocal returns. For instance, if I write

def nextDangerousThing(i: Int) = Hop.int{
  Iterator.iterate(i)(_ * 2)
    .takeWhile(_ > 0)
    .foreach(j => Try{ hop(dangerously(j)) })
  i
}

then because Try lets through ControlThrowable, the code works: if dangerously succeeds, we hop out. If it fails, Try catches the exception. Likewise, this would work if I used a nonlocal return (here, no difference in functionality because the scope of the hop is the entire function). But

def nextDangerousThing(i: Int) = boundary {
  Iterator.iterate(i)(_ * 2)
    .takeWhile(_ > 0)
    .foreach( j => Try{ break(dangerously(j)) } )
  i
}

does not give the desired result because the Try silently intercepts the control flow.

My understanding is that this is supposed to be desirable, but I don't see how this results in composable code. I wouldn't care if there were other options for going at full speed where throwing stackless exceptions wasn't necessary, but if this is the only way to do it, I'm kind of worried.

Now, of course, I can always avoid using Try and anything else that isn't break-aware, as can others, and we can all write things like

object Attempt {
  def apply[A](a: => A) =
    try Correct(a)
    catch
      case b: boundary.Label[_] => throw b
      case NonFatal(e) => Wrong(e)
}

which is basically just Try relabeled, except it lets the break through so you have composable code once again.

But then people will write things like

boundary {
  Future(Attempt(if nice then foo() else break(bar)))
}

and the exception will leak through and clobber the thread again.

So we have all this churn and accomplish almost nothing, don't we? Aside from pushing off the exact same problem for a while--trading it for a temporary composability problem--until people recover composable code patterns?

I like the boundary/break semantics. I think it's a very clear way to express the problem--as you can tell because I have already been using the same thing! And if the compiler can rewrite the boundary/break into a local return / jump when appropriate, that's fantastic--the JIT compiler sometimes manages with locally-caught stackless exceptions, but it's not something one can really count on.

But I don't see why we don't just reuse the existing ControlThrowable machinery given that we're liable to end up with the same problem soon enough.

The real issue is that some things need to be unbreakable, isn't it?

object Future {
  def apply[A](f: NotGiven[boundary.Label[_]] ?=> A): Future[A] = ???
}

Except that isn't quite a strong enough guarantee: that would be true if you were careless with propagating the breakability context. You probably want something that takes active work to propagate, not just a NotGiven that will pop up anytime you didn't bother keeping track of whether the context is breakable.

object Future {
  def apply[A](f: Unbreakable ?=> A): Future[A] = ???
}

object boundary {
  def apply[A](f: Label[A] ?=> A)(given NotGiven[Unbreakable]): A = ???
}

Then you might finally have enough information in the type system to start allowing the compiler to peel apart the conflict between composability in breakable contexts with unbreakability.

Finally, if because boundary/break works with inlining, unlike return, this would be is really amazing for, for instance, replicating Rust's `?. It's almost trivial:

extension [L, R](e: Either[L, R])
  inline def ?(using boundary.Label[Either[L, R]]): R = e match
    case l: Left[L] => break(l)
    case Right(r) => r

Now whenever you're in breakable context you can shortcut-exit with the error-case with .?.

But this kind of empowering feature just raises the pressure yet more for having code that is composable.

So, in conclusion: I think this was a great addition, but I think Label should be a ControlThrowable, and the path to thread-safety (and other sorts of context-safety) should involve additional constructs, not abandoning composability.

(The precise kind of composability I mean is that wherever you have

foo => bar(foo)

and you replace it with

foo => bar(nice(foo))

where nice makes un-nice things nice, then in the case where everything works nicely, there should be no change in behavior.)

Final thought: I am personally not at all troubled by dropping Try and all other safety constructs and rolling my own, so for me personally, if boundary/break has at least as good performance as local/nonlocal return, I will be delighted without any additional changes whatsoever! The points I raise above are only for the benefit those who might prefer not to retool all their safety-handling code.

@bishabosha
Copy link
Member

bishabosha commented Feb 3, 2023

I figured out a way to get around the limitations of opaque types for the loop example:

import scala.util.boundary, boundary.break

object Loops:

  type ExitToken = Unit { type Exit }
  type ContinueToken = Unit { type Continue }

  type Exit = boundary.Label[ExitToken]
  type Continue = boundary.Label[ContinueToken]

  object loop:
    inline def apply(inline op: (Exit, Continue) ?=> Unit): Unit =
      boundary[ExitToken]:
        while true do
          boundary[ContinueToken]:
            op
    inline def exit()(using Exit) = break(().asInstanceOf[ExitToken])
    inline def continue()(using Continue) = break(().asInstanceOf[ContinueToken])

This manages to optimise to labelled jumps and does not box Unit!

e.g.

import Loops.*
@main def oddsUpToLimit(limit: Int) =
  var i = 0
  loop:
    i += 1
    if i == limit then loop.exit()
    if i % 2 == 0 then loop.continue()
    println(i)

with CFR output it decompiles to

public void oddsUpToLimit(int limit) {
    int i = 0;
    while (true && ++i != limit) {
        if (i % 2 == 0) continue;
        Predef$.MODULE$.println((Object)BoxesRunTime.boxToInteger((int)i));
    }
}

@Ichoran
Copy link

Ichoran commented Feb 10, 2023

For reference: I have rewritten all my control abstractions that used to use nonlocal return (and macros, usually) to use boundary/break instead. It is fantastic! There are a few issues with type inference not being quite as smooth with Label as it was with CanHop (I think because of how inlining interplays with type inference), but overall it is a very very nice way to abstract early return functionality.

I have also thrown away every use of Try and NonFatal and replaced them with early-return-respecting equivalents.

The only thing that could make things better is a witness that closures, lazy values, threads, and the like, cannot escape the context where early returns are defined. Any of these are broken regardless of how far the exception propagates and should be caught at compile-time when possible. Still, I'm used to having to do this manually anyway, and the advantage of direct-style error handling, for instance, is far bigger than the downside of accidental leakage of control flow into non-consecutive execution context.

Anyway, my code is extra-pretty now! I just write things like

def load(path: Path): DataSet Or Err = Or.Ret:
  val data = nice{ Files.readAllBytes(findDataFile(path).?) }.?
  val reliable = easy(data)
  computeOn(data, reliable).?.dataSet

where Or.Ret is a boundary that specifically takes an unboxed sum type Or (and wraps the return value as a success if execution finishes normally), and nice is like Try except it respects all control flow, and will try to use a given to pack an exception into an error type that you like.

The vanilla equivalent would be something like

def load(path: Path): Either[Err, DataSet] =
  Try{ findDataFile(path) } match
    case f: Failure => Left(Err from f)
    case Success(Left(e)) => Left(e)
    case Success(Right(file)) =>
      Try{ Files.readAllBytes(file) } match
        case f: Failure => Left(Err from f)
        case Success(data) =>
          val reliable = easy(data)
          computeOn(data, reliable).map(_.dataSet)

Anyway, I continue to think that re-using ControlThrowable for Break is a good idea, but I'm not affected personally (at the moment, anyway).

@odersky
Copy link
Contributor Author

odersky commented Feb 11, 2023

The only thing that could make things better is a witness that closures, lazy values, threads, and the like, cannot escape the context where early returns are defined. Any of these are broken regardless of how far the exception propagates and should be caught at compile-time when possible. Still, I'm used to having to do this manually anyway, and the advantage of direct-style error handling, for instance, is far bigger than the downside of accidental leakage of control flow into non-consecutive execution context.

That's exactly what our ongoing work on capture checking is about. Hopefully this will mature so that it becomes more widely usable. An important next step that is necessary for widespread usage is to capture-annotate the standard library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants