Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide on the role of : in indentation syntax #7136

Closed
odersky opened this issue Aug 30, 2019 · 94 comments
Closed

Decide on the role of : in indentation syntax #7136

odersky opened this issue Aug 30, 2019 · 94 comments

Comments

@odersky
Copy link
Contributor

odersky commented Aug 30, 2019

The meaning of : for significant indentation is more contentious than other aspects. We should come up with a crisp and intuitive definition where : is allowed.

One way I'd like to frame significant indentation in Scala 3, is that braces are optional, analogously to how semicolons are optional. But what does that mean? If we disregard : it means:

At all points where code in braces {...} is allowed, and some subordinate code is required, an indented code block is treated as if it was in braces.

The second condition is important, since otherwise any deviation from straight indentation would be significant, which would be dangerous and a strain on the eyes. But with that condition it's straightforward: If some subordinate code is required, either that code follows on the same line, or it is on a following line, in which case it should be indented. Braces are then optional, they are not needed to decide code structure. In the following I assume this part as given.

So that leaves the places where some code is not required but we still like to insert braces. There are two reasonable use cases for this:

  1. Between the extends clause of a class, object, or similar and its (optional) template definitions in braces.
  2. In a partial application such as
    xs.foreach:
      ...
    The motivation for this second case is to make uses of library-defined operations close to
    native syntax. If native syntax allows to drop braces, there should be a way for library-defined
    syntax to do the same.

Possible schemes

The current scheme relies on colons at end of lines that can be inserted in a large number of situations. There are several possible alternative approaches I can see:

  1. Don't do it. Insist on braces for (1) and (2).

  2. Split (1) and (2). Make indentation significant after class, object, etc headers without requiring a semicolon. This has the problem that it is not immediately clear whether we define an empty template or not. E.g. in

    object X extends Z
    
     // ...
     // ...
    
     object Y

    it is hard to see whether the second object is on the same level as the first or subordinate. But semantically it makes a big difference. So a system like that would be fragile. By contrast, a mandatory : would make it clear. Then the version above would be two objects on the same level and to get a subordinate member object Y you'd write instead:

    object X extends Z:
    
     // ...
     // ...
    
     object Y

    So I actually quite like the colon in this role.

  3. Split (1) and (2) but require another keyword to start template definitions after class, object, etc headers. @eed3si9n suggested where. It's a possibility, but again I do like : at this point. It reads better IMO (and where might be a useful keyword to have elsewhere, e.g. in a future Scala with predicate refinement types).

  4. Keep : (or whatever) to start templates but introduce a general "parens-killing" operator such as $ in Haskell with a mandatory RHS argument. If that occurred at the end of a line, braces would be optional according to our ground rules.

    I fear that a general operator like that would lead to code that was hard to read for non-experts. I personally find $ did a lot more harm than good to the readability of Haskell code.

  5. Restrict the role of : to the two use cases above. I.e. allow a colon if

    • a template body is expected, or
    • in a partial application, after an identifier, closing bracket ] or closing parenthesis ).

    This way, we would avoid the confusing line noise that can happen if we allow : more freely.

My personal tendency would be to go for that last option.

@odersky odersky changed the title Decide on role of : in indentation syntax Decide on the role of : in indentation syntax Aug 30, 2019
@LPTK
Copy link
Contributor

LPTK commented Aug 30, 2019

Regardless of whether we find a solution for (2), I would go for requiring braces for (1) — object, class, and trait definitions — which is what I proposed in the other thread.

But let me repeat and strengthen the rationale here:

  • It provides a minimum amount of guidance while reading big nested definitions.

    I already find it really difficult to read sources like this or this, where there are several levels of nested objects/classes with huge bodies. It's really challenging to see what begins and ends where — just imagine what it would look like without braces. Currently, using an IDE helps, as I can place the cursor on an opening or closing brace and see the whole corresponding scope highlighted/delineated. That won't be possible without braces.

    And as others have pointed out, this kind of code does come up in realistic situations (my examples are from the scala/scala repo after all). If we make braces around classes/objects optional, we'll encourage people not to use them, and we'll invariably end up with unreadable messes when files naturally grow in size over time.

  • It makes sense conceptually, if we consider that curly braces are used specifically to denote object scopes.

  • I would also argue that object scopes should be explicitly delineated because they are much more significant than expression scopes. Indeed, in an object scope, you get a bunch of implicitly imported definitions, you get a new meaning for this, etc.

@odersky
Copy link
Contributor Author

odersky commented Aug 30, 2019

@LPTK I believe that end markers are actually a far superior way to delineate class scopes. And having both braces and end markers would look weird.

@odersky odersky added this to To do in Scala 3.x planning Aug 30, 2019
@JanBessai
Copy link

Not sure if this is the correct thread, but:
How does indentation sensitivity mix with triple quoted multi-line strings? Does the end-quote have to be indented? If so: how do you write down a line break + spaces before the end quote? Indent further?

@sjrd
Copy link
Member

sjrd commented Aug 30, 2019

At the risk of having to my own devil's advocate in a week or so ... what about using with instead of : for the two use cases above.

For templates:

class Foo extends Bar with SomeTrait with
  def x: Int = 42

The with there does not shock me. It is even quite easy to interpret it as "braces as optional" if we also allow

class Foo extends Bar with SomeTrait with {
  def x: Int = 42
}

That syntax makes sense to me, as the contents of the block are added as members of Foo just as much as the members of SomeTrait. I don't know, it kind of makes sense to me.

For that, the lexer would introduce an indent if with is at the end of the line, the next line is indented and starts by anything but an identifier.

I would actually prefer to simply not have anything to open a template, though. However this might require the parser to feed into the lexer to implement, which is not ideal. Or maybe we can make it work by simply counting opening and closing brackets? That would be nice.

For method calls:

xs.foreach with
  x =>
    println(x)

val y = xs.foldLeft(0) with
  (prev, x) =>
    prev + x

val z = optInt.fold with
  println("nope")
with
  x => println(x)

This syntax had actually been proposed in the original "new implicits" proposals, for implicit parameter lists. So it must have appealed at some point. Except here it is used to pass normal arguments, when they are blocks (by-name params or lambdas).

The advantage over : is that, well, it's not :. There have been several concerns about the multiple problems that : exposes, not the least being the overloading wrt. type ascriptions.

with also suffers from overloading in this case, but perhaps it is less annoying than : because it would only have two meanings that are very well visually separated: with in a class header is composition; with in an expression is a block argument.

I don't really like what I'm proposing, but I dislike it a lot less than :.

@jducoeur
Copy link
Contributor

I don't really like what I'm proposing, but I dislike it a lot less than :.

Strong agreement. I think : is terrible -- not only is it badly overloaded, it's simply too visually subtle. I'm concerned that it will lead to bugs due to people just missing it. I dislike this whole approach (IMO it crosses the line into designing a different language, not enhancing Scala, and enormously increases the risk of splitting the community) but probably this aspect most of all...

@eed3si9n
Copy link
Member

In #7083 I wrote:

Since colon means type annotation (ascription) x: A already, reusing this to mean "begin bill of material" or "begin block" seems odd to me too.

F# uses =, and Haskell uses where:

module Main where  
  import A  
  import B  
  main = A.f >> B.f

(2) In a partial application such as xs.foreach:

I think with is an improvement over :, but it does suffer from overloading. Could we borrow <| from F# here?

xs.foreach <| x =>
  println(x)

ys.foreach <| y => println(y)

val y = xs.foldLeft(0) <| (prev, x) =>
  prev + x

If you squint, it looks like begin {. Note here that the indentation doesn't start until \n so you can write x => on the same line.

pattern match

Pattern matching is an odd one because <| is the opposite direction.

val kind = ch match
  case ' '  => "space"
  case '\t' => "tab"
  case _    => s"'$ch'-character"

You basically want |> but that might be too confusing:

val kind = ch |>
  case ' '  => "space"
  case '\t' => "tab"
  case _    => s"'$ch'-character"

I think we should just allow match to be a special indentation introducer.

(1) template definitions

What's interesting about template is that it is both a list of members and it's also the body of the constructor. So in that sense it might make sense to keep it the same syntax as block introducer.

class Contact(name: String) extends Bar with SomeTrait <|
  def x: Int = 42

  DB.append(name)

object Contact <|
  def apply: Contact = new Contact("")

I understand that = is not exactly right, but visually it looks less intimidating I think

class Contact(name: String) extends Bar with SomeTrait =
  def x: Int = 42

  DB.append(name)

object Contact =
  def apply: Contact = new Contact("")

I don't really like what I'm proposing, but I dislike it a lot less than :.

Ditto.

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 31, 2019

I propose re-using existing keywords wherever possible. The whole point of this exercise is to make the syntax lightweight: having long keywords like where defeats the purpose entirely. if necessary, we could commandeer do as a short-enough keyword to delimit blocks in the case of ambiguity. I think : is common enough that overloading it would be confusing, and it really isn't that much shorter than do anyway.

Re-using existing keywords can get us surprisingly far. Consider Scalite:

package scalite.tutorial                                    package scalite.tutorial

class Point(xc: Int, yc: Int)                               class Point(xc: Int, yc: Int) {
    var x: Int = xc                                           var x: Int = xc
    var y: Int = yc                                           var y: Int = yc
    def move(dx: Int, dy: Int) =                              def move(dx: Int, dy: Int) = {
        x = x + dx                                              x = x + dx
        y = y + dy                                              y = y + dy
                                                              }
    override def toString() =                                 override def toString() = {
        "(" + x + ", " + y + ")"                                "(" + x + ", " + y + ")"
                                                              }
                                                            }
object Run                                                  object Run {
    def apply() =                                             def apply() = {
        val pt = new Point(1, 2)                                val pt = new Point(1, 2)
        println(pt)                                             println(pt)
        pt.move(10, 10)                                         pt.move(10, 10)
        pt.x                                                    pt.x
                                                              }
                                                            }
var x = 0                                                   var x = 0
for(i <- 0 until 10)                                        for(i <- 0 until 10) {
    val j = i * 2                                             val j = i * 2
    val k = j + 1                                             val k = j + 1
    x += k                                                    x += k
                                                            }
val list =                                                  val list = {
    for(i <- 0 to x) yield                                    for(i <- 0 to x) yield {
        val j = i + 1                                           val j = i + 1
        i * j                                                   i * j
                                                              }
                                                            }
list.max                                                    list.max
// 10100                                                    // 10100
val all = for                                               val all = for {
    x <- 0 to 10                                              x <- 0 to 10
    y <- 0 to 10                                              y <- 0 to 10
    if x + y == 10                                            if x + y == 10
yield                                                       } yield {
    val z = x * y                                             val z = x * y
    z                                                         z
                                                            }
all.max                                                     all.max
// 25                                                       // 25

I think this looks superior to using any delimiter. Note that the above already works in Scala 2.11/12/13, and has worked for 5 years now.

We have to use the same whitespace rules for both def/var/vals and class/trait/object. Having mixed rules for what things are "whitespace compatible" and what things aren't is a mess, and would look terribly ugly, especially in Scala where (unlike Java) having nested inner class/objects at the same level as your def/var/vals is common, and having top-level def/var/vals at the same level as your top-level class/objects is going to become common as well. If we want classes/objects to be "special", that ship sailed long ago.

One subtlety that we have to take care of is providing for higher-order methods. Code like:

val foo = bar
  .map{x => 
    val y = x + 1
    y + 1
  }
  .foreach{ x =>
    val y = x + 1
    println(y)
  }

is extremely common with Scala's method-chaining conventions, and I'd want to be able to provide a whitespace-compatible syntax:

val foo = bar
  .map x => 
    val y = x + 1
    y + 1
  .foreach x =>
    val y = x + 1
    println(y)

In Scalite I commandeered the do keyword for this purpose:

val xs = 0 until 10                                         val xs = 0 until 10
val ys = xs.map do                                          val ys = xs.map{
    x => x + 1                                                x => x + 1
                                                            }
ys.sum                                                      ys.sum
// 55                                                       // 55

val zs = xs.map do                                          val zs = xs.map{
case 1 => 1                                                   case 1 => 1
case 2 => 2                                                   case 2 => 2
case x if x % 2 == 0 => x + 1                                 case x if x % 2 == 0 => x + 1
case x if x % 2 != 0 => x - 1                                 case x if x % 2 != 0 => x - 1
                                                            }
zs.sum                                                      zs.sum
// 45                                                       // 45
val ws = xs.map do x =>                                     val ws = xs.map { x =>
    val x1 = x + 1                                            val x1 = x + 1
    x1 * x1                                                   x1 * x1
                                                            }
ws.sum                                                      ws.sum
// 385                                                      // 385

Since we seem to be deprecating do-while loops as I suggested 5 years ago, the do keyword is freed up. As a short, 2-character, now entirely unconflicted keyword, we should make sure we use it as effectively as possible.

For a longer example using what I propose, please take a look at this self-contained JSON parser:

I think it would be a good starting point to compare different syntaxes, being meaty enough to really give you a feel of things where trivial 10-line examples do not. The short examples are also illustrative:

@odersky
Copy link
Contributor Author

odersky commented Aug 31, 2019

I want to comment on sub-part (2), i.e. what to use for starting an indented argument. I believe that no single keyword would work well in that role.

with was borderline acceptable as an implicit argument, since it evokes pairing with a context. But for plain arguments it looks wrong at least as often as it looks OK. For instance:

math.logarithm with 
  val x = f(y)
  x * x

I believe with would only work well if the real function application is to something else and the part following with is in some way an accessory to that. E.g. in xs.map with f it works since the mapped argument is xs and f is an accessory. But the case of logarithm above shows that we cannot generalize that to all applications.

do evokes side effects. It is used now in that role specifically in while or for loops. E.g.:

for 
  x <- xs
  y <- ys
do
  println(x + y)

I believe for pure functions, do is out of place. E.g.:

transpose do
  val a: Matrix = ...
  val b: Matrix = ...
  a * b

In fact, the pattern of function application is so general and multi-faceted that no single keyword can do it justice. That's why Dijkstra uses infix point (well, that's taken already in Scala!) and Haskell uses $. But colon does work brilliantly in this role! Evidence #1 is Python, where it is perceived to be very natural. Yes, I know Python uses it everywhere instead of just for this purpose, but still... Evidence #2 is common language. It's very natural to write something like:

  To make a cake:
    preheat oven,
    mix flours, eggs and milk,
    bake for one hour.

The : specifically introduces a list of statements that's subordinate to a prefix clause. That's completely grammatical. So, I believe strongly that : is the best operator for this.

As to possible ambiguity with type ascription: I really don't think that's a problem. : at the end of line and : used infix are visually quite distinct. The only caveat is that we should not let one follow the other. I.e.:

  def f(): T:
     return foo

would be awkward. But ever since procedure syntax was dropped, Scala does not have syntax where this pattern could occur.

The only possible ambiguity is in a type ascription of an expression spanning multiple lines like this:

  someLongExpression:
    someLongType

But I have not seen code like this in the wild, and in fact our code base including all tools, all tests, and community build does not have a single instance where this pattern occurs. Why does it not occur?
I guess if you have long expressions and types to combine you realize that you are probably better off factoring stuff into a val:

  val someId: someLongType = 
    someLongExpression
  someId

And, if you really need to write a multi-line ascription, you can aways do:

  someLongExpression
    : someLongType

which in fact reads much better. So, in summary, any ambiguity would be extremely rare, and is easily avoided. The other evidence why ambiguities are not a concern is again Python. Python does use : for both roles and the Python community is not known to be sloppy with syntax. Interestingly Python does not use : to indicate a function return type, since that would let them run into precisely the awkwardness I referred to earlier. It uses -> instead. But since Scala uses = instead of : to start a function body it does not run into that problem.

One downside of : is that it only works at the end of lines. So:

  xs.map: x =>
    val y = f(x)
    g(y)

does not work. You have to format it instead as:

  xs.map: 
    x =>
    val y = f(x)
    g(y)

I think this is not so bad. In real code the { x => part is often quite far to the right because the expression preceding it is long. This makes it hard to see the bound name. The vertical syntax makes it much clearer what is bound.

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 31, 2019

I think this is not so bad. In real code the { x => part is often quite far to the right because the expression preceding it is long. This makes it hard to see the bound name. The vertical syntax makes it much clearer what is defined.

To me this formatting is he deal breaker, much more than the ambiguity around type ascriptions. I work with a lot of real code formatted exactly as you describe, and it reads excellently. If the LHS is long, the .map goes on a new line.

I have seen no code at all formatted similar to how you propose, and subjectively it looks awful. If it really looked better, people would already be formatting their lambdas like that right now, and they’re not.

@odersky
Copy link
Contributor Author

odersky commented Aug 31, 2019

@lihaoyi

I have seen no code at all formatted similar to how you propose, and subjectively it looks awful. If it really looked better, people would already be formatting their lambdas like that right now, and they’re not.

Fair point.

Here's a crazy idea for this pattern: use case! Examples:

xs.map case x =>
  val y = f(x)
  g(y)

xs.collect case Some(n) => n

xs.foreach case i => println(s"next: $i")

Points in favor:

  • case is semantically a correct option here.
  • we can combine this with a pattern match.
  • indentation works out of the box, but now it's prompted by the => at the EOL.

To make this work we'd have to add one production to Expr1. It's the last line below:

Expr1 ::= ...
             |  Expr2 ‘match’ ‘{’ CaseClauses ‘}’
             |  Expr2 ‘case’ Pattern [Guard] ‘=>’ Expr

WDYT?

@odersky
Copy link
Contributor Author

odersky commented Aug 31, 2019

[Aside: You may have noted that I "ate my own dog food" in the comments above: every single indented section was introduced with :. Readers should judge for themselves whether this is natural or not.]

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 31, 2019

case looks ok-ish to me, but that seems to introduce an even worse ambiguity: that between partial functions and total functions! This is something we had already made efforts to disambiguate in Dotty (e.g. requiring case for partial functions for for-comprehensions), so that case always means partiality, and partiality always means case. Requiring case just to make indentation work properly is definitely a step backwards

Honestly, I think we should just use do. The English meaning is almost irrelevant: with, class, object, for, etc. in Scala already mean vastly different things in English and Scala, and that's been mostly OK. Having do mean something specific in Scala isn't going to be the end of the world, especially since it's going to be entirely unambiguous and consistent: do will have no other meanings once do-while loops are out. I think it would definitely be preferable to overloading with or case or some other keyword which have very specific, existing Scala meanings.

I appreciate the desire to use :, but function-literals-with-arguments is a real sticking point. Python can get by with (1) not having multiline lambdas (2) a different -> foo syntax for annotating return types and (3) a dedicated syntax for context managers. Scala has none of these things, and passing multi-line n-arg function literals to higher order functions is the order of the day. We have to make sure it looks pretty and first class

@odersky
Copy link
Contributor Author

odersky commented Aug 31, 2019

I have strong objections against do. do universally means imperative side effect, in natural language as well as in all programming languages I know. (do comprehensions in Haskell model side effects via monads). So I believe we cannot simply change its meaning, in particular in a language like Scala which is predominantly functional but still allows side effects.

case is often used with partial functions but not always. A counter example is:

case class Point(x: Double, y: Double)

val points: List[Point] = ...
points.map {
  case Point(x, y) => ...
}

So, case is also used for destructuring in total functions. A simple variable pattern is a special case of that (and it's obvious at a glance that that's what it is)

@lihaoyi
Copy link
Contributor

lihaoyi commented Aug 31, 2019

I'd argue that the vast majority of dos in any language are not in Haskell (or Java/C/C++/Javascript/etc., where do-while loops are uncommon) but in Ruby, which uses do exactly to delimit lambdas which can have a return value:

$ cat foo.rb
def my_map(array)
  new_array = []

  for element in array
    new_array.push yield element
  end

  new_array
end

result = my_map([1, 2, 3]) do |number|
  number * 2
end

puts result.to_s
$ ruby foo.rb
[2, 4, 6]

Here it is used exactly as I am proposing for Scala: to delimit a multiline lambda function taking parameters and returning a value, as a replacement for curlies.

result = my_map([1, 2, 3]){ |number|
  number * 2
}

The usage of do blocks in higher-order collection transformations is exactly as it would be in Scala:

$ cat foo.rb
result = [1, 2, 3, 4].reduce do |sum, i|
  x = i * i
  sum + x
end

puts result.to_s

$ ruby foo.rb
30

Honestly I find case for de-structuring a bit of a wart: we are already getting rid of the need for case-destructuring tuples in Dotty, which I think is a great step forward. Regardless, to me having case mean "partial functions and destructuring" is still a lot better than "partial functions and destructuring and multiline function literals". The last concept really has nothing to do with the first two, and the case keyword is completely out of place: it is taking a keyword with a well-known meaning, and making it required for something entirely unrelated

@Ichoran
Copy link

Ichoran commented Sep 1, 2019

The niche a language occupies is a highly relevant consideration when choosing its syntax. Unless we want Scala to stop trying to occupy the "ultra-powerful type system" niche, where keeps pace with Haskell, do is a bad idea, even though the keyword is available.

I also think precedent is very important when deciding whether a language syntax is a good idea or not. I fully agree that : is completely natural linguistically--it's frequently used for introducing a following bunch of stuff. However, we're all really well-trained to see : as type ascription (either an actual type or a typeclass), and that makes me hesitant--even aside from genuine parse ambiguities--to repurpose it as a begin-block symbol.

With regards to case--well, if we must, okay, but like Li Haoyi I find it a wart in general. Rust does fine without it, and I wish we could too; or at least, I wish we could restrict it to only partial functions. Total functions, even with multiple match blocks, would be nicer without case.

Finally, I don't think my objections to having a bunch of different block-introductions have been adequately addressed. If we do pick something, I think it should be universal, even if it admits weird stuff. I would rather allow

foo:
  3
: 
  4

and insist on

if p then:
  foo
else:
  bar

than have to try to intuit which things require : and which do not. In addition to the keyword confusions I mentioned before, it also makes a sharper distinction between builtin language features and added syntax. One of the most beautiful features of Scala is that you can define your own libraries that feel perfectly seamless, as if they're a built-in part of the language. Having special rules for where : is used and where not would break this.

@lihaoyi-databricks
Copy link

lihaoyi-databricks commented Sep 1, 2019

I think it should be universal, even if it admits weird stuff. I would rather allow

foo:
  3
: 
  4

I think this would look pretty reasonable with a keyword like do:

foo do
  ???
do
  ???

Each indented do block is equivalent to a pair of braces. Although I wonder if it's possible to make the lexer/parser smart enough to omit the first do?

foo
  ???
do
  ???

This would look identical to a while-loop with a multiline condition:

while
  ???
do
  ???

Or if-else

if(???)
  ???
else
  ???

Perhaps even an if-else with a multiline condition:

if
  ???
do
  ???
else
  ???
if
  ???
then
  ???
else
  ???

That would essentially put user-land code syntactically on even footing with the builtin constructs, which seems like exactly what @Ichoran wants

@lihaoyi-databricks
Copy link

lihaoyi-databricks commented Sep 1, 2019

Unless we want Scala to stop trying to occupy the "ultra-powerful type system" niche, where keeps pace with Haskell

This seems like a reasonable thing to me. Scala is its own language with its own styles and conventions. I'm pretty sure this whole whitespace experiment was to try and emulate the approachability and widespread appeal of Python. We aren't trying to attract Haskellite's to Scala. That would make "keeping pace with Haskell" a non-goal altogether (though it still is unclear to me how the spelling of keyword affects the power of the type system)

@odersky
Copy link
Contributor Author

odersky commented Sep 1, 2019

@lihaoyi

After thinking a bit more about it I believe the case idea is indeed a bit crazy, since it risks overloading meanings, as you say. But do is not a workable choice either, for the reasons I stated. I believe Ruby's precendent is not a real counter-example: Ruby (and Smalltalk, from where the block syntax came) are at their heart imperative OO languages. So do is natural. Most closures passed in Ruby or Smalltalk would have side effects. This is no longer true in Scala.

We should also not invent many different constructs that mean the same thing.

So, one valid choice would still be: Do nothing. If you want an argument that is a multi-line lambda, use braces, or arrange the lambda vertically, as I had initially proposed. Sure, nobody does it like this now because it costs an extra line. But it might actually lead to clearer code.

If we must invent a parens killing operator, I think with is a reasonable choice, after all.

xs.map with x =>
  val y = f(x)
  g(y)

xs.collect with Some(n) => n

xs.foreach with i => println(s"next: $i")

It looks better than do for pure expressions like the map example above, but worse for side-effecting expressions like the foreach example. As a predominantly functional language, Scala should optimize for the pure scenario. My main concern about doing this is how to restrict it. All the examples above take lambdas, and that looks OK. But what about

xs.map with f

? You should be able to eta-reduce a lambda x => f(x) to just f without changing its context, so that would imply that this should be legal. But then, don't we also have to accept

sqrt with x + x

? People will write that sort of code to save a pair of parens! But that's where I think we have made matters worse, not better.

One possible choice would be to restrict the type of the right operand of with to some function type (or maybe accept call-by-name parameters as well?) That's a bit weird, since there is no syntactic reason for this restriction. But it might be a workable compromise.

In summary, I am still very much on the fence about all this, so the option of doing nothing for now (i.e. don't introduce a parens killing operator) looks reasonable to me. I have also learned that the two issues of using indentation for arguments and parens killing operators are not necessarily the same, since parens-killing operators will also be used on a single line.

@odersky
Copy link
Contributor Author

odersky commented Sep 1, 2019

One idea which might be attractive is to restrict with's right operand to call by name arguments or lambdas. That means, with is a visual cue that its argument is not strict. with introduces significant indentation by itself, so the following variation of @sjrd's example works:

val z = 
  optInt.fold with
    println("nope")
  with x => 
    val y = f(x)
    println(y)

The indentation is prompted in one case by with and in the other by =>.

People have often wished that {...} would indicate by-name arguments only. That did not work since we also want sometimes multi-statement by-value arguments, and {...} was the only way to get them. But with significant indentation we could introduce two operators that distinguish the two cases.

If we do this, another question is whether : should then apply only to by-value arguments, or whether we want to keep it general.

[EDIT:] I think we want to keep it general, since : would be equivalent to {...}, which is used for both by-name and by-value arguments. So the proposal would be to have a separate with construct that highlights by-name arguments.

@lihaoyi
Copy link
Contributor

lihaoyi commented Sep 1, 2019

with suffers the same problem as case: it currently has a very specific meaning (mixin traits) and it is the wrong meaning for what we have here. It is also a very long keyword for what in my experience is a very common operation.

Here's two more ideas:

  1. How hard would it be to make do without a keyword? Could we use some combination of lenient parsing + post-validation to allow syntax like this?
val foo = bar
  .map x => 
    val y = x + 1
    y + 1
  .foreach x =>
    val y = x + 1
    println(y)

It seems we would need up to 1 line of lookahead in the lexer/parser, which seems like something that can be afforded. I haven't thought through all possible ambiguities, but it seems to me like with some fiddling this could work. We already do a lenient-parse+post-validation step in parsing lambda argument lists anyway.

  1. Could we introduce a new keyword? F# uses the fn keyword, which fits perfectly here:
val foo = bar
  .map fn x => 
    val y = x + 1
    y + 1
  .foreach fn x =>
    val y = x + 1
    println(y)

Short, unambiguous, and precisely meaningful. Sure it would be introducing a new keyword, but I think for such a common operation it is worth is v.s. overloading an existing keyword that doesn't really fit

@odersky
Copy link
Contributor Author

odersky commented Sep 2, 2019

I did some exploration, looking at actual usages of { ... => in our codebase. That made me more skeptical about with. For instance, pairing with exists is really bad:

  sym.baseClasses.exists with ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)

This is a LOT less clear than the original

  sym.baseClasses.exists { ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)
  }

(and do would be just as bad in its place). This shows again that no single keyword can express function application. The only thing one could do is have a keyword that marks the start of a function and that leaves application silent. case was one example of this, fn would be another. But it's still awkward to a degree where I would prefer we leave it in braces.

Can we use just :? Only if we severely restrict its use since otherwise we would get ambiguities with type ascription. E.g.

  sym.baseClasses.exists: ancestor =>
    ancestor.hasAnnotation(jsdefn.EnableReflectiveInstantiationAnnot)

would theoretically work since the type in a type ascription cannot be a naked function type without parens around it. But the longer the parameter list gets the harder it would be to parse (for humans).

@lihaoyi
Copy link
Contributor

lihaoyi commented Sep 2, 2019

If : is unambiguous (assuming some amount of leniency/lookahead to see the => at end-of-line) then sym.baseClasses.exists: ancestor => seems to me like the best option so far.

It wouldn't be the first time : and => are parse differently depending on enclosing context and newlines/semicolon-insertion:

@ object foo{
    identity: Int => Int
    println("this is binding `identity` to a self-type of Int followed by a expression " + identity)
  }
defined object foo

@ foo
this is binding `identity` to a self-type of Int followed by a expression ammonite.$sess.cmd9$foo$@555856fa
res10: foo.type = ammonite.$sess.cmd9$foo$@555856fa
@ object foo{
    (identity: Int => Int)
    println("this is a type ascribing `Predef.identity` to a function type " + identity)
  }
cmd11.sc:3: missing argument list for method identity in object Predef
  println("but this one is " + identity)
@ object foo{
    val x = {identity: Int => Int}
    println("this is binding `identity` to an Int argument of a lambda which returns the Int companion " + x(1))
  }
defined object foo

@ foo
this is binding `identity` to an Int argument of a lambda which returns the Int companion object scala.Int

While not ideal, it seems to have caused little enough unhappiness in the past, so I wouldn't mind pushing the envelope a little bit and overloading : for opening indented blocks as well

@odersky odersky moved this from To do to In progress in Scala 3.x planning Sep 2, 2019
@Ichoran
Copy link

Ichoran commented Sep 2, 2019

I appreciate the effort, but don't think the results from any proposal are particularly visually pleasing or sufficiently general. I'm somewhat puzzled about why people aren't more concerned about the generality/simplicity.

I think we should have a good story about how to select when and where to put a brace-killer. I don't think memorizing a dozen or more cases is a good story. This suggests to me that the solution is that brace-killing happens always.

Maybe you require no syntax:

2 + 5 *
  3 + 8

Or maybe you do:

2 + 5 * :
  3 + 8

2 + 5 * ...
  3 + 8

2 + 5 * do
  3 + 8

but I don't think it works to have to pick and choose the cases. I think even if/then and for/yield are too much to remember if we're going for simplicity.

Python doesn't make you remember a pile of cases. From a random example on the web:

    temperature = float(input('What is the temperature? '))
    if temperature > 70:
        print('Wear shorts.')
    else:
        print('Wear long pants.')
    print('Get some exercise outside.')

Do you see that else:? The : is totally unnecessary because of the keyword else. And yet : provides consistency in introducing a new block indent.

Yes, it's just (redundant) punctuation. But punctuation is important:

do you see that else: the : is totally unnecessary because of the keyword else and yet : is required for consistency yes its just punctuation but punctuation is important

I think the discussion about case vs do vs with is ultimately much less important than allowing generality and avoiding context-sensitivity.

@odersky
Copy link
Contributor Author

odersky commented Sep 3, 2019

@Ichoran Putting : behind else might work well but is out-of-scope for Scala since it would create a different dialect. I believe all we might be reasonably able to do is make braces optional. But we cannot require : where none was required before.

@krakel
Copy link

krakel commented Sep 3, 2019

little joke
I like this Pascal like syntax:
... : ... end
It looks like:
... begin ... end // we should add this as new keyword
Better than:
... { ... }
end joke

These : used as begin of a indentation looks like an alias of a { without a corresponding }. You can show this with a simple \{ or {{ at the end of the line. The : should be separated with a space (xyz : not xyz:) for better reading.

object IndentWidth {{
    private val spaces = IArray.tabulate(MaxCached + 1) {{
        new Run(' ', _)
end IndentWidth

@Ichoran
Copy link

Ichoran commented Sep 3, 2019

@odersky - I don't understand what you mean--surely having a brace-free style is making a different dialect? (Just like xs take 2 is a different dialect than xs.take(2)?)

Anyway, we would only require : to open a brace-free block, but we'd always require it to open a brace-free block.

In 2.12/13, else does not begin a block, so it would continue to work as it does now. You would have to use else: only to get a block. (One could argue that else: vs else is too subtle to notice, but the compiler could also insist on non-confusing indentation following else so that you couldn't make it look like a block with else and still compile.)

@lbkb
Copy link

lbkb commented Sep 17, 2019

Then with : as parens-killer you end up with something pretty light:

logging("stdout")
  xs.map 
    case 0 => 1
    case x => x
  .filter : x =>
    x >= 0
    && x < limit
  .fold 
    zero
  : (x, y) =>
    val z = x + y
    z * z

Note that a space before : makes it look more operator-like.

@odersky
Copy link
Contributor Author

odersky commented Sep 18, 2019

The merged PRs #7185 and #7235 implement the following scheme:

Braces are optional for correctly indented code

  • after =, =>, <-,
  • in parts of control structures, i.e. if-then-else, while-do, try-catch-finally, match, and for-yield/do
  • after headers of classes, etc, where a list of definitions is expected.

A : at end of lines is significant only if the optionYindent-colon is given.

This means that we defer until later the question whether to adopt :, or use something else instead, or just keep on using parentheses or braces.

@odersky odersky closed this as completed Sep 18, 2019
Scala 3.x planning automation moved this from In progress to Done Sep 18, 2019
@odersky odersky reopened this Sep 29, 2019
Scala 3.x planning automation moved this from Done to In progress Sep 29, 2019
@odersky
Copy link
Contributor Author

odersky commented Sep 29, 2019

After having worked with the new rules (without :) for a while, I observe that omitting braces after class or object headers feels different from omitting braces everywhere else. Everywhere else (i.e. after =, =>, <-, then, else, do, yield, try, catch, match, and so on) omitting the braces feels completely natural and improves legibility. The reason is that in these places something has to follow. I.e. if I write

def foo =
  abc

The abc is mandatory, it cannot be left out. The reader parses visually the indented block as the continuation, and it does not matter whether that block consists of one or more statements. For someone parsing the outline of code, the two snippets

def foo =
  println("abc" ++
    "xyz")

and

def foo =
  println("abc")
  "xyz"

are equally clear and braces are needed in neither case. So, I find making braces optional here is a clear win because it improves writability and readability at the same time.

With class and object headers it's different. Here when I see

class Rational(x: Int, y: Int)

  /** The numerator */
  def numer = x

  /** The denominator */
  def denom = y

I am less sure how to read that. Are numer and denom definitions in class Rational or are they following it? I have to count spaces to decide. The immediate effect was that I felt that in this case having more than two spaces as the tab size would help legibility whereas I did not feel that way for the other cases where {...} was left out. It's also a case where adding a space accidentally would change meaning, which is never the case elsewhere, and which is something we would want to avoid.

By contrast, braces help here. When I see

class Rational(x: Int, y: Int) { 
...
}     

I know that the next definitions ... are members of Rational and I can scan for the closing brace } to see where it ends. end markers are a superior replacement for closing braces but the problem remains how can I be sure that anything at all starts after Rational(x: Int, y: Int)?

Possible solutions:

  1. The minimal viable change: Continue to require braces for class/object/... definitions. This would work in the short term, but might lead to an uncomfortable style in the longer term where programs are a mixture of braces and indentation.

  2. Allow to drop braces after class/object/... only if there is a marker word that indicates that something will follow. Various choices for the marker have been discussed. I still think that : is
    superior to all of them. So:

    class Rational(x: Int, y: Int):
    
      /** The numerator */
      def numer = x
      ...
    end Rational
  3. Once we have introduced a marker like : to replace a set of definitions in {...} we can also use it for function arguments. That would go back to the originally proposed version.

@aappddeevv
Copy link

aappddeevv commented Sep 29, 2019

When fiddling with 0.19 I immediately made that mistake on a few class defs without the colon, especially for case classes where I tend to put parameters on their own line. It was a bit hard to read without the colon as I was not using an end (which I wanted to avoid).

case class Foo(
  x: Int,
  s: String,
  // more params
  // ...
): // <-- this colon was very helpful when reading the definition
  def someMethod: String = ???

@arturopala
Copy link
Contributor

Do you see in the long term brace-less syntax as an alternative or an replacement?

@jxtps
Copy link

jxtps commented Oct 3, 2019

It seems like the primary difference between the class/object case and the method case is less that something must follow in the method case, and more that even short classes/objects typically have multiple methods/fields and there's vertical white space between them.

If we were to consider a method with a longer body:

def foo(i:Int, y:Int):Int = 
  println("Welcome to foo!")
  
  val a = i * y
  println("A is: " + a)

  for (val b <- 0 until a) print(".")

  println("Goodbye!")
  a+2

then it's no longer any more or less clear what's part of the body vs what's part of the enclosing scope compared to classes/objects, right?

It's also a case where adding a space accidentally would change meaning, which is never the case elsewhere, and which is something we would want to avoid.

Sure, the first line of a method may be protected by "something must follow", but the subsequent lines are not, right?

In my Python experience (with 4 spaces per indent which was the editor's default) there's never been any ambiguity whatsoever about these things. Copy-paste is a little futzy, but I've never found myself in a situation where I wonder what scope a given line of code is in. For the cases where it gets hairy - really long methods / classes - any braces or end Foo are typically far away anyway, so they don't seem to solve this problem (to the extent it is one).

Instead, I would consider going draconian: if you're going to use indentation-based syntax, the indentation shall be 4 spaces per level, no tabs, no exceptions. See PEP-8 - I'm guessing the Python folks have hashed this out a lot more over the years than we have ;)

@odersky
Copy link
Contributor Author

odersky commented Oct 3, 2019

then it's no longer any more or less clear what's part of the body vs what's part of the enclosing scope compared to classes/objects, right?

It does not work that way for me. Visually, the first line after the def must be the body since it follows a =. Then subsequent lines starting at the same column are clearly part of the same body. We have been trained to read vertical alignment as being in the same list.

But with classes it's different. Here, I have to check whether something is indented relative to the previous line. That's much harder, and suffers from off-by-one-space errors.

Going to 4 spaces might help but it's not something we can require.

@odersky
Copy link
Contributor Author

odersky commented Oct 3, 2019

One downside of using : for introducing template definitions is that it works poorly with given.

given c: C:
  def f() = ...

When I originally proposed : for introducing template definitions the given syntax still used as instead of :, so that was not a problem then. But I believe we are now settled on the given c: C part, so this is now an issue that needs to be resolved.

The alternative of using with does not have that problem, and reads quite well, in fact.

trait Monoid[T] extends SemiGroup[T] with
  def unit: T

given Monoid[String] with
  def (x: String) combine (y: String): String = x.concat(y)
  def unit: String = ""

I am less keen on using with for method arguments, for the reasons I have already stated.

@odersky odersky mentioned this issue Oct 5, 2019
5 tasks
@nafg
Copy link

nafg commented Oct 6, 2019

I really wish all the time being invested in this would be saved for after some more urgent things, for instance getting TASTy to the point where it can be used in Scala 2.x.

@TheElectronWill
Copy link
Member

I really wish all the time being invested in this would be saved for after some more urgent things, for instance getting TASTy to the point where it can be used in Scala 2.x.

Definitely agree that it would be good to move forward on TASTy! I don't think we can prevent discussions and experiments around syntax (whatever our opinion), but I'm positive that PRs are welcome :-)

@bishabosha
Copy link
Member

@nafg @TheElectronWill This is being worked on at the Scala Center: TASTy Reader for Scala 2

@Ichoran
Copy link

Ichoran commented Oct 7, 2019

@odersky - I really like the with syntax for opening a block for a trait or class, because it suggests mixing in an anonymous trait, which is conceptually equivalent anyway. That is, the following two should be equivalent in terms of what methods trait C gets; the only difference is whether you can easily name the subset that are not part of A:

trait A { def a: A }
trait B { def b: B}
trait C extends A with B

trait A { def a: A }
trait C extends A with
  def b: B

@jxtps
Copy link

jxtps commented Oct 7, 2019

There is an argument to be made that the braces vs indentation thing should largely be handled by the editor.

To that end I have filed a feature request with IntelliJ: https://youtrack.jetbrains.net/issue/IDEA-224361 - you may want to consider filing similar requests with your favorite editor, or voting for / promoting such features where applicable / possible.

@nafg
Copy link

nafg commented Oct 8, 2019 via email

@esarbe
Copy link
Contributor

esarbe commented Oct 9, 2019

Why not regularize the definition syntax for object/classes/traits to match the method definition?

class Rational(x: Int, y: Int) =

  /** The numerator */
  def numer = x
  ...


object Foo = 
  ....

trait Bar(x: Int, y: String) =
  ...
end trait

given c: C = 
  def f() = ...

case class Qux(
  x: Int,
  s: String,
  // more params
  // ...
) =
  def someMethod: String = ???

That would avoid the overloaded meaning of the colon.

@TheElectronWill
Copy link
Member

TheElectronWill commented Oct 9, 2019

@esarbe This has already been offered during the discussions on indentation-based syntax. IIRC, the argument against doing this was the following:

  • For functions, the left- and right-handside are equivalent/interchangeable. Therefore = is ok.
  • But it's not the case for classes nor traits (e.g. two classes with the same methods aren't interchangeable).

@esarbe
Copy link
Contributor

esarbe commented Oct 9, 2019

Ah, interesting. I never considered the = for vals or defs to mean equality but always as assignment, similar to Pascal's :=.
I guess that comes with Scala's use of = for both.
Edit; I guess I'm still sleeping.

@TheElectronWill
Copy link
Member

TheElectronWill commented Oct 9, 2019

Well, == still means equal and x = 2 still means assign 2 to variable x, but there is nonetheless a semantic issue with class A = something

@nafg
Copy link

nafg commented Oct 10, 2019 via email

@mjburgess
Copy link

I'd like to +1 on the strategy of regularising brace-formatted scala code and then allowing braces to be dropped. I think this makes the sales pitch (/impact assessment) less fraught.

Braces seems somewhat arbitrarily optional at the moment (in the sense that, eg., in Java, every class definition requires them). There's a strong case for saying "always optional", if that can be achieved with minimal cognitive overhead. Its quite common in C to drop braces wherever possible (whether advisable or not) -- I think the appetite for reduced line noise is pretty widespread.

It seems "with" does make sense for bodies/definitions (,...) as there does feel something similar in intention when saying "with Trait" and "with {body}" -- ie., with Trait (qua mixin) serves as a kind of syntactical abbreviation for the in-place definition.

Is there a case for the colon and with?

@odersky
Copy link
Contributor Author

odersky commented Oct 28, 2019

A status report:

Current master allows to drop braces around templates after with. My initial experience is that it works well. We should wait for a Dotty release cycle where others can try it out before reaching a definite decision on it.

: is currently enabled under -Yindent-colons. It has two uses

  1. as an alternative to with, to start a template
  2. to indicate an indented function argument block

Usecase (1) is an alternative to with in current master. In the end, we should pick one or the other (or none at all). With the new given syntax - which I assume is settled by now - : becomes more awkward than before, e.g.

   given intMonoid: Monoid[Int]: 
     ...

I find the double use of : on a line problematic. with reads better in this case:

  given intMonoid: Monoid[Int] with 
    ...

Usecase (2) is still open. Currently : only works at end of line, which rules out idioms like

  xs.exists: x =>
    x > 0

Enabling this syntax is almost possible. The only clash is with an explicit self type declaration at the start of a class or trait. E.g.

  trait T extends Seq[Int] {
    foreach: x =>
      println(x)

Here, foreach: x => would be parsed as a self type declaration for trait T. There might be ways around that, however: for instance, we could try a different syntax for self type declarations, see #7374.

My current and still provisional tendency is to do nothing about usecase (2) and to retire the -Yindent-colons option for now. That way, we can evaluate the other changes first before taking the next step. I believe a parens killing operator could still be an interesting option at some point, but I see less urgency to put it in Scala 3.0.

@odersky odersky closed this as completed Nov 18, 2019
Scala 3.x planning automation moved this from In progress to Done Nov 18, 2019
@odersky
Copy link
Contributor Author

odersky commented Nov 18, 2019

Closing since it looks there won't be anything standardized for 3.0. We can of course get back to it at some later point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

No branches or pull requests