New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regularize syntax for lambdas #4712

Open
odersky opened this Issue Jun 23, 2018 · 26 comments

Comments

Projects
None yet
@odersky
Contributor

odersky commented Jun 23, 2018

In light of #4672 it's time to think of a systematic syntax solution for the various kinds of types for lambdas. So far we have:

(x: T) => R       impure dependent function type (term -> term)
[X] => R          type lambda (type -> type)

We'd like to reserve -> for pure functions. This means we are currently lacking a good syntax for polymorphic functions (type -> term). And if we ever would like to introduce dependent functions from terms to types, we are also blocked. In light of this I think it makes sense to revise the syntax of type lambdas to be [X]R. If we do that, we'd have accounted for:

    (x: T) => R      term -> term, impure
    (x: T) -> T      term -> term, pure
    [X] => R         type -> term, impure
    [X] -> R         type -> term, pure
    (x: T)R          term -> type, type-level
    [X]R             type -> type, type-level
@Glavo

This comment has been minimized.

Contributor

Glavo commented Jun 23, 2018

Is it really necessary to design a syntax for pure functions? I think we just need to provide a @pure annotation:

class pure extends scala.annotation.ClassfileAnnotation
type PureFunction[T] = T@pure

val double: PureFunction[Int => Int] = d => d * 2
@smarter

This comment has been minimized.

Member

smarter commented Jun 23, 2018

This is nice. The part I'm least convinced of is the [X]R syntax, for simple expressions it seems okay, but consider that R itself could be a lambda type: [X][Y] => Y => X is pretty confusing (if we add parenthesis we get [X]([Y] => (Y => X) which is a bit better).

Alternatively, we could reclaim ~>:

    (x: T) => R      term -> term, impure
    (x: T) -> T      term -> term, pure
    [X] => R         type -> term, impure
    [X] -> R         type -> term, pure
    (x: T) ~> R      term -> type, type-level
    [X] ~> R         type -> type, type-level

[X] ~> [Y] => Y => X looks nicer to me, and you can instantly see what kind it is by counting the number of ~> (whereas with the syntax you propose, determining the kind means counting the number of square bracket pairs which are not followed by a =>).

The main downside is that ~> is already commonly used to represent natural transformations (https://typelevel.org/cats/datatypes/functionk.html#syntactic-sugar), but it doesn't seem like a big loss to me if people are forced to use something else instead like ~~> or ==>.

@Glavo

This comment has been minimized.

Contributor

Glavo commented Jun 23, 2018

I think adding the type keyword to the type lambda syntax is also a good choice:

    [X] => R             type -> term
    type [X] => R        type -> type

Compared to using different arrow symbols to distinguish between type lambda and Polymorphic function type, using the type keyword to distinguish more.

@odersky

This comment has been minimized.

Contributor

odersky commented Jun 23, 2018

Yes, I think one will need the parentheses in

    [X]([Y] => Y => X)
@odersky

This comment has been minimized.

Contributor

odersky commented Jun 23, 2018

I think we just need to provide a @pure annotation:

I would expect or at least hope that once we have pure functions the majority of function types are pure. So we need the simplest and clearest syntax for them, and that's undoubtedly ->.

@odersky odersky closed this Jun 23, 2018

@odersky odersky reopened this Jun 23, 2018

@milessabin

This comment has been minimized.

Contributor

milessabin commented Jun 24, 2018

~> is already very widely used in Cats, shapeless, Scalaz and elsewhere.

@milessabin

This comment has been minimized.

Contributor

milessabin commented Jun 24, 2018

would expect or at least hope that once we have pure functions the majority of function types are pure. So we need the simplest and clearest syntax for them, and that's undoubtedly ->.

I actually think it might go the other way around. I would guess that most functions written as explicit lambdas are already pure, so reserving => for pure and -> for impure would cause the least disruption to existing code, documentation and other literature.

Some empirical support would be helpful for this decision.

@olafurpg

This comment has been minimized.

Contributor

olafurpg commented Jun 24, 2018

-> is already used in Predef.ArrowAssoc to construct tuples and I'm sure that removing it would be a big disruption for existing code and documentation.

A grep over 2.5 million lines of code shows >21.661 hits (of which only a few results appear to be false positives). That's close to 1% of the corpus. A subset of the results can be seen here https://gist.github.com/olafurpg/9349c6fb189c9ecb762504cc488d2b07 the full output is in this zip file arrow.zip

The a -> b syntax has several benefits over (a, b). For example, it's is the most convenient way to use Map builders

val m = Map.newBuilder[Int, Int]
m += (1 -> 2) // OK
m += ((2, 3)) // Not as nice

m += (1, 2) // ERROR
cmd4.sc:1: type mismatch;
 found   : Int(1)
 required: (Int, Int)
val res4 = m += (1, 2)
                 ^

I don't have a alternative suggestion, but please consider leaving -> alone.

@ritschwumm

This comment has been minimized.

ritschwumm commented Jun 24, 2018

to me, -> never stopped feeling wrong for tuple construction. it looks good in a Map literal, but not anywhere else.

@julienrf

This comment has been minimized.

julienrf commented Jun 24, 2018

(Note: the builder example does not apply to dotty, where m += (1, 2) is equivalent as m += 1 -> 2, and not m.+=(1, 2) like it is in 2.12.)

[X] => R         type -> term
[X]R             type -> type, type-level

Relatedly, what should F[_] be expanded to? Should it be [X] => F[X] or [X]F[X]? I find it more intuitive to use the arrow for type-level functions, by analogy with term level functions, and have F[_] be expanded to [X] => F[X].

@Blaisorblade

This comment has been minimized.

Contributor

Blaisorblade commented Jun 24, 2018

  1. Do we actually have distinct semantics for pure and impure polymorphic functions? #4672 has only impure polymorphic functions [X] => R; if f is a pure polymorphic function, then it seems very likely that it must be a polymorphic value (since types are erased), and f[T] should not produce a function call. Using function syntax for those would be weird — for those semantics, I'd rather specify [X] T as a polymorphic value, but restrict T to be either ExprType or a function type. This restriction is needed for soundness, to avoid the issue that polymorphic references would have in ML (as in Damas-Milner), but that'd be fixed by a reasonable concept of pure polymorphic function.

  2. If the redesign uses [X]R then we need a new syntax for MethodType, which do appear often in error messages. Also, this proposal seems not enough to allow picking an arrow when writing a method — after defining def f[X]: R, does f have type [X] => R or [X] -> R? What about def f[X](a: T1)(b: T2)(c: T3)?

  3. Part of the ambiguity happens because Scala, unlike any other functional language I can think of, uses no token to introduce functions, instead of having some syntax for lambda. Some MLs and Coq use fun, Haskell/Agda use \ or 𝜆, Scheme actually writes lambda, etc.
    Worse, Scala syntax is not Scalaish: while Haskell is designed to be extremely concise and can look like a symbol soup to the uninitiated, Scala is usually a bit more verbose — that's only useful if it's more clear, but some bits of syntax seem even more confusing.

    At least for type-to-type functions, changes are still easy since they're not in Scala 2, and since those are least common they don't need to be more compact. I'm fine with all those alternatives, but \[X] => R (also allowing for 𝜆[X] => R) avoids requiring a new keyword, while fun [X] => R or especially typefun [X] => R is much clearer. The same syntax could work (as an additional one) for term-level functions, but adding alternative syntax (and deprecating the current one) has an higher threshold (as seen in the rejection of SIP-12).
    Similarly, polymorphic types could well use a forall [X] -> R for extra clarity.

@odersky

This comment has been minimized.

Contributor

odersky commented Jun 25, 2018

@olafurpg Note that -> in ArrowAssoc is only used as an operator in terms. We are talking about -> as an operator on types. The two do not clash. I agree it's unfortunate that they have different meanings.

@olafurpg

This comment has been minimized.

Contributor

olafurpg commented Jun 25, 2018

@odersky Will there be a term syntax to construct pure functions? I interpreted

    (x: T) => R      term -> term, impure
    (x: T) -> T      term -> term, pure

as if List(1).map(i -> i + 1) would become valid code.

@odersky

This comment has been minimized.

Contributor

odersky commented Jun 25, 2018

@olafurpg I don't think we need (x: T) -> E as term syntax. (x: T) => E gives a function type that has exactly the effects of E (which might be no effects at all). We need to distinguish pure from impure only when lifting this into a type.

@olafurpg

This comment has been minimized.

Contributor

olafurpg commented Jun 25, 2018

I see. The breakage is indeed smaller then, here's what I found for -> in type position

URL details
shapeless/poly.scala#L120 class ->[T, R](f : T => R) extends Poly1 {
implicit def subT[U <: T] = atU
}
scalaz/NaturalTransformation.scala#L30 type ->[A, B] = λ[α => A] ~> λ[α => B]
shapeless/sybclass.scala#L65 ->((i: Int) => i*110/100)
shapeless/sybclass.scala#L112 ->((i: Int) => i+1)
shapeless/tuples.scala#L120 (Any -> String)
shapeless/tuples.scala#L121 (Fruit -> Fruit)

I am still concerned about giving a -> bdifferent meaning for term and type position.

@Blaisorblade

This comment has been minimized.

Contributor

Blaisorblade commented Jun 25, 2018

@odersky

I don't think we need (x: T) -> E as term syntax. (x: T) => E gives a function type that has exactly the effects of E (which might be no effects at all). We need to distinguish pure from impure only when lifting this into a type.

It looks like you're only considering the parsing, but what about the requirements of people reading code? To know whether E has effects you need to look either at all the called functions or at the types of imported symbols.

When I and @liufengyun tried to use variants of this syntax in a draft, we had no convenient way to write pure functions/methods which were visibly pure, and that's not currently addressed by this proposal. So this isn't worth the disruption IMHO.

// either pray:
def pureFunction[X](xs: List[X]) = xs.map(f).map(g) // let's hope that f and g are pure
// or give up methods and pray again:
def pureFunction[X] = (xs: List[X]) => xs.map(f).map(g) // let's hope that f and g are pure
// or give up methods *and* return type inference
def pureFunction[X]: (xs: List[X]) -> List[Z] = xs => xs.map(f).map(g) // finally, a compile error if f or g isn't pure
@odersky

This comment has been minimized.

Contributor

odersky commented Jun 25, 2018

Our approach to purity is really the dual of what you have in mind: Instead of tracking purity, track the capabilities you have in scope to be impure. If there are none, any code that typechecks is pure.

@Blaisorblade

This comment has been minimized.

Contributor

Blaisorblade commented Jun 25, 2018

True, if we want to be able to write something like val g : A => B = f and make it mean that g gets all of f's effects automatically (that is, it takes all its capabilities), we end up with similar problems. When we get to a complete design, this might turn out to be a non-problem, but I'm not yet convinced.

@LPTK

This comment has been minimized.

LPTK commented Jun 25, 2018

@olafurpg

I am still concerned about giving a -> bdifferent meaning for term and type position.

For what it's worth, I personally always define:

@showAsInfix type -> [+A,+B] = (A,B)
object -> { def unapply[A,B](ab: (A,B)) = Some(ab) }

... which I consider to be glaring omissions from the standard library!

This way, I can write code like:

def foo[K,V](kv: K -> V) = kv match {
  case k -> v => assert(kv == k -> v)
}
@Jasper-M

This comment has been minimized.

Contributor

Jasper-M commented Jun 25, 2018

@LPTK I have good news for you then: scala/scala#6304
The type alias didn't make it though.

@odersky

This comment has been minimized.

Contributor

odersky commented Jun 26, 2018

val g : A => B = f and make it mean that g gets all of f's effects automatically

That's not the plan. If you write that, it means g is maximally impure, can be applied only in contexts that have the capability to do anything.

@sirinath

This comment has been minimized.

sirinath commented Jun 30, 2018

Will the following solve the problem?

(x: T) -> [R]          term -> type, type-level
[X] -> [R]             type -> type, type-level
@odersky

This comment has been minimized.

Contributor

odersky commented Jul 2, 2018

The group discussion today led to some support for

[X] #> T
(x: T) #> T

for the type-level lambdas. One argument in favor is that # is already associated with another typelevel operation, namely projection. So it feels natural to employ it in type-level lambdas as well.

@jdegoes

This comment has been minimized.

jdegoes commented Jul 7, 2018

@odersky

For the following reasons, I am not a fan of the -> / => distinction for pure / impure functions:

There are infinitely many type of impure functions

Functional programmers are the target audience for pure functions, yet functional programmers care about the distinctions between types of impurity.

For example, functional programmers would like to know merely by looking at a type that a function uses the effect of Random or the effect of Clock, or maybe both effects. We already have good tools for encoding this via polymorphism and context bounds (F[_]: Clock: Random), which are verbose and could use full-featured, first-class type classes, but which nonetheless exactly satisfy the need to precisely describe the scope a function's impurity.

The use of => for a generic class of impure functions does not satisfy the needs of functional programmers, because the types A => B will not describe the capabilities required by functions.

I realize the "implicit effect system" sketched out in an older PR is intended to address this limitation, and while this is a separate topic, I feel the implicit effect system will not be used by either functional programmers or OO programmers.

The functional programmers already have better tools and the Java programmers don't care.

Impure functions are better modeled by pure functions

Functional programmers are the target audience for pure functions, yet functional programmers prefer to model all functions as pure.

Pure functions have a tremendous simplifying power on reasoning and testing. If we see a function a -> b, we know that the function is total, deterministic, and free of side-effects. We can reason it using type-based and equational reasoning. We know quite a lot about its implementation merely by inspecting the type, and we know the semantics of any piece of code through substitution.

If we see a function def foo[F[_]: Clock]: F[Long], not only do we know this function only uses the effect of the current time, but we know it always returns the same value and we can reason about it uniformly in the same way we reason about strings, collections, numbers, and methods on them. We don't have to switch our brains over to a mindset in which the ordinary rules of functional programming no longer apply.

Moreover, pure functions give us strictly more compositional power than a mix of pure and impure functions. For example, if we have l: List[A], then we can call l.map(putStrLn) to convert that into, say, List[IO[A]], then we can call sequence to convert that into IO[List[A]]. Since all functions are pure, even effectful ones, we can pass effectful functions into higher-order functions.

If we switch this over to the pure/impure proposal, then a function like List.map will ideally have the type: List[A] -> (A -> B) -> List[B]. So we can no longer call l.map(println). Programmers have strictly less expressive power because they have to abide by the constraints that no effectful code can be passed into higher-order functions that expect non-effectful code. Whereas in functional programming, there is no such restriction, because all functions are pure.

Escape hatches will be needed

Since many functions are, from the outside, pure functions, even though they use an impure implementation, an escape hatch will be necessary. Precisely, the escape hatch will allow an impure function (=>) to be converted to or used as a pure function (->).

The necessary existence of this escape hatch means that it's possible to arbitrarily pretend that any impure function is pure. In other words, the benefit of => versus -> is merely documentation to the programmer, because it does not provide any actual guarantees by the compiler.

This does not mean it is useless, only that it's utility is limited to documentation.

If there are impure/pure functions, there should be impure/pure methods

Because methods can be "converted" to functions, and because methods can call functions, in order for a consistent treatment, a bifurcation between pure/impure functions would require a similar bifurcation between pure/impure methods (otherwise there is no information available as to which type of function a method can be converted into). This will introduce additional complexity and confusion, and lead to the need for additional escape hatches.

Many other objections have merit

  • Requirements of people reading code. The fact that -> is required at the type level but disallowed at the term level means it's not possible to look at a code and know if it's pure / impure.
  • -> has different meanings in term and type position. It's tuple construction in one, and a pure function type in the other. Contrast this with =>, which has uniform meaning.

Other languages similar to Scala do not take this approach

See, for example, OCaml, 1ML, etc. No impure functional programming languages take this approach, and the pure ones all take different approaches to effects.

In summary, I'd argue the proposal has a large amount of cognitive and pedagogical overhead, introduces a lot of additional complexity into an already complex language, and will not pay for itself given all the limitations mentioned above. It feels like it could facilitate the implicit effect system, but I don't think that will be used by functional programmers or Java programmers.

I feel like a simpler proposal would be better: perhaps allow an @impure annotation on methods / functions and emit warnings (which can be turned into errors) for calling such functions from code not so marked. This way the standard library's effectful functions can be annotated with @impure and there's no new syntax or backward compatibility issues.

@LPTK

This comment has been minimized.

LPTK commented Jul 7, 2018

@jdegoes I did not understand what you were trying to say until I got to your very last paragraph, which made for a frustrating read. So to spare other people the frustration, what you want is (correct me if I'm wrong): to make 'pure' the default.

This would mean leaving everything as it is, except for forcing the effectful to declare themselves @impure.

You use a confrontational tone while pushing for this, but I am not sure this is even controversial. Leaving syntax considerations aside, I think the goal for everyone here is the same: to require purity by default, and to only allow impure code through explicit mechanisms – be it requiring special imports, adding a capability system, using type classes... you name it! Crucially, these are not mutually exclusive approaches.

Different people working in different domains with different skill sets have different needs. Martin and many others think the Haskell way is not always best, let alone always appropriate, so they are trying to implement a system to make the life of those who chose not to use it better. It seems clear that you are not in this category, so why do you even care that much? Again, both camps want purity by default and making impurity explicit, but it's not like they have to agree on a single "one true way" of dealing with impure code. AFAIK the approaches could even be mixed, for great benefits.

Functional programmers are the target audience for pure functions, yet functional programmers prefer to model all functions as pure.

What are you trying to say here? You want to be able to say that things are pure, and that's what we want to give you. Where is the problem?

The necessary existence of this escape hatch means that it's possible to arbitrarily pretend that any impure function is pure. In other words, the benefit of => versus -> is merely documentation to the programmer, because it does not provide any actual guarantees by the compiler. [...] it's utility is limited to documentation

Scala has always had asInstanceOf as an escape hatch. Does that mean that Scala types are merely documentation to the programmer? No, because we still get the compiler to help us check them. Having an escape hatch does not invalidate that in the least. The same is true for pure functions.

I'm not even sure what you think could be an alternative here. Remove asInstanceOf from Scala?
(Unrelated: I do think asInstanceOf should be put behind a feature flag, along with null.)
As you know, even languages like Haskell and Idris have these escape hatches.

See, for example, OCaml, 1ML, etc. No impure functional programming languages take this approach

Huh? OCaml's approach is exactly the status quo currently shared by Scala. I don't see how that comparison brings in any new insights. Also, 1ML is not even a real programming language – it's a research prototype and it's not clear whether what they do for effects makes sense in the context of Scala.

@kaishh

This comment has been minimized.

kaishh commented Jul 26, 2018

@jdegoes

See, for example, OCaml, 1ML, etc. No impure functional programming languages take this approach

What about Multicore Ocaml with its algebraic effects?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment