Summary of proposed changes

This page summarizes the changes to SML that are defined in the Definition of Successor ML. The original version of this summary was written by Andreas Rossberg and is available at http://www.mpi-sws.org/~rossberg/hamlet/README-succ.txt; this version omits those features that have not yet been included in the definition.

Lexical Syntax

The token `(*)` begins a comment that stretches until the end of line.

• Extended Literals.

Numeric literals may contain underscores to group digits, as in `1_000_000_000`, `3.141_592_653`, or `0wxf300_4588`. Furthermore, numeric literals in binary notation are supported, e.g., `0wb1101_0010`.

Records

• Record Punning (derived form).

In record expressions, a field of the form id=id can be abbreviated to id (in Standard ML that is only allowed in patterns). For instance:

`  fn {a, b, c} => {a, b=b+1, c, d=0}`
• Record Extension.

A record `r` can be extended using the notation `{a=2, b=3, ...=r}`. For instance,

```  val r1 = {a=1, b=2}
val r2 = {c=0, ...=r1}```

binds `r2` to `{a=1, b=2, c=0}`. The same syntax is available in patterns to match the "rest" of a record:

`  case r2 of {b, ...=r'} => r'`

evaluates to `{a=1, c=0}`.

As a derived form, ellipses can appear at any position in the record (but only once):

`  val r2 = {...=r1, c=0}`

Note that the context must still determine the fields denoted by the ellipses (i.e., there is no record polymorphism yet). Those fields may not overlap with the enumerated ones.

Record types can also be formed by extension:

```  type 'a t = {a:'a, b:bool}
type 'a u = {c:char, d:'a list, ...:'a t}```

The type `'a u` is equivalent to `{a:'a, b:bool, c:char, d:'a list}`.

• Record update (derived form).

Besides extension, a record r can be updated, using the notation `{r where a=2, b=3}`. For example:

```  val r1 = {a=1, b=2, c=6}
val r2 = {r1 where b="hi"}```

binds `r2` to `{a=1, b="hi", c=6}`.

Again, the context must determine the fields in the base record. It must include all updated fields (i.e., it is orthogonal to extension).

Note that record update may change the types of modified fields.

Pattern Matching

• Optional Bars and Semicolons.

An optional bar is allowed before the first rule of a match, yielding more regular and editing-friendly notation:

```  case f n of
| LESS => f(n-1)
| EQUAL => n
| GREATER => f(n+1)```

This is also supported for "fn" and "handle", for function declarations, and for datatype declarations. For instance,

```  datatype 'a exp =
| Const  of 'a
| Var    of string
| Lambda of string * 'a exp
| App    of 'a exp * 'a exp```

In a similar vein, optional terminating semicolons are allowed for expression sequences. For example, in a let expression:

```  fun myfunc1(x, y) =
let val z = x + y in
f x;
g y;
h z;
end```

The same applies to parenthesised expressions and sequences.

• Disjunctive Patterns.

So-called "or-patterns" of the form `pat1 | pat2` are supported. For example:

```  fun fac (0|1) = 1
| fac   n   = n * fac(n-1)```

The syntax subsumes multiple or-patterns and multiple matches:

```  case exp of
| A | B | C => 1
| D | E     => 2```
• Conjunctive Patterns.

Layered patterns ("as" patterns) have been generalised to allow arbitrary subpatterns on both sides, i.e., `pat1 as pat2`. They are useful in combination with nested matches (see below), but also have the advantage that the order of binding may now be inverted (`pat as x`), which sometimes is more readable.

• Nested matches.

Patterns may contain nested matching constructs of the form

`  pat1 with pat2 = exp`

Such a pattern is matched by first matching `pat1`, then evaluating `exp`, and matching its result against `pat2`. Variables bound in `pat1` may occur in `exp`. The pattern fails when either pattern does not match. The pattern binds the combined set of variables occuring in `pat1` and `pat2`. For instance, consider:

`  case xs of [x,y] with SOME z = f(x,y) => x+y+z | _ => 0`

If `xs` is a two-element list `[x,y]` such that `f(x,y)` returns `SOME z`, then the whole expression evaluates to `x+y+z`, otherwise it evaluates to `0`.

Nested matches are a very general construct that allows simple "views" and pattern guards to be defined uniformly as derived forms (see below). They can also be useful in combination with disjunctive patterns,

`  case args of x::_ | (nil with x = 0) => ...`

or with guards (see below):

```  fun escape #"\"" = "\\\""
| escape #"\\" = "\\\\"
| escape (c with n=ord c) if (n < 32) = "\\^" ^ str(chr(n+64))
| escape c = str c```
• Pattern guards (derived form).

Patterns may contain a boolean guards of the form `if exp`, as in

```  fun nth(l, n) =
case (l, n) of
| (x::_,  0)        => x
| (_::xs, n) if n>0 => nth(xs, n-1)
|     _             => raise Subscript```

Guards are also allowed in function declarations:

```  fun nth(x::_,  0)          = x
| nth(_::xs, n) if (n>0) = nth(xs, n-1)
| nth(  _,   _)          = raise Subscript```

A pattern guard `pat if exp` actually is syntactic sugar for the the nested pattern `pat with true = exp` and may hence appear inside other patterns.

Value Definitions

• Simpler Recursive Bindings.

Recursive bindings can no longer override constructor status.

The syntax for recursive bindings has been made less baroque: the `rec` keyword must always follow directly after `val`, and may no longer be repeated. This just rules out pathological cases most programmers might not even be aware of.

• Strengthened Value Restriction.

The value restriction has been extended to demand that patterns in polymorphic bindings are exhaustive. This fixes a conceptual bug in the language and enables type passing implementations.

• Do declarations.

The simple derived form

`  do exp`

expands to

`val () = exp`

and allows expressions to be evaluated for their side effects within declarations.

Type Definitions

• Withtype Specifications (derived form).

The `withtype` syntax for defining mutually recursive datatypes and type synonyms is available in signatures (in Standard ML it is only supported in structures).

• Proper Scoping for Transparent Type Specifications (derived form).

Transparent type specifications in signatures have scoping rules consistent with the rest of the language, i.e.,

```  type t = bool
signature S =
sig
type t = int
and  u = t
end```

no longer equates u with int.

The more or less obsolete `abstype` declaration form has been removed from the bare language and been redefined as a simple derived form. This change does not have much visible effect, but simplifies implementations and can be seen as a first step towards removing it altogether.
• Abolished `and` in Type Realisations (derived form).
The syntax of type realisations using `where` does no longer allow multiple equations connected with `and`. That arcane syntax produced an annoying singularity in the language and was never implemented correctly by most SML implementations.