Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
494 lines (354 sloc) 15.7 KB

Datatypes

Formality includes a powerful datatype system. A new datatype can be defined with the T syntax, which is similar to Haskell's data, and creates global definitions for its type and constructors. To pattern-match against a value of a datatype, you must use case/T.

Enumerations

Enumerations can be defined and used as follows:

import Base@0

T Suit
| clubs
| diamonds
| hearts
| spades

print_suit : {suit : Suit} -> Output
  case/Suit suit
  | clubs    => print("First rule: you do not talk about Fight Club.")
  | diamonds => print("Queen shines more than diamond.")
  | hearts   => print("You always had mine.")
  | spades   => print("The only card I need is the Ace of Spades! \m/")
  : Output

main : Output
  print_suit(spades)

The program above creates a datatype, Suit, with 4 possible values. In Formality, we call those values constructors. It then pattern-matches a suit and outputs a different sentence depending on it. Notice that the case expression requires you to annotate the returned type: that's called the motive, and is very useful when theorem proving. You can omit it by using a case argument instead:

print_suit : {case suit : Suit} -> Output
| clubs    => print("First rule: you do not talk about Fight Club.")
| diamonds => print("Queen shines more than diamond.")
| hearts   => print("You always had mine.")
| spades   => print("The only card I need is the Ace of Spades! \m/")

Differently from case expressions, case arguments are shorter and simpler, but are less flexible.

Constructor with fields

Datatype constructors can have fields, allowing them to store values:

import Base@0

T Person
| person {age : Num, name : String}

get_name : {p : Person} -> String
  case/Person p
  | person => p.name
  : String

main : Output
  let john = person(26, "John")

  print(get_name(john))

As you can see, fields can be accessed inside case expressions and case arguments. Notice that p.name is not a field accessor, but just a single variable: the . is part of its name. When Formality doesn't know the name of the matched value, you must must explicitly name it using the as keyword:

main : {p : Person} -> Num
case/Person person(26, "John") as john
| person => john.age
: Num

Moving resources to branches

Since Formality functions are affine, you can't use an argument more than once. So, for example, the function below isn't allowed:

import Base@0

main : {a : Bool, b : Bool} -> Bool
  case/Bool a
  | true  => b
  | false => not(b)
  : Bool

In this case in particular, you used b in two different branches, so you shouldn't need to copy it:

import Base@0

  main : {a : Bool, b : Bool} -> Bool
  case/Bool a
  + b : Bool
  | true  => b
  | false => not(b)
  : Bool

Under the hoods, this is desugared to an extra lambda on each branch:

import Base@0

main : {a : Bool, b : Bool} -> Bool
  (case/Bool a
  | true  => {b} b
  | false => {b} not(b)
  : Bool -> Bool)(b)

Using case arguments:

import Base@0

main : {case a : Bool, b : Bool} -> Bool
| true  => b
| false => not(b)

As you can see, the version using case arguments is the shortest.

Dependent motives

In Formality, the type returned by a case expression can depend on the matched value. For example, we can do this:

import Base@0

main : Num
  case/Bool true as x
  | true  => 42
  | false => "hello"
  : case/Bool x
    | true  => Num
    | false => String
    : Type

While strange-looking, this is perfectly logical and well-typed. The reason this works is that, when checking the type of a case expression, Formality first specializes the motive for every possible branch to determine "what it demands". In this case, it demands a Num on the true branch and a String on the false branch. If you satisfy every demand, then it determines the type of the whole case expression by specializing the motive using actual matched value. In this case, we matched on true, so it returns Num.

This has many interesting effects and applications.

Functions with different return types

We could write a function that returns different types based on its input:

import Base@0

foo : {case b : Bool} -> case/Bool b | true => Num | false => String : Type
| true  => 42
| false => "hello"

main : Num
  foo(true) .+. foo(true)

Note this is not the same as returning Either, which is done in Haskell when we want to return different types, needing an extra pattern-match every time you call the function. Here, that's not necessary, because foo really returns different types based on its input. Thanks to dependent types, Formality can statically determine if you're doing the right thing, so, replacing foo(true) by foo(false) would be a type error.

Proving absurds with Empty

Another interesting example comes from the Empty datatype:

// {- From Empty@0 -}

T Empty

This code isn't incomplete, the datatype has zero constructors. This means it is impossible to construct a term t with type Empty, but we can still accept it as a function argument. So, what happens if we pattern match against it?

import Base@0

wtf : {x : Empty} -> ???
  case/Empty
  : ???

The answer is: we can replace ??? by anything, and the program will check. That's because we technically proved all the demanded cases, so Formality just returns the motive directly as the type of this case expression. That allows us to write a function that returns something absurd:

import Base@0

prove_1_is_2 : {e : Empty} -> Equal(Num, %1, %2)
  case/Empty e
  : Equal(Num, %1, %2)

If we managed to call it, we'd have a proof that 1 .==. 2, which is absurd, making Formality inconsistent. But this is fine: prove_1_is_2 can't ever be called, because we can't construct a value of type Empty. This is pretty useful, so we have a base-lib function that, given an element of Empty, returns any absurd type:

// {- From Empty@0 -}

absurd : {e : Empty, ~P : Type} -> P
  case/Empty e
  : P

The opposite holds too: given an absurd equality, we can make an element of type Empty. Here is an example:

// {- From Bool@0 -}

true_isnt_false : {e : true == false} -> Empty
  unit :: rewrite x
    in (case/Bool x
       | true  => Unit
       | false => Empty
       : Type)
    with e

As an exercise, convince yourself why this works.

Unreachable branches with equality notes

Notice the program below:

import Base@0

main : Num
  case/Bool true
  | true  => 10
  | false => ?
  : Num

Here, we're matching against true, so we know the false case is unreachable, but we still need to fill it with something. In this case, we could write any number, but that's not always possible. In those cases, equality notes can be helpful:

import Base@0

main : Num
  case/Bool true as x
  + refl(~Bool, ~%true) as e : Equal(Bool, %x, %true)
  | true  => 10
  | false => absurd(false_isnt_true(e), ~Num)
  : Num

This moves a proof that Equal(Nat, x, true) that will be specialized to the value of x on each branch. On the false branch, we get an e : Equal(Nat, false, true), which is absurd, so we can fill it by calling absurd function on e, without providing an actual number.

To desugar an equality note, Formality simply adds an extra, erased equality argument:

import Base@0

main
  (case/Bool true as x
  | true  => {e} 10
  | false => {e} absurd(false_isnt_true(e), ~Num)
  : {e : Equal(Bool, %x, %true)} -> Num)(refl(~Bool, ~%true))

Notice that the left side of the equality note is allowed to access the matched value (x), while the right side is used to build the refl proof.

Recursive fields

Fields can refer to the datatype being defined:

// {- From Nat@0 -}
T Nat
| succ {pred : Nat}
| zero

Since Nat is so common, there is a syntax-sugar for it: 0n3, which expands to succ(succ(succ(zero))).

Mutual recursion is allowed:

T Foo
| foo {bar : Foo}

T Bar
| bar {foo : Bar}

And negative occurrences too (soundly):

T Loop
| loop {f : Foo -> Foo}

While recursion is very liberal on types, the same isn't true for programs. You can't write recursive functions directly, such as:

mul_by_2 : {case n : Nat} -> Nat
| succ => succ(succ(mul_by_2(n)))
| zero => zero

main : Nat
  mul_by_2(0n2)

This makes dealing with recursive datatypes more complex than usual, since you must use a boxed definition:

#mul_by_2*N : !{case n : Nat} -> Nat
| succ => succ(succ(mul_by_2(n)))
| zero => zero
halt: zero

#main : !Nat
  ($mul_by_2*)(0n2)

This is explained in more details on the Boxes and Recursion sections of the Tutorial.

Polymorphism

Polymorphic datatypes allow us to create multiple instances of the same datatype with different contained types.

import Base@0

T Pair {A : Type, B : Type}
| pair {x : A, y : B}

main : Num
  let a = pair(~Bool, ~Num, true, 7)

  case/Pair a
  | pair => a.y
  : Num

The {A : Type, B : Type} after the datatype declares two polymorphic variables, A and B, allowing us to create different types of pair, such as a pair of Bool, or a pair of Num, without needing to duplicate the definition of Pair. Each polymorphic variable adds an implicit, erased argument to each constructor, so, instead of pair(true, 7), you need to write pair(~Bool, ~Num, true, 7). In a future, this verbosity will be prevented with implicit arguments. For now, you can amend it with a let:

import Base@0

T Pair {A : Type, B : Type}
| pair {x : A, y : B}

main : [:Pair(Bool, Bool), Pair(Bool, Bool)]
  let pairbb = pair(~Bool, ~Bool)

  let a = pairbb(true, false)
  let b = pairbb(false, true)
  [a, b]

By combining polymorphism with recursive types, we can create the popular List type (already defined on the base libs above):

// {- T List {T : Type}
// | cons {head : T, tail : List(T)}
// | nil

// Returns all but the first element
// tail : {~T : Type, case list : List(T)} -> List(T)
// | cons => list.tail
// | nil  => nil(~T) -}

main : List(Num)
  tail(~Num, cons(~Num, 1, cons(~Num, 2, cons(~Num, 3, nil(~Num)))))

Since List is so common, there is a built-in syntax-sugar for it, the dollar sign:

main : List(Num)
  tail(~Num, Num$[1, 2, 3])

Indices

Indices are like polymorphic variables, except that, rather than constant types, they are computed values that can depend on each constructor's arguments. That gives us a lot of type-level power, and is one of the reasons Formality is a great proof language. For example:

T IsEven (x : Num)
| make_even {half : Num} (half .*. 2)

This datatype has one index, x, of type Num. Its constructor, is_even, has one field, half : Num. When you write make_even(3), the number 3 is multiplied by two and moved to the type-level, resulting in a value of type IsEven(6). So, for example:

even_0 : IsEven(0)
  make_even(0)

even_2 : IsEven(2)
  make_even(1)

even_4 : IsEven(4)
  make_even(2)

even_6 : IsEven(6)
  make_even(3)

Notice that is impossible to create a value with type IsEven(3), because that would require a half : Num such that half .*. 2 .==. 3, and there is no such number. Because of that, indexed datatypes have many applications. For example, suppose that you want to write a div2 function that divides a number by two, but you don't want it to be called with odd numbers. You could do this:

div2 : {x : Num, ~x_is_even : IsEven(x)} -> Num
  x ./. 2

main : Num
  div2(10, ~make_even(5))

Here, the second argument "proves" that the first is even, making it impossible to call div2 with an odd input. Since x_is_even is erased from the runtime, this gives us a static, zero-cost guarantee. Alternatively, we could have used ~x_is_even : [k : Num ~ x .==. k .*. 2], but indexed datatypes are more handy in general. For example, you could easily write one for 3 consecutive numbers:

T Consecutive (a : Num, b : Num, c : Num)
| consecutive {init : Num} (init, init .+. 1, init .+. 2)

So, for example, Consecutive(a, b, c) means a, b, c are 3 consecutive numbers such as 7, 8, 9. You could also write a type for sorted lists:

import Base@0

T SortedList {A : Type} (xs : List(Num))
| scons {
  add  : Num,
  head : Num,
  tail : List(Num),
  prof : SortedList(Num, cons(~Num, head, tail))
} (cons(~Num, head .+. add, cons(~Num, head, tail)))
| snil
   (nil(~Num))

Here, SortedList(xs) would mean that xs is a list of Nums with elements in descending order, such as [7, 5, 5, 3, 1]. In order words, a function that returned [xs : List(Num) ~ SortedList(xs)] could only be written if xs was, indeed, sorted. And so on. Indexed datatypes, in a way, give us a powerful language of specifications which we can use to statically reason about our datatypes and algorithms.

For a more complex example of indices, here is a Vector, which is like a List, except that its type stores its own length:

// From Vector@0
// A vector is a list with a statically known length
T Vector {A : Type} (len : Nat)
| vcons {~len : Nat, head : A, tail : Vector(A, len)} (succ(len))
| vnil                                                (zero)

Every time we call vcons, the len index on the type of the vector increases by one, allowing us to track its length statically:

main : Vector(String, 0n3)
   vcons(~String, ~0n2, "ichi",
   vcons(~String, ~0n1, "ni",
   vcons(~String, ~0n0, "san",
   vnil(~String))))

This has many applications. For example, we can create a type-safe vhead that returns the first element of a non-empty vector:

// {-From Vector@0 -}

// {-A type-safe "head" that returns the first element of a non-empty vector
// On the `vcons` case, return the vector's head
// On the `vnil` case, prove it is unreachable, since `xs.len > 0` -}
vhead: {~T : Type, ~n : -Nat, xs : Vector(T, %succ(+n))} -> T
  case/Vector xs
  + refl(~Nat, ~%succ(+n)) as e : Equal(Nat, xs.len, %succ(+n))
  | vcons => xs.head
  | vnil  => absurd(zero_isnt_succ(~n, e), ~T) 
  : T

Notice that constructor indices are accessible on the left side of an equality note. That gives us an e : zero is succ(n) (since zero is the length of vnil, and succ(n) is the length of xs). Since 0 != 1, this branch is unreachable, so we can fill it with an absurd.

Self-Encodings

Interestingly, none of the features above are part of Formality's type theory. Instead, they are lightweight syntax-sugars that elaborate to plain-old lambdas. To be specific, a datatype is encoded as is own inductive hypothesis, with "Self Types". For example, the Bool datatype desugars to:

Bool : Type
  ${self}
  { ~P    : {x : Bool} -> Type
  , true  : P(true)
  , false : P(false)
  } -> P(self)

true : Bool
  new(~Bool){~P, true, false} true

false : Bool
  new(~Bool){~P, true, false} false

case_of : {b : Bool, ~P : {x : Bool} -> Type, t : P(true), f : P(false)} -> P(b)
  (%b)(~P, t, f)

Here, ${self} ..., new(~T) val and %b are the type, introduction, and elimination of Self Types, respectively. You can see how any datatype is encoded under the hoods by asking fm to evaluate its type, as in, fm Data.Bool@0/Bool -W (inside the fm_modules directory). The -W flag asks Formality to not evaluate fully since Bool is recursive. While you probably won't need to deal with self-encodings yourself, knowing how they work is valuable, since it allows you to express types not covered by the built-in syntax.

TODO: write a brief explanation of how Self encodings work (although I think it should be self-explanatory from this example!).

You can’t perform that action at this time.