## Session 4: Error handling and modules

### Exceptions

OCaml's type system detects a wide range of *static errors*, such as syntax errors, scope errors, and type errors. Some errors are only discovered at runtime, for example, division by zero:

In [None]:
let div x y = x / y
let _ = div 3 0

Such error is fatal to a program: the computation is immediately stopped, and future computation will not be executed. For example, if we map `div 3` to a `[3; 6; 0; 9; 12]`, then `div 3 9` and `div 3 12` will not be evaluated.

In [None]:
let l1 = [3; 6; 0; 9; 12]
let _ = List.map (div 3) l1

Like other modern languages, OCaml has a way to `catch` exceptions and recover from errors, with a syntax similar to pattern matching. We can define a safe division that takes positive integers `x` and `y`, and returns a negative number `-1` when division by zero happens. Surely, we can also use if-and-else to test if `y = 0`.

In [None]:
let safe_div x y = 
  try 
    div x y 
  with
  | Division_by_zero -> -1

In general, one could define a safe map function `safe_map : ('a -> 'b) -> 'b -> 'a list -> 'b list`
that maps a function over a list as usual, but if the function throws an exception, we use a given default value as the result. Notice that the wildcard pattern `_` catches all exceptions 

In [None]:
let rec safe_map f default = function
  | [] -> []
  | x :: xs -> 
    let hd = try f x with _ -> default in
      hd :: safe_map f default xs

We may define our own exception types that carry parameters and raise exceptions with `raise`. For example, the following `merge` function tests if its two input lists have the same length, and if they don't, `merge` throws an exception with the lengths of the two lists.

In [None]:
exception Length_mismatch of int * int

let rec merge xs ys f = 
  let xlen = List.length xs in
  let ylen = List.length ys in
    if xlen <> ylen then 
      raise (Length_mismatch (xlen, ylen))
    else
      match xs, ys with
        | [], [] -> []
        | x :: xs, y :: ys -> (f x y) :: merge xs ys f
        | _ -> assert false

Here we used `assert`, which triggers exception `Assertion_failure` if the code reaches `assert` and the followed condition is not satisfied. So, `assert false` will always trigger exception. However, since we have tested that the input lists have the same length, the third pattern matching case (when two lists have different lengths) is unreachable, so we put `assert false` there to make sure that if the unreachable branch is reached, the program will fail. We also need this to pass the coverage check.

In [None]:
let l1 = [1; 2; 3; 4; 5]
let l2 = [2; 1; 4; 3; 5]
let f x y = if x > y then x else y
let _ = merge l1 l2 f

### Option types

Dealing with exceptions is annoying. Given a function `f : int -> int -> int`, how do you know what kinds of exceptions it might throw? There is no indication in the types. In OCaml, however, we can use option types to encode possibly erroneous values.

In [None]:
type 'a option = None | Some of 'a

Now, our functions can return `None` when there is an error and `Some x` when there is no error and the result of the function is `x`. Let's write `safe_div` again using this style.

In [None]:
let safe_div x y = 
  if y = 0 then 
    None
  else
    Some (x / y) 

Notice the change in the types: `safe_div` has type `int -> int -> int option`, showing that it is a function that might return an error. We can define further operations on the possibly erroneous value via pattern matching, for example, add `3` to the result of `safe_div`. If we see `None`, we should propagate the error and return `None`. If we see `Some a`, we extract the value `a`, add `3` to it, and put it back into `Some`.

In [None]:
let add3 = function
  | None -> None
  | Some a -> Some (a + 3)

In general, we can define a function `bind : 'a option -> ('a -> 'b option) -> 'b option`. `bind` takes two arguments: `x : 'a option`, some input value that could be an error, and `f : 'a -> 'b option`, a function that might produce an error. `bind x f` applies `f` to `x` in the most appropriate way:
- If `x` is `None`, `bind x f = None`, meaning that the error is propagated.
- If `x` is `Some a`, `bind x f = f x`, meaning that we apply `f` to the concrete value `a`.

It is common to write `bind` as an infix operator `>>=`.

In [None]:
let (>>=) x f =
  match x with
  | None -> None
  | Some a -> f a

`bind` is helpful when we're defining sequences of operations that might return errors, for example, we can use pattern matching to compute `(x / y) / (z / w)` via safe_div:

In [None]:
let div4 x y z w = 
  match safe_div x y with
  | None -> None
  | Some k1 ->
    match safe_div z w with
      | None -> None
      | Some k2 -> safe_div k1 k2

The match clauses will go on and on. Instead, using `bind` (written as infix operator `>>=`) gives us the following definition:

In [None]:
let div4 x y z w =
  safe_div x y >>= fun k1 ->
  safe_div z w >>= fun k2 ->
  safe_div k1 k2

### Modules

When our code base grows bigger, we need a way of organizing data structures and their corresponding operations. Module system is the solution. We start with writing a library of stack, implemented with lists.

In [None]:
type 'a stack = 'a list

exception Empty

let empty = []

let push x st = x :: st

let pop = function
  | [] -> raise Empty
  | _ :: st -> st

let top = function
  | [] -> raise Empty
  | x :: _ -> x

let null = function
  | [] -> true
  | _ :: _ -> false

let size = List.length

 There are several problems:
- The types of the operations are still referring to `'a list`, not `'a stack`.
- Future definitions cannot use simple names like `empty` or `length`.
- A stack, ideally, should only be modified via `push` and `pop`, but the user can manipulate the stack arbitrarily like manipulating a list.

We can solve the first problem by adding type annotations, and the second problem by using `long_and_complicated_function_names`. For the last problem, we can only ask the user politely to not manipulate the stack. 

The key problem here is that we need a suitable method of *abstraction*, a module that packs the type definitions and functions together and hides the implementation details.

In [None]:
module Stack : sig
    type 'a stack
    exception Empty
    val empty : 'a stack
    val push : 'a -> 'a stack -> 'a stack
    val pop : 'a stack -> 'a stack
    val top : 'a stack -> 'a
    val null : 'a stack -> bool
    val size : 'a stack -> int
end = struct
  type 'a stack = 'a list

  exception Empty

  let empty = []

  let push x st = x :: st

  let pop = function
    | [] -> raise Empty
    | _ :: st -> st

  let top = function
    | [] -> raise Empty
    | x :: _ -> x

  let null = function
    | [] -> true
    | _ :: _ -> false

  let size = List.length
end

let st = Stack.empty
let st = Stack.push 3 st
let st = Stack.push 2 st
let v = Stack.top st
let st = Stack.pop st
let st = Stack.pop st
let st = Stack.pop st

Here, the `sig` defines the module signature, and `struct` provides a concrete implementation. In a multi-file OCaml project, each file `code.ml` is itself a module `Code` and you can specify its module signature in a header file `code.mli`. We access contents inside modules with dot notations like `Stack.empty` or `Stack.top`.

As an exercise, let's write the module version of queues using two lists.

In [None]:
module Queue : sig
  type 'a queue
  exception Empty
  val empty : 'a queue
  val null : 'a queue -> bool
  val enq : 'a -> 'a queue -> 'a queue
  val deq : 'a queue -> 'a queue
  val top : 'a queue -> 'a
  val size : 'a queue -> int
end = struct
  type 'a queue = 'a list * 'a list
  exception Empty

  let empty = [], []

  let null = function
    | [], [] -> true
    | _ -> false

  let norm = function
    | inq, [] -> [], List.rev inq
    | q -> q

  let enq a (inq, outq) = norm (a :: inq, outq)

  let deq = function
    | inq, hd :: outq -> norm (inq, outq)
    | _ -> raise Empty

  let top = function
    | _, hd :: _ -> hd
    | _ -> raise Empty

  let size (inq, outq) = List.length inq + List.length outq

end