# CW 2.4:  Lexer for FUNC

Your overall task is to develop a compiler for the programming language given below, called ``FUNC``.

**CW 2.4** consists of writing a lexer for FUNC.

If you have any questions, use the labs slots or ask Kathrin & the Lab Helpers.

**IMPORTANT** 
Compiler errors: All code you submit must compile. Programs that do not compile will receive an automatic zero.
- If you are having trouble getting your assignment to compile, please visit consulting hours.
- If you run out of time, it is better to comment out the parts that do not compile, than hand in a more complete file that does not compile.

## Testing 

At the end of this file you'll find example program you can test your programs with. 
**You will want to write additional tests for intermediate steps.**

You can easily write tests to ensure that your program behaves as expected as follows:

In [1]:
assert ([2;3;5;5;2;1] (* Expected result *) 
= [2;3;5] @ [5;2;1] (* Calling your function *) ) ;; 

- : unit = ()


**The plagarism policy does not hold for this part of the coursework. 
Please feel free to share your tests with other students in the course.**

## Submission

Please submit a .zip file containing this notebook and the file ``CW/func.mll`` on Canvas until **Thu, 28th March**. 

**Late Submissions.** See Canvas for F29LP's late-submission policy. 

**Plagarism.** All code (except tests) is subject to the course's plagarism policy. 

Happy coding!

## The Source Language: FUNC

Recall the syntax of FUNC:

```
<program> ::= <methods> 
<methods> ::= <method>;[<methods>] 
<method> ::= method <id>([<args>]) [vars <args>] 
	begin <statements> [return <id>;] endmethod
<args> ::= <id>[,<args>] 
<statements> ::= <statement>;[<statements>] 
<statement> ::= <assign> | <if> | <while> | <rw>
<rw> ::= read <id> | write <exp>
<assign> ::= <id> := <exp>
<if> ::= if  <cond> then <statements> [else <statements>] endif 
<while> ::= while <cond> begin <statements> endwhile
<cond> ::= <bop> ( [<exps>] ) 
<bop> ::= less | lessEq | eq | nEq 
<exps> ::= <exp> [,<exps>] 
<exp> ::= <id>[( [<exps>] )] | <int> 
<int> is a natural number (no leading zeroes) 
<id> is any string starting with a character followed by characters or numbers (that is not already a keyword)
```

- Each program must have a function called ``main`` with no arguments and no return value. 
- All other functions may have an optional return value. If a function does not have a return value, they implicitly return `0`.
- You should support the following built-in functions - assume they have been defined; they accept two integers and return an integer:
     - ``plus``, which adds its arguments;
     - ``times``, which multiplies its arguments;
     - ``minus``, which subtracts its arguments;
     - ``divide``, which divides its arguments.
- All the boolean operators (``less``, ``lessEq``, ``eq``, ``nEq``) are also binary, i.e. take two arguments.
- The ``read`` command assumes that the given variable is an ``int`` variable.

##### Example 

The following example illustrates a valid FUNC program (more examples later in the document)

```
method pow(x, y) vars i, res
begin
    res := x; 
    i := 1; 
    while less(i,y)
    begin
        res := times(res,x);
        i := plus(i,1); 
    endwhile;
    write res;
    return res;
endmethod;

method main() vars a, b, x
begin
    a := 5; b := 2; 
    x := pow(b,a);
    if  eq(x,32) then write 1; else write 0; endif; 
endmethod;
```

## Lexing

Produce a lexer file into ``CW/func.mll`` together with a suitable representation of tokens.

**IMPORTANT** Jupyter Notebooks automatically saves some output information. 
Each time you change the ``func.mll`` file and want to re-run the following commands, 
first choose in the menu Kernel -> Restart & Clear Output to ensure your changed file is used.

In [2]:
#require "jupyter.notebook" ;;
open Jupyter_notebook ;;

/Users/lucca/.opam/default/lib/base64: added to search path
/Users/lucca/.opam/default/lib/base64/base64.cma: loaded
/Users/lucca/.opam/default/lib/ocaml/compiler-libs: added to search path
/Users/lucca/.opam/default/lib/ocaml/compiler-libs/ocamlcommon.cma: loaded
/Users/lucca/.opam/default/lib/seq: added to search path
/Users/lucca/.opam/default/lib/yojson: added to search path
/Users/lucca/.opam/default/lib/yojson/yojson.cma: loaded
/Users/lucca/.opam/default/lib/ppx_yojson_conv_lib: added to search path
/Users/lucca/.opam/default/lib/ppx_yojson_conv_lib/ppx_yojson_conv_lib.cma: loaded
/Users/lucca/.opam/default/lib/ocaml/unix.cma: loaded
/Users/lucca/.opam/default/lib/bytes: added to search path
/Users/lucca/.opam/default/lib/uuidm: added to search path
/Users/lucca/.opam/default/lib/uuidm/uuidm.cma: loaded
/Users/lucca/.opam/default/lib/jupyter: added to search path
/Users/lucca/.opam/default/lib/jupyter/jupyter.cma: loaded
/Users/lucca/.opam/default/lib/result: added to search pat

In [3]:
(* Run the lexer generator *)
Process.sh "ocamllex func.mll";;

(* Compile and load the file produced by the lexer *)
Process.sh "ocamlc -c func.ml";;
#load "func.cmo";;

(* Convert the buffer into a list for further processing. *)
let rec stream_to_list buffer = 
    match Func.token buffer with 
    | EOF -> []
    | x -> x :: stream_to_list buffer

83 states, 5367 transitions, table size 21966 bytes


- : Jupyter_notebook.Process.t =
{Jupyter_notebook.Process.exit_status = Unix.WEXITED 0; stdout = None;
 stderr = None}


- : Jupyter_notebook.Process.t =
{Jupyter_notebook.Process.exit_status = Unix.WEXITED 0; stdout = None;
 stderr = None}


val stream_to_list : Lexing.lexbuf -> Func.token list = <fun>


In [4]:
(*
You can test your lexer here. 
You will want to test your lexer with more code snippets!
*)

let p_basic = 
"
method main() vars inp, res
begin
read inp;
res:=0;
while less(0,inp)
begin
res := plus(res,inp);
inp := minus(inp,1);
endwhile;
write res;
endmethod;
";;

open Func

let res = stream_to_list (Lexing.from_string p_basic)

val p_basic : string =
  "\nmethod main() vars inp, res\nbegin\nread inp;\nres:=0;\nwhile less(0,inp)\nbegin\nres := plus(res,inp);\ninp := minus(inp,1);\nendwhile;\nwrite res;\nendmethod;\n"


val res : Func.token list =
  [METHOD; ID "main"; LBRA; RBRA; VARS; ID "inp"; COMMA; ID "res"; BEGIN;
   READ; ID "inp"; SEMI; ID "res"; ASSIGN; INT 0; SEMI; WHILE; LESS; LBRA;
   INT 0; COMMA; ID "inp"; RBRA; BEGIN; ID "res"; ASSIGN; ID "plus"; LBRA;
   ID "res"; COMMA; ID "inp"; RBRA; SEMI; ID "inp"; ASSIGN; ID "minus"; LBRA;
   ID "inp"; COMMA; INT 1; RBRA; SEMI; ENDWHILE; SEMI; WRITE; ID "res"; SEMI;
   ENDMETHOD; SEMI]
