# CW2.1:  Compiler Front End for FUNC

Your overall task is to develop a compiler for the programming language given below, called ``FUNC``.
This overall task is composed of two parts:

- **CW2 Part I** is concerned with the implementation of the compiler’s front end (this document). This is worth 10 marks.
- **CW2 Part II** is concerned with the implementation of the compiler’s back end. This is also worth 10 marks and released later. 

**CW 2 Part I** consists of two parts: 
- Writing a lexer (4 points)
- Writing a parser (6 points)

If you have any questions, use the labs slots or ask Kathrin & the Lab Helpers.

**IMPORTANT** 
Compiler errors: All code you submit must compile. Programs that do not compile will receive an automatic zero.
- If you are having trouble getting your assignment to compile, please visit consulting hours.
- If you run out of time, it is better to comment out the parts that do not compile, than hand in a more complete file that does not compile.

## Testing 

At the end of this file you'll find example program you can test your programs with. 
**You will want to write additional tests for intermediate steps.**

You can easily write tests to ensure that your program behaves as expected as follows:

In [None]:
assert ([2;3;5;5;2;1] (* Expected result *) 
= [2;3;5] @ [5;2;1] (* Calling your function *) ) ;; 

**The plagarism policy does not hold for this part of the coursework. 
Please feel free to share your tests with other students in the course.**

## Submission

Please submit a .zip file containing this notebook and the file ``CW/func.mll`` on Canvas until **Fri, 24th March**. 
Please ensure that you do not change the name or signature of the functions ``parse_exp``, ``parse_program``, etc. 

**Late Submissions.** See Canvas for F29LP's late-submission policy. 

**Plagarism.** All code (except tests) is subject to the course's plagarism policy. 

Happy coding!

## The Source Language: FUNC

The ``FUNC`` language has the following syntax: 

```
<program> ::= <methods> 
<methods> ::= <method>;[<methods>] 
<method> ::= method <id>([<args>]) [vars <args>] 
	begin <statements> [return <id>;] endmethod
<args> ::= <id>[,<args>] 
<statements> ::= <statement>;[<statements>] 
<statement> ::= <assign> | <if> | <while> | <rw>
<rw> ::= read <id> | write <exp>
<assign> ::= <id> := <exp>
<if> ::= if  <cond> then <statements> [else <statements>] endif 
<while> ::= while <cond> begin <statements> endwhile
<cond> ::= <bop> ( [<exps>] ) 
<bop> ::= less | lessEq | eq | nEq 
<exps> ::= <exp> [,<exps>] 
<exp> ::= <id>[( [<exps>] )] | <int> 
<int> is a natural number (no leading zeroes) 
<id> is any string starting with a character followed by characters or numbers (that is not already a keyword)
```

- Each program must have a function called ``main`` with no arguments and no return value. 
- All other functions may have an optional return value. If a function does not have a return value, they implicitly return `0`.
- You should support the following built-in functions - assume they have been defined; they accept two integers and return an integer:
     - ``plus``, which adds its arguments;
     - ``times``, which multiplies its arguments;
     - ``minus``, which subtracts its arguments;
     - ``divide``, which divides its arguments.
- All the boolean operators (``less``, ``lessEq``, ``eq``, ``nEq``) are also binary, i.e. take two arguments.
- The ``read`` command assumes that the given variable is an ``int`` variable.

##### Example 

The following example illustrates a valid FUNC program (more examples later in the document)

```
method pow(x, y) vars i, res
begin
    res := x; 
    i := 1; 
    while less(i,y)
    begin
        res := times(res,x);
        i := plus(i,1); 
    endwhile;
    write res;
    return res;
endmethod;

method main() vars a, b, x
begin
    a := 5; b := 2; 
    x := pow(b,a);
    if  eq(x,32) then write 1; else write 0; endif; 
endmethod;
```

## Part 1: Lexing (4 Points)

Produce a lexer file into ``CW/func.mll`` together with a suitable representation of tokens.

**IMPORTANT** Jupyter Notebooks automatically saves some output information. 
Each time you change the ``func.mll`` file and want to re-run the following commands, 
first choose in the menu Kernel -> Restart & Clear Output to ensure your changed file is used.

In [None]:
#require "jupyter.notebook" ;;
open Jupyter_notebook ;;

In [None]:
(* Run the lexer generator *)
Process.sh "ocamllex func.mll";;

(* Compile and load the file produced by the lexer *)
Process.sh "ocamlc -c func.ml";;
#load "func.cmo";;

(* Convert the buffer into a list for further processing. *)
let rec stream_to_list buffer = 
    match Func.token buffer with 
    | EOF -> []
    | x -> x :: stream_to_list buffer

In [None]:
(*
You can test your lexer here. 
See below the code for lexing program ex1.
You will want to test your lexer with more code snippets!
*)

open Func
(* let res = stream_to_list (Lexing.from_string (* Insert example program *)) *)

## Part 2: Parsing (6 Points)

Below you can see an abstract grammar for the language you've seen before.

In [None]:
type exp = Numb of int | Id of string | App of string * exp list

type bop = Less | LessEq | Eq | NEq 
type cond = C of bop * exp * exp

type statement =
  Assign of string * exp
| Read of string 
| Write of exp 
| If of cond * statement list
| Ite of cond * statement list * statement list
| While of cond * statement list

type mmethod = M of string (* name of function *)
                * string list (* arguments *)
                * string list (* declarations *) 
                * statement list (* function body *)
                * string option (* possible return value value *)

type program = P of mmethod list

Write a recursive-descent parser for ``FUNC``.
Your parser should contain at least: 
- a function ``parse_exp : token list -> exp * token list``
- a function ``parse_cond : token list -> cond * token list``
- a function ``parse_statement : token list -> statement * token list``
- a function ``parse_program : program -> statement * token list``

You will require more functions. 
You can get partial points by providing e.g. only ``parse_exp``. 

**Hints:** 
- Your parser does **not** have to ensure that variables, functions, the ``main`` function etc. exists or functions are applied to the right number of arguments.
- You will want to test your program step-by-step, e.g. test that ``parse_exp`` runs as expected before writing ``parse_cond``. 

In [None]:
exception SyntaxError of string
open List

(* Optional:
   You might want to write a function to print tokens. 
   Comment out if not needed.  *) 
      
let print_token (t : Func.token) : string = match t with 
 | _ -> "TO IMPLEMENT" 

let rec print_list (s :  Func.token list) = match s with 
  | [] -> ""
  | x :: xs -> String.cat (print_token x) (String.cat " " (print_list xs))

In [None]:
(* Optional helper functions *)
let parse_id xs : string * token list = match xs with 
 | ID x :: xs' -> (x, xs')
 | _ -> raise (SyntaxError "Not an identifier.")
  
let parse_token (x : token) (xs : token list) = match xs with 
 | y :: ys -> if (x == y) then ys 
                 else raise (SyntaxError (String.cat "Token expected: "(String.cat (print_token x) (print_list xs) )))
 | _ -> raise (SyntaxError (String.cat "Token expected: "(String.cat (print_token x) (print_list xs) ))) 

In [None]:
let rec parse_exp (ts : token list) : exp * token list = 
    raise (SyntaxError "Not implemented yet")

In [None]:
let parse_cond (ts : token list) : cond * token list =  
    raise (SyntaxError "Not implemented yet")

In [None]:
let rec parse_statement (ts : token list) : statement * token list = 
    raise (SyntaxError "Not implemented yet")

In [None]:
let parse_program (ts: token list) : program * token list =
    raise (SyntaxError "Not implemented yet")

## Appendix - Example Programs

In [None]:
let ex1 = "method pow(x, y) vars i, res
begin

	res := x;
	i := 1;
	while less(i,y)
	begin
		res := times(res,x);
		i := plus(i,1);
        endwhile;
	write res;
	return res;

endmethod;

method main() vars a, b, x
begin

	a := 5;
	b := 2;
	x := pow(b,a);
	if eq(x,32)
		 then write 1;
	else
		write 0;
	endif;

endmethod;
"    

let ex2 = "method pow(x,y) vars i, res,w
begin

	res := x(da,da(1,2,m(1,1)),1);
	i := 2;
	if eq(x,32) then 
		write 1;
		read a;
	else
		b := 11;
	endif;
	while less(i,y)
	begin
		res := times(res,x);
		i := plus(i,1);
        endwhile;
	write res;
	return res;

endmethod;

method main() vars a, b, x
begin

	a := 5; 
	b := 2;
	x := pow(b,a);
	if eq(x,32)
		 then write 1; 
	else 
		write 0;
	endif; 
endmethod;"

let ex3 = "method main() vars inp, res
begin
read inp;
res:=0;
while less(0,inp)
begin
res := plus(res,inp);
inp := minus(inp,1);
endwhile;
write res;
endmethod;
"

let ex4 = "method sum(inp) vars res
begin
res:=0;
while less(0,inp)
begin
res := plus(res,inp);
inp := minus(inp,1);
endwhile;
return res;
endmethod;

method main() vars inp,res
begin
read inp;
res := sum(inp);
write res;
endmethod;"

let ex5 = "method sum(inp) vars tmp
begin
if eq(inp,0) then
res := inp;
else
tmp := sum(minus(inp,1));
res := plus(tmp,inp);
endif;
endmethod;

method main() vars inp,res
begin
read inp;
res := sum(inp);
write res;
endmethod;"

let text_to_ast ex = parse_program (stream_to_list (Lexing.from_string ex))

(* Compare with what you expect *)
let parsed1 = text_to_ast ex1 
let parsed2 = text_to_ast ex2
let parsed3 = text_to_ast ex3
let parsed4 = text_to_ast ex4
let parsed5 = text_to_ast ex5