Skip to content

Latest commit

 

History

History

toyparser

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Parsing

From your fork of the repository, go to the lip/ directory and create a new project:

dune init project toyparser

Then, run the following commands from the lip/toyparser directory:

echo '(using menhir 2.1)' >> dune-project
echo -e '(menhir (modules parser))\n(ocamllex lexer)' >> lib/dune

These commands extend the dune configuration files, to instruct the compiler to use the ocamllex and the Menhir tools.

The goal of this project is to familiarize with the Menhir parser generator. While the lexer takes as input a string of text and outputs a list of tokens (in the form of a lexer buffer of type Lexing.lexbuf), the parser takes as input a lexer buffer, and gives as output an abstract syntax tree (AST).

You will work on the following files in the lib directory:

  • ast.ml, defining the type of the abstract syntax tree
  • lexer.mll, defining the lexer generator
  • parser.mly, defining the parser generator
  • main.ml, containing utility functions.

In this project, we start with simple expressions that are summations of natural numbers. For instance:

1 + 2 + 3 + (1 + 2)

Correspondigly, the type of the abstract syntax tree is the following:

type ast =
    Const of int
  | Add of ast * ast

The main utility function in the Toyparser library is:

eval : ast -> result

This function takes as input an abstract syntax tree, and outputs its integer value. For instance:

eval (Add(Add(Const(1),Const(2)),Const(3)))
> 6

The main routine in bin/main.ml reads a line from the stdin, and processes it according to the command line:

dune exec toyparser

which evaluates the string fed from stdin, and prints the result.

If everything is fine, you can test the project as follows:

echo "1 + 2 + 3 + (1 + 2)" | dune exec toyparser
> 9

The project requires you to work at the following tasks:

Task 1

Extend the lexer, the parser and the evaluation function to handle also the subtraction operator.

The tricky part of this task is to obtain the right priority to operators. As a guideline, we want to ensure that their priority is the same as in the Python REPL. For instance:

echo "5 - 3 - 1" | dune exec toyparser
> 1                                   

The priority of operators is defined in parser.mly. In the project skeleton, look at the line:

%left PLUS

This line instructs the lexer that the PLUS token associates to the left. In practice, it introduces a rule in the parser saying that the sum production cannot appear as the right child of another sum production.

Then, when parsing:

1 + 2 + 3

The parser will output the AST:

Add(Add(Const(1),Const(2)),Const(3))

A line starting with %left defines an associativity group. Every token or production in the same associativity group has the same priority.

When there are multiple associativity declarations, like e.g.:

%left tok_1
%left tok_2
...
%left tok_n

a token tok_i has priority over tok_j whenever i>j.

In practice, if tok_j binds looser (i.e. has lower priority) than tok_i, then tok_j cannot appear as a direct child of tok_i in the AST of any given expression.

Task 2

Extend the lexer, the parser and the evaluation function to handle also multiplication and division.

Revise the evaluation function so that results are of type int option. The result of eval e must be None if the evaluation involves a division by zero.

Task 3

Extend the lexer, the parser and the evaluation function to handle also the unary minus. For instance:

echo "-1 - 2 - -3" | dune exec toyparser
0                                   

Task 4

Extend the lexer, the parser and the evaluation function to handle also hexadecimal numers in C syntax. For instance:

echo "0x01 + 2" | dune exec toyparser
> 3

Task 5

Implement unit tests in the test directory.