This repository contains the starter code for Lab 5 of the course Language and Compiler Design.
In this lab, you will extend the previous lab sessions' work on the compiler of CALCB language with short-circuit operations. Your job is to implement declarations of identifiers in the style of let expressions. You need to implement the interpreter, the type systems, and the compiler to LLVM. Use the provided environment module env.ml to manage the identifiers and their relations to values (interpreter), types (type checker), and LLVM results (compiler).
Assuming that you have completed the previous labs, the following steps outline the tasks you need to accomplish in this lab:
-
Lexer: Modify the lexer (
lexer.mll) to recognize identifiers (sequences of letters) and the keywordsletandin. You will also need to add reuse the token for the equality operator=. Take care to place the identifier rule at the end of the rules to avoid conflicts with keywords. -
Parser: Update the parser (
parser.mly) to handle let expressions of the formlet ID = expr in expr. Ensure that the parser correctly constructs the AST for these expressions. Let expressions have the lowest precedence and are right associative, which means that they should be placed at the top level of the precedence hierarchy. -
AST: Extend the AST definition (
ast.ml) to include a new constructor for let expressions. This constructor should take a list of pairs (identifier/expression) and an expression representing the scope of the identifiers. You may start with a single identifier declaration to pave the way for multiple declarations later. -
Interpreter: Implement the evaluation of let expressions in the interpreter module (
eval.ml). This involves evaluating the expressions assigned to identifiers, updating the environment with these new bindings, and then evaluating the body of the let expression in this updated environment. Ensure that the environment is correctly managed to handle nested let expressions and scope. Take notice of the explanation in the slides of the course. -
Type System: Implement the type checking for let expressions in the type system module (
typing.ml). This involves checking that expressions assigned to identifiers are well-typed and that the body of the let expression is also well-typed in the context of these declarations. The resulting type is the type of the expression body. You will need to manage a typing environment that maps identifiers to their types. Notice that the environment moduleenv.mlprovides a generic data structure that can be used for this purpose. -
LLVM Code Generation: Extend the LLVM code generation module (
llvm.ml) to handle let expressions. This involves generating LLVM code for the expressions assigned to identifiers, storing the results (registers or constants) in the environment, and then generating code for the body of the let expression using these results. An alternative is to use local variables to store the intermediate result.
Building the project follows the same steps as in previous labs. Make sure you have dune installed.
Run the following command inside the project root:
dune buildThis compiles the interpreter and related modules.
After building, you can run the interpreter with:
dune exec calccThis will start the program that evaluates expressions written in the defined expression language.
The program asks for an expression to compile and outputs some LLVM code. For instance, if the expression is 1+2+3, the output will be
@.str = private unnamed_addr constant [4 x i8] c"%d\0A\00", align 1
define i32 @main() #0 {
%1 = add nsw i32 1, 2
%2 = add nsw i32 %1, 3
%3 = call i32 (ptr, ...) @printf(ptr noundef @.str, i32 noundef %2)
ret i32 0
}
declare i32 @printf(ptr noundef, ...) #1This output should be placed in a file with the extension ll and compiled using clang
clang -o a a.llThe result is the executable a which can be run from the console
./ato obtain the result 6
To research LLVM instructions, you can read the documentation online and can compile sample C programs to LLVM. The command to do so is
clang -S -emit-llvm -c a.c -o a.llthe result will be a file called a.ll