Project 3b: SmallC Interpreter
CMSC 330, Spring 2018 Due March 13th at 11:59pm (Late: March 14th at 11:59pm)
Ground Rules and Extra Info
This is NOT a pair project. You must work on this project alone as with most other CS projects. See the Academic Integrity section for more information.
For this project, post questions using the
Project 3 folder on Piazza.
In your code, you may use any non-imperative standard library functions (with the exception of printing), but the ones that will be useful to you will be found in the
Pervasives module and the
List module. Note that the
List module has been disallowed in previous projects, but in the case of this project and projects going forward it will be allowed.
You still may not use any imperative structures of OCaml such as references or the
Up to this point your exposure to OCaml has been in an effort to familiarize you with the language, and you should now have a basic understanding of OCaml's deeply integrated typing and pattern systems. It's because of these elegant constructs that OCaml has gained popularity as programming language implementation language. Many compilers classes (including CMSC430 at UMD) use OCaml. Facebook uses OCaml heavily, too, e.g., for Hack, its PHP-based web language.
In this project, you will implement a small subset of an interpreter for SmallC, a small C-like language. The language supports variables,
bool types, equality and comparison operations, math and boolean operations, control flow, and printing, all while maintaining static type-safety and being Turing complete!
The language consists of expressions from the
expr type and statements from the
stmt type. These algebraic types can be used to represent the full space of properly formed SmallC programs. Their definitions are found in the
types.ml file. This file should be a constant reference to the data types involved in successfully working with SmallC.
This project is the second part of a project you will do later in the semester, in which you will write the parser for SmallC. In order to run SmallC code it first goes through a lexer, then a parser, and then this interpreter. The lexer and parser are both provided for this project, but later you will write a parser of your own!
To begin this project, you will need to commit any uncommitted changes to your local branch and pull updates from the git repository. Click here for directions on working with the Git repository. The following are the relevant files:
- OCaml Files
- eval.ml: All of your code will all be written in the file
eval.ml. None of the files other than this one should be changed.
- eval.mli: This is the interface for
eval.ml. It defines what types and functions are visible to modules outside of
smallc.ml, listed below).
- studentTests.ml: You aren't required to use this file, but you may use it to add your own tests to the project. We've included some small notes on writing OUnit tests and our test utilities for this project, which we discuss below if you're interested in this method of testing.
- lexer.cm[oi] and parser.cm[oi]: These precompiled object and interface files contain the lexer and parser used for turning plain files into OCaml datatypes. They are precompiled because you will implement your own parser in project 5.
- public.ml and public_inputs/: The public test driver file and the SmallC input files to go with it, respectively.
- smallc.ml: A frontend to your interpreter used to build the
smallcexecutable target. This is explained in more detail below.
- types.ml: This file contains all type definitions used in this project.
- utils.ml and testUtils.ml: These files contain functions that we have written for you and for us that aid in testing and debugging. The small part of utils.ml that concerns you in implementing this project is called out very explicitly when it is needed later in the document, and otherwise you should not need to look at either of these files.
- Submission Scripts and Other Files
- submit.rb: Execute this script to submit your project to the submit server.
- submit.jar and .submit: Don't worry about these files, but make sure you have them.
- Makefile: This is used to build the public tests by simply running the command
make, just as in 216.
Compilation, Tests, and Running
In order to compile your project, simply run the
make command and our
Makefile will handle the compilation process for you, just as in 216. After compiling your code, three executable files will be created:
The public tests can be run by simply running
./public in the terminal; think of this just like with a.out in C).
You can run the SmallC interpreter directly on a SmallC program by running
./smallc <filename>. This driver, provided by us, reads in a program from a file and evaluates the code, outputing the results of any print statements present in the source file. Think of this command a lot like the
ruby command, but instead of running the ruby interpreter, it runs the SmallC interpreter that you wrote yourself!
If you would like more detailed information about what your code produced, running
./smallc <filename> -R provides a report on the resultant variable bindings as reported by your evaluator. If you would like to see data structure that is being generated by the parser and being fed into your interpreter, run
./smallc <filename> -U and our
Utils module will translate the data structure into a string and print it out for you - this part does not require any of your code, so feel free to try it on the public tests before you even start! Use the
smallc executable to your advantage when testing; that's why we're providing it! Note that you don't need to touch
smallc.ml yourself, as it only functions as an entry point for the interpreter and is independent of your implementation.
studentTests.ml file compiles into the
student binary, which behaves just like the
public binary except serves as a dedicated space for you to write your own test cases. We offer a few small notes on how to do this, however full information on OUnit can be found online and our testing tools can be seen in full use in the
public.ml test cases, and then emulated. You are NOT required to use this file at all, and if you don't touch it, it won't affect your code at all. If you want more information about writing student tests within our framework, post on Piazza.
In order to compile against the provided parser and lexer, you must be on OCaml version 4.04.0.
The evaluator must be implemented in
eval.ml in accordance with the signatures for
eval_stmt found in
eval.ml is the only file you will write code in. The functions should be left in the order they are provided, as your implementation of
eval_stmt will rely on
The heart of SmallC is your evaluator. We have already implemented
Parser modules that deal with constructing tokens and creating the AST out of a program. Where your code picks up is with a representation of SmallC programs as OCaml datatypes, which you are then responsible for evaluating according the rules below. A program is made up of a series of statements and expressions:
- Statements represent the structure of a program - declarations, assignments, control flow, and prints in the case of SmallC. - Expressions represent operations on data - variable lookups, mathematical and boolean operations, and comparison. Expressions can't affect the environment, and as a result only return a
valuecontaining the value of the expression.
You are responsible for implementing two functions,
eval_stmt in that order. Each of these takes as an argument an
environment (given in
types.ml) that is an association list. An association list is a list of pairs (2-tuples) where the key is the first element of the pair and the value is the second element. Elements earlier in the list override elements later in the list. In our case the association list is of type
(string * value) list where the string is the id name and the value is the current value of that id. The List module has many useful functions for working with association lists. Statements may change the values of variables, so
eval_stmt returns a possibly updated environment.
A formal operational semantics of SmallC can be found in
semantics.pdf. We highly recommend you read this document as you work to clear up any possible ambiguity introduced by the English explanations below. The formal semantics do not, however, define error cases such as addition between a boolean and an integer and therefore represent a stuck reduction. The expected behavior for these irreducable error cases are defined only in this document and can be boiled down to the following rules:
- Any expression containing division by zero should raise a DivByZero error when evaluated.
- Any expression or statement that is applied to the wrong types should raise a
TypeErrorexception when evaluated, for example, the expression
1 + truewould result in
- An expression or statement that redefines an already defined variable, assigns to an undefined variable, or reads from an undefined variable should raise a
We do not enforce what messages you use when raising the
DeclareError exceptions; that's up to you. Evaluation of subexpressions should be done from left to right, as specified by the semantics, in order to ensure that lines with multiple possible errors match up with our expected errors.
Part 1: eval_expr
eval_expr takes an environment
env and an expression
e and produces the result of evaluating
e, which is something of type
Integer literals evaluate to a
Int_Val of the same value.
Boolean literals evaluate to a
Bool_Val of the same value.
An identifier evaluates to whatever value it is mapped to by the environment. Should raise a
DeclareError if the identifier has no binding.
Add, Sub, Mult, Div, and Pow
These rules are jointly classified as BinOp-Int in the formal semantics.
These mathematical operations operate only on integers and produce a
Int_Val containing the result of the operation. All operators must work for all possible integer values, positive or negative, except for division, which will raise a
DivByZeroError exception on an attempt to divide by zero. If either argument to one of these operators evaluates to a non-integer, a
TypeError should be raised.
Or and And
These rules are jointly classified as BinOp-Bool in the formal semantics.
These logical operations operate only on booleans and produce a
Bool_Val containing the result of the operation. If either argument to one of these operators evaluates to a non-boolean, a
TypeError should be raised.
The unary not operator operates only on booleans and produces a
Bool_Val containing the negated value of the contained expression. If the expression in the
Not is not a boolean (and does not evaluate to a boolean), a
TypeError should be raised.
Greater, Less, GreaterEqual, LessEqual
These rules are jointly classified as BinOp-Int in the formal semantics
These relational operators operate only on integers and produce a
Bool_Val containing the result of the operation. If either argument to one of these operators evaluates to a non-integer, a
TypeError should be raised.
Equal and NotEqual
These equality operators operate both on integers and booleans, but both subexpressions must be of the same type. The operators produce a
Bool_Val containing the result of the operation. If the two arguments to these operators do not evaluate to the same type (i.e. one boolean and one integer), a
TypeError should be raised.
Part 2: eval_stmt
eval_stmt takes an environment
env and a statement
s and produces an updated
environment (defined in Types) as a result. This environment is represented as
a in the formal semantics, but will be referred to as the environment in this document.
NoOp is short for "no operation" and should do just that - nothing at all. It is used to terminate a chain of sequence statements, and is much like the empty list in OCaml in that way. The environment should be returned unchanged when evaluating a
The sequencing statement is used to compose whole programs as a series of statements. When evaluating
Seq, evaluate the first substatement under the environment
env to create an updated environment
env'. Then, evaluate the second substatement under
env', returning the resulting environment.
The declaration statement is used to create new variables in the environment. If a variable of the same name has already been declared, a
DeclareError should be raised. Otherwise, if the type being declared is
Int_Type, a new binding to the value
Int_Val(0) should be made in the environment. If the type being declared is
Bool_Type, a new binding to the value
Bool_Val(false) should be made in the environment. The updated environment should be returned.
The assignment statement assigns new values to already-declared variables. If the variable hasn't been declared before assignment, a
DeclareError should be raised. If the variable has been declared to a different type than the one being assigned into it, a
TypeError should be raised. Otherwise, the environment should be updated to reflect the new value of the given variable, and an updated environment should be returned.
if statement consists of three components - a guard expression, an if-body statement and an else-body statement. The guard expression must evaluate to a boolean - if it does not, a
TypeError should be raised. If it evaluates to true, the if-body should be evaluated. Otherwise, the else-body should be evaluated instead. The environment resulting from evaluating the correct body should be returned.
The while statement consists of two components - a guard expression and a body statement. The guard expression must evaluate to a boolean - if it does not, a
TypeError should be raised. If it evaluates to
true, the body should be evaluated to produce a new environment and the entire loop should then be evaluated again under this new environment, returning the environment produced by the reevaluation. If the guard evaluates to
false, the current environment should simply be returned.
The formal semantics rule for while loops is particularly helpful!
The print statement is your project's access to standard output. First, the expression to
Int_Val(10) should print "10". Booleans should print in plaintext (i.e. printing
Bool_Val(true) should print "true" and likewise for "false"). Whatever is printed should always be followed by a newline.
VERY IMPORTANT NOTE ON PRINTING: If you attempt to print with
print_int, etc... you will fail any test that checks your output. This is because we do not directly intercept your output, but rather have implemented some wrapper functions in the
Utils for program printing (do not remove
open Utils from the top of
eval.ml!). As a result, your code may have any amount of
print_string statements and it WILL NOT affect the tests. Only printing performed through the following wrapper functions will be evaluated by the test system, and as a result should be used exclusively with
print_output_string : string -> unit
print_output_int : int -> unit
print_output_bool : bool -> unit
print_output_newline : unit -> unitAgain, you may use the normal prints defined in the standard library
Pervasivesmodule, but they will not affect your test grading in any way.
You should submit the file
eval.ml containing your solution. You may submit other files, but they will be ignored during grading. We will run your solution as individual OUnit tests just as in the provided public test file.
Be sure to follow the project description exactly! Your solution will be graded automatically, so any deviation from the specification will result in lost points.
You can submit your project in two ways:
- Submit your files directly to the submit server as a zip file by clicking on the submit link in the column "web submission".
Then, use the submit dialog to submit your zip file containing
Select your file using the "Browse" button, then press the "Submit project!" button. You will need to put it in a zip file since there are two component files.
- Submit directly by executing a the submission script on a computer with Java and network access. Included in this project are the submission scripts and related files listed under Project Files. These files should be in the directory containing your project. From there you can either execute submit.rb or run the command
java -jar submit.jardirectly (this is all submit.rb does).
No matter how you choose to submit your project, make sure that your submission is received by checking the submit server after submitting.
Please carefully read the academic honesty section of the course syllabus. Any evidence of impermissible cooperation on projects, use of disallowed materials or resources, or unauthorized use of computer accounts, will be submitted to the Student Honor Council, which could result in an XF for the course, or suspension or expulsion from the University. Be sure you understand what you are and what you are not permitted to do in regards to academic integrity when it comes to project assignments. These policies apply to all students, and the Student Honor Council does not consider lack of knowledge of the policies to be a defense for violating them. Full information is found in the course syllabus, which you should review before starting.