# Assignment 2 - Parsing with CFG

## About the assignment

In this assignment, you will primarily focus on the ANTLR grammar of a simple procedural language.

The language we wish to design supports the following:

1. Arithemtic expressions.
2. Numerical and string literals.
3. Function calls.
4. Variable assignments.
5. Comments.

## The Syntax

The best way to understand the syntax is to review the test cases.  Make sure you understand the positive and negative test cases included in this Kotlin notebook.  The grammar can largely be described as the following components:

1. Expressions


2. Function calls


3. Variable assignments

    
4. Statements


5. Program

## Expressions and Function Calls


1. Expressions

    We support arithmetic expressions between numeric values and string values.
    So, we allow any combinations of:
    
        - `1 + 2 * (3 - 4)`
        - `"hello" * 2 + "world"`
        
    Expressions can also make use of variables:
    
        - `pi * radius * radius`
        
2. Function calls

    The syntax allows a mixture of function calls as part of an expression.  An example of a function call is `cos(3.1415 / 4 + 0.1)`, or `range(0, 100)`.
   

## Variable Assignments and Statements

    
3. Variable assignments

    We can use variables, and assign them to expressions.  An example is:
    
    ```
    let area = 3.1415 * 100
    ```
    
4. Statements

    A statement **always** ends with `;`.  Here are some examples of statements:
    
    - `1 + (2 * 3)`;
    - `println("The answer is:", 1 + 2);
    - `let pi = 3.1415;`

## The Program and other considerations

5. The `program` is the start symbol of the grammar.

    It consists of zero or more statements, **followed** by `EOF`.
    
    
    
6. Comments and whitespaces are ignored.

    Comments are the C-style multiline comments `/* ... */`.
    Both comments and whitespaces (outside of strings) are ignored.

# Getting Started

Load the dependencies.

In [1]:
@file:DependsOn("/data/shared/antlr-4.9.1-complete.jar")
@file:DependsOn(".")

Import the classes we need from the runtime library as well as the generated lexer and parsers.

In [2]:
import org.antlr.v4.runtime.*
import mygrammar.*

We will use a custom error handler to provide a standardized error message upon lexical analysis and parsing.

In [3]:
fun makeErrorListener(name: String) = object: BaseErrorListener() {
  override fun syntaxError(recognizer: Recognizer<*,*>,
                          symbol: Any?,
                          line: Int, pos: Int, message: String, e:RecognitionException?) {
      var error = "[${name} error at line=${line}]"
      if(message.length > 0) {
          error += ": ${message}"
      }
      throw Exception(error)
  }  
}

The `recognize` function will return `true` of the program does not have any syntax error, and `false` if syntax error is found.

It will also print the error message associated with the exception thrown by the error listener.

In [4]:
fun recognize(source: String): Boolean {
    val input = CharStreams.fromString(source)
    val lexer = ExprLexer(input)
    val tokens = CommonTokenStream(lexer)
    val parser = ExprParser(tokens)
    
    // setup error listener
    lexer.removeErrorListeners()
    lexer.addErrorListener(makeErrorListener("LEXER"))
    
    parser.removeErrorListeners()
    parser.addErrorListener(makeErrorListener("PARSER"))
    
    try {
        parser.program()
        return true
    } catch(e: Exception) {
        println(e.message)
        return false
    }
}

## Positive Test Cases

A simple expression can be a statement.

Note that it must end with `;`.

In [5]:
val program = """
42;
"""

recognize(program)

true

This is another expression as a statement.

In [6]:
val program = """
42 + 90;
"""

recognize(program)

true

This is an expression involving both numeric and string literals, as a statement.

In [7]:
val program = """
42 + ("hello");
"""

recognize(program)

true

This is an expression with two strings.

In [8]:
val program = """
"hello" + "world";
"""

recognize(program)

true

Function calls can be statements.

In [9]:
val program = """
sin(90);
"""

recognize(program)

true

Here, we have a function call with an expression as an argument.

In [10]:
val program = """
sin(90 + 10);
"""

recognize(program)

true

We have a nested function call, where `range` has two arguments, both are simple expressions.

In [11]:
val program = """
sum(range(0, 100));
"""

recognize(program)

true

Assignments can be statements.

In [12]:
val program = """
let pi = 3.1415;
"""

recognize(program)

true

Here is a program with two statements.

In [13]:
val program = """
let pi = 3.1415;
let radius = 10;
"""

recognize(program)

true

Here is a program with multiple statements.

In [14]:
val program = """
let pi = 3.1415;
let radius = 10;
let area = pi * power(radius, 2);
println("area is", area);
"""

recognize(program)

true

Here is a program with some multi-line comments.

In [15]:
val program = """
let pi = 3.1415;
let radius = 10;

/*
 * Area is π * r * r
 */

let area = pi * power(radius, 2);

/*-------------------------------
  print the area of semi-circle 
  ------------------------------- */

println("area is", area / 2);
"""

recognize(program)

true

Since all whitespaces outside of strings are ignored, each statement can span over multiple lines as long as it ends with `;`.

In [16]:
val program = """
let pi = 3.1415;
let radius = 10;
let area = 
  pi * 
  power(
      radius,
      2
  )
  ;

println(
  "area is",
  area
);
"""

recognize(program)

true

## Negative Test Cases

An expression must end with `;` when used as a statement.

In [17]:
val program = """
1 + 2
""".trimIndent()

recognize(program)

[PARSER error at line=1]: missing ';' at '<EOF>'


false

The second expression does not end with `;`.

In [18]:
val program = """
(1 + 2); (1 + 2)
""".trimIndent()

recognize(program)

[PARSER error at line=1]: missing ';' at '<EOF>'


false

The second statement is incomplete.

In [19]:
val program = """
1 + 2;
let x;
""".trimIndent()

recognize(program)

[PARSER error at line=2]: mismatched input ';' expecting '='


false

The second statement has a missing `)`.

In [20]:
val program = """
1 + 2;
let x = sin(90;
"""

recognize(program)

[PARSER error at line=3]: missing ')' at ';'


false

Here the function call `range(...)` is not using the argument separator `,`.

In [21]:
val program = """
1 + 2;
let x = range(1 10);
"""

recognize(program)

[PARSER error at line=3]: extraneous input '10' expecting ')'


false

Oops, the string `hello` has a missing close quote.  This triggers a lexer error.

In [22]:
val program = """
let message = "hello * 2;
"""

recognize(program)

[LEXER error at line=2]: token recognition error at: '"hello * 2;\n'


false

Variable names must starts with a letter, so `0message` is not a valid identifier.

In [23]:
val program = """
let 0message = 1+2;
"""

recognize(program)

[PARSER error at line=2]: extraneous input '0' expecting ID


false

We don't support `&` as a valid token, so it triggers a lexer error.

In [24]:
val program = """
let message = 1 & 2;
"""

recognize(program)

[LEXER error at line=2]: token recognition error at: '&'


false