In [1]:
from IPython.core.display import HTML
with open('../style.css') as file:
    css = file.read()
HTML(css)

# Testing an <span style="font-variant:small-caps;">Antlr</span> Grammar via `grun`

In order for the examples using <span style="font-variant:small-caps;">Antlr</span> to work, 
we first have to install <span style="font-variant:small-caps;">Antlr</span>.  This can be done by executing 
the following commands in an *Anaconda environment* that has been activated:
```
conda install -y -c conda-forge antlr4-python3-runtime
conda install -y -c conda-forge antlr
```
Alternatively, you can download https://www.antlr.org/download/antlr-4.13.1-complete.jar.  I will assume that this `.jar`file is 
stored in the directory `/usr/local/lib/`.  Furthermore, I assume that both a *java runtime*
and a *java compiler* are available. 

In [None]:
#!conda install -y -c conda-forge antlr4-python3-runtime

In [None]:
#!conda install -y -c conda-forge antlr

Our grammar is stored in the file `Expr.g4`.  In order to inspect it, we use the command line tool `cat`.  This will work with MacOs and Linux.  On Windows,
either use the power shell, which understands `cat`,  or use the command `type` instead.  The option  `-n` of `cat` provides numbered output.

In [2]:
!cat -n Expr.g4

     1	grammar Expr;
     2	
     3	expr    : expr '+' product 
     4	        | expr '-' product
     5	        | product        
     6	        ;
     7	
     8	product : product '*' factor 
     9	        | product '/' factor 
    10	        | factor
    11	        ;
    12	
    13	factor  : '(' expr ')'
    14	        | NUMBER
    15	        ;
    16	
    17	NUMBER  : '0'|[1-9][0-9]*;
    18	WS      : [ \t\n\r] -> skip;


In [None]:
!type Expr.g4  

Note that this grammar does not contain any *embedded actions*.  
Hence we cannot compute anything with it.  We will only be able to 
check whether a given string is generated by this grammar.  We can generate both the scanner and the parser using the following command:

In [3]:
!antlr4 -Dlanguage=Python3 Expr.g4

In [4]:
!ls -l

total 256
-rw-r--r--@ 1 karl  staff   4592 Nov  6 20:51 AST-2-Dot.ipynb
-rw-r--r--@ 1 karl  staff   1107 Nov  7 20:45 Calculator.g4
-rw-r--r--@ 1 karl  staff   6669 Nov  7 20:51 Calculator.ipynb
-rw-r--r--@ 1 karl  staff    906 Nov  9  2021 Differentiator.g4
-rw-r--r--@ 1 karl  staff   7491 Nov  7 21:20 Differentiator.ipynb
-rw-r--r--@ 1 karl  staff    302 Oct 31  2020 Expr.g4
-rw-r--r--@ 1 karl  staff   1366 Nov 11 14:26 Expr.interp
-rw-r--r--@ 1 karl  staff     92 Nov 11 14:26 Expr.tokens
-rw-r--r--@ 1 karl  staff   1565 Nov 11 14:26 ExprLexer.interp
-rw-r--r--@ 1 karl  staff   2228 Nov 11 14:26 ExprLexer.py
-rw-r--r--@ 1 karl  staff     92 Nov 11 14:26 ExprLexer.tokens
-rw-r--r--@ 1 karl  staff   1056 Nov 11 14:26 ExprListener.py
-rw-r--r--@ 1 karl  staff  12352 Nov 11 14:26 ExprParser.py
-rw-r--r--@ 1 karl  staff   9118 Nov  8 00:44 Interpreter-Matching.ipynb
-rw-r--r--@ 1 karl  staff    524 Nov  7 20:45 Program.g4
-rw-r--r--@ 1 karl  staff    752 Nov 12  2021 Pure.

In [5]:
!dir /B  

zsh:1: command not found: dir


The files `ExprLexer.py` and `ExprParser.py` contain the generated scanner and parser, respectively.
If we want to test the parser in this notebook, we have to import these files.

In [6]:
from ExprLexer  import ExprLexer
from ExprParser import ExprParser

Of course, we also have to import `antlr4`.

In [7]:
import antlr4

Now we are able to parse a string.  The function `parser_string` takes the string `s` as its argument and checks,
whether this string can be parsed as an arithmetic expression.  This is done in five steps:
- The string is converted into an `antlr4.InputStream`.
- The input stream is converted into a lexer.
- The lexer is converted into an `antlr4.CommonTokenStream`.
- The token stream is converted into a parser.
- The parser tries to parse with `start` symbol.

In [8]:
def parse_string(string): 
    inputStream = antlr4.InputStream(string)
    lexer       = ExprLexer(inputStream)
    tokenStream = antlr4.CommonTokenStream(lexer)
    parser      = ExprParser(tokenStream)
    parser.expr()

In [9]:
parse_string('1 + 2 * 3 - 4')

As there is no syntax error, the string `'1 + 2 * 3 - 4'` adheres to the specification given by our grammar.
Lets try a string that is not generated by our grammar.

In [10]:
parse_string('1 + 2 * 3 ** 4')

line 1:11 extraneous input '*' expecting {'(', NUMBER}


As the operator `**` is not supported by our grammar, we get a *syntax error* at the 
last occurrence of the character `*` in the given string.  
Note that the column count starts at `0`.

In [11]:
parse_string('1 < 2')

line 1:2 token recognition error at: '<'


This time we get a *lexical error* as the character `<` is not a legal token.

We can also generate a *parse tree* with our grammar.  However, for this to work <span style="font-variant:small-caps;">Antlr</span>
first has to generate a `java` parser.  Hence we have to call `antlr4` again, but this time with `Java` as the target language.

In [12]:
!java -jar /usr/local/lib/antlr4-4.13.1-complete.jar -Dlanguage=Java Expr.g4 

This command has generated some files for us that contain a both a lexer and a parser.  
However, this time these are `.java`-files.

In [13]:
!ls -l *.java

-rw-r--r--@ 1 karl  staff   1963 Nov 11 14:27 ExprBaseListener.java
-rw-r--r--@ 1 karl  staff   5415 Nov 11 14:27 ExprLexer.java
-rw-r--r--@ 1 karl  staff   1154 Nov 11 14:27 ExprListener.java
-rw-r--r--@ 1 karl  staff  12776 Nov 11 14:27 ExprParser.java


In [14]:
!dir /B *.java

zsh:1: command not found: dir


We have to compile the generated `.java` files.  Below, you might have to change the path to 
the file `antlr-4.8-complete.jar` to make this work.

In [15]:
!javac -cp .:/usr/local/lib/antlr4-4.13.1-complete.jar *.java

Next, we can start the so called *TestRig* to generate and display the <em style="color:blue">parse tree</em> for a given string.

In [None]:
!echo "1+2*3-4" | java -cp .:/usr/local/lib/antlr4-4.13.1-complete.jar org.antlr.v4.gui.TestRig Expr expr -gui



Let us clean up the working directory.

In [None]:
!ls

In [None]:
!dir /B 

In [None]:
!rm *.py *.tokens *.interp *.java *.class
!rm -r __pycache__/

In [None]:
!del *.py *.tokens *.interp *.java *.class /Q
!rmdir __pycache__ /S /Q

In [None]:
!ls -l

In [None]:
!dir /B  