In [None]:
from IPython.core.display import HTML
with open ("../style.css", "r") as file:
    css = file.read()
HTML(css)

# A Simple Symbolic Calculator

This file shows how a simple symbolic calculator can be implemented using `Ply`.  The grammar for the language implemented by this parser is as follows:
$$
\begin{array}{lcl}
  \texttt{stmnt}   & \rightarrow & \;\texttt{IDENTIFIER} \;\texttt{':='}\; \texttt{expr}\; \texttt{';'}\\
                   & \mid        & \;\texttt{expr}\; \texttt{';'}                      \\[0.2cm]
  \texttt{expr}    & \rightarrow & \;\texttt{expr}\; \texttt{'+'} \; \texttt{product}  \\
                   & \mid        & \;\texttt{expr}\; \texttt{'-'} \; \texttt{product}  \\
                   & \mid        & \;\texttt{product}                                  \\[0.2cm]
  \texttt{product} & \rightarrow & \;\texttt{product}\; \texttt{'*'} \;\texttt{factor} \\
                   & \mid        & \;\texttt{product}\; \texttt{'/'} \;\texttt{factor} \\
                   & \mid        & \;\texttt{factor}                                   \\[0.2cm]
  \texttt{factor}  & \rightarrow &   \texttt{'('} \; \texttt{expr} \;\texttt{')'}        \\
                   & \mid        & \;\texttt{NUMBER}                                   \\
                   & \mid        & \;\texttt{IDENTIFIER}                        
\end{array}
$$

## Specification of the Scanner

In [None]:
import ply.lex as lex

There are only three tokens that need to be defined via regular expressions.  The other tokens consist only of a single character and can therefore be defined as literals.

In [None]:
tokens = [ 'NUMBER', 'IDENTIFIER', 'ASSIGN_OP' ]

The token `NUMBER` specifies a fully featured *floating point number*.

In [None]:
def t_NUMBER(t):
    r'0|[1-9][0-9]*(\.[0-9]+)?([eE][+-]?([0-9]+))?'
    t.value = float(t.value)
    return t

The token `IDENTIFIER` specifies the name of a *variable*.

In [None]:
def t_IDENTIFIER(t):
    r'[a-zA-Z][a-zA-Z0-9_]*'
    return t

The token `ASSIGN_OP` specifies the *assignment operator*.  As this operator consists of two characters, it can't be defined as a literal.

In [None]:
def t_ASSIGN_OP(t):
    r':='
    return t

`literals` is a list of operator symbols that consist of a single character.

In [None]:
literals = ['+', '-', '*', '/', '(', ')', ';']

Blanks and tabulators are ignored.

In [None]:
t_ignore  = ' \t'

Newlines are counted in order to give precise error messages.  Otherwise they are ignored.

In [None]:
def t_newline(t):
    r'\n+'
    t.lexer.lineno += t.value.count('\n')

Unkown characters are reported as lexical errors.

In [None]:
def t_error(t):
    print(f"Illegal character '{t.value[0]}' at character number {t.lexer.lexpos} in line {t.lexer.lineno}.")
    t.lexer.skip(1)

The next line is necessary because we use `Ply` from a notebook.

In [None]:
__file__ = 'main'

We generate the lexer.

In [None]:
lexer = lex.lex()

## Specification of the Parser

In [None]:
import ply.yacc as yacc

The *start variable* of our grammar is `statement`.

In [None]:
start = 'stmnt'

There are two grammar rules for `stmnt`s:
```
    stmnt : IDENTIFIER ":=" expr ";"
          | expr ';'
          ;
```
- If a *stmnt* is an assignment, the expression on the right hand side of the assignment operator is 
  evaluated and the value is stored in the dictionary `Names2Values`.  The key used in this dictionary
  is the name of the variable on the left hand side ofthe assignment operator.
- If a *stmnt* is an expression, the expression is evaluated and the result of this evaluation is printed.

It is <b>very important</b> that in the grammar rules below the `:` is surrounded by space characters, for otherwise `Ply` will throw mysterious error messages at us!

Below, `Names2Values` is a dictionary mapping variable names to their values.  It will be defined later.

In [None]:
def p_stmnt_assign(p):
    "stmnt : IDENTIFIER ASSIGN_OP expr ';'"
    Names2Values[p[1]] = p[3]

def p_stmnt_expr(p):
    "stmnt : expr ';'"
    print(p[1])

An *expr* is a sequence of *products* that are combined with the operators `+` and `-`.
The corresponding grammar rules are:
```
    expr : expr '+' product
         | expr '-' product
         | product
         ;
```

In [None]:
def p_expr_plus(p):
    "expr : expr '+' product"
    p[0] = p[1] + p[3]
    
def p_expr_minus(p):
    "expr : expr '-' product"
    p[0] = p[1] - p[3]
    
def p_expr_product(p):
    "expr : product"
    p[0] = p[1]

A *product* is a sequence of *factors* that are combined with the operators `*` and `/`.
The corresponding grammar rules are:
```
    product : product '*' factor
            | product '/' factor
            | factor
            ;
```

In [None]:
def p_product_multiply(p):
    "product : product '*' factor"
    p[0] = p[1] * p[3]
    
def p_product_divide(p):
    "product : product '/' factor"
    p[0] = p[1] / p[3]
    
def p_product_factor(p):
    "product : factor"
    p[0] = p[1]

A factor is either an expression in parenthesis, a number, or an identifier.
```
   factor : '(' expr ')'
          | NUMBER
          | IDENTIFIER
          ;
```
If an identifier is undefined, its value is `NaN`, which is short for *not a number*.

In [None]:
def p_factor_group(p):
    "factor : '(' expr ')'"
    p[0] = p[2]

def p_factor_number(p):
    "factor : NUMBER"
    p[0] = p[1]

def p_factor_identifier(p):
    "factor : IDENTIFIER"
    p[0] = Names2Values.get(p[1], float('NaN'))

The expression `float('NaN')` stands for an undefined number.

In [None]:
import math

In [None]:
math.nan, float('Inf'), float('Inf') - float('Inf') 

In [None]:
1.0 / float('Inf')

In [None]:
float('Inf') - 1

The method `p_error` is called if a syntax error occurs.  The argument `p` is the token that could not be read.  If `p` is `None` then the syntax error occurs at the end of the input.

In [None]:
def p_error(p):
    if p:
        print(f"Syntax error at character number {p.lexer.lexpos} at token '{p.value}' in line {p.lexer.lineno}.")
    else:
        raise(f"Incomplete input.")

Setting the optional argument `write_tables` to `False` <B style="color:red">is required</B> to prevent an *obscure bug* where the parser generator tries to read an empty parse table.
We set `debug` to `True` so that the parse tables are dumped into the file `parser.out`.

In [None]:
parser = yacc.yacc(write_tables=False, debug=True)

Let's look at the action table that is generated.

In [None]:
!cat parser.out

`Names2Values` is the dictionary that maps variable names to their values.  Initially the dictionary is empty as no variables have yet been defined.

In [None]:
Names2Values = {}

The parser is invoked by calling the method `yacc.parse(s)` where `s` is a string that is to be parsed.

In [None]:
def main():
    while True:
        s = input('calc> ')
        if s == '':
            break
        yacc.parse(s)

In [None]:
main()

In [None]:
Names2Values