In [None]:
reload_ext run_and_test

# Background

## A context free grammar for arithmetic expressions

Consider the following context free grammar:

    EXPRESSION --> TERM TERM_OPERATOR EXPRESSION
    EXPRESSION --> TERM
    TERM --> FACTOR FACTOR_OPERATOR TERM
    TERM --> FACTOR
    FACTOR --> (EXPRESSION)
    FACTOR --> NUMBER
    NUMBER --> DIGIT NUMBER
    NUMBER --> DIGIT
    DIGIT --> 0
    ...
    DIGIT --> 9
    TERM_OPERATOR --> +
    TERM_OPERATOR --> -
    FACTOR_OPERATOR --> *
    FACTOR_OPERATOR --> /

The grammar is designed in such a way that:

* multiplication and division have higher priority than addition and subtraction;
* addition and subtraction have the same priority and associate to the left, as do multiplication and division.

Note that numbers can have an arbitrary number of leadings 0s. For instance, while 02, 002, 0002... are incorrect Python integer literals, they are valid for the grammar and should all evaluate to 2.

## Examples of derivations


Here are a few leftmost derivations of some finite sequences of terminals generated by the grammar:

* 2

      EXPRESSION
      -> TERM
       -> FACTOR
        -> NUMBER
         -> DIGIT
          -> 5
* (357)

      EXPRESSION
      -> TERM
       -> FACTOR
        -> (EXPRESSION)
         -> (TERM)
          -> (FACTOR)
           -> (NUMBER)
            -> (DIGIT NUMBER)
             -> (3 NUMBER)
              -> (3 DIGIT NUMBER)
               -> (35 NUMBER)
                -> (35 DIGIT)
                  -> (357)
* 1 + 2

      EXPRESSION
      -> TERM TERM_OPERATOR EXPRESSION
       -> FACTOR TERM_OPERATOR EXPRESSION
        -> NUMBER TERM_OPERATOR EXPRESSION
         -> DIGIT TERM_OPERATOR EXPRESSION
          -> 1 TERM_OPERATOR EXPRESSION
           -> 1 + EXPRESSION
            -> 1 + TERM
             -> 1 + FACTOR
              -> 1 + NUMBER
               -> 1 + DIGIT
                -> 1 + 2
* 3 / 4 - 5 * 6

      EXPRESSION
      -> TERM TERM_OPERATOR EXPRESSION
       -> FACTOR FACTOR_OPERATOR TERM TERM_OPERATOR EXPRESSION
        -> NUMBER FACTOR_OPERATOR TERM TERM_OPERATOR EXPRESSION
         -> DIGIT FACTOR_OPERATOR TERM TERM_OPERATOR EXPRESSION
          -> 3 FACTOR_OPERATOR TERM TERM_OPERATOR EXPRESSION
           -> 3 / TERM TERM_OPERATOR EXPRESSION
            -> 3 / FACTOR TERM_OPERATOR EXPRESSION
             -> 3 / NUMBER TERM_OPERATOR EXPRESSION
              -> 3 / DIGIT TERM_OPERATOR EXPRESSION
               -> 3 / 4 TERM_OPERATOR EXPRESSION
                -> 3 / 4 - EXPRESSION
                 -> 3 / 4 - TERM
                  -> 3 / 4 - FACTOR FACTOR_OPERATOR TERM
                   -> 3 / 4 - NUMBER FACTOR_OPERATOR TERM
                    -> 3 / 4 - DIGIT FACTOR_OPERATOR TERM
                     -> 3 / 4 - 5 FACTOR_OPERATOR TERM
                      -> 3 / 4 - 5 * TERM
                       -> 3 / 4 - 5 * FACTOR
                        -> 3 / 4 - 5 * NUMBER
                         -> 3 / 4 - 5 * DIGIT
                          -> 3 / 4 - 5 * 6
* (5 + 7) / 4 / 6

      EXPRESSION
      -> TERM
       -> FACTOR FACTOR_OPERATOR TERM
        -> (EXPRESSION) FACTOR_OPERATOR TERM
         -> (TERM TERM_OPERATOR EXPRESSION) FACTOR_OPERATOR TERM
          -> (FACTOR TERM_OPERATOR EXPRESSION) FACTOR_OPERATOR TERM
           -> (NUMBER TERM_OPERATOR EXPRESSION) FACTOR_OPERATOR TERM
            -> (DIGIT TERM_OPERATOR EXPRESSION) FACTOR_OPERATOR TERM
             -> (5 TERM_OPERATOR EXPRESSION) FACTOR_OPERATOR TERM
              -> (5 + EXPRESSION) FACTOR_OPERATOR TERM
               -> (5 + TERM) FACTOR_OPERATOR TERM
                -> (5 + FACTOR) FACTOR_OPERATOR TERM
                 -> (5 + NUMBER) FACTOR_OPERATOR TERM
                  -> (5 + DIGIT) FACTOR_OPERATOR TERM
                   -> (5 + 7) FACTOR_OPERATOR TERM
                    -> (5 + 7) / TERM
                     -> (5 + 7) / FACTOR FACTOR_OPERATOR TERM
                      -> (5 + 7) / NUMBER FACTOR_OPERATOR TERM
                       -> (5 + 7) / DIGIT FACTOR_OPERATOR TERM
                        -> (5 + 7) / 4 FACTOR_OPERATOR TERM
                         -> (5 + 7) / 4 / TERM
                          -> (5 + 7) / 4 / NUMBER
                           -> (5 + 7) / 4 / DIGIT
                            -> (5 + 7) / 4 / 6

## Recursive descent parsers

Let $s$ be an arithmetic expression, defined as a sequence of spaces, opening and closing parentheses, addition, subtraction, multiplication and division signs, and digits; saying that $s$ is syntactically valid or syntactically invalid means that, omitting the spaces, $s$ can or cannot be generated by the grammar, respectively. We want to write code that meets the following requirements (to simplify the discussion, assume that no space is used).

* Should $s$ be syntactically valid, $s$ would tentatively be evaluated, either successfully, with the result of the evaluation (integer or floating point) being returned, or unsuccessfully because division by 0 is attempted at some point, in which case `ZeroDivisionError` should be raised.
* Should $s$ be syntactically invalid (whether or not division by 0 is attempted at some point), `None` should be returned.

It is possible to proceed as follows.

* All symbols in $s$ can be stored in reverse order in a list; consuming all these symbols from left to right in $s$ can then be effectively achieved by popping the list's last element again and again until none is left.
* A function $\mathit{evaluate\_expression}$ can be in charge of processing the longest initial segment of $s$ that is either empty or syntactically valid. When the function has completed its work, either all symbols from $s$ have been consumed and it has tentatively evaluated $s$, with the expected behavior, or it has returned `None`.
* Given an arithmetic expression $\sigma$, the longest initial segment of $\sigma$ that is syntactically valid is of the form $t_1\pm t_2\pm\dots\pm t_n$ for some largest $n\geq 1$ and terms $t_1$, ... $t_n$. As $n$ is the largest possible, given $i\in\{1,\dots,n\}$, $t_i$ is of the form $f_{i,1}\divideontimes f_{i,2}\divideontimes\dots\divideontimes f_{i,m_i}$ for some largest $m_i\geq 1$ and factors $f_{i,1}$, ..., $f_{i,m_i}$. A function $\mathit{evaluate\_term}$ can be in charge of processing each of $t_1$, ... $t_n$, all longest sequences of the form $\phi_1\divideontimes\phi_2\divideontimes\dots\divideontimes\phi_k$ for some $k\geq 1$ and factors $\phi_1$, ...$\phi_k$ as no such sequence followed by a plus or a minus sign can be an initial segment of another such sequence.
* A function $\mathit{evaluate\_factor}$ can be in charge of processing $f_{i,1}$, ..., $f_{i,m_1}$, ..., $f_{n,1}$, ..., $f_{n,m_n}$. Since for all $i\in\{1,\dots,n\}$, $m_i$ is the largest possible, for all $i\in\{1,\dots,n\}$ and $j\in\{1,\dots,m_i\}$, there are two alternatives.
    * $f_{i,j}$ is of the form $(\sigma)$ for some syntactically valid arithmetic expression $\sigma$. In this case,
        * the opening parenthesis can be consumed;
        * then $\mathit{evaluate\_expression}$ can be used to process the whole of $\sigma$ and no symbol beyond, if any, as no syntactically valid arithmetic expression starts with $\sigma)$;
        * then the closing parenthesis can be consumed.
    * $f_{i,j}$ is a nonempty sequence of digits that terminates $s$ or that is followed in $s$ by a nondigit symbol. All digits that make up this sequence can be consumed by a function $\mathit{evaluate\_number}$.

It follows from the previous considerations that it is always sufficient to possibly look at the next symbol to consume before eventually consuming it. This makes this context free grammar an _$\mathrm{LL}(1)$ grammar_. More generally, if for some $k\geq 1$, a context free grammar has the property that it is always sufficient to possibly look at some of the next $k$ symbols to consume before eventually consuming the first one, then the grammar is $\mathrm{LL}(k)$. Both Ls in the notation are for __L__*eft-to-right*, __L__*eftmost derivation*.

The procedure that has been described is a kind of *recursive descent parser*, that is *predictive* because it involves no backtracking.

# Task

Write a program `predictive_parser.py` that implements a function, `evaluate(expression)`, that given as argument a string $s$, returns `None` if $s$ is not a syntactically valid arithmetic expression (as previously defined), and in case it is, returns the result of $s$'s evaluation if no division by 0 is attempted, raising `ZeroDivisionError` otherwise. The program should be designed in such a way that it implements the kind of predictive recursive descent parser previously described, operating on the reversed list of nonspace characters in $s$.

`float('nan')` might prove useful, as well as, but to a lesser extent, the `isnan()` function from the `math` module.

# Tests

## Evaluating '002'

In [None]:
statements = 'from predictive_parser import *; print(evaluate("002"))'

In [None]:
%%run_and_test python3 -c "$statements"

'2\n'

## Evaluating '(357)'

In [None]:
statements = 'from predictive_parser import *; print(evaluate("(357)"))'

In [None]:
%%run_and_test python3 -c "$statements"

'357\n'

## Evaluating '10+0325'

In [None]:
statements = 'from predictive_parser import *; print(evaluate("10+0325"))'

In [None]:
%%run_and_test python3 -c "$statements"

'335\n'

## Evaluating '1 - 20 + 300'

In [None]:
statements = 'from predictive_parser import *; print(evaluate("1 - 20 + 300"))'

In [None]:
%%run_and_test python3 -c "$statements"

'281\n'

## Evaluating ' ((   ((( 10 ))-((20))+(  (000)  ))   )) '

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate(" ((   ((( 10 ))-((20))+(  (000)  ))   )) "))'

In [None]:
%%run_and_test python3 -c "$statements"

'-10\n'

## Evaluating '20*4/5'

In [None]:
statements = 'from predictive_parser import *; print(evaluate("20*4/5"))'

In [None]:
%%run_and_test python3 -c "$statements"

'16.0\n'

## Evaluating '(( (((020))  *  ((  41  ))/((58 ))))   ) '

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("(( (((020))  *  ((  41  ))/((58 ))))   ) "))'

In [None]:
%%run_and_test python3 -c "$statements"

'14.137931034482758\n'

## Evaluating '30 / 412 - 5735 * 6008'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("30 / 412 - 5735 * 6008"))'

In [None]:
%%run_and_test python3 -c "$statements"

'-34455879.92718446\n'

## Evaluating '(5543  +  732) /49/6'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("(5543  +  732) /49/6"))'

In [None]:
%%run_and_test python3 -c "$statements"

'21.343537414965983\n'

## Evaluating '1 + 20 * 30 - 400 / 500'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("1 + 20 * 30 - 400 / 500"))'

In [None]:
%%run_and_test python3 -c "$statements"

'600.2\n'

## Evaluating '1 + (20*30-400) / 500'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("1 + (20*30-400) / 500"))'

In [None]:
%%run_and_test python3 -c "$statements"

'1.4\n'

## Evaluating '1+(20 / 30 * 400)- 500'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("1+(20 / 30 * 400)- 500"))'

In [None]:
%%run_and_test python3 -c "$statements"

'-232.33333333333337\n'

## Evaluating '1 + 2 * (3+4*5) / (6*7-8/9)'

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("1 + 2 * (3+4*5) / (6*7-8/9)"))'

In [None]:
%%run_and_test python3 -c "$statements"

'2.1189189189189186\n'

## Checking that '1 & 2' is not valid

In [None]:
statements = 'from predictive_parser import *; print(evaluate("1 & 2"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '100)' is not valid

In [None]:
statements = 'from predictive_parser import *; print(evaluate("100)"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '100 + ' is not valid

In [None]:
statements = 'from predictive_parser import *; print(evaluate("100 + "))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '100 + -3' is not valid

In [None]:
statements = 'from predictive_parser import *; print(evaluate("100 + -3"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '100 // 0' is not valid

In [None]:
statements = 'from predictive_parser import *; print(evaluate("100 // 0"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '100 / 0' raises ZeroDivisionError

In [None]:
statements = 'from predictive_parser import *; evaluate("100 / 0")'

In [None]:
%%run_and_test -e python3 -c "$statements"

'ZeroDivisionError\n'

## Checking that '((3 * 5 / (8 - (2 * 4))) + 10 / 16' is not valid

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("((3 * 5 / (8 - (2 * 4))) + 10 / 16"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '((3 * 5 / (8 - (2 * 4))) + 10) / 16' raises ZeroDivisionError

In [None]:
statements = 'from predictive_parser import *; '\
             'evaluate("((3 * 5 / (8 - (2 * 4))) + 10) / 16")'

In [None]:
%%run_and_test -e python3 -c "$statements"

'ZeroDivisionError\n'

## Checking that '1/2/3/4/0/0/0/0/5/6//7' is invalid

In [None]:
statements = 'from predictive_parser import *; '\
             'print(evaluate("1/2/3/4/0/0/0/0/5/6//7"))'

In [None]:
%%run_and_test python3 -c "$statements"

'None\n'

## Checking that '1/2/3/4/0/0/0/0/5/6/7' raises ZeroDivisionError

In [None]:
statements = 'from predictive_parser import *; '\
             'evaluate("1/2/3/4/0/0/0/0/5/6/7")'

In [None]:
%%run_and_test -e python3 -c "$statements"

'ZeroDivisionError\n'