The parser implemented in the function parseExpr parses an arithmetic 
expression according to the following EBNF grammar.
```
   regExp  -> product ("+" product)*
   product -> factor factor*
   factor  -> atom "*"?
   atom    -> "(" expr ")" | CHAR | "" | 0
```

In [1]:
import re

The function $\texttt{isWhiteSpace}(s)$ checks whether $s$ contains only blanks and tabulators.

In [2]:
def isWhiteSpace(s):
    whitespace = re.compile(r'[ \t]+')
    return whitespace.fullmatch(s)

The function `tokenize(s)` partitions the string `s` into a list of tokens.
It recognizes the operator symbols "+", "*", the parentheses "(", ")", characters, `0`, the empty string `""`
and skips whitespace.

In [3]:
def tokenize(s):
    regExp = r'''
              [+*()]   |  # operators and parentheses
              [ \t\n]  |  # white space
              [a-zA-Z] |  # characters
              0        |  # empty regular expression
              ""          # epsilon
              '''
    return [t for t in re.findall(regExp, s, flags=re.VERBOSE) if not isWhiteSpace(t)]

In [4]:
def parse(s):
     TokenList = tokenize(s)
     regExp, Rest = parseRegExp(TokenList)
     assert Rest == [], f'Parse Error: could not parse {TokenList}'
     return regExp

The function `parseRegExp` takes a token list `TokenList` and tries to interpret this list
as a regular expression.  It returns the regular expression in the form of a nested tuple and
a list of those tokens that could not be parsed.

In [5]:
def parseRegExp(TokenList):
    result, Rest = parseProduct(TokenList)
    while len(Rest) > 1 and Rest[0] == '+':
        arg, Rest = parseProduct(Rest[1:])
        result = ('or', result, arg)
    return result, Rest

In [6]:
def parseProduct(TokenList):
    result, Rest = parseFactor(TokenList)
    while len(Rest) > 0 and not (Rest[0] in ["+", "*", ")"]):
        arg, Rest = parseFactor(Rest)
        result = ('cat', result, arg)
    return result, Rest

In [7]:
def parseFactor(TokenList):
    atom, Rest = parseAtom(TokenList)
    if len(Rest) > 0 and Rest[0] == "*":
        return ('star', atom), Rest[1:]
    return atom, Rest

In [8]:
def parseAtom(TokenList):
    if TokenList[0] == '0':
        return 0, TokenList[1:]
    if TokenList[0] == '(':
        regExp, Rest = parseRegExp(TokenList[1:])
        assert Rest[0] == ")", "Parse Error"
        return regExp, Rest[1:]
    if TokenList[0] == '""':
        return '', TokenList[1:]
    s = TokenList[0]
    assert len(s) <= 1, f'parse error: {TokenList}'
    return s, TokenList[1:]