In [1]:
%%HTML
<style>
.container { width:100% } 
</style>

Normally, I would just write
```
%run Stack.ipynb
```
here.  As this does not work in Deepnote, I have included the implementation of the class `Stack` here.

In [2]:
class Stack:
    def __init__(self):
        self.mStackElements = []

    def push(self, e):
        self.mStackElements.append(e)

    def pop(self):
        assert len(self.mStackElements) > 0, "popping empty stack"
        self.mStackElements = self.mStackElements[:-1]

    def top(self):
        assert len(self.mStackElements) > 0, "top of empty stack"
        return self.mStackElements[-1]

    def isEmpty(self):
        return self.mStackElements == []

    def copy(self):
        C = Stack()
        C.mStackElements = self.mStackElements[:]
        return C

    def __str__(self):
        C = self.copy()
        result = C._convert()
        return result

    def _convert(self):
        if self.isEmpty():
            return '|'
        t = self.top()
        self.pop()
        return self._convert() + ' ' + str(t) + ' |'

def createStack(L):
    S = Stack()
    n = len(L)
    for i in range(n):
        S.push(L[i])
    return S

# The Shunting Yard Algorithm (Operator Precedence Parsing)

In [3]:
import re

The function $\texttt{isWhiteSpace}(s)$ checks whether $s$ contains only blanks and tabulators.

In [4]:
def isWhiteSpace(s):
    whitespace = re.compile(r'[ \t]+')
    return whitespace.fullmatch(s)

The function $\texttt{toFloat}(s)$ tries to convert the string $s$ to a floating point number.  If this works out, this number is returned.  Otherwise, the string $s$ is returned unchanged.

In [5]:
def toFloat(s):
    try:
        return float(s)   
    except ValueError:
        return s

In [6]:
toFloat('0.123')

0.123

In [7]:
toFloat('+')

'+'

The module `re` provides support for <a href='https://en.wikipedia.org/wiki/Regular_expression'>regular expressions</a>.  These are needed for
<em style="color:blue;">tokenizing</em> a string.

The function $\texttt{tokenize}(s)$ takes a string and splits this string into a list of tokens.  Whitespace is discarded.

In [8]:
def tokenize(s):
    regExp = r'''
              0|[1-9][0-9]*               | # integer
              (?:0|[1-9][0-9])+[.][0-9]+  | # floating point number
              \*\*                        | # power operator
              [-+*/()]                    | # arithmetic operators and parentheses
              [ \t]                       | # white space
              sqrt                        | # square root
              sin                         | # sine function
              cos                         | # cosine function
              tan                         | # tangent function
              asin                        | # arcus sine
              acos                        | # arcus cosine
              atan                        | # arcus tangent
              exp                         | # exponential function
              log                         | # natural logarithm
              x                           | # variable
              e                           | # Euler's number
              pi                            # π
              '''
    L = [toFloat(t) for t in re.findall(regExp, s, flags=re.VERBOSE) if not isWhiteSpace(t)]
    return list(reversed(L))

In [9]:
tokenize('x**2 - 2')

[2.0, '-', 2.0, '**', 'x']

The module `math` provides a number of mathematical functions like `exp`, `sin`, `log` etc.

In [10]:
import math

The function $\texttt{findZero}(f, a, b, n)$ takes a function $f$ and two numbers $a$ and $b$ such that

  - $a < b$,
  - $f(a) \leq 0$, and 
  - $0 \leq f(b)$.
  
It uses the bisection method to find a number $x \in [a, b]$ such that $f(x) \approx 0$.

In [11]:
def findZero(f, a, b, n):
    assert a != b    , f'a and b can\'t be the same value: {a}' 
    assert f(a) * f(b) <= 0, f'f({a}) * f({b}) > 0'
    if b > a:
        a, b = b, a
    if f(a) <= 0 <= f(b):
        for k in range(n):
            c = 0.5 * (a + b) 
            # print(f'f({c}) = {f(c)}, {b-a}')
            if f(c) < 0:
                a = c
            elif f(c) > 0:
                b = c
            else:
                return c
    else:
        for k in range(n):
            c = 0.5 * (a + b) 
            # print(f'f({c}) = {f(c)}, {b-a}')
            if f(c) > 0:
                a = c
            elif f(c) < 0:
                b = c
            else:
                return c
    return (a + b) / 2

In [12]:
def f(x):
    return 2 - x ** 2

In [13]:
r = findZero(f, 0, 2, 54)
r

1.414213562373095

In [14]:
r * r

1.9999999999999996

The function $\texttt{precedence}(o)$ calculates the precedence of the operator $o$.

In [16]:
def precedence(op):
    if op in ("e", "pi", "x"):
        return 5
    if op in ("sqrt", "sin", "cos", "tan", "asin", "acos", "atan", "exp", "log"):
        return 4
    if op in ("**"):
        return 3
    if op in ("*", "/" "%"):
        return 2
    if op in ("+", "-"):
        return 1
    assert False, f'unkown operator in precedence: {op}'

The function $\texttt{isUnaryOperator}(o)$ returns `True` of $o$ is a unary operator.

In [17]:
def isUnaryOperator(op):
    return op in ("sqrt", "sin", "cos", "tan", "asin", "acos", "atan", "exp", "log")

The function $\texttt{isConstOperator}(o)$ returns `True` of $o$ is a constant like `e`or `pi`. 
The variable `x` is also considered as a constant operator.

In [18]:
def isConstOperator(op):
    return op in ("e", "pi", "x")

The function $\texttt{isLeftAssociative}(o)$ returns `True` of $o$ is left associative.

In [33]:
def isLeftAssociative(op):
    if op in ("*", "/" "%", "+", "-"):
        return True
    if op in ("sqrt", "sin", "cos", "tan", "asin", "acos", "atan", "exp", "log", "**"):
        return False
    assert False, f'unkown operator in isLeftAssociative: {op}'

The function $\texttt{evalBefore}(o_1, o_2)$ receives to strings representing arithmetical operators.  It returns `True` if the operator $o_1$ should be evaluated before the operator $o_2$ in an arithmetical expression of the form $a \;\texttt{o}_1\; b \;\texttt{o}_2\; c$.  In order to determine whether $o_1$ should be evaluated before $o_2$ it uses the 
<em style="color:blue">precedence</em> and the <em style="color:blue">associativity</em> of the operators.  
Its behavior is specified by the following rules:
- $\texttt{precedence}(o_1) > \texttt{precedence}(o_2) \rightarrow \texttt{evalBefore}(\texttt{o}_1, \texttt{o}_2) = \texttt{True}$,
- $o_1 = o_2 \wedge \neg\texttt{isUnaryOperator}(o_1)\rightarrow \texttt{evalBefore}(\texttt{o}_1, \texttt{o}_2) = \texttt{isLeftAssociative}(o_1)$,
- $\texttt{isUnaryOperator}(o_1) \wedge \texttt{isUnaryOperator}(o_2) \rightarrow 
   \texttt{evalBefore}(\texttt{o}_1, \texttt{o}_2) = \texttt{False}$,
- $\texttt{precedence}(o_1) = \texttt{precedence}(o_2) \wedge o_1 \not= o_2 \wedge 
   \neg\texttt{isUnaryOperator}(o_1) \rightarrow 
   \texttt{evalBefore}(\texttt{o}_1, \texttt{o}_2) = \texttt{True}$,
- $\texttt{precedence}(o_1) < \texttt{precedence}(o_2) \rightarrow \texttt{evalBefore}(\texttt{o}_1, \texttt{o}_2) = \texttt{False}$.

In [20]:
def evalBefore(stackOp, nextOp):
    if precedence(stackOp) > precedence(nextOp):
        return True
    elif stackOp == nextOp and not isUnaryOperator(stackOp):
        return isLeftAssociative(stackOp)
    elif isUnaryOperator(stackOp) and isUnaryOperator(nextOp):
        return False
    elif precedence(stackOp) == precedence(nextOp) and stackOp != nextOp and not isUnaryOperator(stackOp):
        return True
    elif precedence(stackOp) < precedence(nextOp):
        return False
    assert False, f'incomplete case distinction in evalBefore({stackOp}, {nextOp})'

The class `Calculator` supports three member variables:
  - the token stack `mTokens`,
  - the operator stack `mOperators`,
  - the argument stack `mArguments`,
  - the floating point number `mValue`, which is the current value of `x`.
  
The constructor takes a list of tokens `TL` and initializes the token stack with these 
tokens.

In [21]:
class Calculator:
    def __init__(self, TL, x=0):
        self.mTokens     = createStack(TL)
        self.mOperators  = Stack()
        self.mArguments  = Stack()
        self.mValue      = x

The method `__str__` is used to convert an object of class `Calculator` to a string.

In [22]:
def toString(self):
    return '\n'.join(['_'*50, 
                      'TokenStack: ' + str(self.mTokens), 
                      'Arguments:  ' + str(self.mArguments), 
                      'Operators:  ' + str(self.mOperators), 
                      '_'*50])

Calculator.__str__ = toString
del toString

The function $\texttt{evaluate}(\texttt{self})$ evaluates the expression that is given by the tokens on the `mTokenStack`.  
There are two phases:
1. The first phase is the <em style="color:blue">reading phase</em>. In this phase
   the tokens are removed from the token stack `mTokens`.  
2. The second phase is the <em style="color:blue">evaluation phase</em>.  In this phase,
   the remaining operators on the operator stack `mOperators` are evaluated.  Note that some operators are already 
   evaluated in the *reading phase*.

We can describe what happens in the *reading phase* using 
<em style="color:blue">rewrite rules</em> that describe how the three stacks `mTokens`, `mArguments` and `mOperators`
are changed in each *step*.  Here, a *step* is one iteration of the first `while`-loop of the function `evaluate`.
The following *rewrite rules* are executed until the token stack `mTokens` is empty.
1. If the token on top of the token stack is an integer, it is removed from the token stack and pushed onto the argument stack.
   The operator stack remains unchanged in this case.  
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{token} ] & \wedge \\
     \texttt{isInteger}(\texttt{token}) & \Rightarrow \\[0.2cm]
     \texttt{mArguments}' = \texttt{mArguments} + [\texttt{token}] & \wedge \\
     \texttt{mTokens}' = \texttt{mTokensRest} & \wedge \\
     \texttt{mOperators}' = \texttt{mOperators}
     \end{array} 
   $$
   Here, the primed variable $\texttt{mArguments}'$ refers to the argument stack after  $\texttt{token}$
   has been pushed onto it.
   
   In the following rules we implicitly assume that the token on top of the token stack is not an integer but 
   rather a parenthesis or a proper operator.  In order to be more concise, we suppress this precondition from the 
   following rewrite rules.
2. If the operator stack is empty, the next token is pushed onto the operator stack.
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{op} ] & \wedge \\
     \texttt{mOperators} = [] & \Rightarrow \\[0.2cm]
     \texttt{mOperators}' = \texttt{mOperators} + [\texttt{op}] & \wedge \\
     \texttt{mTokens}' = \texttt{mTokensRest} & \wedge \\
     \texttt{mArguments}' = \texttt{mArguments} 
     \end{array} 
   $$
3. If the next token is an opening parenthesis, this parenthesis token is pushed onto the operator stack.
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{'('} ] & \Rightarrow \\[0.2cm]
     \texttt{mOperators}' = \texttt{mOperators} + [\texttt{'('}] & \wedge \\
     \texttt{mTokens}' = \texttt{mTokensRest} & \wedge \\
     \texttt{mArguments}' = \texttt{mArguments} 
     \end{array} 
   $$
4. If the next token is a closing parenthesis and the operator on top of the operator stack is an opening parenthesis, then both 
   parentheses are removed.
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{')'} ] & \wedge \\
     \texttt{mOperators} =\texttt{mOperatorsRest} + [\texttt{'('}]                  & \Rightarrow \\[0.2cm]
     \texttt{mOperators}' = \texttt{mOperatorsRest} & \wedge \\
     \texttt{mTokens}' = \texttt{mTokensRest} & \wedge \\
     \texttt{mArguments}' = \texttt{mArguments} 
     \end{array} 
   $$
5. If the next token is a closing parenthesis but the operator on top of the operator stack is not an opening parenthesis, 
   the operator on top of the operator stack is evaluated.  Note that the token stack is not changed in this case.
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{')'} ] & \wedge \\
     \texttt{mOperatorsRest} + [\texttt{op}]                   & \wedge \\
     \texttt{op} \not= \texttt{'('}                            & \wedge \\
     \texttt{mArguments} = \texttt{mArgumentsRest} + [\texttt{lhs}, \texttt{rhs}] & \Rightarrow \\[0.2cm]
        \texttt{mOperators}' = \texttt{mOperatorsRest} & \wedge \\
         \texttt{mTokens}' = \texttt{mTokens} & \wedge \\
         \texttt{mArguments}' = \texttt{mArgumentsRest} + [\texttt{lhs} \;\texttt{op}\; \texttt{rhs}]
     \end{array} 
   $$
   Here, the expression $\texttt{lhs} \;\texttt{op}\; \texttt{rhs}$ denotes evaluating the operator $\texttt{op}$ with the arguments
   $\texttt{lhs}$ and $\texttt{rhs}$.
6. If the token on top of the operator stack is an opening parenthesis, then the operator on top of the token stack
   is pushed onto the operator stack.
   $$\begin{array}{lc}
     \texttt{mTokens} = \texttt{mTokensRest} + [\texttt{op}] & \wedge \\
     \texttt{op} \not= \texttt{')'}                          & \wedge \\
     \texttt{mOperators} = \texttt{mOperatorsRest} + [\texttt{'('}] & \Rightarrow \\[0.2cm]
     \texttt{mOperator}' = \texttt{mOperator} + [\texttt{op}] & \wedge \\
     \texttt{mTokens}' = \texttt{mTokensRest} & \wedge \\
     \texttt{mArguments}' = \texttt{mArguments}
     \end{array} 
   $$
   
   In the remaining cases neither the token on top of the token stack nor the operator on top of the operator stack can be
   a parenthesis.  The following rules will implicitly assume that this is the case.
7. If the operator on top of the operator stack needs to be evaluated before the operator on top of the token stack,
   the operator on top of the operator stack is evaluated.
      $$\begin{array}{lc}
        \texttt{mTokens} = \texttt{mTokensRest} + [o_2]                                        & \wedge \\
        \texttt{mOperatorsRest} + [o_1]                                                        & \wedge \\
        \texttt{evalBefore}(o_1, o_2)                                                          & \wedge \\ 
        \texttt{mArguments} = \texttt{mArgumentsRest} + [\texttt{lhs}, \texttt{rhs}]           & \Rightarrow \\[0.2cm]
        \texttt{mOperators}' = \texttt{mOperatorRest}                                          & \wedge \\
        \texttt{mTokens}' = \texttt{mTokens}                                                   & \wedge \\
        \texttt{mArguments}' = \texttt{mArgumentsRest} + [\texttt{lhs} \;o_1\; \texttt{rhs}]
        \end{array} 
      $$
8. Otherwise, the operator on top of the token stack is pushed onto the operator stack.
   $$\begin{array}{lc}
         \texttt{mTokens} = \texttt{mTokensRest} + [o_2]           & \wedge \\
         \texttt{mOperators} = \texttt{mOperatorsRest} + [o_1]     & \wedge \\
         \neg \texttt{evalBefore}(o_1, o_2)                        & \Rightarrow \\[0.2cm]
        \texttt{mOperators}' = \texttt{mOperators} + [o_2]         & \wedge \\
        \texttt{mTokens}' = \texttt{mTokensRest}                   & \wedge \\
        \texttt{mArguments}' = \texttt{mArguments}
      \end{array} 
    $$
   
In every step of the evaluation phase we 
- remove one operator from the operator stack, 
- remove its arguments from the argument stack, 
- evaluate the operator, and 
- push the result back on the argument stack.

In [23]:
def evaluate(self):
    while not self.mTokens.isEmpty():
        next_op = self.mTokens.top()
        self.mTokens.pop()
        
        if isinstance(next_op, float):
            self.mArguments.push(next_op)
        elif self.mOperators.isEmpty():
            self.mOperators.push(next_op)
        elif next_op == '(':
            self.mOperators.push(next_op)
        elif next_op == ')' and self.mOperators.top() == '(':
            self.mOperators.pop()
        elif next_op == ')' and self.mOperators.top() != '(':
            self.popAndEvaluate()
            self.mTokens.push(next_op)
        elif next_op != ')' and self.mOperators.top() == '(':
            self.mOperators.push(next_op)
        elif evalBefore(self.mOperators.top(), next_op):
            self.mTokens.push(next_op)
            self.popAndEvaluate()
        else:
            self.mOperators.push(next_op)
    while not self.mOperators.isEmpty():
        self.popAndEvaluate()
    return self.mArguments.top()
    
Calculator.evaluate = evaluate
del evaluate

The method $\texttt{popAndEvaluate}(\texttt{self})$ removes an operator from the operator stack and removes the corresponding arguments from the 
arguments stack.  It evaluates the operator and pushes the result on the argument stack.

In [24]:
def popAndEvaluate(self):
    result = None
    op = self.mOperators.top(); self.mOperators.pop()
    if isConstOperator(op):
        if op == 'pi':
            result = math.pi
        elif op == 'e':
            result = math.e
        elif op == 'x':
            result = self.mValue
    elif isUnaryOperator(op):
        arg = self.mArguments.top(); self.mArguments.pop()
        if op == "sqrt":
            result = math.sqrt(arg)
        elif op == "sin":
            result = math.sin(arg)
        elif op == "cos":
            result = math.cos(arg)
        elif op == "tan":
            result = math.tan(arg)
        elif op == "asin":
            result = math.asin(arg)
        elif op == "acos":
            result = math.acos(arg)
        elif op == "atan":
            result = math.atan(arg)
        elif op == "exp":
            result = math.exp(arg)
        elif op == "log":
            result = math.log(arg)
    else:
        rhs = self.mArguments.top(); self.mArguments.pop()
        lhs = self.mArguments.top(); self.mArguments.pop()
        if op == '+':
            result = lhs + rhs
        elif op == '-':
            result = lhs - rhs
        elif op == '*':
            result = lhs * rhs
        elif op == '/':
            result = lhs / rhs
        elif op == '**':
            result = lhs ** rhs

    assert result != None, f'ERROR: *** Unknown Operator *** "{op}"'
    self.mArguments.push(result)
Calculator.popAndEvaluate = popAndEvaluate
del popAndEvaluate

The function `testEvaluateExpr` takes three arguments:
- `s` is a string that can be interpreted as an arithmetic expression.
  This string might contain the variable $x$. In this arithmetic expression,
  unary function symbols need not be  be followed by parenthesis.
- `t` is a string that contains an arithmetic expression. The syntax
  of this expression has to follow the rules of the programming language
  python.
- `x` is a floating point value.  This value is supposed to be the value of
  the variable $x$ that might occur in `s` and `t`. 

The function evaluates `s`using the class `Calculator`, while `t` is evaluated 
using the predefined function `eval`.  If the results differ, an exception is raised. 

In [25]:
def testEvaluateExpr(s, t, x):
    TL = tokenize(s)
    C = Calculator(TL, x)
    r1 = C.evaluate()
    r2 = eval(t, { 'math': math }, { 'x': x })
    assert r1 == r2, f'{r1} != {r2}'

In [26]:
testEvaluateExpr('sin cos x', 'math.sin(math.cos(x))', 0)

In [27]:
testEvaluateExpr('sin x**2', 'math.sin(math.pi)**2', math.pi)

In [28]:
testEvaluateExpr('log e ** x + 1 * 2 - 3', 'math.log(math.e**x) + 1 * 2 - 3 ', 1)

The function `computeZero` takes three arguments:
* `s` is a string that can be interpreted as a function $f$ of the variable `x`.
  For example, `s` could be equal to `'x * x - 2.0'`.
* `left` and `right` are floating point numbers.

It is required that the function $f$ changes signs in the interval $[\texttt{left}, \texttt{right}]$.
Then `computeZero` returns a floating point value $x_0$ such that $f(x_0) \approx 0$.

In [29]:
def computeZero(s, left, right):
    def f(x):
        c = Calculator(tokenize(s), x)
        return c.evaluate()

    return findZero(f, left, right, 54);

The cell below should output the number `0.7390851332151607`.

In [30]:
computeZero('x ** 2 - 2', 0, 2)

1.414213562373095

In [32]:
testEvaluateExpr('sin sin x', 'math.sin(math.sin(x))', 1)