# Day 18 - Parsing an expression

- https://adventofcode.com/2020/day/18

We are given an expression to parse and evaluate. With no precedence rules, we can effectively execute the operators as we parse them, by wrapping [`operator` module functions](https://docs.python.org/3/library/operator.html) in a [`functools.partial()`](https://docs.python.org/3/library/functools.html#functools.partial) object.

I used the Python [`tokenize` module](https://docs.python.org/3/library/tokenize.html) to parse the expression into a stream of tokens, and a stack to hold intermediate results and to evaluate parenthesised expressions.

The top of the stack alternates between numbers and callables that take one number to produce a number. Every number token is handled by calling the current top-of-stack (TOS) element and then replacing it with the result, every operator token is handled by binding the operator with the TOS number. Because this requires that we have a callable object on the stack present for the first number of an expression, I used [`operator.pos()`](https://docs.python.org/3/library/operator.html#operator.pos) (unary `+`) as the stack seed as that returns numbers unchanged.

To decide what the next stack value should be for a given token, I use a dispatch map keyed on the token type. Each dispatch function is given the stack and the token string value, and its return value is pushed on the stack. E.g. `NUMBER` a is combination converting the `n` token string to an `int()`, popping the `s` stack, calling the resulting object with the integer as only argument, and returning the result to be pushed onto the stack. In a single expression that's `s.pop()(int(n))`.

For the expression `1 + (2 * 3)`, the algorithm produces the following stack states from the input tokens:

| token         | operations             | stack                                |
| ------------- | ---------------------- | ------------------------------------ |
|               | initial stack          | `[pos]`                              |
| `NUMBER, '1'` | pop, `pos(1)`          | `[1]`                                |
| `PLUS, '+'`   | pop, `partial(add, 1)` | `[partial(add, 1)]`                  |
| `LPAR, '('`   | `pos`                  | `[partial(add, 1), pos]`             |
| `NUMBER, '2'` | pop, `pos(2)`          | `[partial(add, 1), 2]`               |
| `STAR, '*'`   | pop, `partial(mul, 2)` | `[partial(add, 1), partial(mul, 2)]` |
| `NUMBER, '3'` | pop, `mul(2, 3)`       | `[partial(add, 1), 6]`               |
| `RPAR, ')'`   | pop, pop, `add(1, 6)`  | `[7]`                                |

The final expression result is now on the top of the stack.

All in all, this lets us parse and execute these expressions in just 6 lines, plus the 5-key dispatch mapping.


In [1]:
import operator
from collections import deque
from collections.abc import Mapping
from functools import partial
from io import StringIO
from tokenize import LPAR, NUMBER, PLUS, RPAR, STAR, generate_tokens
from typing import Callable, Deque, Union, cast

PartialOp = Callable[[int], int]
StackEntry = Union[int, PartialOp]

dispatch: Mapping[int, Callable[[Deque[StackEntry], str], StackEntry]] = {
    LPAR: lambda *_: operator.pos,
    RPAR: lambda s, _: (tos := cast(int, s.pop()), cast(PartialOp, s.pop())(tos))[-1],
    NUMBER: lambda s, n: cast(PartialOp, s.pop())(int(n)),
    PLUS: lambda s, op: partial(operator.add, s.pop()),
    STAR: lambda s, op: partial(operator.mul, s.pop()),
}


def evaluate(expr: str) -> int:
    # operator.pos() is a no-op on numbers, used as stack primer.
    stack: Deque[StackEntry] = deque([operator.pos])
    for t in generate_tokens(StringIO(expr).readline):
        if handler := dispatch.get(t.exact_type):
            stack.append(handler(stack, t.string))
    return cast(int, stack.pop())


tests = {
    "1 + 2 * 3 + 4 * 5 + 6": 71,
    "1 + (2 * 3) + (4 * (5 + 6))": 51,
    "2 * 3 + (4 * 5)": 26,
    "5 + (8 * 3 + 9 + 3 * 4 * 3)": 437,
    "5 * 9 * (7 * 3 * 3 + 9 * 3 + (8 + 6 * 4))": 12240,
    "((2 + 4 * 9) * (6 + 9 * 8 + 6) + 6) + 2 + 4 * 2": 13632,
}
for test, expected in tests.items():
    assert evaluate(test) == expected

In [2]:
import aocd

expressions = aocd.get_data(day=18, year=2020).splitlines()

In [3]:
print("Part 1:", sum(map(evaluate, expressions)))

Part 1: 45283905029161


## Part 2 - Parsing with operator precedence

Part two adds precedence rules, so now we have to reorder expression execution. We can do this by using a simple [Shunting Yard algorithm](https://en.wikipedia.org/wiki/Shunting-yard_algorithm); the shunting yard 'holds back' operators until they can be applied to the numbers kept on a separate execution stack.

The result of the algorithm is a series of tokens in [Reverse Polish notation](https://en.wikipedia.org/wiki/Reverse_Polish_notation); to evaluate such series you only need to push the numbers onto a stack, and for every operator, pop the top two numbers, apply these to the operator to produce a new number, and then push the result back on the stack.


In [4]:
from tokenize import ENDMARKER, NEWLINE, OP, TokenInfo
from typing import Iterator, Tuple


def shunting_yard(
    tokens: Iterator[TokenInfo], precedence: Mapping[int, int]
) -> Iterator[TokenInfo]:
    precedence = {**precedence, LPAR: -1}
    stack: Deque[Tuple[int, TokenInfo]] = deque()
    for token in tokens:
        if token.type in (ENDMARKER, NEWLINE):
            continue
        if token.type != OP:
            yield token
            continue
        et = token.exact_type
        if et == RPAR:
            while stack and (tos := stack.pop()[1]).exact_type != LPAR:
                yield tos
            continue
        p = precedence[token.exact_type]
        if et != LPAR:
            while stack and stack[-1][0] >= p:
                yield stack.pop()[1]
        stack.append((p, token))
    yield from (t for _, t in reversed(stack))


PRECEDENCE = {PLUS: 1, STAR: 0}
OPS = {PLUS: operator.add, STAR: operator.mul}


def evaluate_swapped_precedence(expr: str) -> int:
    stack: Deque[int] = deque()
    shunted = shunting_yard(generate_tokens(StringIO(expr).readline), PRECEDENCE)
    for token in shunted:
        if oper := OPS.get(token.exact_type):
            stack.append(oper(stack.pop(), stack.pop()))
        else:
            stack.append(int(token.string))
    return stack.pop()


tests_swapped_precedence = {
    "1 + 2 * 3 + 4 * 5 + 6": 231,
    "1 + (2 * 3) + (4 * (5 + 6))": 51,
    "2 * 3 + (4 * 5)": 46,
    "5 + (8 * 3 + 9 + 3 * 4 * 3)": 1445,
    "5 * 9 * (7 * 3 * 3 + 9 * 3 + (8 + 6 * 4))": 669060,
    "((2 + 4 * 9) * (6 + 9 * 8 + 6) + 6) + 2 + 4 * 2": 23340,
}
for test, expected in tests_swapped_precedence.items():
    assert evaluate_swapped_precedence(test) == expected

In [5]:
print("Part 1:", sum(map(evaluate_swapped_precedence, expressions)))

Part 1: 216975281211165
