# Advent of code 2020

This notebook contains my (somewhat documented) solutions to advent of code 2020. For
each day, I've tried to summarize ideas behind the solutions and to write understandable
and explicit code with type hints, so it's easier to follow what's happening. My
other goal has been optimizing: most of the solutions should run under one seconds
with a decent, modern laptop.

If you want to tinker around with the solutions, you can install needed libraries
with either [poetry](https://python-poetry.org/) or just plain `pip install .`. Some
extra setup is also needed for the awesome [advent-of-code-data](
https://pypi.org/project/advent-of-code-data/) library, which I'm using for getting
input. See project docs for information.

TODO: d20, d22, d23, d24

## Setup and shared functions

This first block is used to import all libraries, define all special function and
create all types needed in the notebook. These are mainly for parsing data into
different forms.

In [1]:
# Load black for automatic code formatting
%load_ext nb_black

# Load libraries
import re
import math
import itertools
import functools
import networkx as nx
import aocd
import numpy as np
import numba as nb

from collections import defaultdict, deque, Counter
from typing import (
    Iterable,
    Callable,
    Optional,
    Literal,
    Iterator,
    Tuple,
    Set,
    List,
    Dict,
    Union,
    Deque,
)
from dataclasses import dataclass
from sympy.ntheory.modular import crt
from sympy.ntheory import discrete_log

Vector = Tuple[int, int]
Mode = Literal["first", "second"]
Solution = Tuple[Union[int, str], Union[int, str]]


def data(day: int) -> str:
    """Get days input data as one string."""
    return aocd.get_data(day=day, year=2020)


def data_to_nums(day: int) -> List[int]:
    """Get days input data as numbers."""
    return [int(x) for x in re.findall(r"(\d+)", aocd.get_data(day=day, year=2020))]


def data_to_lines(day: int) -> List[str]:
    """Get days input data lines as list of strings."""
    return aocd.get_data(day=day, year=2020).splitlines()


def data_to_blocks(day: int) -> List[str]:
    """Get days input data blocks that are separated by empty new lines."""
    return [block for block in aocd.get_data(day=day, year=2020).split("\n\n")]


def run_day(day_number: int) -> Solution:
    """Return solutions for given day."""
    day_func = globals()[f"day_{day_number}"]
    return day_func()


def print_day(day_number: int) -> None:
    """Print solutions for given day."""
    first, second = run_day(day_number)
    print("Part 1:", first)
    print("Part 2:", second)

<IPython.core.display.Javascript object>

## Day 1

**Task**: find the entries that sum to 2020.

Input size is quite small, just hundred numbers, so brute-forcing through every possible
combination is fast enough.

In [2]:
def day_1() -> Solution:
    numbers = data_to_nums(day=1)

    def solve(goal: int, number_of_entries: int) -> int:
        """Find first tuple that has sum equivalent to goal."""
        return math.prod(
            next(
                combination
                for combination in itertools.combinations(numbers, number_of_entries)
                if sum(combination) == goal
            )
        )

    first = solve(goal=2020, number_of_entries=2)
    second = solve(goal=2020, number_of_entries=3)

    return first, second

<IPython.core.display.Javascript object>

In [3]:
print_day(1)

Part 1: 1010299
Part 2: 42140160


<IPython.core.display.Javascript object>

## Day 2

**Task**: find amount of passwords that match their counterpart rule.

Classic aocd-style input format! Luckily, 
[regex](https://en.wikipedia.org/wiki/Regular_expression) is here to help.

For the first part, occurrences of char in string is needed. The second part is handled
with [exclusive or (XOR)](https://en.wikipedia.org/wiki/Exclusive_or) since char at
either, but not both positions must equal to
char in rule.

In [4]:
def day_2() -> Solution:
    lines = data_to_lines(day=2)

    regex = re.compile(r"(\d+)-(\d+) ([a-z]): ([a-z]+)")
    first, second = 0, 0

    for line in lines:
        low, high, char, string = regex.match(line).groups()
        low, high = int(low), int(high)
        if low <= string.count(char) <= high:
            first += 1
        if (string[low - 1] == char) ^ (string[high - 1] == char):
            second += 1

    return first, second

<IPython.core.display.Javascript object>

In [5]:
print_day(2)

Part 1: 524
Part 2: 485


<IPython.core.display.Javascript object>

## Day 3

**Task**: count the number of trees encountered while traversing the map.

Instead of really traversing the who map, generator can be used.
The `count_trees` function uses `range` and `count` generators zipped into 
tuples to create all traversed points. 

In [6]:
def day_3() -> Solution:
    grid = data_to_lines(day=3)
    height, width = len(grid), len(grid[0])

    def count_trees(slope: Vector) -> int:
        """Get sum of points where tree is encountered."""
        right, down = slope
        return sum(
            grid[y][x % width] == "#"
            for y, x in zip(
                range(0, height, down), itertools.count(start=0, step=right)
            )
        )

    slopes = [(3, 1), (1, 1), (5, 1), (7, 1), (1, 2)]
    counts = [count_trees(slope=slope) for slope in slopes]

    return counts[0], math.prod(counts)

<IPython.core.display.Javascript object>

In [7]:
print_day(3)

Part 1: 280
Part 2: 4355551200


<IPython.core.display.Javascript object>

## Day 4

**Task**: count the number of valid passports

In the first part, it's enough to check that all ids in validators are present in a
passport. For the second part, all validator lambdas are also executed against passport
values.

In [8]:
Validator = Callable[[str], bool]


def day_4() -> Solution:
    passports = data_to_blocks(day=4)

    validators: Dict[str, Validator] = {
        "byr": lambda v: 1920 <= int(v) <= 2002,
        "iyr": lambda v: 2010 <= int(v) <= 2020,
        "eyr": lambda v: 2020 <= int(v) <= 2030,
        "hgt": lambda v: (
            "cm" in v
            and 150 <= int(v[:-2]) <= 193
            or "in" in v
            and 59 <= int(v[:-2]) <= 76
        ),
        "hcl": lambda v: bool(re.fullmatch(r"^#[0-9a-f]{6}$", v)),
        "ecl": lambda v: v in ["amb", "blu", "brn", "gry", "grn", "hzl", "oth"],
        "pid": lambda v: bool(re.fullmatch(r"^[0-9]{9}$", v)),
    }
    regex = re.compile("(\w+):(\S+)")
    first, second = 0, 0

    for passport in passports:
        fields = dict(regex.findall(passport))
        if set(validators).issubset(set(fields)):
            first += 1
            second += all(
                validate(fields[field_id]) for field_id, validate in validators.items()
            )

    return first, second

<IPython.core.display.Javascript object>

In [9]:
print_day(4)

Part 1: 192
Part 2: 101


<IPython.core.display.Javascript object>

## Day 5

**Task**: Find out the highest seat ID and your own seat ID.

There's a nice catch in the puzzle: char pairs F/L and B/R aren't really different,
since each can be substituted with 0 or 1. The result is a [binary number
](https://en.wikipedia.org/wiki/Binary_number) string which can be then casted to int.

In the second part, helper function finds the first seat id that matches given
conditions.

In [10]:
def day_5() -> Solution:
    lines = data_to_lines(day=5)

    table = str.maketrans("FBLR", "0101")
    seat_ids = {int(line.translate(table), base=2) for line in lines}

    first = max(seat_ids)
    second = next(
        seat
        for seat in range(1, max(seat_ids))
        if seat not in seat_ids and seat - 1 in seat_ids and seat + 1 in seat_ids
    )

    return first, second

<IPython.core.display.Javascript object>

In [11]:
print_day(5)

Part 1: 858
Part 2: 557


<IPython.core.display.Javascript object>

## Day 6

**Task**: Count the number of questions to which anyone/everyone answered "yes".

In the first part, the question is basically "how many unique alphabets there are
between newlines?". In the second part, question is "which alphabets are present on all
lines between newlines?", which can be handled with [set intersections
](https://en.wikipedia.org/wiki/Intersection_(set_theory)).

In [12]:
def day_6() -> Solution:
    groups = data_to_blocks(day=6)

    first, second = 0, 0

    for answers in groups:
        answer_ids = {answer for answer in answers if answer.isalpha()}
        first += len(answer_ids)
        for person in answers.split("\n"):
            answer_ids &= set(person)
        second += len(answer_ids)

    return first, second

<IPython.core.display.Javascript object>

In [13]:
print_day(6)

Part 1: 6532
Part 2: 3427


<IPython.core.display.Javascript object>

## Day 7

**Task**: Which bags can contain "shiny gold" bag / how many bags does "shiny gold" bag
contain?

A bit trickier one! One helpful observation is that the rules define a graph, a
[directed acyclic graph](https://en.wikipedia.org/wiki/Directed_acyclic_graph) (DAG) 
to be precise. So, the excellent network library [NetworkX](https://networkx.org/) can
be used.

Initially, graph contains weighted edges from container bag to content bags. The first 
part is solved by turning edges around to answer the question "which bags can contain
this bag". This way it's easy to walk graph through from the "shiny gold" node.

In second part, [depth-first-search](https://en.wikipedia.org/wiki/Depth-first_search)
is used to sum bag amounts from the innermost bags until "shiny gold" is reached.

In [14]:
def day_7() -> Solution:
    rules = data_to_lines(day=7)

    def dfs(bag: str) -> int:
        """Get amount of bags inside this bag."""
        return (
            sum(dfs(content) * graph[bag][content]["weight"] for content in graph[bag])
        ) + 1

    graph = nx.DiGraph()
    regex = re.compile(r"(\d+) ([\w\s]+(?= ))")

    for rule in rules:
        container, content = rule.split(" bags contain ")
        graph.add_weighted_edges_from(
            ((container, bag, int(amount)) for amount, bag in regex.findall(content))
        )

    first = len(nx.dag.descendants(graph.reverse(), "shiny gold"))
    second = dfs("shiny gold") - 1

    return first, second

<IPython.core.display.Javascript object>

In [15]:
print_day(7)

Part 1: 372
Part 2: 8015


<IPython.core.display.Javascript object>

## Day 8

**Task**: Where the given semi-assembly program loops? / Change one instruction
to make program terminate successfully.

This reminded me of [Intcode computer](https://adventofcode.com/2019/day/2) tasks from
AoC 2019, so I decided to just simulate the given program.

In the first part, loop breaks once some instruction is visited for the second time. In
the second part, generator function is used to generate potential tapes where either
"jmp" or "nop" is replaced until correct one is found.

In [16]:
Instruction = Tuple[str, int]
Tape = List[Instruction]


def day_8() -> Solution:
    instructions = re.findall(r"(\w+) ([+-]\d+)", data(day=8))

    @dataclass
    class Comp:
        tape: Tape
        accumulator: int = 0
        head: int = 0

        def run(self, mode: Mode = "first") -> Optional[int]:
            seen: Set[int] = set()
            while self.head not in seen:
                seen.add(self.head)
                try:
                    op, arg = self.tape[self.head]
                except IndexError:
                    return self.accumulator
                if op == "acc":
                    self.accumulator += arg
                    self.head += 1
                elif op == "jmp":
                    self.head += arg
                elif op == "nop":
                    self.head += 1
            if mode == "first":
                return self.accumulator

    def potential_tapes(broken_tape: Tape) -> Iterator[Tape]:
        """Yield potential tape, where one "jmp" or "nop" is replaced."""
        replace = {"jmp": "nop", "nop": "jmp"}
        for i, (op, arg) in enumerate(broken_tape):
            if op in replace:
                yield [*broken_tape[:i], (replace[op], arg), *broken_tape[i + 1 :]]

    tape = [(op, int(arg)) for op, arg in instructions]
    potential_comps = (
        Comp(potential_tape).run(mode="second")
        for potential_tape in potential_tapes(tape)
    )
    first = Comp(tape).run()
    second = next(result for result in potential_comps if result is not None)

    return first, second

<IPython.core.display.Javascript object>

In [17]:
print_day(8)

Part 1: 2034
Part 2: 672


<IPython.core.display.Javascript object>

## Day 9

**Task**: What is the first number that isn't sum of any two numbers of the 25
numbers before it? / Find a contiguous subset of numbers that sum up to that first
number.

Time to brute-force again! In the first part, sum_in_interval checks if number at
an index $i$ is a sum of any combination of previous 25 numbers. Number at first index
satisfying this property is the answer.

In the second part I decided to put a little of effort to produce an $O(n)$ solution 
(well, almost, calculating a sum isn't $O(1)$).
Solution uses a [sliding window](
https://www.geeksforgeeks.org/window-sliding-technique/) in order to have each
number added/removed from contiguous range max one time.

In [18]:
def day_9() -> Solution:
    numbers = data_to_nums(day=9)

    def sum_in_interval(start: int, goal: int) -> bool:
        """Find out if number at goal is sum of any two numbers in interval."""
        return any(
            sum(combination) == numbers[goal]
            for combination in itertools.combinations(numbers[start:goal], 2)
        )

    def find_sum(goal: int) -> Optional[int]:
        """Find a first contiguous set that sums up to goal."""
        partial = deque()
        for num in numbers:
            if sum(partial) < goal:
                partial.append(num)
            if sum(partial) == goal and len(partial) >= 2:
                return min(partial) + max(partial)
            while sum(partial) > goal:
                partial.popleft()

    n, offset = len(numbers), 25
    invalid_number_index = next(
        i for i in range(offset, n) if not sum_in_interval(start=i - offset, goal=i)
    )

    first = numbers[invalid_number_index]
    second = find_sum(goal=first)

    return first, second

<IPython.core.display.Javascript object>

In [19]:
print_day(9)

Part 1: 542529149
Part 2: 75678618


<IPython.core.display.Javascript object>

## Day 10

**Task**: Count differences between numbers / count total numbers of ways to arrange
numbers up to `n`th number.

It helps here to sort the given joltage array first. After that, both parts can be
solved in $O(n)$ by iterating through the sorted array. 

For the first part, differences between consecutive numbers are calculated. The 
second part can be solved with 
[dynamic programming](https://en.wikipedia.org/wiki/Dynamic_programming): since
one can add either adapter with $k+1$, $k+2$ or $k+3$ joltages after an adapter
with $k$ joltages, amount of adapters at some $k$ is based on amounts before it.
Answer is the amount of possible ways to sum up to charger with largest joltage.

In [20]:
def day_10() -> Solution:
    joltages = sorted(data_to_nums(day=10))

    ways, diffs = defaultdict(int), defaultdict(int)
    prev = 0
    ways[0] = 1
    for joltage in joltages:
        ways[joltage] = sum(ways[joltage - i] for i in [1, 2, 3])
        diffs[joltage - prev] += 1
        prev = joltage
    diffs[3] += 1

    first = diffs[1] * diffs[3]
    second = ways[joltages[-1]]

    return first, second

<IPython.core.display.Javascript object>

In [21]:
print_day(10)

Part 1: 1656
Part 2: 56693912375296


<IPython.core.display.Javascript object>

## Day 11

**Task**: Run [cellular automaton](https://en.wikipedia.org/wiki/Cellular_automaton)
until stable state is reached.

Game of life -style problems have been repeating puzzle theme in previous years, so it was nice to see one during 2020 too. Basic idea is quite simple: create new generations based on previous generation and a set of rules until consecutive generations are exactly the same.

For efficient solutions, following is needed: fast way to update whether seat is occupied or empty and fast way to get neighbours, especially on the second part. With pure python, fastest solution I could get was around 3 seconds, so I'll let this be the slowest solution this far :)

In [22]:
def day_11() -> Solution:
    lines = data_to_lines(day=11)

    class SeatingSystem:
        def __init__(self, limit: int, mode: Mode) -> None:
            self.mode = mode
            self.limit = limit
            self.occupied = self.grid_to_dict(char="L", indicator=False)
            self.floor = self.grid_to_dict(char=".", indicator=True)
            self.neighbours = self.get_neighbours()

        @staticmethod
        def grid_to_dict(char: str, indicator: bool) -> Dict[complex, bool]:
            h, w = len(lines), len(lines[0])
            return {
                complex(x, y): indicator
                for y in range(h)
                for x in range(w)
                if lines[y][x] == char
            }

        def get_neighbours(self) -> Dict[complex, List[complex]]:
            directions = [
                complex(x, y)
                for x, y in itertools.product((-1, 0, 1), repeat=2)
                if (x, y) != (0, 0)
            ]
            return {
                seat: [
                    neighbour
                    for direction in directions
                    if (neighbour := self.get_neighbour(seat=seat, direction=direction))
                    in self.occupied
                ]
                for seat in self.occupied.keys()
            }

        def get_neighbour(self, seat: complex, direction: complex) -> Optional[complex]:
            if self.mode == "second":
                return next(
                    point
                    for multiplier in itertools.count(start=1)
                    if (point := seat + multiplier * direction) not in self.floor
                )
            return seat + direction

        def run(self):
            while True:
                next_occupied = self.next_generation()
                if next_occupied == self.occupied:
                    return sum(self.occupied.values())
                self.occupied = next_occupied

        def next_generation(self):
            return {seat: self.becomes_occupied(seat) for seat in self.occupied}

        def becomes_occupied(self, seat: complex) -> bool:
            count = sum(self.occupied[neighbour] for neighbour in self.neighbours[seat])
            currently_occupied = self.occupied[seat]
            if currently_occupied and count >= self.limit:
                return False
            if not currently_occupied and count == 0:
                return True
            return currently_occupied

    first = SeatingSystem(limit=4, mode="first").run()
    second = SeatingSystem(limit=5, mode="second").run()
    return first, second

<IPython.core.display.Javascript object>

In [23]:
print_day(11)

Part 1: 2281
Part 2: 2085


<IPython.core.display.Javascript object>

## Day 12

**Task**: Move ship by given rules, calculate distance between starting and ending
positions.

Quite straightforward puzzle: biggest difference between parts one and two is 
movement of direction (part 1) / waypoint (part 2). It also helps again to use
`complex` numbers for easier additions, multiplication and especially for doing
the rotations: code here uses the fact that raising complex number to the power 
of $n$ "moves" the number along perimeter of circle by multiplies of $90$ degrees. 
([Here's some info about the math behind](https://brilliant.org/wiki/complex-exponentiation/))

In [24]:
def day_12() -> Solution:
    instructions = [(line[0], int(line[1:])) for line in data_to_lines(day=12)]

    def solve(mode: Mode) -> int:
        directions = {
            "N": complex(0, 1),
            "E": complex(1, 0),
            "S": complex(0, -1),
            "W": complex(-1, 0),
        }
        rotations = {"L": complex(0, 1), "R": complex(0, -1)}
        point = complex(0, 0)
        other = directions["E"] if mode == "first" else complex(10, 1)

        for action, value in instructions:
            if action in rotations:
                other *= rotations[action] ** (value // 90)
            elif action == "F":
                point += other * value
            elif mode == "first":
                point += directions[action] * value
            elif mode == "second":
                other += directions[action] * value
        return int(abs(point.real) + abs(point.imag))

    return solve(mode="first"), solve(mode="second")

<IPython.core.display.Javascript object>

In [25]:
print_day(12)

Part 1: 1010
Part 2: 52742


<IPython.core.display.Javascript object>

## Day 13

**Task**: Given some timestamp, and a bus schedule, what is the earliest time to take
a bus (part 1) / the earliest time when busses depart exactly n minutes after first,
where n is their placement on the list.

For input, ids and diffs between bus id and its placement in schedule are calculated.
This helps on part 2.

The first part requires simple calculation, which gets remainder between the earliest
time and bus ids. The second part, on the other hand, requires some number
theory: [chinese remainder theorem](
https://en.wikipedia.org/wiki/Chinese_remainder_theorem). Idea here is that, there is
one integer $x$ that holds

$\begin{align}
x \equiv a_1 \pmod{n_1} \\
x \equiv a_2 \pmod{n_2} \\
x \equiv a_k \pmod{n_k} \\
\end{align}
$

where $a$ is a list of bus ids and $n$ is a list of differences between id and busses
placement in given schedule. Using the given example, this means that
`1068781 % 0 == (7-0) % 0`, `1068781 % 1 == (13-1) % 1`,`1068781 % 4 == (59-4) % 4` etc.

Sympy library [provides function](
https://docs.sympy.org/latest/modules/ntheory.html?highlight=baby%20step#sympy.ntheory.modular.crt)
to calculate the integer $x$.

In [26]:
def day_13() -> Solution:
    lines = data_to_lines(day=13)
    earliest = int(lines[0])
    bus_ids_and_diffs = [
        (int(bus_id), int(bus_id) - i)
        for i, bus_id in enumerate(lines[1].split(","))
        if bus_id != "x"
    ]

    minutes, bus_id = min(
        (bus_id - earliest % bus_id, bus_id) for bus_id, _ in bus_ids_and_diffs
    )
    first = minutes * bus_id

    bus_ids, diffs = zip(*bus_ids_and_diffs)
    second = crt(bus_ids, diffs)[0]

    return first, second

<IPython.core.display.Javascript object>

In [27]:
print_day(13)

Part 1: 3269
Part 2: 672754131923874


<IPython.core.display.Javascript object>

## Day 14

**Task**: Given a program with changing bitmask, memory addesses and values, what is 
the sum of the values in memory after executing the program?

[Mask article in Wikipedia](https://en.wikipedia.org/wiki/Mask_(computing)) is worth 
checking (it's actually linked in puzzle description as well), since the first part
is just applying two masks: other for masking bits to ones and other for masking bits
to zeros.

In the second part, things are bit more harder. I decided to go with following 
generator: apply ones mask and prepend address with zeros to make it 36-bit,
then generate all possible products of ones and zeros for "X"s in address and
yield results until all products have been consumed.

In [28]:
def day_14() -> Solution:
    lines = data_to_lines(day=14)
    mask, ones_mask, zeros_mask = None, None, None

    def apply_mask(value: int) -> int:
        value |= ones_mask
        value &= zeros_mask
        return value

    def all_memory_addresses(address: int) -> Iterator[int]:
        address_36bit = f"{address | ones_mask:b}".zfill(36)
        for replacements in itertools.product("01", repeat=mask.count("X")):
            it = iter(replacements)
            result = [next(it) if a == "X" else b for a, b in zip(mask, address_36bit)]
            yield int("".join(result), 2)

    regex = re.compile(r"(\d+)")
    first_memory, second_memory = {}, {}
    for line in lines:
        operation, argument = line.split(" = ")
        if operation == "mask":
            mask = argument
            ones_mask = int(argument.replace("X", "0"), 2)
            zeros_mask = int(argument.replace("X", "1"), 2)
        else:
            address, value = [int(x) for x in regex.findall(line)]
            first_memory[address] = apply_mask(value=value)
            for new_address in all_memory_addresses(address=address):
                second_memory[new_address] = value

    return sum(first_memory.values()), sum(second_memory.values())

<IPython.core.display.Javascript object>

In [29]:
print_day(14)

Part 1: 17765746710228
Part 2: 4401465949086


<IPython.core.display.Javascript object>

## Day 15

**Task**: Play a game based on [Van Eck sequence](https://www.youtube.com/watch?v=etMJxB-igrc) and return $n$th number.

Game defined in the puzzle is quite simple: it's best to use data structure that allows fast lookups for last index where number was seen. 

The problem here is that Van Eck sequence is unpredictable and, as far as I know, there's no algorithm to solve $n$:th number of the sequence without calculating previous $n-1$ numbers.
I wasn't happy with basic cpython running time (around 7-8 seconds), but using numpy + numba gave some satisfying
results.

In [30]:
def day_15() -> Solution:
    numbers = data_to_nums(day=15)

    @nb.njit
    def loop(seen: np.ndarray, start: int, turns: int, spoken: int) -> int:
        for turn in range(start, turns):
            last_seen = seen[spoken]
            seen[spoken] = turn
            spoken = last_seen if last_seen == 0 else turn - last_seen
        return spoken

    def solve(turns) -> int:
        seen = np.zeros(turns, dtype=np.int32)
        for i, num in enumerate(numbers):
            seen[num] = i + 1
        return loop(seen=seen, start=len(numbers), turns=turns, spoken=numbers[-1])

    return solve(turns=2020), solve(turns=30_000_000)

<IPython.core.display.Javascript object>

In [31]:
print_day(15)

Part 1: 1618
Part 2: 548531


<IPython.core.display.Javascript object>

## Day 16

**Task**: Given a list of rules for ticket fields and some ticket examples, find out which rule corresponds to which field. Report error on invalid example tickets and decode some fields from your own ticket.

Eh, day 16 requires a lot of parsing and is probably one of the longer code cells in this notebook. Idea here is following: in the first part, each field position in each nearby ticket is matched against given rules (class, row etc.) and field-rule pairs without matches for all the valid tickets are filtered away. Error rate (answer for the first part) is calculated here as well. 

The second part is just finding which field position has only one possible field name, pop it out, remove it from other field positions and continue until all field positions have been matched with some name. 

Other way to solve this puzzle would be to use bipartite graph and search for max flow (NetworkX has some okay tools for this). I already had solution written, so maybe I'll try different approach next year.

In [32]:
FieldName = str
FieldPosition = int
Range = Tuple[int, int]
FieldMatches = Dict[FieldPosition, List[FieldName]]


def day_16() -> Solution:
    notes = data_to_blocks(day=16)

    def validate_fields(fields: List[int]) -> Tuple[bool, Union[int, FieldMatches]]:
        field_matches: FieldMatches = defaultdict(list)
        for field_position, value in enumerate(fields):
            matches_for_position = [
                field_name
                for field_name, ranges in rules.items()
                if any(low <= value <= up for low, up in ranges)
            ]
            if not matches_for_position:
                return False, value
            field_matches[field_position].extend(matches_for_position)
        return True, field_matches

    my_ticket = notes[1].splitlines()[1]
    nearby_tickets = notes[2].splitlines()[1:]
    rules: Dict[FieldName, List[Range]] = {}
    all_field_matches: FieldMatches = defaultdict(list)
    potential_fields: Dict[FieldPosition, Set[FieldName]] = {}
    final_fields: Dict[FieldPosition, FieldName] = {}
    regex = re.compile(r"(\d+)-(\d+)")
    error_rate, not_valid_tickets = 0, 0

    for line in notes[0].splitlines():
        field_name, ranges = line.split(": ")
        rules[field_name] = [(int(low), int(up)) for low, up in regex.findall(ranges)]

    for ticket in nearby_tickets:
        fields = [int(x) for x in ticket.split(",")]
        parsing_ok, data = validate_fields(fields=fields)
        if not parsing_ok:
            error_rate += data
            not_valid_tickets += 1
        else:
            for field_position, matches in data.items():
                all_field_matches[field_position].extend(matches)

    valid_amount = len(nearby_tickets) - not_valid_tickets
    for field_position, matches in all_field_matches.items():
        potential_fields[field_position] = {
            field_name
            for field_name, match_amount in Counter(matches).items()
            if match_amount == valid_amount
        }

    while potential_fields:
        field_position = next(
            position
            for position in potential_fields
            if len(potential_fields[position]) == 1
        )
        field_name = list(potential_fields[field_position])[0]
        final_fields[field_position] = field_name
        for field_names in potential_fields.values():
            field_names.discard(field_name)
        del potential_fields[field_position]

    values_from_my_ticket = [
        int(value)
        for field_position, value in enumerate(my_ticket.split(","))
        if final_fields[field_position].startswith("departure")
    ]

    return error_rate, math.prod(values_from_my_ticket)

<IPython.core.display.Javascript object>

In [33]:
print_day(16)

Part 1: 20060
Part 2: 2843534243843


<IPython.core.display.Javascript object>

## Day 17

**Task**: Simulate [Conway's Game of life](https://en.wikipedia.org/wiki/Conway%27s_Game_of_Life), but in three and four dimensions.

Just like in day 11, this puzzle is a cellular automaton. Difference here is that a) grid is not limited, 
b) there are more than two dimensions and c) amount of round is given. 

Otherwise, same idea applies: get next generation based on given rules and make calculating neighbours efficient. 
Using sets as a way to represent points shines here, since looping through 4 dimension tables will be quite slow.

In [34]:
Cell = Tuple[int, ...]


def day_17() -> Solution:
    lines = data_to_lines(day=17)

    class GameOfLife:
        def __init__(self, dimensions: int) -> None:
            self.cells = self.initial_cells(dimensions=dimensions)
            self.directions = [
                direction
                for direction in itertools.product((-1, 0, 1), repeat=dimensions)
                if direction != tuple(0 for _ in range(dimensions))
            ]

        @staticmethod
        def initial_cells(dimensions: int) -> Set[Cell]:
            h, w = len(lines), len(lines[0])
            return {
                (x, y, *(0 for _ in range(dimensions - 2)))
                for y in range(h)
                for x in range(w)
                if lines[y][x] == "#"
            }

        def run(self, rounds: int) -> int:
            for _ in range(rounds):
                self.cells = self.next_generation()
            return len(self.cells)

        def next_generation(self) -> Set[Cell]:
            return {
                cell
                for cell, count in Counter(self.get_neighbours()).items()
                if self.becomes_active(cell=cell, count=count)
            }

        def becomes_active(self, cell: Cell, count: int) -> bool:
            return count == 3 or (count == 2 and cell in self.cells)

        def get_neighbours(self) -> Counter:
            return itertools.chain.from_iterable(
                [self.get_neighbour(cell, direction) for direction in self.directions]
                for cell in self.cells
            )

        @staticmethod
        def get_neighbour(cell: Cell, direction: Cell) -> Cell:
            return tuple(a + b for a, b in zip(cell, direction))

    first = GameOfLife(dimensions=3).run(rounds=6)
    second = GameOfLife(dimensions=4).run(rounds=6)
    return first, second

<IPython.core.display.Javascript object>

In [35]:
print_day(17)

Part 1: 368
Part 2: 2696


<IPython.core.display.Javascript object>

## Day 18

**Task**: Calculate bunch of expressions according to weird math rules. 

After some amount of googling if changing operator presedence would be possible in python (not easy or even possible), I decided to uses regexes to evaluate expressions in certain order. Method `run` searches for expressions inside parentheses, evaluates expressions (according to rules in part one / two) and replaces expression with evaluated answer. If there's parentheses inside parentheses, recursion goes a level deeper. 

Answer is found by just summing up everything together.

In [36]:
def day_18():
    lines = data_to_lines(day=18)

    @dataclass
    class Evaluator:
        mode: Mode
        parentheses = re.compile(r"(\([^\(\)]*\))")
        add_or_mul = re.compile(r"\d+ [\+*] \d+")
        add = re.compile(r"\d+ [\+] \d+")
        mul = re.compile(r"\d+ [\*] \d+")

        def run(self, line: str) -> int:
            while expr := self.parentheses.search(line):
                inner_expr = expr[0][1:-1]
                line = self.parentheses.sub(str(self.run(inner_expr)), line, count=1)
            if self.mode == "first":
                while expr := self.add_or_mul.search(line):
                    line = self.add_or_mul.sub(str(eval(expr[0])), line, count=1)
            else:
                while expr := self.add.search(line):
                    line = self.add.sub(str(eval(expr[0])), line, count=1)
                while expr := self.mul.search(line):
                    line = self.mul.sub(str(eval(expr[0])), line, count=1)
            return int(line)

    evaluator = Evaluator(mode="first")
    first = sum(evaluator.run(line=line) for line in lines)
    evaluator.mode = "second"
    second = sum(evaluator.run(line=line) for line in lines)

    return first, second

<IPython.core.display.Javascript object>

In [37]:
print_day(18)

Part 1: 4297397455886
Part 2: 93000656194428


<IPython.core.display.Javascript object>

## Day 19

**Task**: From a list of recursively matching rules and list of messages, check what
messages match rule number 0.

Since given rules basically define a context-free grammar, my initial thought was to
use [Cocke–Younger–Kasami algorithm](https://en.wikipedia.org/wiki/CYK_algorithm) to
parse rules. After spending way too much time with that, I just decided to generate
regex string recursively. That turned out to be quite compact and good way to solve this
problem.

For the first part, `get_regex` method joins together regex strings recursively adding
logical OR:s where needed. The second parts changed rules are handled as follows:
`8: 42 | 42 8` means that rules number 42 can occur one or more times.
`11: 42 31 | 42 11 31` means that rules 42 and 31 must same amount of times, all 42:s
first and then all 31:s. I don't know if it's possible to create regex for this, so I
used kinda hacky solution instead: regex matches if there is either one 42 and one 31,
two 42:s and two 31:s etc. up to five times. Max five reps here comes just from
trial and error: it seemed to be the lowest amount to produce the correct amount.

In [38]:
def day_19():
    blocks = data_to_blocks(day=19)
    regex = re.compile(r"\d+")

    @dataclass
    class RegexFactor:
        rules: Dict[int, str]
        mode: Mode

        def get_regex(self, rule_id: int) -> str:
            if self.mode == "second":
                if rule_id == 8:
                    return self.get_regex(rule_id=42) + "+"
                if rule_id == 11:
                    max_reps = 5
                    return self.or_join(
                        self.get_regex(rule_id=42)
                        + f"{{{reps}}}"
                        + self.get_regex(rule_id=31)
                        + f"{{{reps}}}"
                        for reps in range(1, max_reps)
                    )
            if self.rules[rule_id] in "ab":
                return self.rules[rule_id]
            return self.or_join(
                "".join(
                    self.get_regex(rule_id=int(rule_id))
                    for rule_id in regex.findall(part)
                )
                for part in self.rules[rule_id].split(" | ")
            )

        @staticmethod
        def or_join(iterable: Iterable) -> str:
            return "(?:" + "|".join(iterable) + ")"

    def count_valid_messages(regex: str) -> int:
        messages = blocks[1].splitlines()
        return sum(bool(re.fullmatch(regex, message)) for message in messages)

    rules: Dict[int, str] = {}
    for line in blocks[0].splitlines():
        rule_num, rule = line.split(": ")
        rules[int(rule_num)] = rule[1] if "a" in rule or "b" in rule else rule

    factor = RegexFactor(rules=rules, mode="first")
    first = count_valid_messages(factor.get_regex(rule_id=0))
    factor.mode = "second"
    second = count_valid_messages(factor.get_regex(rule_id=0))

    return first, second

<IPython.core.display.Javascript object>

In [39]:
print_day(19)

Part 1: 195
Part 2: 309


<IPython.core.display.Javascript object>

## Day 20

**Task**: 

In [40]:
def day_20():
    first = second = None

    return first, second

<IPython.core.display.Javascript object>

In [41]:
print_day(20)

Part 1: None
Part 2: None


<IPython.core.display.Javascript object>

## Day 21

**Task**: Parse allergens and match them with foods from encoded list. Report amount of foods without allergens and all foods with allergens.

This puzzle is actually almost like the one from the day 16, so I decided to take a similar approach: first, find out all potentially allerginic foods by filtering them with set intersections. Then find out which allergen matches just one food, pop it out, remove food from all other allergen sets and continue until there's no allergens left. The while loop 
here is basically copy from day 16.

Answer for first is solved by merging sets for dictionary `potential` and checking food appearances against those. Second is just sorting allergens by food.

In [42]:
Food = Allergen = str


def day_21():
    lines = data_to_lines(day=21)
    potential_allergens: Dict[Allergen, Set[Food]] = defaultdict(set)
    food_appearances = Counter()
    final_allergens: List[Tuple[Food, Allergen]] = []
    regex = re.compile(r"(\w+)")

    for line in lines:
        foods, allergens = [regex.findall(part) for part in line.split("contains")]
        food_appearances.update(foods)
        for allergen in allergens:
            if allergen not in potential_allergens:
                potential_allergens[allergen] |= set(foods)
            potential_allergens[allergen] &= set(foods)

    allergenic_foods = functools.reduce(
        lambda x, y: {*x, *y}, potential_allergens.values()
    )

    while potential_allergens:
        allergen = next(
            allergen
            for allergen in potential_allergens
            if len(potential_allergens[allergen]) == 1
        )
        food = list(potential_allergens[allergen])[0]
        final_allergens.append((allergen, food))
        for foods in potential_allergens.values():
            foods.discard(food)
        del potential_allergens[allergen]

    first = sum(
        appearances
        for food, appearances in food_appearances.items()
        if food not in allergenic_foods
    )
    second = ",".join(food for _, food in sorted(final_allergens))

    return first, second

<IPython.core.display.Javascript object>

In [43]:
print_day(21)

Part 1: 2779
Part 2: lkv,lfcppl,jhsrjlj,jrhvk,zkls,qjltjd,xslr,rfpbpn


<IPython.core.display.Javascript object>

## Day 22

**Task**: Play games called "Combat" and "Recursive Combat".

Good old space cards from 2019, but this time, luckily, without a lot of arithmetics ([check this out if you want to see more](https://adventofcode.com/2019/day/22)). 

For efficiently simulating cards in decks, it's best to use deque: it supports $O(1)$ insertions and removals in both ends and also gives quite efficient way for slicing. This way
it's easy to run games according to given rules. 

In [44]:
Winner = str
Score = int


def day_22():
    def score(deck: Deque) -> Score:
        return sum((i + 1) * num for i, num in enumerate(reversed(deck)))

    def play_first(decks: List[Deque]) -> Deque:
        while decks[0] and decks[1]:
            cards = [deck.popleft() for deck in decks]
            if cards[0] > cards[1]:
                decks[0].extend(cards)
            else:
                decks[1].extend(reversed(cards))
        return score(deck=decks[0]) if decks[0] else score(deck=decks[1])

    def play_second(decks: List[Deque], at_root=False) -> Union[Winner, Score]:
        seen = [set() for _ in range(2)]
        while decks[0] and decks[1]:
            if any(tuple(decks[i]) in seen[i] for i in range(2)):
                return "Player 1"
            seen[0].add(tuple(decks[0]))
            seen[1].add(tuple(decks[1]))
            cards = [deck.popleft() for deck in decks]
            if all(len(decks[i]) >= cards[i] for i in range(2)):
                new_decks = [
                    deque(itertools.islice(a, 0, b)) for a, b in zip(decks, cards)
                ]
                winner = play_second(decks=new_decks)
            else:
                winner = "Player 1" if cards[0] > cards[1] else "Player 2"
            if winner == "Player 1":
                decks[0].extend(cards)
            else:
                decks[1].extend(reversed(cards))
        winner = "Player 1" if decks[0] else "Player 2"
        if at_root:
            return score(deck=decks[0]) if decks[0] else score(deck=decks[1])
        return winner

    def run(mode):
        decks = [
            deque([int(x) for x in nums.splitlines()[1:]])
            for nums in data_to_blocks(day=22)
        ]
        return (
            play_second(decks=decks, at_root=True)
            if mode == "second"
            else play_first(decks=decks)
        )

    return run(mode="first"), run(mode="second")

<IPython.core.display.Javascript object>

In [45]:
print_day(22)

Part 1: 35013
Part 2: 32806


<IPython.core.display.Javascript object>

## Day 23

**Task**: Play cups and balls magic trick for $n$ rounds.

This is one of those aoc-puzzles, where it's easy to write first version, then input size 
bumps up to something like $10^9$ and you're left with solution with entirely wrong data structure / algorithm.

Rules themselves are quite simple: take three elements and move them to another place without changing their order, then repeat. A good data structure for this is linked list, but since
python doesn't have one in standard library and only single linked list is needed (to know next cups on the right side), one can simulate linked list with regular list. Here, each index 
$i$ in the list `neighbour` tells which label is on the right side of cup label $i$.

And since simulating cup movements is basically just looping over same code $n$ times, it can
be sped up with numba.

In [167]:
import time


def day_23() -> Solution:
    cups = [int(x) for x in data(day=23)]

    @nb.njit
    def run(curr: int, neighbour: nb.typed.List, repeat: int) -> nb.typed.List:
        n = len(neighbour) - 1
        for _ in range(repeat):
            first = neighbour[curr]
            second = neighbour[first]
            last = neighbour[second]
            values = {first, second, last}

            dest = n if curr == 1 else curr - 1
            while dest in values:
                dest = n if dest == 1 else dest - 1

            neighbour[curr] = neighbour[last]
            neighbour[last] = neighbour[dest]
            neighbour[dest] = first

            curr = neighbour[curr]

        return neighbour

    def solve(mode: Mode) -> Union[int, str]:
        neighbour = nb.typed.List()
        n = len(cups)
        for _ in range(n + 1):
            neighbour.append(0)
        for i in range(n - 1):
            neighbour[cups[i]] = cups[i + 1]
        curr = cups[0]

        start = time.time()
        if mode == "second":
            neighbour[cups[n - 1]] = n + 1
            for i in range(n + 2, 10 ** 6 + 1):
                neighbour.append(i)
            neighbour.append(cups[0])
            repeat = 10 ** 7
        else:
            neighbour[cups[-1]] = cups[0]
            repeat = 100
        print("took:", time.time() - start)

        start = time.time()
        solution = run(curr=curr, neighbour=neighbour, repeat=repeat)
        print("took:", time.time() - start)

        if mode == "second":
            return solution[1] * solution[solution[1]]

        curr, ans = 1, []
        while (curr := solution[curr]) != 1:
            ans.append(str(curr))
        return "".join(ans)

    return solve(mode="first"), solve(mode="second")

<IPython.core.display.Javascript object>

In [168]:
print_day(23)

took: 4.291534423828125e-06
took: 0.4045839309692383
took: 1.6664531230926514
took: 4.467679023742676
Part 1: 58427369
Part 2: 111057672960


<IPython.core.display.Javascript object>

## Day 25

**Task**: Given two public keys, find out the encryption key they share.

The final day, yeah! Exchange process detailed in the puzzle is a lot like [Diffie-
Hellman key exchange](https://en.wikipedia.org/wiki/Diffie%E2%80%93Hellman_key_exchange), 
and in this case, the secret integer (loop size) must be solved.

Transforming process is basically taking $k$th power of the subject number $7$
modulo $20201227$, where $k = $ loop size. Since public key $n$ is given, loop 
size can be found from the equation $7^k \equiv n \pmod{20201227}$. 

Instead of brute forcing loop sizes $1…k$ until correct is found, it's better to
calculate [discrete logarithm](https://en.wikipedia.org/wiki/Discrete_logarithm) of
$n$ to the base $7$ modulo $20201227$. Luckily, Sympy library provides
[suitable function for this](
https://docs.sympy.org/latest/modules/ntheory.html?#sympy.ntheory.residue_ntheory.discrete_log).

In [46]:
def day_25():
    card_key, door_key = data_to_nums(day=25)
    subject_number, modulo, loop_size = 7, 20201227, 0
    
    k = discrete_log(modulo, card_key, subject_number)
    
    return pow(door_key, k, modulo), None


<IPython.core.display.Javascript object>

In [47]:
print_day(25)

Part 1: 3286137
Part 2: None


<IPython.core.display.Javascript object>

# Timings
Here's the report on 2018 15" Macbook Pro (2,8 Ghz i7):

In [148]:
import time


def timing():
    """Report on timing for all days."""
    print("Day  Secs.")
    print("===  =====")
    for day in range(1, 24):
        start = time.time()
        run_day(day)
        total = time.time() - start
        print(f"{day:2} {total:6.3f}")


%time timing()

Day  Secs.
===  =====
 1  0.122
 2  0.002
 3  0.001
 4  0.002
 5  0.001
 6  0.004
 7  0.017
 8  0.019
 9  0.013
10  0.001
11  2.661
12  0.002
13  0.001
14  0.361
15  0.955
16  0.080
17  0.382
18  0.086
19  0.015
20  0.000
21  0.002
22  1.457


Encountered the use of a type that is scheduled for deprecation: type 'reflected list' found for argument 'neighbour' of function 'day_23.<locals>.run'.

For more information visit https://numba.pydata.org/numba-doc/latest/reference/deprecation.html#deprecation-of-reflection-for-list-and-set-types

File "<ipython-input-146-a9eae53b9846>", line 5:
    @nb.njit
    def run(
    ^



23  5.139
CPU times: user 11.1 s, sys: 136 ms, total: 11.3 s
Wall time: 11.3 s


<IPython.core.display.Javascript object>

Take a look at following:
- https://github.com/norvig/pytudes/blob/master/ipynb/Advent-2020.ipynb
- https://github.com/sophiebits/adventofcode/
- https://github.com/arknave/advent-of-code-2020
- https://github.com/mjpieters/adventofcode/tree/master/2020