# Exponential growth in polymers

* <https://adventofcode.com/2021/day/14>

If we were to do as the puzzle suggests and actually build the string up, we'd run out of memory quite fast; for a starting template of length $k$, there are $k - 1$ pairs and every round doubles the pairs; the output length after $n$ steps is $((k - 1) 2^n) + 1$. After 10 steps, you'd need about $k$ kilobyte, after 20 steps $k$ megabyte, and after 30 steps $k$ gigabytes. Given that the puzzle input gives us a template of 20 characters that would _just_ fit into my laptop memory, but not comfortably!

So we need an alternative approach. The order of the polymer pairs doesn't actually matter here, so we can just _count pairs_, and map each pair to two new pairs. For example, the first pair in the template, `NN`, maps to two new pairs, `NC` and `CN`. For every _pair -> inserted element_ we can generate a _pair -> (pair1, pair2)_ mapping instead, and then count how many pairs there are at each step.

E.g. the input template `NNCB` should really be interpreted as `NN`, `NC`, and `CB`, which then maps to `NC`, `CN`, `NB`, `BC`, `CH` and `HB`, all unique pairs still. The next step then gives us:

| pair | count |
| ---- | ----: |
| `NB` |     2 |
| `BC` |     2 |
| `CC` |     1 |
| `CN` |     1 |
| `BB` |     2 |
| `CB` |     2 |
| `BH` |     1 |
| `HC` |     1 |

We only need to keep up to 16 counts this way (though counts may differ for the puzzle input, my puzzle input uses 10 unique elements resulting in 100 rules).

The final step is to turn the pairs into counts for individual elements, but we only count the _first_ element of every pair as the second element of each pair overlaps with the next. This would leave out the last element of the chain, but that element never changes from the starting template. For the test input template `NNCB`, that last element is `B`, and like the starting element stays constant throughout the growing polymer. If you take the first from each of the pair counts as the count for that invididual element, add 1 for the last element `B`, you can see that after step two the 13 elements we have been shown match the following counts:

| elem | count |
| ---- | ----: |
| `B`  |     6 |
| `C`  |     4 |
| `H`  |     1 |
| `N`  |     2 |

We need to pick the maximimum and minimum values from these, subtract them, and that's the puzzle answer.


In [1]:
from __future__ import annotations
from collections import Counter
from dataclasses import dataclass, replace
from functools import cached_property
from itertools import islice
from typing import Iterator, NamedTuple


Rules = dict[str, tuple[str, str]]


class Extremes(NamedTuple):
    min: int
    max: int


@dataclass(frozen=True)
class Polymerization:
    chain: Counter[str]
    last: str
    rules: Rules

    @classmethod
    def from_instructions(cls, instructions: str) -> Polymerization:
        templ, rulelines = instructions.split("\n\n")
        template = Counter(f"{l1}{l2}" for l1, l2 in zip(templ, templ[1:]))
        rule_pairs = (line.split(" -> ") for line in rulelines.splitlines())
        rules = {
            pair: (f"{pair[0]}{target}", f"{target}{pair[1]}")
            for pair, target in rule_pairs
        }
        return cls(template, templ[-1], rules)

    def __len__(self) -> int:
        return self.chain.total() + 1

    def __iter__(self) -> Iterator[Polymerization]:
        step, rules = self, self.rules
        while True:
            chain = Counter()
            for pair, count in step.chain.items():
                for new in rules[pair]:
                    chain[new] += count
            yield (step := replace(step, chain=chain))

    @cached_property
    def extremes(self) -> Extremes:
        elems = Counter(self.last)
        for (elem, _), count in self.chain.items():
            elems[elem] += count
        return Extremes(min(elems.values()), max(elems.values()))


test_instructions = """\
NNCB

CH -> B
HH -> N
CB -> H
NH -> C
HB -> C
HC -> B
HN -> C
NN -> C
BH -> H
NC -> B
NB -> B
BN -> B
BB -> N
BC -> B
CC -> N
CN -> C
"""

test_reaction = Polymerization.from_instructions(test_instructions)
test_step10 = next(islice(test_reaction, 9, None))  # skip 9 steps to get to step 10
assert test_step10.extremes == (161, 1749)


In [2]:
import aocd

reaction = Polymerization.from_instructions(aocd.get_data(day=14, year=2021))
step10 = next(islice(reaction, 9, None))  # skip 9 steps to get to step 10
print("Part 1:", step10.extremes.max - step10.extremes.min)


Part 1: 3259


# Part 2: run it until you would run out of memory

As I suspected, part 2 is to run the reaction 40 times, which for 20 elements in my puzzle input would require about 20[TiB](https://en.wikipedia.org/wiki/Byte#Units_based_on_powers_of_2) ($(k - 1) 2^{40} + 1$ is 20,890,720,927,745 elements for $k = 20$). Good thing we are only tracking pair counts!


In [3]:
test_step40 = next(islice(test_step10, 29, None))  # 30 steps onwards from test_step10
assert test_step40.extremes == (3849876073, 2192039569602)


In [4]:
step40 = next(islice(step10, 29, None))  # 30 steps onwards from step10
print("Part 2:", step40.extremes.max - step40.extremes.min)


Part 2: 3459174981021
