In [26]:
with open('19.txt') as f:
    SUBSTRINGS, STRINGS = f.read().split('\n\n')
    SUBSTRINGS = SUBSTRINGS.split(', ')
    STRINGS = STRINGS.split('\n')

len(SUBSTRINGS), len(STRINGS)

(447, 400)

# Part 1: Substring matching

In [36]:
def m2(string: str):
    if string == '':
        yield True
    for s in SUBSTRINGS:
        if string.startswith(s):
            yield from m2(string.removeprefix(s))

sum(any(m2(s)) for s in STRINGS)

247

# Part 2: Decomposing the substrings

## Attempt 1: Analysis time

It stands to reason we could just swap out the `any` for `sum`, but with 447 substrings, there is a _lot_ of overlap possible. Let's be a little smart.

If a substring can be decomposed in 0 ways, it's a "prime substring". But what about composite substrings? Let's decompose each substring into _other_ substrings and check to see how many ways each substring can be decomposed:

In [8]:
from collections import Counter

def without[T](seq: list[T], exclude: T):
    for x in seq:
        if x != exclude:
            yield x

Counter(sum(m2(s, without(SUBSTRINGS, s))) for s in SUBSTRINGS)

Counter({0: 245, 1: 202})

This is _extremely_ informative! It tells us that **composite substrings have only one other decomposition**. In other words, each time a composite substring is present, we simply multiply the possible decompositions by two.

In [9]:
COMPOSITES = {
    s for s in SUBSTRINGS
    if sum(m2(s, without(SUBSTRINGS, s))) == 1}

A naïve solution – and indeed one tried – would be to assume the presence of any composite implies two possibilities (the composite, or its decomposed form); thus we might think
```py
2 ** sum(string.count(c) for c in COMPOSITES)
```
Consider, however, that string `abc` could be decomposed as `a+b+c`, `ab+c` and `a+bc` – because `ab` and `bc` have overlap, they are mutually independent. Therefore any time we saw `abc`, we could take the naïve count of possibilities and multiply by 3/4. So let's do just that!

In [17]:
def decompose(composite: str):
    parts = []
    string = composite
    while string:
        for s in without(SUBSTRINGS, composite):
            if string.startswith(s):
                parts.append(s)
                string = string.removeprefix(s)
    return parts

COMPOSITE_PARTS = list(decompose(ss) for ss in COMPOSITES)
COMPOSITE_PARTS

OVERLAPS = [
    ''.join(ab+bc[1:]) for ab in COMPOSITE_PARTS for bc in COMPOSITE_PARTS
    if ab[-1] == bc[0]
]
len(OVERLAPS)

1166

In [18]:
def possibilities(string: str):
    score = 2 ** sum(string.count(c) for c in COMPOSITES)
    score *= (3/4) ** sum(string.count(o) for o in OVERLAPS)
    return score

sum(map(possibilities, STRINGS))

4563447880091503.0

## Attempt 2: Maybe Memoization?

In [44]:
from functools import cache

@cache
def m2(string: str):
    if string == '':
        return 1
    return sum(
        m2(string.removeprefix(s))
        for s in SUBSTRINGS
        if string.startswith(s)
    )

sum(m2(s) for s in STRINGS)

692596560138745