I'm attempting 19 again. Last time, I had to give up and look at Norvig's solution. It's been a few days now, so I want to see if I can solve it, hopefully without remembering much of Norvig's solution. (Though it does help to know a simple solution exists.)

In [1]:
import re
from typing import Optional
from helpers import data

Rules can either be a character that must match, or a list of sub-rules where at least one sub-rule must match. A sub-rule is a list of rule numbers which must match consecutively. 

We'll store rules as a dictionary where the key is the rule number. The values will either be a character, or a list of tuples, where each tuple is a sub-rule (containing rule numbers), and at least one tuple within the list should match. 

We'll store rules as dictionary where the key is the rule number. To encode flexible information about rules, we'll have a `Choice` type, and an `Ordered` type. `Choice` indicates rule numbers where at least one must match. `Ordered` indicates we need to match the rules (numbers, strings, `Choice`'s, or `Ordered`'s) in the given order. Values in the `rules` dictionary can be either `Char`, `int`, `Choice`, or `Ordered`. 

We'll store rules as dictionary where the key is the rule number. Every value will be an `Ordered` type, which can store `Char`s, `int`s, or `Choice`s. 

We'll store the rules as a dictionary where the key is the rule number. The values will all be an `Ordered` type, which either has a `Char` or a `Choice` (which contains rule numbers). 

In [2]:
Char = str
Choice = tuple 
Ordered = list 

In [3]:
def parse_rule(line: str): 
    """
    We'll store rules as a dictionary where the key is the rule number. The values 
    will be an `Ordered` type, which either has a `Char` or a `Choice`, which 
    contains rule numbers. 
    
    Current implementation handles at most one '|'. 
        
    >>> parse_rule('21: 45 47 | 110 18')
    (21, [([45, 47], [110, 18])])
    >>> parse_rule('18: "b"')
    (18, ['b'])
    >>> parse_rule('87: 112 107')
    (87, [112, 107])
    >>> parse_rule('8: 42')
    (8, [42])
    """
    rule_number = int(re.match("\d+", line).group())
    # Remove rule number and : 
    line = re.sub("\d+:\s", "", line)

    # Rule is a character
    if line[0] == '"':
        return rule_number, Ordered(line[1])
    
    choices = line.count("|")
    if choices == 0: 
        # Match a specific ordering 
        return rule_number, Ordered( map(int, line.split(" ")) )
    elif choices == 1: 
        divider_index = line.index("|")
        sub_rule_1, sub_rule_2 = line[:divider_index - 1], line[divider_index + 2:]
        return rule_number, Ordered(( Choice(( parse_rule(sub_rule_1)[1], parse_rule(sub_rule_2)[1] )), ))
    else: 
        raise NotImplementedError("At most one | character in line.")

In [4]:
rules_arr, messages = data(19, sep="\n\n", parser=lambda g: g.split("\n"))
rules = dict(parse_rule(rule) for rule in rules_arr)

**Part 1:** How many messages *completely* match rule 0?

In [5]:
rules[0]

[8, 11]

In [6]:
def match(rule: Ordered, message: str) -> Optional[str]:
    """
    Match MESSAGE to RULE, and return remaining substring or None if the match fails. 
    
    To do this recursively, we need to return the remaining substring because we don't 
    know the length each rule might match beforehand. 
    
    One thing I'm confused about: multiple choices in a Choice might work, but this 
    algorithm uses the first one. What if we need a later one that matches? How can 
    we do this greedily?
    """
    if not rule: 
        return message 
    elif not message: 
        return None 
    elif isinstance(rule[0], Char):
        # If character matches, continue
        return match( Ordered(rule[1:]), message[1:] ) if message[0] == rule[0] else None
    elif isinstance(rule[0], int):
        # Lookup rule number to match 
        return match( Ordered(rules[rule[0]] + rule[1:]), message )
    elif isinstance(rule[0], Choice):
        # Try each choice until we get a match
        for choice in rule[0]:
            result = match( Ordered(choice + rule[1:]), message )
            if result is not None: 
                return result 

    return None 

In [7]:
sum(match(rules[0], message) == '' for message in messages)

120

**Part 2:** Replace Rule 8: 42 with Rule 8: 42 | 42 8. Replace Rule 11: 42 31 with Rule 11: 42 31 | 42 11 31. Now how many match Rule 0?

In [8]:
rules[8] = Ordered(( Choice(( Ordered((42,)), Ordered((42, 8)) )), ))
rules[11] = Ordered(( Choice(( Ordered((42, 31)), Ordered((42, 11, 31)) )), ))

In [9]:
rules[8]

[([42], [42, 8])]

In [10]:
rules[11]

[([42, 31], [42, 11, 31])]

In [11]:
sum(match(rules[0], message) == '' for message in messages)

350

In [12]:
from doctest import testmod 
testmod()

TestResults(failed=0, attempted=4)