# Day 19

## Part I

本日问题有难度。第一部分可以使用深度优先搜索来判断字符串是否匹配规则0。这里采用了递归完成，每次成功的匹配都会返回匹配到的字符串长度，对于简单规则来说，那就是返回1，没有匹配到任何字符串将会返回0。

这里如果试图获得所有可能的模式，无论在时间上还是空间上复杂度都很高。

In [1]:
from itertools import product
from typing import Dict, List

class Rule(object):
    def __init__(self, rule_id: int):
        self.rule_id = rule_id
    
    def add_forward_list(self, forward: List[int]):
        if not hasattr(self, 'forwards'):
            self.forwards = []
        self.forwards.append(forward)
        
    def set_match(self, match: str):
        self.match = match
    
    def matches(self, rules: Dict[int, 'Rule'], data: str) -> int:
        if hasattr(self, 'match'):
            return int(data[0] == self.match)
        for forward in self.forwards:
            index = 0
            for r in forward:
                ret = rules[r].matches(rules, data[index:])
                if ret == 0:
                    break
                index += ret
            else:
                return index
        return 0

然后是两个部分可以共用的读取输入函数：

In [2]:
from typing import Tuple

def read_input(input_file: str) -> Tuple[Dict[int, Rule], List[str]]:
    with open(input_file) as fn:
        data = fn.read()
    part1, part2 = data.split('\n\n')
    rules = {}
    for line in part1.split('\n'):
        rule_id, rhs = line.rstrip().split(': ')
        rule_id = int(rule_id)
        rule = Rule(rule_id)
        if rhs[0] == '"':
            rule.set_match(rhs[1:-1])
        else:
            for forward in rhs.split(' | '):
                rule.add_forward_list([int(x) for x in forward.split(' ')])
        rules[rule_id] = rule
    return rules, [line for line in part2.split('\n') if line]

我们对上面的matches方法做一个简单的单元测试：

In [3]:
test_rules, test_data = read_input('testcase1.txt')
assert(test_rules[0].matches(test_rules, 'ababbb') == 6)

然后就是第一部分的逻辑，所有字符串复合规则0的个数：

In [4]:
def part1_solution(rules: Dict[int, Rule], data: List[str]) -> int:
    return sum(rules[0].matches(rules, line) == len(line) for line in data)

单元测试：

In [5]:
assert(part1_solution(test_rules, test_data) == 2)

获得第一部分的结果：

In [6]:
rules, data = read_input('input.txt')
part1_solution(rules, data)

299

## Part II

第二部分需要对数据进行观察，规则0是由规则8和11组成，规则8修改之后可以包括1个以上的规则42，规则11修改之后可以包括相等个数的规则42和规则31。因此采用贪婪原则，首先匹配尽可能多的规则42，假设匹配到m个，然后匹配剩余所有的规则31，假设匹配到n个，如果 m - n > 0 就表示能够符合规则0。

In [7]:
class Rule(object):
    def __init__(self, rule_id: int):
        self.rule_id = rule_id
    
    def add_forward_list(self, forward: List[int]):
        if not hasattr(self, 'forwards'):
            self.forwards = []
        self.forwards.append(forward)
        
    def set_match(self, match: str):
        self.match = match
    
    def matches(self, rules: Dict[int, 'Rule'], data: str) -> int:
        if hasattr(self, 'match'):
            return int(data[0] == self.match)
        for forward in self.forwards:
            index = 0
            for r in forward:
                ret = rules[r].matches(rules, data[index:])
                if ret == 0:
                    break
                index += ret
            else:
                return index
        return 0
    
    def multiple_matches(self, rules: Dict[int, 'Rule'], data: str) -> Tuple[int, int]:
        match_len = self.matches(rules, data)
        if match_len == 0:
            return 0, 0
        if match_len >= len(data):
            return 1, match_len
        counter = 1
        index = match_len
        ret = self.matches(rules, data[index:])
        while ret > 0:
            counter += 1
            index += match_len
            if index >= len(data):
                break
            ret = self.matches(rules, data[index:])
        return counter, match_len

对multiple_matches方法做简单的单元测试：

In [8]:
test_rules, test_data = read_input('testcase2.txt')
assert(test_rules[42].multiple_matches(test_rules, 'bbabbbbaabaabba') == (2, 5))

定义一个辅助函数来完成第二部分每一行是否匹配的逻辑：

In [9]:
def part2_match(rules: Dict[int, Rule], line: str) -> bool:
    m, match_len = rules[42].multiple_matches(rules, line)
    if m == 0:
        return False
    
    # 如果规则42已经将整个字符串匹配完，此时仅需要判断最后一段是否符合规则31即可
    if m * match_len == len(line):
        n, _ = rules[31].multiple_matches(rules, line[-match_len:])
    else:
        n, _ = rules[31].multiple_matches(rules, line[m * match_len:])
    
    # 除了必须匹配一个以上规则31以及m-n>0之外，还需要检查是否匹配完了整个字符串
    return n > 0 and m - n > 0 and (m + n) * match_len == len(line)

对part2_match函数做个简单的单元测试：

In [10]:
assert(not part2_match(test_rules, 'aaaabbaaaabbaaa'))

第二部分最终逻辑，求出所有符合规则0的行数：

In [11]:
def part2_solution(rules: Dict[int, Rule], data: List[str]):
    return sum(part2_match(rules, line) for line in data)

单元测试：

In [12]:
assert(part2_solution(test_rules, test_data) == 12)

最后是第二部分的结果：

In [13]:
part2_solution(*read_input('input.txt'))

414