# Day 7

Seems to be a text parsing question, so I guess time to brush up on regular expressions.
## Part 1:
> An IP supports TLS if it has an Autonomous Bridge Bypass Annotation, or ABBA. An ABBA is any four-character sequence which consists of a pair of two different characters followed by the reverse of that pair, such as xyyx or abba. However, the IP also must not have an ABBA within any hypernet sequences, which are contained by square brackets.

In [4]:
from utils import *
import re

inp = read_input_as_list('day7', '')
head(inp)

['dnwtsgywerfamfv[gwrhdujbiowtcirq]bjbhmuxdcasenlctwgh', 'rnqfzoisbqxbdlkgfh[lwlybvcsiupwnsyiljz]kmbgyaptjcsvwcltrdx[ntrpwgkrfeljpye]jxjdlgtntpljxaojufe', 'jgltdnjfjsbrffzwbv[nclpjchuobdjfrpavcq]sbzanvbimpahadkk[yyoasqmddrzunoyyk]knfdltzlirrbypa', 'vvrchszuidkhtwx[ebqaetowcthddea]cxgxbffcoudllbtxsa']


In [5]:
re_abba = r'[a-z]*([a-z])(?!\1)([a-z])\2\1[a-z]*'
# this matches any number of chars, an ABBA pattern, then any number of chars

def supports_tls(ip):
    print(ip)
    ip1, hypernet, ip2 = re.split('\W+', ip)
    ip_valid = re.search(re_abba, ip1) or re.search(re_abba, ip2)
    hypernet_valid = not re.search(re_abba, hypernet)
    return ip_valid and hypernet_valid

assert supports_tls('abba[mnop]qrst')
assert not supports_tls('abcd[bddb]xyyx')
assert not supports_tls('aaaa[qwer]tyui')
assert supports_tls('ioxxoj[asdfgh]zxcvbn')

sum([1 for ip in inp if supports_tls(ip)])

abba[mnop]qrst
abcd[bddb]xyyx
aaaa[qwer]tyui
ioxxoj[asdfgh]zxcvbn
dnwtsgywerfamfv[gwrhdujbiowtcirq]bjbhmuxdcasenlctwgh
rnqfzoisbqxbdlkgfh[lwlybvcsiupwnsyiljz]kmbgyaptjcsvwcltrdx[ntrpwgkrfeljpye]jxjdlgtntpljxaojufe


ValueError: too many values to unpack (expected 3)

Turns out I misread, the question. There can be any number of hypernet sequences in the ip address. Back to the drawing board.

In [6]:
def supports_tls(ip):
    sections = re.split('\W+', ip)
    # this isn't an ideal solution, but I'm going to assume every second section is a hypernet
    ip_sections = sections[::2]
    hypernet_sections = sections[1::2]
    ip_valid = any(re.search(re_abba, ip_s) for ip_s in ip_sections)
    hypernet_valid = not any(re.search(re_abba, ht_s) for ht_s in hypernet_sections)
    return ip_valid and hypernet_valid

    
assert supports_tls('abba[mnop]qrst')
assert not supports_tls('abcd[bddb]xyyx')
assert not supports_tls('aaaa[qwer]tyui')
assert supports_tls('ioxxoj[asdfgh]zxcvbn')    

def part1(possible_ips):
    return sum(map(supports_tls, possible_ips))

part1(inp)

105

## Part 2:
Now there's a new thing to match, SSL:
>An IP supports SSL if it has an Area-Broadcast Accessor, or ABA, anywhere in the supernet sequences (outside any square bracketed sections), and a corresponding Byte Allocation Block, or BAB, anywhere in the hypernet sequences. An ABA is any three-character sequence which consists of the same character twice with a different character between them, such as xyx or aba. A corresponding BAB is the same characters but in reversed positions: yxy and bab, respectively.

I forgot that it's possible to have multiple hypernet sequences for a minute and thought this was easy. Basically we have to get all the possible ABA values, then check if any corresponding BAB values exist in the hypernet sequences. I had real trouble coming up with a regex I could use to get the ABA values to check. `search` just returns the one, and when I used `findall` I kept getting just the letter `z`. Grouping everything I wanted then grouping all the groups is what I ended up with.

In [7]:
ex = 'zazbz'
re_aba = r'(?=([a-z])(?!\1)([a-z])(\1))'
re.findall(re_aba, ex)

[('z', 'a', 'z'), ('z', 'b', 'z')]

In [18]:
def bab(aba):
    a, b, _ = aba
    return b + a + b

def contains_bab(bab_l, hypernet_s):
    return any(bab in hypernet_s for bab in bab_l)

def supports_ssl(ip):
    # we get the sections in the same way
    sections = re.split('\W+', ip)
    ip_sections = sections[::2]
    hypernet_sections = sections[1::2]
    # first get all the ABAs in the ip_sections
    aba_l = list(chain(*[re.findall(re_aba, ip_s) for ip_s in ip_sections]))
    # generate the BABs from the ABAs
    bab_l = list(map(bab, aba_l))
    # check if any of the BABs are in the hypernet_sections
    valid_hypernet = any(contains_bab(bab_l, hypernet_s) for hypernet_s in hypernet_sections)
    return valid_hypernet

assert supports_ssl('aba[bab]xyz')
assert not supports_ssl('xyx[xyx]xyx')
assert supports_ssl('aaa[kek]eke')
assert supports_ssl('zazbz[bzb]cdb')
assert supports_ssl('neakzsrjrhvixwp[ydbbvlckobfkgbandud]xdynfcpsooblftf[wzyquuvtwnjjrjbuhj]yxlpiloirianyrkzfqe')

def part2(possible_ips):
    return sum(map(supports_ssl, possible_ips))

part2(inp)

258

That took longer than expected, mostly due to a bug with this line: `bab_l = list(map(bab, aba_l))`. I'd originally left out the list conversion, which gave me 