# Day 16

[Ticket Translation](https://adventofcode.com/2020/day/16)

As you're walking to yet another connecting flight, you realize that one of the legs of your re-routed trip coming up is on a high-speed train. However, the train ticket you were given is in a language you don't understand. You should probably figure out what it says before you get to the train station after the next flight.

Unfortunately, you can't actually read the words on the ticket. You can, however, read the numbers, and so you figure out the fields these tickets must have and the valid ranges for values in those fields.

You collect the rules for ticket fields, the numbers on your ticket, and the numbers on other nearby tickets for the same train service (via the airport security cameras) together into a single document you can reference (your puzzle input).

The rules for ticket fields specify a list of fields that exist somewhere on the ticket and the valid ranges of values for each field. For example, a rule like class: 1-3 or 5-7 means that one of the fields in every ticket is named class and can be any value in the ranges 1-3 or 5-7 (inclusive, such that 3 and 5 are both valid in this field, but 4 is not).

Each ticket is represented by a single line of comma-separated values. The values are the numbers on the ticket in the order they appear; every ticket has the same format. For example, consider this ticket:

.--------------------------------------------------------.
| ????: 101    ?????: 102   ??????????: 103     ???: 104 |
|                                                        |
| ??: 301  ??: 302             ???????: 303      ??????? |
| ??: 401  ??: 402           ???? ????: 403    ????????? |
'--------------------------------------------------------'

Here, ? represents text in a language you don't understand. This ticket might be represented as 101,102,103,104,301,302,303,401,402,403; of course, the actual train tickets you're looking at are much more complicated. In any case, you've extracted just the numbers in such a way that the first number is always the same specific field, the second number is always a different specific field, and so on - you just don't know what each position actually means!

Start by determining which tickets are completely invalid; these are tickets that contain values which aren't valid for any field. Ignore your ticket for now.

For example, suppose you have the following notes:

class: 1-3 or 5-7
row: 6-11 or 33-44
seat: 13-40 or 45-50

your ticket:
7,1,14

nearby tickets:
7,3,47
40,4,50
55,2,20
38,6,12

It doesn't matter which position corresponds to which field; you can identify invalid nearby tickets by considering only whether tickets contain values that are not valid for any field. In this example, the values on the first nearby ticket are all valid for at least one field. This is not true of the other three nearby tickets: the values 4, 55, and 12 are are not valid for any field. Adding together all of the invalid values produces your ticket scanning error rate: 4 + 55 + 12 = 71.

Consider the validity of the nearby tickets you scanned. What is your ticket scanning error rate?

In [1]:
data_input = open('input/day 16.txt').read().splitlines()

first_break = data_input.index('')
second_break = data_input.index('', first_break + 1)

rules = data_input[:first_break]
my_ticket = data_input[first_break + 2]
tickets = data_input[second_break + 2:]

In [2]:
rules

['departure location: 29-917 or 943-952',
 'departure station: 50-875 or 884-954',
 'departure platform: 41-493 or 503-949',
 'departure track: 50-867 or 875-966',
 'departure date: 30-655 or 679-956',
 'departure time: 46-147 or 153-958',
 'arrival location: 50-329 or 344-968',
 'arrival station: 42-614 or 623-949',
 'arrival platform: 35-849 or 860-973',
 'arrival track: 42-202 or 214-959',
 'class: 38-317 or 329-968',
 'duration: 44-530 or 539-953',
 'price: 28-713 or 727-957',
 'route: 30-157 or 179-966',
 'row: 38-114 or 136-969',
 'seat: 45-441 or 465-956',
 'train: 44-799 or 824-951',
 'type: 41-411 or 437-953',
 'wagon: 39-79 or 86-969',
 'zone: 48-306 or 317-974']

In [3]:
# collect all ticket numbers

import re

ticket_numbers = set()
extract_ranges = re.compile(r': (\d*\-\d*) or (\d*\-\d*)').findall
for rule in rules:
    # print(groups)
    for group in extract_ranges(rule)[0]:
        lower, higher = group.split('-')
        ticket_numbers = ticket_numbers.union(range(int(lower), int(higher) + 1))

In [4]:
# count invalid tickets

all_tickets = [int(x) for x in ','.join(tickets).split(',')]
error_rate = sum(t for t in all_tickets if t not in ticket_numbers)
error_rate

23925

## Part 2

Now that you've identified which tickets contain invalid values, discard those tickets entirely. Use the remaining valid tickets to determine which field is which.

Using the valid ranges for each field, determine what order the fields appear on the tickets. The order is consistent between all tickets: if seat is the third field, it is the third field on every ticket, including your ticket.

For example, suppose you have the following notes:

class: 0-1 or 4-19
row: 0-5 or 8-19
seat: 0-13 or 16-19

your ticket:
11,12,13

nearby tickets:
3,9,18
15,1,5
5,14,9

Based on the nearby tickets in the above example, the first position must be row, the second position must be class, and the third position must be seat; you can conclude that in your ticket, class is 12, row is 11, and seat is 13.

Once you work out which field is which, look for the six fields on your ticket that start with the word departure. What do you get if you multiply those six values together?

In [5]:
# remove tickets not in the ticket ranges
valid_tickets = [[int(x) for x in nums.split(',') if int(x) in ticket_numbers] for nums in tickets]

In [6]:
# create ticket number sets for each fields
import collections
fields = collections.defaultdict(set)
field_list = []
extraction = re.compile(r'(.*): (\d*)-(\d*) or (\d*)-(\d*)').findall
for r in rules:
    fld, l1, h1, l2, h2 = extraction(r)[0]
    field_list.append(fld)
    for n in range(int(l1), int(h1) + 1):
        fields[n].add(fld)
    for n in range(int(l2), int(h2) + 1):
        fields[n].add(fld)

In [12]:
known_columns = {}
while True:
    total = 0
    for n in range(20):
        column_data = [fields[t[n]] for t in valid_tickets if len(t) >= 20]
        candidates = set(field_list)
        for flds in column_data:
            candidates &= flds
        for v in known_columns.values():
            candidates.discard(v)
        if len(candidates) == 1:
            # print(f'{n:2} {candidates}')
            known_columns[n] = candidates.pop()
        else:
            total += len(candidates)
    if total < 1:
        break
    else:
        print('To go:', total)
known_columns = {n: known_columns[n] for n in sorted(known_columns)}
print(known_columns)

To go: 187
To go: 161
To go: 126
To go: 105
To go: 87
To go: 44
To go: 23
To go: 13
To go: 5
{0: 'duration', 1: 'departure track', 2: 'departure station', 3: 'arrival platform', 4: 'zone', 5: 'departure location', 6: 'class', 7: 'train', 8: 'arrival track', 9: 'route', 10: 'price', 11: 'departure time', 12: 'row', 13: 'arrival location', 14: 'wagon', 15: 'departure platform', 16: 'type', 17: 'arrival station', 18: 'seat', 19: 'departure date'}


In [17]:
nums = [int(my_ticket.split(',')[i]) for i in range(20) if known_columns[i].startswith('departure')]
nums

[89, 73, 103, 67, 157, 137]

In [18]:
import math
math.prod(nums)

964373157673