# Ticket Translation

--- Day 16: Ticket Translation ---
As you're walking to yet another connecting flight, you realize that one of the legs of your re-routed trip coming up is on a high-speed train. However, the train ticket you were given is in a language you don't understand. You should probably figure out what it says before you get to the train station after the next flight.

Unfortunately, you can't actually read the words on the ticket. You can, however, read the numbers, and so you figure out the fields these tickets must have and the valid ranges for values in those fields.

You collect the rules for ticket fields, the numbers on your ticket, and the numbers on other nearby tickets for the same train service (via the airport security cameras) together into a single document you can reference (your puzzle input).

The rules for ticket fields specify a list of fields that exist somewhere on the ticket and the valid ranges of values for each field. For example, a rule like class: 1-3 or 5-7 means that one of the fields in every ticket is named class and can be any value in the ranges 1-3 or 5-7 (inclusive, such that 3 and 5 are both valid in this field, but 4 is not).

Each ticket is represented by a single line of comma-separated values. The values are the numbers on the ticket in the order they appear; every ticket has the same format. For example, consider this ticket:
```
.--------------------------------------------------------.
| ????: 101    ?????: 102   ??????????: 103     ???: 104 |
|                                                        |
| ??: 301  ??: 302             ???????: 303      ??????? |
| ??: 401  ??: 402           ???? ????: 403    ????????? |
'--------------------------------------------------------'
```
Here, ? represents text in a language you don't understand. This ticket might be represented as 101,102,103,104,301,302,303,401,402,403; of course, the actual train tickets you're looking at are much more complicated. In any case, you've extracted just the numbers in such a way that the first number is always the same specific field, the second number is always a different specific field, and so on - you just don't know what each position actually means!

Start by determining which tickets are completely invalid; these are tickets that contain values which aren't valid for any field. Ignore your ticket for now.

For example, suppose you have the following notes:
```
class: 1-3 or 5-7
row: 6-11 or 33-44
seat: 13-40 or 45-50

your ticket:
7,1,14

nearby tickets:
7,3,47
40,4,50
55,2,20
38,6,12
```
It doesn't matter which position corresponds to which field; you can identify invalid nearby tickets by considering only whether tickets contain values that are not valid for any field. In this example, the values on the first nearby ticket are all valid for at least one field. This is not true of the other three nearby tickets: the values 4, 55, and 12 are are not valid for any field. Adding together all of the invalid values produces your ticket scanning error rate: 4 + 55 + 12 = 71.

Consider the validity of the nearby tickets you scanned. What is your ticket scanning error rate?

## Second part

Now that you've identified which tickets contain invalid values, discard those tickets entirely. Use the remaining valid tickets to determine which field is which.

Using the valid ranges for each field, determine what order the fields appear on the tickets. The order is consistent between all tickets: if seat is the third field, it is the third field on every ticket, including your ticket.

For example, suppose you have the following notes:
```
class: 0-1 or 4-19
row: 0-5 or 8-19
seat: 0-13 or 16-19

your ticket:
11,12,13

nearby tickets:
3,9,18
15,1,5
5,14,9
```
Based on the nearby tickets in the above example, the first position must be row, the second position must be class, and the third position must be seat; you can conclude that in your ticket, class is 12, row is 11, and seat is 13.

Once you work out which field is which, look for the six fields on your ticket that start with the word departure. What do you get if you multiply those six values together?

In [1]:
import os
import time
import numpy as np
import re

In [2]:
DAY = 'Day_16'
FILE_END = '_01.txt'
RUN_WITH = 'input' ## admitted values: sample1 sample2 input 
SAMPLE_DATA = {1:'''class: 1-3 or 5-7
row: 6-11 or 33-44
seat: 13-40 or 45-50

your ticket:
7,1,14

nearby tickets:
7,3,47
40,4,50
55,2,20
38,6,12''', 2: '''class: 0-1 or 4-19
row: 0-5 or 8-19
seat: 0-13 or 16-19

your ticket:
11,12,13

nearby tickets:
3,9,18
15,1,5
5,14,9'''
              }


In [3]:
start = time.time()

In [4]:

input_path = os.path.join(os.path.dirname(os.path.dirname(os.path.dirname(os.getcwd()))), "Inputs")
input_path_day = os.path.join(input_path, DAY)
file_path = os.path.join(input_path_day, DAY+ FILE_END)

### Read input

In [5]:
if RUN_WITH == 'input':
    with open(file_path,'r') as f:
        values = f.read()
elif RUN_WITH.startswith('sample'):
    end = int(RUN_WITH[-1:])
    values =  SAMPLE_DATA[end]
else:
    print("No valid input selected")

values = values.split('\n')
values = list(filter(None, values))
values[0:5]

['departure location: 45-422 or 444-950',
 'departure station: 36-741 or 752-956',
 'departure platform: 46-788 or 806-967',
 'departure track: 46-57 or 70-950',
 'departure date: 35-99 or 108-974']

In [6]:
rules = []
my_ticket = []
other_tickets = []

section = 0
for line in values:
    if line.startswith('your') or line.startswith('nearby'):
        section +=1
        continue
    if section == 0:
        rules.append(line)
    if section == 1:
        my_ticket = list(map(int,line.split(',')))
    if section == 2:
        other_tickets.append(list(map(int,line.split(','))))

In [7]:
rules[0:2]

['departure location: 45-422 or 444-950',
 'departure station: 36-741 or 752-956']

In [8]:
rules_dict = {}

for rule in rules:
    rules_dict[rule.split(':')[0]] = set()
    for boundary in rule.split(': ')[1].split(' or '):
        for item in range(int(boundary.split('-')[0]),int(boundary.split('-')[1])+1 ):
            rules_dict[rule.split(':')[0]].add(item)
all_valid_numbers = set.union(*rules_dict.values())

# Part I:

In [9]:
error_rate = 0
valid_tickets = set()
invalid_tickets = set()

In [10]:
for val, ticket in enumerate(other_tickets):
    for number in ticket:
        if number not in all_valid_numbers:
            invalid_tickets.add(val)
            error_rate += number
    if val not in invalid_tickets:
        valid_tickets.add(val)

In [11]:
error_rate

21071

## Part II:

In [12]:
# Using only valid tickets, find the potential positions for rules

In [13]:
ticket_pos = {k:np.zeros(len(rules_dict.keys())) for k in rules_dict.keys()}

for ticket in valid_tickets:
    for position, number in enumerate(other_tickets[ticket]):
        for rule, values in rules_dict.items():
            if number in values:
                ticket_pos[rule][position] +=1

In [14]:
n = len(valid_tickets)

In [15]:
# Solver: pick the rule with only one position valid for all, remove that position for all rules, iterate.

In [16]:
rules_to_assign = list(rules_dict.keys())
rules_assigned = {}

while rules_to_assign:
    match_ = ''
    for rule, values in ticket_pos.items():
        options = (len(list(filter(lambda x: x==n, values))))
        if options ==1:
            position =  np.argmax(values)
            rules_assigned[rule] = np.argmax(values)
            match_ = rule
    for rule in ticket_pos.keys():
        ticket_pos[rule][position] = 0
    rules_to_assign.remove(match_)

In [17]:
product = 1
for rule, pos in rules_assigned.items():
    if rule.startswith('depart'):
        product *= my_ticket[pos]

In [18]:
product

3429967441937

In [19]:
end = time.time()
end - start

0.17661404609680176