# Day 4: Passport Processing

https://adventofcode.com/2020/day/4

Passport data is validated in batch files (your puzzle input). Each passport is
represented as a sequence of `key:value` pairs separated by spaces or newlines.
Passports are separated by blank lines.

The expected fields are as follows:

- `byr` (Birth Year)
- `iyr` (Issue Year)
- `eyr` (Expiration Year)
- `hgt` (Height)
- `hcl` (Hair Color)
- `ecl` (Eye Color)
- `pid` (Passport ID)
- `cid` (Country ID)

Here is an example batch file containing four passports:

In [None]:
test_batch_file = """
ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm

hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in
"""

Count the number of **valid** passports - those that have all required fields.
Treat `cid` as optional. **In your batch file, how many passports are valid?**

According to the above rules, your improved system would report `2` valid
passports.

In [None]:
import doctest
import re

def passports(batch_file):
  passports = re.split(r'\n{2,}', batch_file)
  for passport in passports:
    pairs = re.split(r'\s+', passport)
    fields = dict(f.split(':') for f in pairs if f)
    yield fields

def has_required_fields(passport):
  required_fields = {'byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid'}
  return set(passport.keys()) >= required_fields

def count_valid_passports(batch_file):
  """
  >>> count_valid_passports(test_batch_file)
  2
  """
  return sum(has_required_fields(passport) for passport
             in passports(batch_file))

doctest.testmod()

# Part Two

You can continue to ignore the cid field, but each other field has strict rules
about what values are valid for automatic validation:

- `byr` (Birth Year) - four digits; at least `1920` and at most `2002`.
- `iyr` (Issue Year) - four digits; at least `2010` and at most `2020`.
- `eyr` (Expiration Year) - four digits; at least `2020` and at most `2030`.
- `hgt` (Height) - a number followed by either `cm` or `in`:
    - If `cm`, the number must be at least `150` and at most `193`.
    - If `in`, the number must be at least `59` and at most `76`.
- `hcl` (Hair Color) - a `#` followed by exactly six characters `0`-`9` or
  `a`-`f`.
- `ecl` (Eye Color) - exactly one of: `amb` `blu` `brn` `gry` `grn` `hzl` `oth`.
- `pid` (Passport ID) - a nine-digit number, including leading zeroes.
- `cid` (Country ID) - ignored, missing or not.

Count the number of **valid** passports - those that have all required fields
**and valid values**. Continue to treat `cid` as optional. **In your batch file,
how many passports are valid?**

In [None]:
def valid_as_year(text, at_least, at_most):
  if re.match(r'\d{4}$', text):
    return at_least <= int(text) <= at_most
  return False

def valid_as_height(text):
  match = re.match(r'(\d+)(cm|in)$', text)
  if match:
    height, unit = int(match.group(1)), match.group(2)
    if unit == 'cm':
      return 150 <= height <= 193
    elif unit == 'in':
      return 59 <= height <= 76
  return False

def valid_hair_color(text):
  return re.match(r'#[0-9a-f]{6}$', text)

def valid_eye_color(text):
  eye_colors = {'amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth'}
  return text in eye_colors

def valid_passport_id(text):
  return re.match(r'\d{9}$', text)

def part_two(batch_file):
  """
  >>> part_two(test_batch_file)
  2
  """
  valid_passports = 0
  for passport in passports(batch_file):
    if not has_required_fields(passport): continue
    if not valid_as_year(passport['byr'], 1920, 2002): continue
    if not valid_as_year(passport['iyr'], 2010, 2020): continue
    if not valid_as_year(passport['eyr'], 2020, 2030): continue
    if not valid_as_height(passport['hgt']): continue
    if not valid_hair_color(passport['hcl']): continue
    if not valid_eye_color(passport['ecl']): continue
    if not valid_passport_id(passport['pid']): continue
    valid_passports += 1
  return valid_passports

doctest.testmod()

# Running on real input

1. Use the file uploader to upload a file
2. Re-run the last cell to use the input

In [None]:
from IPython.display import display
import ipywidgets as widgets

uploader = widgets.FileUpload(accept='.txt', multiple=False)
display(uploader)

In [None]:
batch_file = list(uploader.value.values())[0]['content'].decode('utf-8')
print('[Part 1] valid passports:', count_valid_passports(batch_file))
print('[Part 2] valid passports:', part_two(batch_file))