### Day 4: Passport Processing

You arrive at the airport only to realize that you grabbed your North Pole Credentials instead of your passport. While these documents are extremely similar, North Pole Credentials aren't issued by a country and therefore aren't actually valid documentation for travel in most of the world.

It seems like you're not the only one having problems, though; a very long line has formed for the automatic passport scanners, and the delay could upset your travel itinerary.

Due to some questionable network security, you realize you might be able to solve both of these problems at the same time.

The automatic passport scanners are slow because they're having trouble detecting which passports have all required fields. The expected fields are as follows:

    byr (Birth Year)
    iyr (Issue Year)
    eyr (Expiration Year)
    hgt (Height)
    hcl (Hair Color)
    ecl (Eye Color)
    pid (Passport ID)
    cid (Country ID)

Passport data is validated in batch files (your puzzle input). Each passport is represented as a sequence of key:value pairs separated by spaces or newlines. Passports are separated by blank lines.

Here is an example batch file containing four passports:

    ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
    byr:1937 iyr:2017 cid:147 hgt:183cm

    iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
    hcl:#cfa07d byr:1929

    hcl:#ae17e1 iyr:2013
    eyr:2024
    ecl:brn pid:760753108 byr:1931
    hgt:179cm

    hcl:#cfa07d eyr:2025 pid:166559648
    iyr:2011 ecl:brn hgt:59in
    
The first passport is valid - all eight fields are present. The second passport is invalid - it is missing hgt (the Height field).

The third passport is interesting; the only missing field is cid, so it looks like data from North Pole Credentials, not a passport at all! Surely, nobody would mind if you made the system temporarily ignore missing cid fields. Treat this "passport" as valid.

The fourth passport is missing two fields, cid and byr. Missing cid is fine, but missing any other field is not, so this passport is invalid.

According to the above rules, your improved system would report 2 valid passports.


Count the number of valid passports - those that have all required fields. Treat cid as optional. In your batch file, how many passports are valid?

In [1]:
import re

with open("data/advent_04.txt", "r") as f:
    passports = f.read()
    passports = passports.split('\n\n')
    passports = [passport.replace('\n', ' ') for passport in passports]

In [2]:
mandatory_fields = ['byr', 'iyr', 'eyr', 'hgt', 'hcl', 'ecl', 'pid']

passports_validated = [all([field in p for field in mandatory_fields]) for p in passports]
print("Solution: The number of valid passports is {}.".format(sum(passports_validated)))

Solution: The number of valid passports is 213.


<br> 

### Part two

The line is moving more quickly now, but you overhear airport security talking about how passports with invalid data are getting through. Better add some data validation, quick!

You can continue to ignore the cid field, but each other field has strict rules about what values are valid for automatic validation:

byr (Birth Year) - four digits; at least 1920 and at most 2002.
iyr (Issue Year) - four digits; at least 2010 and at most 2020.
eyr (Expiration Year) - four digits; at least 2020 and at most 2030.
hgt (Height) - a number followed by either cm or in:
If cm, the number must be at least 150 and at most 193.
If in, the number must be at least 59 and at most 76.
hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.
ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.
pid (Passport ID) - a nine-digit number, including leading zeroes.
cid (Country ID) - ignored, missing or not.
Your job is to count the passports where all required fields are both present and valid according to the above rules.

    byr valid:   2002
    byr invalid: 2003

    hgt valid:   60in
    hgt valid:   190cm
    hgt invalid: 190in
    hgt invalid: 190

    hcl valid:   #123abc
    hcl invalid: #123abz
    hcl invalid: 123abc

    ecl valid:   brn
    ecl invalid: wat

    pid valid:   000000001
    pid invalid: 0123456789
    
    
    
<br>
Count the number of valid passports - those that have all required fields and valid values. Continue to treat cid as optional. In your batch file, how many passports are valid?

In [3]:
def is_passport_valid(passport):
    
    passport = {field[:3]: field[4:] for field in passport.split(' ')}
    
    if int(passport['byr']) < 1920 or int(passport['byr']) > 2002:
        return 0

    if int(passport['iyr']) < 2010 or int(passport['iyr']) > 2020:
        return 0

    if int(passport['eyr']) < 2020 or int(passport['eyr']) > 2030:
        return 0
    
    if re.match('^\d+(cm|in)$', passport['hgt']) is None:
        return 0
    
    height = int(passport['hgt'][:-2])
    if 'cm' in passport['hgt']:
        if height < 150 or height > 193:
            return 0
    elif height < 59 or height > 76:
        return 0
    
    if re.match('^#[0-9a-f]{6}$', passport['hcl']) is None:
        return 0
    
    if passport['ecl'] not in ['amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth']:
        return 0
    
    if re.match('^[0-9]{9}$', passport['pid']) is None:
        return 0
    
    return 1

In [4]:
valid = 0
for passport in passports:
    has_mandatory_keys = all([field in passport for field in mandatory_fields])
    if has_mandatory_keys and is_passport_valid(passport):
        valid += 1

print("Solution: There are {} valid passports.".format(valid))

Solution: There are 147 valid passports.
