# --- Day 1: Report Repair ---
## Key words: Math calculation, reverse calculation
After saving Christmas five years in a row, you've decided to take a vacation at a nice resort on a tropical island. Surely, Christmas will go on without you.

The tropical island has its own currency and is entirely cash-only. The gold coins used there have a little picture of a starfish; the locals just call them stars. None of the currency exchanges seem to have heard of them, but somehow, you'll need to find fifty of these coins by the time you arrive so you can pay the deposit on your room.

To save your vacation, you need to get all fifty stars by December 25th.

Collect stars by solving puzzles. Two puzzles will be made available on each day in the Advent calendar; the second puzzle is unlocked when you complete the first. Each puzzle grants one star. Good luck!

Before you leave, the Elves in accounting just need you to fix your expense report (your puzzle input); apparently, something isn't quite adding up.

Specifically, they need you to find the two entries that sum to 2020 and then multiply those two numbers together.

For example, suppose your expense report contained the following:

1721<br>
979<br>
366<br>
299<br>
675<br>
1456<br>
In this list, the two entries that sum to 2020 are 1721 and 299. Multiplying them together produces 1721 * 299 = 514579, so the correct answer is 514579.

Of course, your expense report is much larger. Find the two entries that sum to 2020; what do you get if you multiply them together?

Your puzzle answer was 838624.

In [174]:
import pandas as pd

In [181]:
df1 = pd.read_excel('1.xlsx', header=None)
df1.head()

Unnamed: 0,0
0,2008
1,1529
2,1594
3,1422
4,1518


In [176]:
list = df1[0].tolist()

In [177]:
for a in range(len(list)):
    for b in range(len(list) - a - 1):
        if list[a] + list[b + a + 1] == 2020:
            print ("Answer: " +str(list[a] * list[b + a + 1]))                          

Answer: 838624


____
# --- Part Two ---
The Elves in accounting are thankful for your help; one of them even offers you a starfish coin they had left over from a past vacation. They offer you a second one if you can find three numbers in your expense report that meet the same criteria.

Using the above example again, the three entries that sum to 2020 are 979, 366, and 675. Multiplying them together produces the answer, 241861950.

In your expense report, what is the product of the three entries that sum to 2020?

Your puzzle answer was 52764180.

Both parts of this puzzle are complete! They provide two gold stars: **

In [178]:
for a in range(len(list)):
    for b in range(len(list) - a - 1):
        for c in range(len(list) - a - b - 2):
            if list[a] + list[b + a + 1] + list[a + b + c + 2] == 2020:
                print ('Answer: ' + str(list[a] * list[b + a + 1] * list[a + b + c + 2]))                 

Answer: 52764180


# --- Day 2: Password Philosophy ---
## Key words: Text analysis, text validation
Your flight departs in a few days from the coastal airport; the easiest way down to the coast from here is via toboggan.

The shopkeeper at the North Pole Toboggan Rental Shop is having a bad day. "Something's wrong with our computers; we can't log in!" You ask if you can take a look.

Their password database seems to be a little corrupted: some of the passwords wouldn't have been allowed by the Official Toboggan Corporate Policy that was in effect when they were chosen.

To try to debug the problem, they have created a list (your puzzle input) of passwords (according to the corrupted database) and the corporate policy when that password was set.

For example, suppose you have the following list:

1-3 a: <br>
1-3 b: cdefg <br>
2-9 c: ccccccccc <br>
Each line gives the password policy and then the password. The password policy indicates the lowest and highest number of times a given letter must appear for the password to be valid. For example, 1-3 a means that the password must contain a at least 1 time and at most 3 times.

In the above example, 2 passwords are valid. The middle password, cdefg, is not; it contains no instances of b, but needs at least 1. The first and third passwords are valid: they contain one a or nine c, both within the limits of their respective policies.

How many passwords are valid according to their policies?

Your puzzle answer was 548.

In [None]:
import pandas as pd

In [200]:
df2 = pd.read_excel('2.xlsx', names = ['Input'])

In [190]:
df2.head()

Unnamed: 0,Input
0,1-14 b: bbbbbbbbbbbbbbbbbbb
1,3-14 v: vvpvvvmvvvvvvvv
2,2-5 m: mfvxmmm
3,15-20 z: zdzzzrjzzzdpzzdzzzzz
4,6-8 g: tggjggggrg


In [201]:
df2[['From', 'Pass']] = df2['Input'].str.split('-', expand = True)
df2[['To', 'Pass']] = df2['Pass'].str.split(n = 1, expand = True)
df2[['Letter', 'Pass']] = df2['Pass'].str.split(':', n = 1, expand = True)
df2.drop('Input', axis =1, inplace = True)
df2 = df2[['From', 'To', 'Letter', 'Pass']]

In [202]:
df2.head()

Unnamed: 0,From,To,Letter,Pass
0,1,14,b,bbbbbbbbbbbbbbbbbbb
1,3,14,v,vvpvvvmvvvvvvvv
2,2,5,m,mfvxmmm
3,15,20,z,zdzzzrjzzzdpzzdzzzzz
4,6,8,g,tggjggggrg


In [203]:
df2['To'] = pd.to_numeric(df2['To'])
df2['From'] = pd.to_numeric(df2['From'])

In [206]:
df2['Count'] = df2.apply(lambda x: x['Pass'].count(x['Letter']), axis = 1)

In [207]:
df2['Correct'] = df2.apply(lambda x: True if x['To'] >= x['Count'] >= x['From'] else False, axis = 1)

In [209]:
df2.head()

Unnamed: 0,From,To,Letter,Pass,Count,Correct
0,1,14,b,bbbbbbbbbbbbbbbbbbb,19,False
1,3,14,v,vvpvvvmvvvvvvvv,13,True
2,2,5,m,mfvxmmm,4,True
3,15,20,z,zdzzzrjzzzdpzzdzzzzz,14,False
4,6,8,g,tggjggggrg,7,True


In [213]:
print('Answer: ' + str(df2['Correct'].sum()))

Answer: 548


____
# --- Part Two ---
While it appears you validated the passwords correctly, they don't seem to be what the Official Toboggan Corporate Authentication System is expecting.

The shopkeeper suddenly realizes that he just accidentally explained the password policy rules from his old job at the sled rental place down the street! The Official Toboggan Corporate Policy actually works a little differently.

Each policy actually describes two positions in the password, where 1 means the first character, 2 means the second character, and so on. (Be careful; Toboggan Corporate Policies have no concept of "index zero"!) Exactly one of these positions must contain the given letter. Other occurrences of the letter are irrelevant for the purposes of policy enforcement.

Given the same example list from above:

1-3 a: abcde is valid: position 1 contains a and position 3 does not.<br>
1-3 b: cdefg is invalid: neither position 1 nor position 3 contains b.<br>
2-9 c: ccccccccc is invalid: both position 2 and position 9 contain c.<br>
How many passwords are valid according to the new interpretation of the policies?<br>

Your puzzle answer was 502.

Both parts of this puzzle are complete! They provide two gold stars: **

In [217]:
df2['Correct_2'] = df2.apply(lambda x: 
        False if x['Pass'][x['From']] == x['Letter'] and x['Pass'][x['To']] == x['Letter'] else
        (True if x['Pass'][x['From']] == x['Letter'] or x['Pass'][x['To']] == x['Letter'] else False), 
        axis = 1)

In [220]:
print('Answer: ' + str(df2['Correct_2'].sum()))

Answer: 502


____

# --- Day 4: Passport Processing ---
## Key words: Unstructured data, data validation, regex validation
You arrive at the airport only to realize that you grabbed your North Pole Credentials instead of your passport. While these documents are extremely similar, North Pole Credentials aren't issued by a country and therefore aren't actually valid documentation for travel in most of the world.

It seems like you're not the only one having problems, though; a very long line has formed for the automatic passport scanners, and the delay could upset your travel itinerary.

Due to some questionable network security, you realize you might be able to solve both of these problems at the same time.

The automatic passport scanners are slow because they're having trouble detecting which passports have all required fields. The expected fields are as follows:

byr (Birth Year) <br>
iyr (Issue Year)<br>
eyr (Expiration Year)<br>
hgt (Height)<br>
hcl (Hair Color)<br>
ecl (Eye Color)<br>
pid (Passport ID)<br>
cid (Country ID)<br>
Passport data is validated in batch files (your puzzle input). Each passport is represented as a sequence of key:value pairs separated by spaces or newlines. Passports are separated by blank lines.

Here is an example batch file containing four passports:

ecl:gry pid:860033327 eyr:2020 hcl:#fffffd<br>
byr:1937 iyr:2017 cid:147 hgt:183cm<br>

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884<br>
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013<br>
eyr:2024<br>
ecl:brn pid:760753108 byr:1931<br>
hgt:179cm<br>

hcl:#cfa07d eyr:2025 pid:166559648<br>
iyr:2011 ecl:brn hgt:59in<br><br>
The first passport is valid - all eight fields are present. The second passport is invalid - it is missing hgt (the Height field).

The third passport is interesting; the only missing field is cid, so it looks like data from North Pole Credentials, not a passport at all! Surely, nobody would mind if you made the system temporarily ignore missing cid fields. Treat this "passport" as valid.

The fourth passport is missing two fields, cid and byr. Missing cid is fine, but missing any other field is not, so this passport is invalid.

According to the above rules, your improved system would report 2 valid passports.

Count the number of valid passports - those that have all required fields. Treat cid as optional. In your batch file, how many passports are valid?

Your puzzle answer was 254.

In [231]:
df4 = pd.read_excel('4.xlsx', skip_blank_lines = False, names = ['Input'])

In [232]:
df4.head()

Unnamed: 0,Input
0,iyr:2010 ecl:gry hgt:181cm
1,pid:591597745 byr:1920 hcl:#6b5442 eyr:2029 ci...
2,
3,cid:223 byr:1927
4,hgt:177cm hcl:#602927 iyr:2016 pid:404183620


In [233]:
df4['Pass'] = ''

In [234]:
a=1
for i in range(len(df4['Input'])):
    if df4['Input'][i] != df4['Input'][i]:
        a = a + 1
    df4['Pass'][i] = a
        

In [235]:
df4_2 = df4.set_index('Pass')['Input'].str.split().apply(pd.Series).stack().reset_index()
df4_2.drop(labels = 'level_1', axis = 1, inplace = True)
df4_2.columns = ['Pass', 'Data']
df4_2[['Metric', 'Value']] = df4_2['Data'].str.split(':', expand = True)
df4_2.drop(['Data'], axis = 1, inplace = True)

In [237]:
df4_pivot = pd.pivot(df4_2, values = 'Value', index = 'Pass', columns = 'Metric')
df4_pivot.head()

Metric,byr,cid,ecl,eyr,hcl,hgt,iyr,pid
Pass,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,1920.0,123.0,gry,2029,#6b5442,181cm,2010,591597745
2,1927.0,223.0,amb,2020,#602927,177cm,2016,404183620
3,1998.0,178.0,hzl,2030,#a97842,166cm,2014,594143498
4,,,hzl,2024,#de745c,157cm,2018,795349208
5,1978.0,117.0,hzl,2025,#18171d,159cm,2018,364060467


In [238]:
print('Answer: ' + str(df4_pivot.drop('cid', axis = 1).dropna().shape[0]))

Answer: 254


____
# --- Part Two ---
The line is moving more quickly now, but you overhear airport security talking about how passports with invalid data are getting through. Better add some data validation, quick!

You can continue to ignore the cid field, but each other field has strict rules about what values are valid for automatic validation:

byr (Birth Year) - four digits; at least 1920 and at most 2002.<br>
iyr (Issue Year) - four digits; at least 2010 and at most 2020.<br>
eyr (Expiration Year) - four digits; at least 2020 and at most 2030.<br>
hgt (Height) - a number followed by either cm or in:<br>
If cm, the number must be at least 150 and at most 193.<br>
If in, the number must be at least 59 and at most 76.<br>
hcl (Hair Color) - a # followed by exactly six characters 0-9 or a-f.<br>
ecl (Eye Color) - exactly one of: amb blu brn gry grn hzl oth.<br>
pid (Passport ID) - a nine-digit number, including leading zeroes.<br>
cid (Country ID) - ignored, missing or not.<br><br>
Your job is to count the passports where all required fields are both present and valid according to the above rules. Here are some example values:

byr valid:   2002<br>
byr invalid: 2003<br>

hgt valid:   60in<br>
hgt valid:   190cm<br>
hgt invalid: 190in<br>
hgt invalid: 190<br>

hcl valid:   #123abc<br>
hcl invalid: #123abz<br>
hcl invalid: 123abc<br>

ecl valid:   brn<br>
ecl invalid: wat<br>

pid valid:   000000001<br>
pid invalid: 0123456789<br>
Here are some invalid passports:

eyr:1972 cid:100<br>
hcl:#18171d ecl:amb hgt:170 pid:186cm iyr:2018 byr:1926<br>

iyr:2019<br>
hcl:#602927 eyr:1967 hgt:170cm<br>
ecl:grn pid:012533040 byr:1946<br>

hcl:dab227 iyr:2012<br>
ecl:brn hgt:182cm pid:021572410 eyr:2020 byr:1992 cid:277<br>

hgt:59cm ecl:zzz<br>
eyr:2038 hcl:74454a iyr:2023<br>
pid:3556412378 byr:2007<br>
Here are some valid passports:

pid:087499704 hgt:74in ecl:grn iyr:2012 eyr:2030 byr:1980<br>
hcl:#623a2f<br>

eyr:2029 ecl:blu cid:129 byr:1989<br>
iyr:2014 pid:896056539 hcl:#a97842 hgt:165cm<br>

hcl:#888785<br>
hgt:164cm byr:2001 iyr:2015 cid:88<br>
pid:545766238 ecl:hzl<br>
eyr:2022<br>

iyr:2010 hgt:158cm hcl:#b6652a ecl:blu byr:1944 eyr:2021 pid:093154719<br>
Count the number of valid passports - those that have all required fields and valid values. Continue to treat cid as optional. In your batch file, how many passports are valid?

Your puzzle answer was 184.

Both parts of this puzzle are complete! They provide two gold stars: **

In [240]:
import re

In [239]:
df4_3 = df4_pivot.drop('cid', axis = 1)
df4_3 = df4_3.dropna()

In [241]:
df4_3['Test'] = df4_3.apply(lambda x: True if
    1920 <= pd.to_numeric(x['byr']) <= 2002 
    and 2010 <= pd.to_numeric(x['iyr']) <= 2020
    and 2020 <= pd.to_numeric(x['eyr']) <= 2030
    and ((True if x['hgt'][-2:] == 'cm' and 150 <= pd.to_numeric(x['hgt'][:-2]) <= 193
        or x['hgt'][-2:] == 'in' and 59 <= pd.to_numeric(x['hgt'][:-2]) <= 76 else False))
    and bool(re.match("^#+[0-9a-f]{6}$", x['hcl']))
    and x['ecl'] in ['amb', 'blu', 'brn', 'gry', 'grn', 'hzl', 'oth']
    and bool(re.match("[0-9]{9}$", x['pid']))                   
    else False, axis = 1)

In [244]:
print('Answer: ' + str(df4_3['Test'].sum()))

Answer: 184


____
# --- Day 5: Binary Boarding ---
## Key words: Binary numbers
You board your plane only to discover a new problem: you dropped your boarding pass! You aren't sure which seat is yours, and all of the flight attendants are busy with the flood of people that suddenly made it through passport control.

You write a quick program to use your phone's camera to scan all of the nearby boarding passes (your puzzle input); perhaps you can find your seat through process of elimination.

Instead of zones or groups, this airline uses binary space partitioning to seat people. A seat might be specified like FBFBBFFRLR, where F means "front", B means "back", L means "left", and R means "right".

The first 7 characters will either be F or B; these specify exactly one of the 128 rows on the plane (numbered 0 through 127). Each letter tells you which half of a region the given seat is in. Start with the whole list of rows; the first letter indicates whether the seat is in the front (0 through 63) or the back (64 through 127). The next letter indicates which half of that region the seat is in, and so on until you're left with exactly one row.

For example, consider just the first seven characters of FBFBBFFRLR:

Start by considering the whole range, rows 0 through 127.<br>
F means to take the lower half, keeping rows 0 through 63.<br>
B means to take the upper half, keeping rows 32 through 63.<br>
F means to take the lower half, keeping rows 32 through 47.<br>
B means to take the upper half, keeping rows 40 through 47.<br>
B keeps rows 44 through 47.<br>
F keeps rows 44 through 45.<br>
The final F keeps the lower of the two, row 44.<br>
The last three characters will be either L or R; these specify exactly one of the 8 columns of seats on the plane (numbered 0 through 7). The same process as above proceeds again, this time with only three steps. L means to keep the lower half, while R means to keep the upper half.

For example, consider just the last 3 characters of FBFBBFFRLR:

Start by considering the whole range, columns 0 through 7.<br>
R means to take the upper half, keeping columns 4 through 7.<br>
L means to take the lower half, keeping columns 4 through 5.<br>
The final R keeps the upper of the two, column 5.<br>
So, decoding FBFBBFFRLR reveals that it is the seat at row 44, column 5.

Every seat also has a unique seat ID: multiply the row by 8, then add the column. In this example, the seat has ID 44 * 8 + 5 = 357.

Here are some other boarding passes:

BFFFBBFRRR: row 70, column 7, seat ID 567.<br>
FFFBBBFRRR: row 14, column 7, seat ID 119.<br>
BBFFBBFRLL: row 102, column 4, seat ID 820.<br>
As a sanity check, look through your list of boarding passes. What is the highest seat ID on a boarding pass?

Your puzzle answer was 888.

In [18]:
import pandas as pd
import math

In [246]:
df5 = pd.read_excel('5.xlsx', header = None, names = ['Seat'])
df5.head()

Unnamed: 0,Seat
0,FBBBFFFLRL
1,BFFBBBBLRL
2,FBFFFBBRRR
3,FBFBBFFLRL
4,FFFBFBBRLR


In [256]:
def seat_id(x):
    a = 0
    r = 0 
    c = 0
    for a in range(7):
        if x[a] == 'B':
            r = r + (128 / (2**(a+1)))
    for b in range(3):
        if x[b+7] == 'R':
            c = c + (8 / (2**(b+1)))
    return int(r * 8 + c)

In [257]:
df5['Seat_ID'] = df5.apply(lambda x: seat_id(x['Seat']), axis = 1)

In [258]:
print('Answer: ' + str(max(df5['Seat_ID'])))

Answer: 888


___
# --- Part Two ---
Ding! The "fasten seat belt" signs have turned on. Time to find your seat.

It's a completely full flight, so your seat should be the only missing boarding pass in your list. However, there's a catch: some of the seats at the very front and back of the plane don't exist on this aircraft, so they'll be missing from your list as well.

Your seat wasn't at the very front or back, though; the seats with IDs +1 and -1 from yours will be in your list.

What is the ID of your seat?

Your puzzle answer was 522.

In [278]:
df5.sort_values(by = 'Seat_ID', inplace = True)
df5.reset_index(drop = True, inplace = True)

In [308]:
print('Answer: ' + str(int(df5[df5['Seat_ID'].diff() != 1][1:]['Seat_ID']-1)))

Answer: 522


____
# --- Day 6: Custom Customs ---
## Key words: Text analysis, unstructured data
As your flight approaches the regional airport where you'll switch to a much larger plane, customs declaration forms are distributed to the passengers.

The form asks a series of 26 yes-or-no questions marked a through z. All you need to do is identify the questions for which anyone in your group answers "yes". Since your group is just you, this doesn't take very long.

However, the person sitting next to you seems to be experiencing a language barrier and asks if you can help. For each of the people in their group, you write down the questions for which they answer "yes", one per line. For example:

abcx<br>
abcy<br>
abcz<br>
In this group, there are 6 questions to which anyone answered "yes": a, b, c, x, y, and z. (Duplicate answers to the same question don't count extra; each question counts at most once.)

Another group asks for your help, then another, and eventually you've collected answers from every group on the plane (your puzzle input). Each group's answers are separated by a blank line, and within each group, each person's answers are on a single line. For example:

abc

a<br>
b<br>
c<br>

ab<br>
ac<br>

a<br>
a<br>
a<br>
a<br>

b<br>
This list represents answers from five groups:

The first group contains one person who answered "yes" to 3 questions: a, b, and c.<br>
The second group contains three people; combined, they answered "yes" to 3 questions: a, b, and c.<br>
The third group contains two people; combined, they answered "yes" to 3 questions: a, b, and c.<br>
The fourth group contains four people; combined, they answered "yes" to only 1 question, a.<br>
The last group contains one person who answered "yes" to only 1 question, b.<br>
In this example, the sum of these counts is 3 + 3 + 3 + 1 + 1 = 11.<br>

For each group, count the number of questions to which anyone answered "yes". What is the sum of those counts?

Your puzzle answer was 6703.

In [309]:
import pandas as pd

In [316]:
df6 = pd.read_excel('6.xlsx', header = None, names = ['Answer'], skip_blank_lines = False)
df6.head(10)

Unnamed: 0,Answer
0,w
1,s
2,q
3,s
4,
5,klfrwivqhc
6,w
7,wgyze
8,anw
9,


In [317]:
a=1
df6['Group'] = ''
for i in range(len(df6)):
    if df6['Answer'][i] != df6['Answer'][i]:
        a = a + 1
    df6['Group'][i] = a

In [318]:
df6_2 = df6.dropna().groupby(['Group'])['Answer'].apply(lambda x: ''.join(x.astype(str))).reset_index()

In [319]:
print('Answer: ' + str(df6_2.apply(lambda x: len(set(x['Answer'])),axis = 1).sum()))

Answer: 6703


___
# --- Part Two ---
As you finish the last group's customs declaration, you notice that you misread one word in the instructions:

You don't need to identify the questions to which anyone answered "yes"; you need to identify the questions to which everyone answered "yes"!

Using the same example as above:

abc

a<br>
b<br>
c<br>

ab<br>
ac<br>

a<br>
a<br>
a<br>
a<br>

b<br>
This list represents answers from five groups:

In the first group, everyone (all 1 person) answered "yes" to 3 questions: a, b, and c.<br>
In the second group, there is no question to which everyone answered "yes".<br>
In the third group, everyone answered yes to only 1 question, a. Since some people did not answer "yes" to b or c, they don't count.<br>
In the fourth group, everyone answered yes to only 1 question, a.<br>
In the fifth group, everyone (all 1 person) answered "yes" to 1 question, b.<br>
In this example, the sum of these counts is 3 + 0 + 1 + 1 + 1 = 6.<br>

For each group, count the number of questions to which everyone answered "yes". What is the sum of those counts?

Your puzzle answer was 3430.

Both parts of this puzzle are complete! They provide two gold stars: **

In [325]:
df6['Count'] = ''
df6_3 = df6.dropna().groupby(['Group'])['Count'].apply(lambda x: x.count()).reset_index()
df6_4 = pd.merge(df6_2, df6_3, on = 'Group')
df6_4.head()

Unnamed: 0,Group,Answer,Count
0,1,wsqs,4
1,2,klfrwivqhcwwgyzeanw,4
2,3,khfraeogtbdscwrdofujgnmydfrgodgjoqmrfyrfdgzpon,5
3,4,xmivszjcnqhdaefwvanjddvjuywnaegjwvndkboa,4
4,5,mraiyzpxngdlynzdmgkxwpaiolr,2


In [326]:
df6_4['Result'] = ''
for a in range(len(df6_4)):
    c = 0
    for b in set(df6_4['Answer'][a]):
        if df6_4['Answer'][a].count(b) == df6_4['Count'][a]:
            c = c + 1
    df6_4['Result'][a] = c

A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  import sys


In [328]:
print('Answer: ' + str(df6_4['Result'].sum()))

Answer: 3430


___
# --- Day 7: Handy Haversacks ---
## Key words: Data manipulation, nested loop calculation
You land at the regional airport in time for your next flight. In fact, it looks like you'll even have time to grab some food: all flights are currently delayed due to issues in luggage processing.

Due to recent aviation regulations, many rules (your puzzle input) are being enforced about bags and their contents; bags must be color-coded and must contain specific quantities of other color-coded bags. Apparently, nobody responsible for these regulations considered how long they would take to enforce!

For example, consider the following rules:

light red bags contain 1 bright white bag, 2 muted yellow bags.<br>
dark orange bags contain 3 bright white bags, 4 muted yellow bags.<br>
bright white bags contain 1 shiny gold bag.<br>
muted yellow bags contain 2 shiny gold bags, 9 faded blue bags.<br>
shiny gold bags contain 1 dark olive bag, 2 vibrant plum bags.<br>
dark olive bags contain 3 faded blue bags, 4 dotted black bags.<br>
vibrant plum bags contain 5 faded blue bags, 6 dotted black bags.<br>
faded blue bags contain no other bags.<br>
dotted black bags contain no other bags.<br>
These rules specify the required contents for 9 bag types. In this example, every faded blue bag is empty, every vibrant plum bag contains 11 bags (5 faded blue and 6 dotted black), and so on.

You have a shiny gold bag. If you wanted to carry it in at least one other bag, how many different bag colors would be valid for the outermost bag? (In other words: how many colors can, eventually, contain at least one shiny gold bag?)

In the above rules, the following options would be available to you:

A bright white bag, which can hold your shiny gold bag directly.<br>
A muted yellow bag, which can hold your shiny gold bag directly, plus some other bags.<br>
A dark orange bag, which can hold bright white and muted yellow bags, either of which could then hold your shiny gold bag.<br>
A light red bag, which can hold bright white and muted yellow bags, either of which could then hold your shiny gold bag.<br>
So, in this example, the number of bag colors that can eventually contain at least one shiny gold bag is 4.

How many bag colors can eventually contain at least one shiny gold bag? (The list of rules is quite long; make sure you get all of it.)

Your puzzle answer was 246.

In [1]:
import pandas as pd

In [2]:
df7 = pd.read_excel('7.xlsx', header = None, names = ['Rule'])
df7.head()

Unnamed: 0,Rule
0,"dark olive bags contain 2 muted brown bags, 1 ..."
1,"faded coral bags contain 3 drab cyan bags, 1 l..."
2,plaid plum bags contain 2 mirrored cyan bags.
3,clear maroon bags contain 1 dotted turquoise b...
4,"plaid coral bags contain 3 posh fuchsia bags, ..."


In [3]:
df7[['Bag', 'Inside']] = df7['Rule'].str.split('contain', expand = True)
df7.drop('Rule', axis = 1, inplace = True)

In [4]:
df7_2 = df7.copy()
df7_2.replace([' bags', ' bag', '\.'], "", regex = True, inplace = True)
df7_2['Inside'] = df7_2['Inside'].str.replace('\s\d+', '')
df7_2['Inside'] = df7_2['Inside'].str.strip()
df7_2['Bag'] = df7_2['Bag'].str.strip()
df7_2.head()

Unnamed: 0,Bag,Inside
0,dark olive,"muted brown, mirrored tomato, bright black"
1,faded coral,"drab cyan, light aqua"
2,plaid plum,mirrored cyan
3,clear maroon,"dotted turquoise, dim lavender"
4,plaid coral,"posh fuchsia, dotted beige, wavy turquoise"


In [None]:
#import time
#t_end = time.time() + 10
#while time.time() < t_end:

In [5]:
List_to_check = ['shiny gold']
List_checked = []
List_correct = []

while List_to_check != []:
    for a in List_to_check:
        for i in range(len(df7_2)):
            if a in df7_2['Inside'][i] and df7_2['Bag'][i] not in List_to_check and df7_2['Bag'][i] not in List_checked:
                List_to_check.append(df7_2['Bag'][i])
            if a in df7_2['Inside'][i] and df7_2['Bag'][i] not in List_correct: 
                List_correct.append(df7_2['Bag'][i])
        List_to_check.remove(a)
        List_checked.append(a)

In [6]:
print('Answer: ' + str(len(List_correct)))

Answer: 246


___
#  Part Two ---
It's getting pretty expensive to fly these days - not because of ticket prices, but because of the ridiculous number of bags you need to buy!

Consider again your shiny gold bag and the rules from the above example:

faded blue bags contain 0 other bags.<br>
dotted black bags contain 0 other bags.<br>
vibrant plum bags contain 11 other bags: 5 faded blue bags and 6 dotted black bags.<br>
dark olive bags contain 7 other bags: 3 faded blue bags and 4 dotted black bags.<br>
So, a single shiny gold bag must contain 1 dark olive bag (and the 7 bags within it) plus 2 vibrant plum bags (and the 11 bags within each of those): 1 + 1*7 + 2 + 2*11 = 32 bags!

Of course, the actual rules have a small chance of going several levels deeper than this example; be sure to count all of the bags, even if the nesting becomes topologically impractical!

Here's another example:

shiny gold bags contain 2 dark red bags.<br>
dark red bags contain 2 dark orange bags.<br>
dark orange bags contain 2 dark yellow bags.<br>
dark yellow bags contain 2 dark green bags.<br>
dark green bags contain 2 dark blue bags.<br>
dark blue bags contain 2 dark violet bags.<br>
dark violet bags contain no other bags.<br>
In this example, a single shiny gold bag must contain 126 other bags.<br>

How many individual bags are required inside your single shiny gold bag?

In [101]:
df7_3 = df7.copy()
df7_3.replace([' bags', ' bag', '\.'], "", regex = True, inplace = True)
df7_3['Inside'] = df7_3['Inside'].str.strip()
df7_3['Bag'] = df7_3['Bag'].str.strip()
df7_3.head()

Unnamed: 0,Bag,Inside
0,dark olive,"2 muted brown, 1 mirrored tomato, 4 bright black"
1,faded coral,"3 drab cyan, 1 light aqua"
2,plaid plum,2 mirrored cyan
3,clear maroon,"1 dotted turquoise, 3 dim lavender"
4,plaid coral,"3 posh fuchsia, 3 dotted beige, 2 wavy turquoise"


In [100]:
def List_def(lst): 
    res = [] 
    for el in lst: 
        sub = el.split(maxsplit = 1) 
        try:
            sub[0] = int(sub[0])
        except ValueError:
            pass
        res.append(sub)  
    return(res)

In [102]:
df7_3['Inside'] = df7_3.apply(lambda x: x['Inside'].split(", "), axis = 1)
df7_3['Inside'] = df7_3.apply(lambda x: List_def(x['Inside']), axis = 1)

In [120]:
df7_3.head()

Unnamed: 0,Bag,Inside
0,dark olive,"[[2, muted brown], [1, mirrored tomato], [4, b..."
1,faded coral,"[[3, drab cyan], [1, light aqua]]"
2,plaid plum,"[[2, mirrored cyan]]"
3,clear maroon,"[[1, dotted turquoise], [3, dim lavender]]"
4,plaid coral,"[[3, posh fuchsia], [3, dotted beige], [2, wav..."


In [None]:
#import time
#t_end = time.time() + 1
#while time.time() < t_end:

In [121]:
List_to_include = [[1, 'shiny gold']]
x = 0

while List_to_include != []:
    for a in List_to_include:
        for b in df7_3[df7_3['Bag'] == a[1]]['Inside']:
            for c in b:
                if c[0] != 'no':
                    List_to_include.append([a[0]*c[0], c[1]])
                    x += a[0]*c[0]
        List_to_include.remove(a)
print('Answer: ', str(x))

Answer:  2976


# --- Day 8: Handheld Halting ---
Your flight to the major airline hub reaches cruising altitude without incident. While you consider checking the in-flight menu for one of those drinks that come with a little umbrella, you are interrupted by the kid sitting next to you.

Their handheld game console won't turn on! They ask if you can take a look.

You narrow the problem down to a strange infinite loop in the boot code (your puzzle input) of the device. You should be able to fix it, but first you need to be able to run the code in isolation.

The boot code is represented as a text file with one instruction per line of text. Each instruction consists of an operation (acc, jmp, or nop) and an argument (a signed number like +4 or -20).

acc increases or decreases a single global value called the accumulator by the value given in the argument. For example, acc +7 would increase the accumulator by 7. The accumulator starts at 0. After an acc instruction, the instruction immediately below it is executed next.
jmp jumps to a new instruction relative to itself. The next instruction to execute is found using the argument as an offset from the jmp instruction; for example, jmp +2 would skip the next instruction, jmp +1 would continue to the instruction immediately below it, and jmp -20 would cause the instruction 20 lines above to be executed next.
nop stands for No OPeration - it does nothing. The instruction immediately below it is executed next.
For example, consider the following program:

nop +0<br>
acc +1<br>
jmp +4<br>
acc +3<br>
jmp -3<br>
acc -99<br>
acc +1<br>
jmp -4<br>
acc +6<br>
These instructions are visited in this order:

nop +0  | 1<br>
acc +1  | 2, 8(!)<br>
jmp +4  | 3<br>
acc +3  | 6<br>
jmp -3  | 7<br>
acc -99 |<br>
acc +1  | 4<br>
jmp -4  | 5<br>
acc +6  |<br>
First, the nop +0 does nothing. Then, the accumulator is increased from 0 to 1 (acc +1) and jmp +4 sets the next instruction to the other acc +1 near the bottom. After it increases the accumulator from 1 to 2, jmp -4 executes, setting the next instruction to the only acc +3. It sets the accumulator to 5, and jmp -3 causes the program to continue back at the first acc +1.

This is an infinite loop: with this sequence of jumps, the program will run forever. The moment the program tries to run any instruction a second time, you know it will never terminate.

Immediately before the program would run an instruction a second time, the value in the accumulator is 5.

Run your copy of the boot code. Immediately before any instruction is executed a second time, what value is in the accumulator?

In [122]:
import pandas as pd

In [174]:
df8 = pd.read_excel('8.xlsx', header = None, names = ['Input'])
df8.head()

Unnamed: 0,Input
0,acc +8
1,nop +139
2,nop +383
3,jmp +628
4,acc -6


In [175]:
df8[['Command', 'Value']] = df8['Input'].str.split(expand = True)
df8.drop('Input', axis = 1, inplace = True)
df8['Value'] = pd.to_numeric(df8['Value'])

In [224]:
df8.tail()

Unnamed: 0,Command,Value
641,acc,16
642,acc,35
643,nop,-584
644,acc,-12
645,jmp,1


In [219]:
#import time
#t_end = time.time() + 3
#while time.time() < t_end:

steps =[]
acc = 0
a = 0
a_temp = 0

while a not in steps:
    steps.append(a)
    if df8[df8.index == a]['Command'].iloc[0] == 'acc':
        acc += df8[df8.index == a]['Value'].iloc[0]
        a += 1
    elif df8[df8.index == a]['Command'].iloc[0] == 'nop':
        a += 1
    else:       
        a += df8[df8.index == a]['Value'].iloc[0]
print('Answer: ', acc)

Answer:  1671


In [221]:
steps.sort()
len(steps)

217

___
# --- Part Two ---
After some careful analysis, you believe that exactly one instruction is corrupted.

Somewhere in the program, either a jmp is supposed to be a nop, or a nop is supposed to be a jmp. (No acc instructions were harmed in the corruption of this boot code.)

The program is supposed to terminate by attempting to execute an instruction immediately after the last instruction in the file. By changing exactly one jmp or nop, you can repair the boot code and make it terminate correctly.

For example, consider the same program from above:

nop +0<br>
acc +1<br>
jmp +4<br>
acc +3<br>
jmp -3<br>
acc -99<br>
acc +1<br>
jmp -4<br>
acc +6<br>
If you change the first instruction from nop +0 to jmp +0, it would create a single-instruction infinite loop, never leaving that instruction. If you change almost any of the jmp instructions, the program will still eventually find another jmp instruction and loop forever.

However, if you change the second-to-last instruction (from jmp -4 to nop -4), the program terminates! The instructions are visited in this order:

nop +0  | 1<br>
acc +1  | 2<br>
jmp +4  | 3<br>
acc +3  |<br>
jmp -3  |<br>
acc -99 |<br>
acc +1  | 4<br>
nop -4  | 5<br>
acc +6  | 6<br>
After the last instruction (acc +6), the program terminates by attempting to run the instruction below the last instruction in the file. With this change, after the program terminates, the accumulator contains the value 8 (acc +1, acc +1, acc +6).

Fix the program so that it terminates normally by changing exactly one jmp (to nop) or nop (to jmp). What is the value of the accumulator after the program terminates?

In [295]:
df8_2 = df8.copy()
df8_3 = df8_2[df8_2['Command'].isin(['jmp', 'nop'])]

In [301]:
#import time
#t_end = time.time() + 10
#while time.time() < t_end:

b = 0
while max(steps) != 646:
    if df8_2.loc[df8_2.index == df8_3.index[b], 'Command'].iloc[0] == 'nop':
        df8_2.loc[df8_2.index == df8_3.index[b], 'Command'] = 'jmp'
    else: df8_2.loc[df8_2.index == df8_3.index[b], 'Command'] = 'nop'
    steps = []
    acc = 0
    a = 0
    try:
        while a not in steps:
            steps.append(a)
            if df8_2[df8_2.index == a]['Command'].iloc[0] == 'acc':
                acc += df8_2[df8_2.index == a]['Value'].iloc[0]
                a += 1
            elif df8_2[df8_2.index == a]['Command'].iloc[0] == 'nop':
                a += 1
            else:       
                a += df8_2[df8_2.index == a]['Value'].iloc[0]
    except IndexError:
        pass
    df8_2['Command'] = df8['Command']
    b += 1
print('Answer: ', acc)

Answer:  892


# --- Day 9: Encoding Error ---
## Nested loop statement
With your neighbor happily enjoying their video game, you turn your attention to an open data port on the little screen in the seat in front of you.

Though the port is non-standard, you manage to connect it to your computer through the clever use of several paperclips. Upon connection, the port outputs a series of numbers (your puzzle input).

The data appears to be encrypted with the eXchange-Masking Addition System (XMAS) which, conveniently for you, is an old cypher with an important weakness.

XMAS starts by transmitting a preamble of 25 numbers. After that, each number you receive should be the sum of any two of the 25 immediately previous numbers. The two numbers will have different values, and there might be more than one such pair.

For example, suppose your preamble consists of the numbers 1 through 25 in a random order. To be valid, the next number must be the sum of two of those numbers:

26 would be a valid next number, as it could be 1 plus 25 (or many other pairs, like 2 and 24).<br>
49 would be a valid next number, as it is the sum of 24 and 25.<br>
100 would not be valid; no two of the previous 25 numbers sum to 100.<br>
50 would also not be valid; although 25 appears in the previous 25 numbers, the two numbers in the pair must be different.<br>
Suppose the 26th number is 45, and the first number (no longer an option, as it is more than 25 numbers ago) was 20. Now, for the next number to be valid, there needs to be some pair of numbers among 1-19, 21-25, or 45 that add up to it:

26 would still be a valid next number, as 1 and 25 are still within the previous 25 numbers.<br>
65 would not be valid, as no two of the available numbers sum to it.
64 and 66 would both be valid, as they are the result of 19+45 and 21+45 respectively.
Here is a larger example which only considers the previous 5 numbers (and has a preamble of length 5):

35<br>
20<br>
15<br>
25<br>
47<br>
40<br>
62<br>
55<br>
65<br>
95<br>
102<br>
117<br>
150<br>
182<br>
127<br>
219<br>
299<br>
277<br>
309<br>
576<br>
In this example, after the 5-number preamble, almost every number is the sum of two of the previous 5 numbers; the only number that does not follow this rule is 127.

The first step of attacking the weakness in the XMAS data is to find the first number in the list (after the preamble) which is not the sum of two of the 25 numbers before it. What is the first number that does not have this property?

In [18]:
import pandas as pd

In [39]:
df9 = pd.read_excel('9.xlsx', header = None, names = ['Input'])
df9.head()

Unnamed: 0,Input
0,30
1,17
2,44
3,5
4,10


In [7]:
c=0
test = True
for i in range(25,len(df9)):
    test = True
    for a in range(i-25, i):
        for b in range (1,a-i+26):
            c = df9.loc[a].iloc[0] + df9.loc[a-b].iloc[0]
            if c == df9.loc[i].iloc[0]:
                test = False
                break
        if c == df9.loc[i].iloc[0]:
            break
    if test:
        print('Answer: ', df9.loc[i].iloc[0])
        break

Answer:  400480901


# --- Part Two ---
The final step in breaking the XMAS encryption relies on the invalid number you just found: you must find a contiguous set of at least two numbers in your list which sum to the invalid number from step 1.

Again consider the above example:

35<br>
20<br>
15<br>
25<br>
47<br>
40<br>
62<br>
55<br>
65<br>
95<br>
102<br>
117<br>
150<br>
182<br>
127<br>
219<br>
299<br>
277<br>
309<br>
576<br>
In this list, adding up all of the numbers from 15 through 40 produces the invalid number from step 1, 127. (Of course, the contiguous set of numbers in your actual list might be much longer.)

To find the encryption weakness, add together the smallest and largest number in this contiguous range; in this example, these are 15 and 47, producing 62.

What is the encryption weakness in your XMAS-encrypted list of numbers?

In [38]:
for a in range(2, len(df9)):
    b = a
    x = 0
    mini = float('inf')
    maxi = 0
    while x < 400480901:
        x += df9.loc[b].iloc[0]
        mini = min(mini, df9.loc[b].iloc[0])
        maxi = max(maxi, df9.loc[b].iloc[0])
        b += 1
    if x == 400480901:
        print('Answer:', mini + maxi)
        break

Answer: 67587168
