# Day 3: Rucksack Reorganization

One Elf has the important job of loading all of the rucksacks with supplies for the jungle journey. Unfortunately, that Elf didn't quite follow the packing instructions, and so a few items now need to be rearranged.

Each rucksack has two large compartments. All items of a given type are meant to go into exactly one of the two compartments. The Elf that did the packing failed to follow this rule for exactly one item type per rucksack.

The Elves have made a list of all of the items currently in each rucksack (your puzzle input), but they need your help finding the errors. Every item type is identified by a single lowercase or uppercase letter (that is, a and A refer to different types of items).

The list of items for each rucksack is given as characters all on a single line. A given rucksack always has the same number of items in each of its two compartments, so the first half of the characters represent items in the first compartment, while the second half of the characters represent items in the second compartment.

For example, suppose you have the following list of contents from six rucksacks:

```
vJrwpWtwJgWrhcsFMMfFFhFp
jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL
PmmdzqPrVvPwwTWBwg
wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn
ttgJtRGJQctTZtZT
CrZsJsPPZsGzwwsLwLmpwMDw
```

- The first rucksack contains the items vJrwpWtwJgWrhcsFMMfFFhFp, which means its first compartment contains the items vJrwpWtwJgWr, while the second compartment contains the items hcsFMMfFFhFp. The only item type that appears in both compartments is lowercase p.
- The second rucksack's compartments contain jqHRNqRjqzjGDLGL and rsFMfFZSrLrFZsSL. The only item type that appears in both compartments is uppercase L.
- The third rucksack's compartments contain PmmdzqPrV and vPwwTWBwg; the only common item type is uppercase P.
- The fourth rucksack's compartments only share item type v.
- The fifth rucksack's compartments only share item type t.
- The sixth rucksack's compartments only share item type s.

To help prioritize item rearrangement, every item type can be converted to a priority:

- Lowercase item types a through z have priorities 1 through 26.
- Uppercase item types A through Z have priorities 27 through 52.

In the above example, the priority of the item type that appears in both compartments of each rucksack is 16 (p), 38 (L), 42 (P), 22 (v), 20 (t), and 19 (s); the sum of these is 157.

Find the item type that appears in both compartments of each rucksack. What is the sum of the priorities of those item types?

In [106]:
# Rough idea of solution:

# for each rucksack
#    find char appearing in each half
#    calculate the priority of that

In [107]:
"""
Lowercase item types a through z have priorities 1 through 26.
Uppercase item types A through Z have priorities 27 through 52.

ord() returns values as follows:

A = 65  => maps to 27  - diff of -38
Z = 90  => maps to 52  - diff of -38
a = 97  => maps to 1   - diff of -96
z = 122 => maps to 26  - diff of -96
"""
def calculate_priority(c: str) -> int:
    char_value = ord(c)
    if char_value < 91:
        return char_value - 38
    return char_value - 96

In [108]:
#@title Test calculate_priority function
import unittest

class TestCalculatePriority(unittest.TestCase):
    
    def test_a(self):
        self.assertEqual(calculate_priority('a'), 1)
    
    def test_z(self):
        self.assertEqual(calculate_priority('z'), 26)
    
    def test_A(self):
        self.assertEqual(calculate_priority('A'), 27)
    
    def test_Z(self):
        self.assertEqual(calculate_priority('Z'), 52)

unittest.main(argv=[''], verbosity=2, exit=False)

test_A (__main__.TestCalculatePriority.test_A) ... ok
test_Z (__main__.TestCalculatePriority.test_Z) ... ok
test_a (__main__.TestCalculatePriority.test_a) ... ok
test_z (__main__.TestCalculatePriority.test_z) ... ok
test_case_1 (__main__.TestFindDuplicateItem.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItem.test_case_2) ... ok
test_case_3 (__main__.TestFindDuplicateItem.test_case_3) ... ok
test_case_4 (__main__.TestFindDuplicateItem.test_case_4) ... ok
test_case_5 (__main__.TestFindDuplicateItem.test_case_5) ... ok
test_case_6 (__main__.TestFindDuplicateItem.test_case_6) ... ok
test_case_1 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_2) ... ok

----------------------------------------------------------------------
Ran 12 tests in 0.025s

OK


<unittest.main.TestProgram at 0x7f8d25b06350>

In [109]:
"""
This function splits the input in two and searches for the duplicate (letter appearing in both halves)
"""
def find_duplicate_item(rucksack: str) -> str:
    midpoint = int(len(rucksack) / 2)
    first_half = rucksack[:midpoint]
    second_half = rucksack[midpoint:]

    for c in first_half:
        if c in second_half:
            return c

    return None   

In [110]:
#@title Test calculate_priority function
import unittest

class TestFindDuplicateItem(unittest.TestCase):
        
    def test_case_1(self):
        self.assertEqual(find_duplicate_item('vJrwpWtwJgWrhcsFMMfFFhFp'), 'p')
        
    def test_case_2(self):
        self.assertEqual(find_duplicate_item('jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL'), 'L')
        
    def test_case_3(self):
        self.assertEqual(find_duplicate_item('PmmdzqPrVvPwwTWBwg'), 'P')
        
    def test_case_4(self):
        self.assertEqual(find_duplicate_item('wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn'), 'v')
        
    def test_case_5(self):
        self.assertEqual(find_duplicate_item('ttgJtRGJQctTZtZT'), 't')
        
    def test_case_6(self):
        self.assertEqual(find_duplicate_item('CrZsJsPPZsGzwwsLwLmpwMDw'), 's')

unittest.main(argv=[''], verbosity=2, exit=False)

test_A (__main__.TestCalculatePriority.test_A) ... ok
test_Z (__main__.TestCalculatePriority.test_Z) ... ok
test_a (__main__.TestCalculatePriority.test_a) ... ok
test_z (__main__.TestCalculatePriority.test_z) ... ok
test_case_1 (__main__.TestFindDuplicateItem.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItem.test_case_2) ... ok
test_case_3 (__main__.TestFindDuplicateItem.test_case_3) ... ok
test_case_4 (__main__.TestFindDuplicateItem.test_case_4) ... ok
test_case_5 (__main__.TestFindDuplicateItem.test_case_5) ... ok
test_case_6 (__main__.TestFindDuplicateItem.test_case_6) ... ok
test_case_1 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_2) ... ok

----------------------------------------------------------------------
Ran 12 tests in 0.011s

OK


<unittest.main.TestProgram at 0x7f8d1abaf890>

In [111]:
#@title Imports
import pandas as pd

In [112]:
#@title import the data
df = pd.read_csv('input.txt', sep=' ', header=None)

In [113]:
#@title Clean and convert the data

# Add Column Names
df.columns = ['Rucksack']

# add score
df['Item'] = df.apply(lambda x: find_duplicate_item(x['Rucksack']), axis=1)
df['Priority'] = df.apply(lambda x: calculate_priority(x['Item']), axis=1)

In [114]:
df['Priority'].sum()

7737

Your puzzle answer was 7737.

The first half of this puzzle is complete! It provides one gold star: *

## Part Two

As you finish identifying the misplaced items, the Elves come to you with another issue.

For safety, the Elves are divided into groups of three. Every Elf carries a badge that identifies their group. For efficiency, within each group of three Elves, the badge is the only item type carried by all three Elves. That is, if a group's badge is item type B, then all three Elves will have item type B somewhere in their rucksack, and at most two of the Elves will be carrying any other item type.

The problem is that someone forgot to put this year's updated authenticity sticker on the badges. All of the badges need to be pulled out of the rucksacks so the new authenticity stickers can be attached.

Additionally, nobody wrote down which item type corresponds to each group's badges. The only way to tell which item type is the right one is by finding the one item type that is common between all three Elves in each group.

Every set of three lines in your list corresponds to a single group, but each group can have a different badge item type. So, in the above example, the first group's rucksacks are the first three lines:

```
vJrwpWtwJgWrhcsFMMfFFhFp
jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL
PmmdzqPrVvPwwTWBwg
```

And the second group's rucksacks are the next three lines:

```
wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn
ttgJtRGJQctTZtZT
CrZsJsPPZsGzwwsLwLmpwMDw
```

In the first group, the only item type that appears in all three rucksacks is lowercase r; this must be their badges. In the second group, their badge item type must be Z.

Priorities for these items must still be found to organize the sticker attachment efforts: here, they are 18 (r) for the first group and 52 (Z) for the second group. The sum of these is 70.

Find the item type that corresponds to the badges of each three-Elf group. What is the sum of the priorities of those item types?

In [115]:
result = []

with open('input.txt') as f:
    temp = []
    for line in f:
        if len(temp) == 3:
           result.append(temp)
           temp = []
        temp.append(line)

    if len(temp) == 3:
       result.append(temp)
    else:
        if len(temp) > 0:
            raise RuntimeError(f'Leftover elements not divisible by 3: {temp}')
            
df = pd.DataFrame(result, columns=['Sack 1', 'Sack 2', 'Sack 3'])

In [116]:
df

Unnamed: 0,Sack 1,Sack 2,Sack 3
0,rZTmmqbBrmBvSTCwDDtlwjqnqnnq\n,dhgQHhPfVgPlPdFzFzFgdptCQjtnwCntjsCppRtRND\n,lVdVHWGPvTvmrrBW\n
1,GmJBqwPLhfPBfJfvfffFmwtjDprpzVpVMpcDrVjzzcjpML\n,HgWnRnWggWbNTWbCnPCgCnsjcVDrjMrdzjprMMrzcHDrDr\n,SsSsRsCSPSPBvtJt\n
2,BLtwwTBmLSTlMsjdZmFZZP\n,hzbzNNrbqQbhQDDrhCprbhDCpvFJJPjMZJZgjjPdlvZjZv...,fbNrqrVDfdfGzqcHTBTVHwcTSHcH\n
3,lcDdcrCDCRHJHBllPR\n,tNGQwQhtzLhJBRHbPMBMjGBj\n,NZZZpVqqFqpQpCTZcTnrTrnJJC\n
4,BSNLNzbLLsMGSDLSsSBdVwTVQFdTVTtqgTwNVN\n,lRjplvmWpgrqwlVFFr\n,mfRCpWwvChZCBGSzZG\n
...,...,...,...
95,zjfgjMhhgMJdfHQHWdVQvR\n,CrmpmpZpHQptHHHQ\n,CnwcFbNCqQBFwwFFsPslJgsjhMlMcDJP\n
96,HpnStLpnQnHnqQLQqpMSSWWZbswNcNqwbNsfwqGGZc\n,dVRRTCTVJNLcfJcJFb\n,gzjTRCddgLDdzdjCCrBjjdhhBnQPSSBhvlSBQvMhQMnt\n
97,lFTlwMwZlblSjrCpVvvsptspZpps\n,nHRPPnqnhPRqJHhqqhfdPqLCHvBCvvscvVNczztCCvsvtm\n,RJDghDhRhhGPPqGhsPhhFSbbwGSFjGQlWTrbwQbW\n
98,RRjgNPTRFhglgNNjTsmGqCCGZfzmHCnZGnZCqq\n,SppWLbtbCzZMpHMZ\n,dSDbbJdVVlHFNlll\n


In [117]:
"""
This function looks through all the items in the biggest rucksack and looks for the one that is in the other two
"""
def find_duplicate_item_across_rucksacks(rucksacks: []) -> str:
    rucksacks = sorted(rucksacks, key=len)
        
    for c in rucksacks[0]:
        if c in rucksacks[1] and c in rucksacks[2]:
            return c

    return None

In [118]:
#@title Test calculate_priority function
import unittest

class TestFindDuplicateItemAcrossRucksacks(unittest.TestCase):
        
    def test_case_1(self):
        rucksacks = ['vJrwpWtwJgWrhcsFMMfFFhFp', 'jqHRNqRjqzjGDLGLrsFMfFZSrLrFZsSL', 'PmmdzqPrVvPwwTWBwg']
        self.assertEqual(find_duplicate_item_across_rucksacks(rucksacks), 'r')
        
    def test_case_2(self):
        rucksacks = ['wMqvLMZHhHMvwLHjbvcjnnSBnvTQFn', 'ttgJtRGJQctTZtZT', 'CrZsJsPPZsGzwwsLwLmpwMDw']
        self.assertEqual(find_duplicate_item_across_rucksacks(rucksacks), 'Z')
        
unittest.main(argv=[''], verbosity=2, exit=False)

test_A (__main__.TestCalculatePriority.test_A) ... ok
test_Z (__main__.TestCalculatePriority.test_Z) ... ok
test_a (__main__.TestCalculatePriority.test_a) ... ok
test_z (__main__.TestCalculatePriority.test_z) ... ok
test_case_1 (__main__.TestFindDuplicateItem.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItem.test_case_2) ... ok
test_case_3 (__main__.TestFindDuplicateItem.test_case_3) ... ok
test_case_4 (__main__.TestFindDuplicateItem.test_case_4) ... ok
test_case_5 (__main__.TestFindDuplicateItem.test_case_5) ... ok
test_case_6 (__main__.TestFindDuplicateItem.test_case_6) ... ok
test_case_1 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_1) ... ok
test_case_2 (__main__.TestFindDuplicateItemAcrossRucksacks.test_case_2) ... ok

----------------------------------------------------------------------
Ran 12 tests in 0.013s

OK


<unittest.main.TestProgram at 0x7f8d1aba4b50>

In [119]:
#@title Process the data

# add score
df['Duplicate'] = df.apply(lambda x: find_duplicate_item_across_rucksacks([x['Sack 1'], x['Sack 2'], x['Sack 3']]), axis=1)
df['Priority'] = df.apply(lambda x: calculate_priority(x['Duplicate']), axis=1)

In [120]:
df['Priority'].sum()

2697