Advent of Code, Day 5

Day 5 presents our first instance of needing to put a bit more effort into the actual data ingestion and processing as it will form the basis for our logic.

Although beautifully depicting containers, we'll remove the brackets and stack numbers from our file as they serve no functional purpose. In doing so, we will end up with a structure that is reliably interpreted via a space delim. We'll also break out the second part of the input into a move instructions file.

In [1]:
with open("data.csv") as file:
    lines = file.readlines()

In [2]:
with open("container_layout.csv",'w') as file:
    for line in lines:
        line = line.replace('[', ' ')
        line = line.replace(']', ' ')
        file.write(line)
        if not line.strip():
            break

In [3]:
with open("move_instructions.csv",'w') as file:
    for line in lines:
         if line.startswith('move'):
             file.write(line)

With two separate files for each section of the input file, we can now work on ingesting each in the way we need. We'll start with the container layout. Using Pandas `read_fwf` (read fixed width file) makes this a breeze. We'll drop the last row since we effectively have the stack designation in the dataframe index.

In [4]:
import pandas as pd

df = pd.read_fwf('container_layout.csv', header=None)

# drop the last row
df = df[:-1]
df

Unnamed: 0,0,1,2,3,4,5,6,7,8
0,,G,R,,,,,P,
1,,H,W,,T,P,,H,
2,,F,T,P,B,D,,N,
3,L,T,M,Q,L,C,,Z,
4,C,C,N,V,S,H,,V,G
5,G,L,F,D,M,V,T,J,H
6,M,D,J,F,F,N,C,S,F
7,Q,R,V,J,N,R,H,G,Z


In order to work with these stacks, let's transpose the dataframe so that we can work with the data in the rows as lists (we'll see why that will be helpful shortly).

In [5]:
df = df.transpose()
df

Unnamed: 0,0,1,2,3,4,5,6,7
0,,,,L,C,G,M,Q
1,G,H,F,T,C,L,D,R
2,R,W,T,M,N,F,J,V
3,,,P,Q,V,D,F,J
4,,T,B,L,S,M,F,N
5,,P,D,C,H,V,N,R
6,,,,,,T,C,H
7,P,H,N,Z,V,J,S,G
8,,,,,G,H,F,Z


Now let's go ahead and reverse the order of the columns to match our original stack orientation.

In [6]:
df = df.iloc[:, ::-1]
df

Unnamed: 0,7,6,5,4,3,2,1,0
0,Q,M,G,C,L,,,
1,R,D,L,C,T,F,H,G
2,V,J,F,N,M,T,W,R
3,J,F,D,V,Q,P,,
4,N,F,M,S,L,B,T,
5,R,N,V,H,C,D,P,
6,H,C,T,,,,,
7,G,S,J,V,Z,N,H,P
8,Z,F,H,G,,,,


Now we'll convert our dataframe to a primitive dictionary with the stacks represented as lists.

In [7]:
df.index += 1
d = df.T.to_dict('list')
d

{1: ['Q', 'M', 'G', 'C', 'L', nan, nan, nan],
 2: ['R', 'D', 'L', 'C', 'T', 'F', 'H', 'G'],
 3: ['V', 'J', 'F', 'N', 'M', 'T', 'W', 'R'],
 4: ['J', 'F', 'D', 'V', 'Q', 'P', nan, nan],
 5: ['N', 'F', 'M', 'S', 'L', 'B', 'T', nan],
 6: ['R', 'N', 'V', 'H', 'C', 'D', 'P', nan],
 7: ['H', 'C', 'T', nan, nan, nan, nan, nan],
 8: ['G', 'S', 'J', 'V', 'Z', 'N', 'H', 'P'],
 9: ['Z', 'F', 'H', 'G', nan, nan, nan, nan]}

Working with primitive data structures allows us to stay closer to the metal for performance reasons and narrows down the scope of function we'll need to more familiar, less library dependent functions. We'll cruise through our dictionary and remove the `nan` entries to complete the cleanup of our data.

In [8]:
for k, v in d.items():
    d[k] = [val for val in v if str(val) != 'nan']
d

{1: ['Q', 'M', 'G', 'C', 'L'],
 2: ['R', 'D', 'L', 'C', 'T', 'F', 'H', 'G'],
 3: ['V', 'J', 'F', 'N', 'M', 'T', 'W', 'R'],
 4: ['J', 'F', 'D', 'V', 'Q', 'P'],
 5: ['N', 'F', 'M', 'S', 'L', 'B', 'T'],
 6: ['R', 'N', 'V', 'H', 'C', 'D', 'P'],
 7: ['H', 'C', 'T'],
 8: ['G', 'S', 'J', 'V', 'Z', 'N', 'H', 'P'],
 9: ['Z', 'F', 'H', 'G']}

With our container layout data prepared to our liking, let's read in our move instructions.

In [9]:
moves = pd.read_csv("move_instructions.csv", header=None, sep=' ', usecols=[1, 3, 5], names=['move', 'source', 'target'])
moves.head()

Unnamed: 0,move,source,target
0,5,8,2
1,2,4,5
2,3,3,9
3,4,1,8
4,5,9,1


Having spent the bulf of our time in data prep, applying the move instructions is a simple matter of applying each move to each stack!

In [10]:
for idx, row in moves.iterrows():
    for m in range(row.move):
        if d[row.source]:
            d[row.target].append(d[row.source].pop())

d

{1: ['D', 'N', 'H', 'M', 'T', 'Z', 'W', 'R', 'V'],
 2: ['C'],
 3: ['T'],
 4: ['D', 'F'],
 5: ['T'],
 6: ['N', 'P', 'J'],
 7: ['J', 'G', 'N', 'R', 'H', 'G', 'Q'],
 8: ['D',
  'L',
  'F',
  'N',
  'V',
  'H',
  'S',
  'Q',
  'M',
  'P',
  'H',
  'H',
  'F',
  'V',
  'C',
  'L',
  'F',
  'J',
  'C',
  'M',
  'R',
  'T',
  'Z',
  'L',
  'V',
  'G',
  'C'],
 9: ['F', 'P', 'S', 'B', 'G']}