[Advent of Code 2022 - Day 7](https://adventofcode.com/2022/day/7)

**Assumption:** No file / folder is named `size` (true for this puzzle input).

In [None]:
# Reading all puzzle input lines...
with open('day7_input.txt') as f:
    lines = f.readlines()

The puzzle input is analyzed here to produce 2 outputs:
* `file_tree` (`dict`): A *virtual file tree* where each folder is represented by its size (`int`) and children / sub-tree (`dict`).
* `file_flat` (`dict`): A flattened version of `file_tree`, where each key, value represent `file_or_folder_path` and total `size`.

In [None]:
file_flat = {}                                                 # { folder_name: { 'size': size },    file_path: size_int }
file_tree = { '/': { '..': None, 'size': 0 } }                 # { folder_name: {...folder-tree...}, file_name: size_int }
cwd, pwd = file_tree, ''
i = 0
while i < len(lines):
    command = lines[i].removeprefix('$ ')
    print(i, 'COMMAND:', command)
    i += 1
    if command == 'ls':
        while i < len(lines):
            if lines[i].startswith('$ '):
                break
            print(i, 'DIR LISTING:', lines[i])
            description, name = lines[i].split()
            i += 1
            if description == 'dir':
                file_flat[pwd + '/' + name] = cwd[name] = { '..': cwd, 'size': 0 }
            else:
                size = int(description)
                file_flat[pwd + '/' + name] = cwd[name] = size
                folder = cwd
                while folder is not None:
                    folder['size'] += size
                    folder = folder['..']
    else:            
        folder_name = command.removeprefix('cd ')
        cwd = cwd[folder_name]
        if folder_name == '..':
            pwd = pwd.rpartition('/')[0]
        else:
            pwd += '/' + folder_name

Converting `file_flat` to a Pandas `DataFrame` for analysis...

In [45]:
import pandas as pd
df = pd.DataFrame(
    {
        'name': name,
        'type': 'folder' if isinstance(info, dict) else 'file',
        'size': info['size'] if isinstance(info, dict) else info
    }
    for name, info in file_flat.items()
)
df

Unnamed: 0,name,type,size
0,///cgw,folder,1994012
1,///fbhz,folder,31535668
2,///lvrzvt,folder,1447647
3,///vngq,file,224312
4,///vwlps,folder,13546432
...,...,...,...
454,///vwlps/rfb/mtzbmlnp/ttsjtcc/tjbnz,folder,73261
455,///vwlps/rfb/mtzbmlnp/ttsjtcc/fbhz/slzn.jls,file,112216
456,///vwlps/rfb/mtzbmlnp/ttsjtcc/tjbnz/fbhz.wtd,file,73261
457,///vwlps/rfb/nsq/srvhswd.mcg,file,93689


**Answer to Part 1** -> Sum of sizes of all folders having size at most 100,000

In [46]:
df[(df['type'] == 'folder') & (df['size'] <= 100_000)]['size'].sum()

1315285

In [47]:
free_space_required = 30_000_000 - 70_000_000 + file_tree['/']['size']
free_space_required

8748071

**Answer to Part 2** -> Size of Smallest folder whose deletion would free up sufficient space

In [48]:
df.iloc[df[(df['type'] == 'folder') & (df['size'] >= free_space_required)]['size'].idxmin()]

name    ///fbhz/zhj/dqzgfcd/vwhnlp
type                        folder
size                       9847279
Name: 130, dtype: object