# Day 7

## Part 1

To begin, find all of the directories with a total size of at most 100000, then calculate the sum of their total sizes. In the example above, these directories are a and e; the sum of their total sizes is 95437 (94853 + 584). (As in this example, this process can count files more than once!)

Find all of the directories with a total size of at most 100000. What is the sum of the total sizes of those directories?

In [19]:
from anytree import Node, RenderTree, PreOrderIter

file = open("input.txt", "r")
contents = file.read().split("\n")

main = Node("/")
current_node = main

# load tree from input
for line in contents:
    if line.startswith("$"):
        if line == "$ cd /":
            current_node = main
        if line == "$ cd ..":
            current_node = current_node.parent
        elif line.startswith("$ cd "):
            new_folder = Node(line[5:], parent = current_node)
            current_node = new_folder
    else:
        if line.startswith("dir "):
            new_folder = Node(line[4:], parent = current_node)
        else:
            new_file = Node("file " + line.split(" ")[0], parent = current_node)

def get_filesize_per_node(node: Node):
    size = sum([int(child.name.split(" ")[1]) for child in node.children if child.name.startswith("file ")])
    children = [child for child in node.children if not child.name.startswith("file ")]
    children_sizes = 0
    children_sizes += sum([get_filesize_per_node(child) for child in children])
    return size + children_sizes

threshold = 100000

def get_filesize_with_threshold(main: Node, threshold: int):
    nodes = [node for node in PreOrderIter(main) if not node.name.startswith("file")]
    sizes = [get_filesize_per_node(node) for node in nodes]
    sizes_filtered = [size for size in sizes if size <= threshold]
    return sum(sizes_filtered)

print(get_filesize_with_threshold(main, threshold))



1517599


## Part 2

Now, you're ready to choose a directory to delete.

The total disk space available to the filesystem is 70000000. To run the update, you need unused space of at least 30000000. You need to find a directory you can delete that will free up enough space to run the update.

In the example above, the total size of the outermost directory (and thus the total amount of used space) is 48381165; this means that the size of the unused space must currently be 21618835, which isn't quite the 30000000 required by the update. Therefore, the update still requires a directory with total size of at least 8381165 to be deleted before it can run.

To achieve this, you have the following options:

Delete directory e, which would increase unused space by 584.
Delete directory a, which would increase unused space by 94853.
Delete directory d, which would increase unused space by 24933642.
Delete directory /, which would increase unused space by 48381165.
Directories e and a are both too small; deleting them would not free up enough space. However, directories d and / are both big enough! Between these, choose the smallest: d, increasing unused space by 24933642.

Find the smallest directory that, if deleted, would free up enough space on the filesystem to run the update. What is the total size of that directory?

In [20]:
from anytree import Node, RenderTree, PreOrderIter

file = open("input.txt", "r")
contents = file.read().split("\n")

main = Node("/")
current_node = main
ls_flag = False

# load tree from input
for line in contents:
    if line.startswith("$"):
        if line == "$ cd /":
            current_node = main
        if line == "$ cd ..":
            current_node = current_node.parent
        elif line.startswith("$ cd "):
            new_folder = Node(line[5:], parent = current_node)
            current_node = new_folder
    else:
        if line.startswith("dir "):
            new_folder = Node(line[4:], parent = current_node)
        else:
            new_file = Node("file " + line.split(" ")[0], parent = current_node)

def get_filesize_per_node(node: Node):
    size = sum([int(child.name.split(" ")[1]) for child in node.children if child.name.startswith("file ")])
    children = [child for child in node.children if not child.name.startswith("file ")]
    children_sizes = 0
    children_sizes += sum([get_filesize_per_node(child) for child in children])
    return size + children_sizes

total_size = 70000000
size_target = 30000000
current_size = get_filesize_per_node(main)
threshold = current_size + size_target - total_size

def get_smallest_folder_size(main: Node, threshold: int):
    nodes = [node for node in PreOrderIter(main) if not node.name.startswith("file")]
    sizes = [get_filesize_per_node(node) for node in nodes]
    sizes_filtered = [size for size in sizes if size >= threshold]
    return min(sizes_filtered)

print(get_smallest_folder_size(main, threshold))

2481982
