## No Space Left On Device

You can hear birds chirping and raindrops hitting leaves as the expedition proceeds. Occasionally, you can even hear much louder sounds in the distance; how big do the animals get out here, anyway?

The device the Elves gave you has problems with more than just its communication system. You try to run a system update:

```
$ system-update --please --pretty-please-with-sugar-on-top
Error: No space left on device
```

Perhaps you can delete some files to make space for the update?

You browse around the filesystem to assess the situation and save the resulting terminal output (your puzzle input). For example:

```
$ cd /
$ ls
dir a
14848514 b.txt
8504156 c.dat
dir d
$ cd a
$ ls
dir e
29116 f
2557 g
62596 h.lst
$ cd e
$ ls
584 i
$ cd ..
$ cd ..
$ cd d
$ ls
4060174 j
8033020 d.log
5626152 d.ext
7214296 k
```

The filesystem consists of a tree of files (plain data) and directories (which can contain other directories or files). The outermost directory is called /. You can navigate around the filesystem, moving into or out of directories and listing the contents of the directory you're currently in.

Within the terminal output, lines that begin with $ are commands you executed, very much like some modern computers:

- `cd` means change directory. This changes which directory is the current directory, but the specific result depends on the argument:

  - `cd x` moves in one level: it looks in the current directory for the
    directory named x and makes it the current directory.

  - `cd ..` moves out one level: it finds the directory that contains
    the current directory, then makes that directory the current directory.

  - `cd /` switches the current directory to the outermost directory, /.
  
- `ls` means list. It prints out all of the files and directories immediately contained by the current directory:
        
  - `123 abc` means that the current directory contains a file named
    `abc` with size 123.
        
  - `dir xyz` means that the current directory contains a directory
    named xyz.

Given the commands and output in the example above, you can determine that the filesystem looks visually like this:

```
- / (dir)
  - a (dir)
    - e (dir)
      - i (file, size=584)
    - f (file, size=29116)
    - g (file, size=2557)
    - h.lst (file, size=62596)
  - b.txt (file, size=14848514)
  - c.dat (file, size=8504156)
  - d (dir)
    - j (file, size=4060174)
    - d.log (file, size=8033020)
    - d.ext (file, size=5626152)
    - k (file, size=7214296)
```

Here, there are four directories: `/` (the outermost directory), `a` and `d` (which are in `/`), and `e` (which is in `a`). These directories also contain files of various sizes.

In [1]:
import pytest

In [2]:
def load_input(filename):
    with open(filename, 'r') as f_in:
        yield from (
            line.strip()
            for line in f_in.readlines()
        )
        
src = list(load_input('sample.txt'))
assert len(src) == 23
assert src[0] == '$ cd /'
assert src[-1] == '7214296 k'

In [3]:
from dataclasses import dataclass, field

ROOT = '/'

_We'll need a Directory class_

In [4]:
@dataclass
class Directory:
    name: str
    content: dict = field(default_factory=dict)
    
    def __len__(self):
        return len(self.content)
    
    def is_file(self):
        return False
    
    def is_dir(self):
        return True


d = Directory('a')
assert d.name == 'a'
assert d.content == {}
assert d.is_dir() is True
assert d.is_file() is False

_We'll need a File class_

In [5]:
@dataclass
class File:
    name: str
    size: int
    
    def is_file(self):
        return True
    
    def is_dir(self):
        return False

f = File('a.txt', 123)
assert f.name == 'a.txt'
assert f.size == 123
assert f.is_dir() is False
assert f.is_file() is True  

_We'll need a FileSystem class_

In [6]:
class FileSystem:
    
    def __init__(self):
        self.root = self.current_dir = Directory(
            name=ROOT,
            content={},
        )
        self.cwd = [self.root]
        
        
    def cd_root(self):
        self.current_dir = self.root
        self.cwd = [self.root]
        
    def cd(self, path):
        assert isinstance(path, str)
        assert path in self.current_dir.content
        new_dir = self.current_dir.content[path]
        assert new_dir.is_dir()
        self.cwd.append(new_dir)
        self.current_dir = new_dir
        
    def cd_pop(self, tron=False):
        if len(self.cwd) > 1:
            self.cwd.pop()
        self.current_dir = self.cwd[-1]
        
    def add_file(self, file_name: str, file_size: int):
        self.current_dir.content[file_name] = File(file_name,file_size)
        
    def add_directory(self, dir_name):
        self.current_dir.content[dir_name] = Directory(dir_name)

    def tree(self, entries=None, level=0):
        if entries == None and level == 0:
            print('/ (dir)')
            entries = self.root.content
        for name, entry in entries.items():
            print(
                ' ' * level,
                '-',
                name,
                '(dir)' if entry.is_dir() else f' (file, size={entry.size})',
            )
            if entry.is_dir():
                self.tree(entry.content, level=level+2)
        
        
        
fs = FileSystem()
assert fs.cwd == [fs.current_dir]
assert fs.current_dir.name == ROOT
assert fs.current_dir.content == {}
fs.add_directory('etc')
fs.add_file('a.txt', 123)
fs.add_file('b.txt', 456)
assert fs.current_dir.content['etc'] == Directory('etc')
assert fs.current_dir.content['a.txt'] == File('a.txt', 123)
fs.cd('etc')
assert len(fs.current_dir) == 0
fs.add_file('passwd', 9532)
assert len(fs.current_dir) == 1
assert fs.current_dir.content['passwd'] == File('passwd', 9532)
assert fs.cwd[-1].name == 'etc'
fs.cd_pop()
assert fs.cwd == [fs.current_dir]
fs.cd_pop() # No effect in root
assert fs.cwd == [fs.current_dir]
fs.tree()

/ (dir)
 - etc (dir)
   - passwd  (file, size=9532)
 - a.txt  (file, size=123)
 - b.txt  (file, size=456)


In [7]:
fs = FileSystem()
fs.add_directory('a')
fs.cd('a')
fs.add_directory('b')
fs.cd('b')
fs.add_directory('c')
fs.cd('c')
fs.tree()

/ (dir)
 - a (dir)
   - b (dir)
     - c (dir)


_We'll need a function to parse one line_

In [8]:
def parse_line(line):
    if line[0] == '$':  # is a command
        cmd = line[2:]
        if cmd.startswith('cd '):
            return tuple(cmd.split(' ', 1))
        elif cmd == 'ls':
            return cmd
        raise ValueError(f"Command {cmd} not valid")
    else:  # Is a file/dir entry
        dir_or_size, name = line.split(' ')
        if dir_or_size.isnumeric():
            return 'file', name, int(dir_or_size)
        elif dir_or_size == 'dir':
            return 'dir', name
        raise ValueError(f"Can't understand line {line}")
    

assert parse_line('$ cd /') == ('cd', '/')
assert parse_line('$ cd ..') == ('cd', '..')
assert parse_line('$ cd abc') == ('cd', 'abc')
assert parse_line('$ ls') == 'ls'
assert parse_line('dir etc') == ('dir', 'etc')
assert parse_line('58932 passwd') == ('file', 'passwd', 58932)
    

_And finally we'll need a function to read, parse and execute lines_

In [9]:
def execute(file_name):
    fs = FileSystem()
    with open(file_name, 'r') as f_input:
        for line in f_input.readlines():
            line = line.strip()
            match parse_line(line):
                case 'cd', '/':
                    fs.cd_root()
                case 'cd', '..':
                    fs.cd_pop()
                case 'cd', dir_name:
                    fs.cd(dir_name)
                case 'file', name, size:
                    fs.add_file(name, size)
                case 'dir', name:
                    fs.add_directory(name)
    return fs

In [10]:
fs = execute('sample.txt')
fs.tree()

/ (dir)
 - a (dir)
   - e (dir)
     - i  (file, size=584)
   - f  (file, size=29116)
   - g  (file, size=2557)
   - h.lst  (file, size=62596)
 - b.txt  (file, size=14848514)
 - c.dat  (file, size=8504156)
 - d (dir)
   - j  (file, size=4060174)
   - d.log  (file, size=8033020)
   - d.ext  (file, size=5626152)
   - k  (file, size=7214296)


Since the disk is full, your first step should probably be to find directories that are good candidates for deletion. To do this, you need to determine the total size of each directory. The total size of a directory is the sum of the sizes of the files it contains, directly or indirectly. (Directories themselves do not count as having any intrinsic size.)

The total sizes of the directories above can be found as follows:

- The total size of directory `e` is 584 because it contains a single file `i` of size 584 and no other directories.

- The directory `a` has total size 94853 because it contains files `f` (size 29116), `g` (size 2557), and `h.lst` (size 62596), plus file `i` indirectly (`a` contains `e` which contains `i`).

- Directory `d` has total size 24933642.
    
As the outermost directory, `/` contains every file. Its total size is 48381165, the sum of the size of every file.

To begin, **find all of the directories with a total size of at most 100000, then calculate the sum of their total size**s.

In [11]:
def get_directory_sizes(filename):
    stack = []
    def get_sizes(directory, base='/'):
        nonlocal stack
        assert directory.is_dir()
        size = 0
        for name, item in directory.content.items():
            if item.is_file():
                size += item.size
            else:
                size += get_sizes(item, base=f"{base}/{name}")
        stack.append((directory.name, size))
        return size
    fs = execute(filename)
    get_sizes(fs.root)
    return stack

 In the example above, these directories are a and e; the sum of their total sizes is 95437 (94853 + 584). (As in this example, this process can count files more than once!)

In [12]:
print(get_directory_sizes('sample.txt'))

[('e', 584), ('a', 94853), ('d', 24933642), ('/', 48381165)]


In [13]:
def solution_one(filename):
    return sum(
        size
        for _dir_size, size in get_directory_sizes(filename)
        if size <= 100000
    )
assert solution_one('sample.txt') ==  95437

In [14]:
sol = solution_one('input.txt')
print(f"Solution part one: {sol}")

Solution part one: 1723892


## Part two

Now, you're ready to choose a directory to delete.

The total disk space available to the filesystem is 70000000. To run the update, you need unused space of at least 30000000. You need to find a directory you can delete that will free up enough space to run the update.

In the example above, the total size of the outermost directory (and thus the total amount of used space) is 48381165; this means that the size of the unused space must currently be 21618835, which isn't quite the 30000000 required by the update. Therefore, the update still requires a directory with total size of at least 8381165 to be deleted before it can run.

To achieve this, you have the following options:

    Delete directory e, which would increase unused space by 584.
    Delete directory a, which would increase unused space by 94853.
    Delete directory d, which would increase unused space by 24933642.
    Delete directory /, which would increase unused space by 48381165.

Directories e and a are both too small; deleting them would not free up enough space. However, directories d and / are both big enough! Between these, choose the smallest: d, increasing unused space by 24933642.

Find the smallest directory that, if deleted, would free up enough space on the filesystem to run the update. What is the total size of that directory?

In [15]:
target = 30000000 - (70000000 - 48381165)

In [16]:
memory_size = 70000000
needed_space = 30000000
sizes = get_directory_sizes('sample.txt')
used_space = dict(sizes)['/']
assert used_space == 48381165
target = needed_space - (memory_size - used_space)
assert target == 8381165

In [17]:
def solution_two(filename):
    memory_size = 70000000
    needed_space = 30000000
    sizes = get_directory_sizes(filename)
    used_space = dict(sizes)['/']
    target = needed_space - (memory_size - used_space)
    options = sorted(
        (size, name)
        for name, size in sizes
        if size >= target
    )
    return options[0]


assert solution_two('sample.txt') ==  (24933642, 'd')

In [18]:
sol, _ = solution_two('input.txt')
print(f"Solution part two: {sol}")

Solution part two: 8474158
