cd means change directory. This changes which directory is the current directory, but the specific result depends on the argument:
cd x moves in one level: it looks in the current directory for the directory named x and makes it the current directory.
cd .. moves out one level: it finds the directory that contains the current directory, then makes that directory the current directory.
cd / switches the current directory to the outermost directory, /.
ls means list. It prints out all of the files and directories immediately contained by the current directory:
123 abc means that the current directory contains a file named abc with size 123.
dir xyz means that the current directory contains a directory named xyz.
Given the commands and output in the example above, you can determine that the filesystem looks visually like this:

- / (dir)
  - a (dir)
    - e (dir)
      - i (file, size=584)
    - f (file, size=29116)
    - g (file, size=2557)
    - h.lst (file, size=62596)
  - b.txt (file, size=14848514)
  - c.dat (file, size=8504156)
  - d (dir)
    - j (file, size=4060174)
    - d.log (file, size=8033020)
    - d.ext (file, size=5626152)
    - k (file, size=7214296)
    
The total sizes of the directories above can be found as follows:

The total size of directory e is 584 because it contains a single file i of size 584 and no other directories.
The directory a has total size 94853 because it contains files f (size 29116), g (size 2557), and h.lst (size 62596), plus file i indirectly (a contains e which contains i).
Directory d has total size 24933642.
As the outermost directory, / contains every file. Its total size is 48381165, the sum of the size of every file.
To begin, find all of the directories with a total size of at most 100000, then calculate the sum of their total sizes. In the example above, these directories are a and e; the sum of their total sizes is 95437 (94853 + 584). (As in this example, this process can count files more than once!)

Find all of the directories with a total size of at most 100000. What is the sum of the total sizes of those directories?

In [76]:
with open('data/7.txt', 'r') as f:
    s = [i.split('\n')[0] for i in f.readlines()]

In [77]:
from collections import defaultdict
files = defaultdict(lambda: 0)
path = []
for row in s:
    if row.startswith('$ cd'):
        loc = row.split('$ cd ')[1]
        if loc != '..':
            path.append(loc)
        else:
            path.pop(-1)
    elif row.startswith('dir'):
        continue
    else:
        try:
            int(row[0])
            for i in range(1, len(path)+1):
                files['/'.join(path[:i])] += int(row.split()[0])    
        except ValueError:
            pass

In [78]:
sum([i for i in files.values() if i <= 100000])

2061777

In [79]:
files

defaultdict(<function __main__.<lambda>()>,
            {'/': 44125990,
             '//gbjh': 9976905,
             '//gbjh/jtgbg': 1361617,
             '//gbjh/jtgbg/hzjcc': 262498,
             '//gbjh/jtgbg/jgpnm': 572718,
             '//gbjh/jtgbg/jgpnm/php': 289068,
             '//gbjh/jtgbg/jgpnm/rlp': 283650,
             '//gbjh/jtgbg/jgpnm/rlp/dhlspmh': 249350,
             '//gbjh/jtgbg/jgpnm/rlp/mlsqrz': 31876,
             '//gbjh/jtgbg/jgpnm/rlp/slwhsqw': 2424,
             '//gbjh/jtgbg/smb': 29124,
             '//gbjh/jtgbg/vvhmmn': 40455,
             '//gbjh/pzdn': 8292589,
             '//gbjh/pzdn/bpdbclp': 65147,
             '//gbjh/pzdn/gvvgncqh': 2584721,
             '//gbjh/pzdn/gvvgncqh/fdcdh': 285507,
             '//gbjh/pzdn/gvvgncqh/jnfhsqrl': 1563456,
             '//gbjh/pzdn/gvvgncqh/jnfhsqrl/ddzqtsvf': 741961,
             '//gbjh/pzdn/gvvgncqh/jnfhsqrl/ddzqtsvf/fdcdh': 741961,
             '//gbjh/pzdn/gvvgncqh/jnfhsqrl/ddzqtsvf/fdcdh/fhmpzq': 28

To begin, find all of the directories with a total size of at most 100000, then calculate the sum of their total sizes. In the example above, these directories are a and e; the sum of their total sizes is 95437 (94853 + 584). (As in this example, this process can count files more than once!)

Find all of the directories with a total size of at most 100000. What is the sum of the total sizes of those directories?

Now, you're ready to choose a directory to delete.

The total disk space available to the filesystem is 70000000. To run the update, you need unused space of at least 30000000. You need to find a directory you can delete that will free up enough space to run the update.

In the example above, the total size of the outermost directory (and thus the total amount of used space) is 48381165; this means that the size of the unused space must currently be 21618835, which isn't quite the 30000000 required by the update. Therefore, the update still requires a directory with total size of at least 8381165 to be deleted before it can run.

To achieve this, you have the following options:

Delete directory e, which would increase unused space by 584.
Delete directory a, which would increase unused space by 94853.
Delete directory d, which would increase unused space by 24933642.
Delete directory /, which would increase unused space by 48381165.
Directories e and a are both too small; deleting them would not free up enough space. However, directories d and / are both big enough! Between these, choose the smallest: d, increasing unused space by 24933642.

Find the smallest directory that, if deleted, would free up enough space on the filesystem to run the update. What is the total size of that directory?

In [80]:
required = 30000000
free = 70000000 - files['/']
target = required - free

In [81]:
target

4125990

In [82]:
min([i for i in files.values() if i > target])

4473403