In [31]:
tasks_text = [l.strip() for l in open('../../data/04-preparation.txt')]

In [32]:
def parse(n_text):
    ns = n_text.split()
    return {'name': ns[0], 'time': int(ns[1]), 'prerequisites': ns[2:]}

In [33]:
tasks = [parse(l) for l in tasks_text]

# Part 1
If it's just me doing all the tasks, the total time is just the sum of the times for all the tasks.

In [34]:
sum(n['time'] for n in tasks)

2215

# Part 2
I use two data structures to keep track of the `handled` and `unhandled` tasks. 

`handled` tasks have all their prerequisites completed and have a known end time: they're stored in a `dict` of task name -> end time. 

`unhanded` tasks are those without a known end time: they're stored in a copy of the `tasks` list.

In [45]:
handled = {}
unhandled = tasks[:]

The `candidate` tasks are those where we don't know the end time, but we know enough to work it out. They are the tasks where all their prerequisites are in `handled`.

In [36]:
def candidates(unhandled, handled):
    return [task for task in unhandled 
            if all(p in handled for p in task['prerequisites'])]

For each `candidate`, the earliest we can start the task is the latest end time of any of its prerequisites (or zero, for those tasks with no prerequisites). We then record its end time in the `handled` dict, and remove the task from `unhandled`. 

In [46]:
while unhandled:
    for candidate in candidates(unhandled, handled):
        start_time = max([0] + [handled[p] for p in candidate['prerequisites']])
        handled[candidate['name']] = start_time + candidate['time']
        unhandled.remove(candidate)

27.2 ns ± 0.157 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


We can now look up the largest end time in `handled`.

In [47]:
max(time for time in handled.values())

413