# Task dependency tutorial

The goal of this exercise is to show how to pass object IDs into remote
functions to encode dependencies between tasks. In this case, we construct a
sequence of tasks, each of which depends on the previous. Within each
sequence, tasks are executed serially, but multiple sequences can be executed
in parallel.

EXERCISE: This script is too slow, use Ray to parallelize the computation
below.

In [2]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import ray
import time

In [3]:
ray.init(num_cpus=4, redirect_output=True)

Waiting for redis server at 127.0.0.1:59459 to respond...
Starting local scheduler with 4 CPUs and 0 GPUs.
The web UI failed to start.


{'local_scheduler_socket_names': ['/tmp/scheduler62881597'],
 'node_ip_address': '127.0.0.1',
 'object_store_addresses': [ObjectStoreAddress(name='/tmp/plasma_store5568531', manager_name='/tmp/plasma_manager25854145', manager_port=31919)],
 'redis_address': '127.0.0.1:59459'}

In [4]:
# This function is a proxy for a more interesting and computationally
# intensive function.
def slow_function(i):
    time.sleep(np.random.uniform(0, 0.1))
    return i + 1

In [7]:
start_time = time.time()

# This loop is too slow. Some of the calls to slow_function should happen in
# parallel. However, they cannot all happen in parallel, because some of them
# calls use the outputs of other calls. The underlying computation graph
# encoding the dependencies between these tasks consists of four chains of
# length twenty.
results = []
for i in range(4):
    x = 100 * i
    for j in range(20):
        x = slow_function(x)
    results.append(x)

end_time = time.time()
duration = end_time - start_time

In [8]:
assert results == [20, 120, 220, 320]
assert duration < 1.3, ("The loop took {} seconds. This is too slow."
                      .format(duration))

print("Success! The example took {} seconds.".format(duration))

AssertionError: 