# Exercise 7 - Process Tasks in Order of Completion

**GOAL:** The goal of this exercise is to show how to use `ray.wait` to process tasks in the order that they finish.

See the documentation for ray.wait at https://ray.readthedocs.io/en/latest/package-ref.html?highlight=ray.wait#ray.wait.

## Concepts for this exercise - `ray.wait`

After launching a number of tasks, you may want to run the results sequentially. To do so, we build off of exercise 6 and use `ray.wait` to execute the results sequentially. 

We are able to use `ray.wait` because the two lists returned by **`ray.wait` maintains the ordering of the input list**. That is, if `f` is a remote function, the code 
```python
    results = ray.wait([f.remote(i) for i in range(100)], num_returns=10)
```
will return `(ready_list, remain_list)` and the `ObjectID`s of in those lists will be ordered by the argument passed to `f` above.

In [1]:
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import ray
import time

In [2]:
ray.init(num_cpus=5, include_webui=False, ignore_reinit_error=True)

# Sleep a little to improve the accuracy of the timing measurements used below,
# because some workers may still be starting up in the background.
time.sleep(2.0)

2020-05-07 19:32:16,881	INFO resource_spec.py:205 -- Starting Ray with 9.62 GiB memory available for workers and up to 4.82 GiB for objects. You can adjust these settings with ray.remote(memory=<bytes>, object_store_memory=<bytes>).


In [3]:
@ray.remote
def f():
    time.sleep(np.random.uniform(0, 5))
    return time.time()

**EXERCISE:** Change the code below to use `ray.wait` to get the results of the tasks in the order that they complete.

**NOTE:** It would be a simple modification to maintain a pool of 10 experiments and to start a new experiment whenever one finishes.

In [8]:
start_time = time.time()

remaining_result_ids = [f.remote() for _ in range(10)]

# Get the results.
results = []
while len(remaining_result_ids) > 0:
    # EXERCISE: Instead of simply waiting for the first result from
    # remaining_result_ids, use ray.wait to get the first one to finish.
    ready_result_id, remaining_result_ids = ray.wait(remaining_result_ids, num_returns=1)
    result_id = ready_result_id[0]
#     remaining_result_ids = remaining_result_ids[1:]
    result = ray.get(result_id)
    results.append(result)
    print('Processing result which finished after {} seconds.'
          .format(result - start_time))

end_time = time.time()
duration = end_time - start_time

Processing result which finished after 0.29426145553588867 seconds.
Processing result which finished after 1.1585915088653564 seconds.
Processing result which finished after 1.2386021614074707 seconds.
Processing result which finished after 1.7017056941986084 seconds.
Processing result which finished after 2.547107219696045 seconds.
Processing result which finished after 2.768010377883911 seconds.
Processing result which finished after 2.8139185905456543 seconds.
Processing result which finished after 3.458484172821045 seconds.
Processing result which finished after 4.396401405334473 seconds.
Processing result which finished after 4.55148458480835 seconds.


**VERIFY:** Run some checks to verify that the changes you made to the code were correct. Some of the checks should fail when you initially run the cells. After completing the exercises, the checks should pass.

In [9]:
assert results == sorted(results), ('The results were not processed in the '
                                    'order that they finished.')

print('Success! The example took {} seconds.'.format(duration))

Success! The example took 4.55344820022583 seconds.
