<img src="https://raw.githubusercontent.com/dask/dask/main/docs/source/images/dask_horizontal_no_pad.svg"
     width="30%"
     alt="Dask logo\" />
     
This notebook was inspired in the materials from: 

- https://github.com/coiled/pydata-global-dask/

     
# Schedulers

So far we have only seen the power of `dask.delayed` and we got familiarized with the idea of task graphs and we learn that these task graphs need to be executed to get the results of our computation. But what does it mean "to be executed"? Who takes care of this? Well, as you might have guess from the title of this notebook, this is the job of the Dask task scheduler. 


<img src="https://raw.githubusercontent.com/coiled/pydata-global-dask/master/images/grid_search_schedule.gif"
     width="95%"
     alt="Grid search schedule\" />


There are different task schedulers in Dask, and even though they will all compute the same result, but they might have different performances. There are two different classes of schedulers: single-machine and distributed schedulers.


## Single Machine Schedulers

Single machine schedulers require no setup, they only use the Python standard library, and they provide basic features on on a local process or threadpool. Dask provides different single machine schedulers:


- "threads": The threaded scheduler executes computations with a local `concurrent.futures.ThreadPoolExecutor`. The threaded scheduler is the default choice for Dask arrays, Dask DataFrames, and Dask delayed.

- "processes": The multiprocessing scheduler executes computations with a local `concurrent.futures.ProcessPoolExecutor`. The multiprocessing scheduler is the default choice for Dask Bag.

- "single-threaded": The single-threaded synchronous scheduler executes all computations in the local thread, with no parallelism at all. This is particularly valuable for debugging and profiling, which are more difficult when using threads or processes.

### Single machine schedulers in action

Using the same examples we used in the Delayed lesson, let's see how we can modify the scheduler and how this affects the performance of our computations. 

In [1]:
import dask
from dask import delayed
from time import sleep

In [2]:
@delayed
def inc(x):
    """Increments x by one"""
    sleep(1)
    return x + 1

In [3]:
data = list(range(8))

results = []
for i in data:
    y = inc(i)         
    results.append(y)
    
total = delayed(sum)(results)
total

Delayed('sum-de7db1d6-1e32-477b-aded-a34ba2c60cd9')

###  The multi-threading scheduler (default)

In [4]:
%%time 
dask.config.set(scheduler='threads')
total.compute()

CPU times: user 2.09 ms, sys: 1.19 ms, total: 3.29 ms
Wall time: 1.01 s


36

In [5]:
%%time 
dask.config.set(scheduler='threads', num_workers=4)  #setting num_workers
total.compute()

CPU times: user 4.79 ms, sys: 2.42 ms, total: 7.21 ms
Wall time: 2.01 s


36

### The multi-process scheduler 

Notice that we can also set the scheduler as a context manager 

In [6]:
%%time
with dask.config.set(scheduler='processes'): 
    total.compute()   

CPU times: user 10.6 ms, sys: 19 ms, total: 29.7 ms
Wall time: 6.19 s


### The single-threaded scheduler 

Tools like `pdb` do not work well with multi threads or process, but you can work around this by using the single-threaded scheduler when debugging.

In [7]:
%%time
total.compute(scheduler="single-threaded")  

CPU times: user 5.29 ms, sys: 1.45 ms, total: 6.74 ms
Wall time: 8.04 s


36

For more information about single-machine schedulers, and which one to choose you can visit the detailed the Dask documentation on [single-machine schedulers](https://docs.dask.org/en/latest/setup/single-machine.html). 

## Distributed Scheduler

The Dask distributed scheduler, despite having "distributed" on its nama, it also works well on a single machine. We recommend using the distributed scheduler as it offers more features and diagnostics. You can think of the distributed scheduler as an "advanced scheduler". 

The distributed scheduler can be used in a cluster as well as locally. Deploying a remote Dask cluster involves additional setup that you can read more about on the Dask [setup documentation](https://docs.dask.org/en/latest/setup.html). Alternatively, you can use [Coiled](https://docs.coiled.io/user_guide/index.html#what-is-coiled) which provides a cluster-as-a-service functionality to provision hosted Dask clusters on demand, and you can try it for free.  

For now, we will set up the scheduler locally. To set up the distributed scheduler locally we need to create a `Client` object, which will let you interact with the "cluster" (local threads or processes on your machine)

In [8]:
from dask.distributed import Client

In [9]:
client = Client(n_workers=4)
client

0,1
Connection method: Cluster object,Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 4
Total threads: 8,Total memory: 16.00 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:57649,Workers: 4
Dashboard: http://127.0.0.1:8787/status,Total threads: 8
Started: Just now,Total memory: 16.00 GiB

0,1
Comm: tcp://127.0.0.1:57665,Total threads: 2
Dashboard: http://127.0.0.1:57666/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:57652,
Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-dop_ge5z,Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-dop_ge5z

0,1
Comm: tcp://127.0.0.1:57668,Total threads: 2
Dashboard: http://127.0.0.1:57669/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:57654,
Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-1wivjaz7,Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-1wivjaz7

0,1
Comm: tcp://127.0.0.1:57660,Total threads: 2
Dashboard: http://127.0.0.1:57662/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:57651,
Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-l9wsw5bz,Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-l9wsw5bz

0,1
Comm: tcp://127.0.0.1:57659,Total threads: 2
Dashboard: http://127.0.0.1:57661/status,Memory: 4.00 GiB
Nanny: tcp://127.0.0.1:57653,
Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-6c3pjpi7,Local directory: /Users/ncclementi/Documents/git/dask-mini-tutorial/notebooks/dask-worker-space/worker-6c3pjpi7


When we create a distributed scheduler `Client`, by default it registers itself as the default Dask scheduler. From now on, all `.compute()` calls will start using the distributed scheduler unless otherwise is specified. 

The distributed scheduler has many features that you can learn more about in the [Dask distributed documentation](https://distributed.dask.org/en/latest/) but a nice feature to explore is diagnostic the Dashboard. We will be taking a look at the dashboard as we perform computations but for a brief overview of the main components of the dashboard you can check the Dask documentation on [diagnosing performance](https://distributed.dask.org/en/latest/diagnosing-performance.html).

If you click on the link of the dashboard on the cell above and run the computation of `total` as we did before you will see now some action happening on the dashboard.  

In [10]:
total.compute()

36

### Futures interface

The distributed scheduler enables another Dask Collection -- the Futures Interface. The Dask distributed scheduler implements a superset of Python's [`concurrent.futures`](https://docs.python.org/3/library/concurrent.futures.html) interface that provides finer control and asynchronous computations.

In [11]:
import time

def inc(x):
    time.sleep(1)
    return x + 1


We can run this locally as serial code:

In [12]:
inc(1)

2

Or we can submit this to the cluster as

In [13]:
future = client.submit(inc, 1)
future

The `Client.submit` function sends a function and arguments to the distributed scheduler for processing. It returns a `Future` object that refer to remote data on the cluster. The Future returns immediately while the computations run remotely in the background. There is no blocking of the local Python session.

If you wait a moment, and then check on the future again, you'll see that it has finished.

In [14]:
future

You can retrieve the result of a Future by calling the `.result()` method. If the status of the Future is "finished", meaning the task has been successfully run on one of the workers, then calling `.result()` will return almost immediately. Conversely, if the Future is still "pending" then calling `.result()` will block the current Python process and wait until the task has been run and then return the result.

In [15]:
future.result()

2

Similar to `Client.submit`, there's also a `Client.map` function for running a function across an iterable of inputs (similar to Python's built in map function).

In [16]:
futures = client.map(inc, list(range(8)))
futures

[<Future: pending, key: inc-b1931637320391d8b1983fb525c4ce93>,
 <Future: finished, type: int, key: inc-63421aeb5ab5dd13590f4327d8784c2d>,
 <Future: pending, key: inc-314071431921bb692f2db6f89fee8f76>,
 <Future: pending, key: inc-d843f447d456d5c2efa4eb5d7671ac50>,
 <Future: pending, key: inc-ffb8bc2e4d057b88f458515b0f8dd4ba>,
 <Future: pending, key: inc-ecdbaebe1a7b9c8492fd5c5ce4d70093>,
 <Future: pending, key: inc-fb91c0313069af4aa7f03cf8d67582a7>,
 <Future: pending, key: inc-7252462e26c872c1f4ced6564c2ba259>]

`Client.map` returns a list of `Futures` objects, one per input that was mapped over. To get the results we can use a list comprehension like `[f.result() for f in futures]` or use the `Client.gather` method.

In [17]:
results = client.gather(futures)
results

[1, 2, 3, 4, 5, 6, 7, 8]

`Futures` obey standard Python garbage collection. The data Futures point to will continue to live on a Dask worker until there are no more references to the Future, at which point they will be deleted from the cluster.

If you want to delete a Future you can always use `del`, for example:

In [18]:
del futures[1]

In [19]:
len(futures)

7

### Exercise: parallelize a for-loop

Parallelize the following piece of code using `Client.submit()`

In [20]:
def inc(x):
    time.sleep(0.5)
    return x + 1

def double(x):
    time.sleep(0.5)
    return 2 * x

def add(x, y):
    time.sleep(0.5)
    return x + y

In [21]:
## paralellize me
output = []
for x in range(10):
    a = inc(x)
    b = double(x)
    c = add(a, b)
    output.append(c)

total = sum(output)
total

145

In [22]:
#solution
output = []
for x in range(10):
    a = client.submit(inc, x)
    b = client.submit(double, x)
    c = client.submit(add, a, b)
    output.append(c)

total = client.submit(sum, output)
total.result()

145

As a good practice we want to close the client when we are done. 

In [None]:
client.close()

## Extra resources

- [Dask documentation on scheduling](https://docs.dask.org/en/latest/scheduling.html)
- Example Dynamic computations using Futures: [PyData Global Dask tutorial - schedulers](https://github.com/coiled/pydata-global-dask/blob/master/3-schedulers.ipynb)
- Advance Delayed with distributed scheduler: [Dask tutorial - Advanced delayed](https://github.com/dask/dask-tutorial/blob/main/06_distributed_advanced.ipynb)
- [Futures Documentation](https://docs.dask.org/en/latest/futures.html)