# Dask jobqueue adaptivity

More: https://docs.dask.org/en/latest/setup/adaptive.html

## Monte-Carlo estimate with multiple Dask batch job workers

We define a Dask jobqueue cluster with Dask workers that each have 4 CPUs and 24 GB of memory.

In [1]:
from time import sleep

import dask, dask.distributed
import dask_jobqueue

In [2]:
cluster = dask_jobqueue.SLURMCluster(

    # Dask worker size
    cores=4, memory='24GB',
    processes=1, # Dask workers per job
    
    # SLURM job script things
    queue='cluster', walltime='00:15:00',
    
    # Dask worker network and temporary storage
    interface='ib0', local_directory='$TMPDIR'
)

client = dask.distributed.Client(cluster)
cluster.scale(jobs=1)

In [3]:
client

0,1
Client  Scheduler: tcp://172.18.4.100:38187  Dashboard: http://172.18.4.100:8787/status,Cluster  Workers: 0  Cores: 0  Memory: 0 B


### Let's scale up the cluster

In [4]:
cluster.scale(jobs=8)

In [5]:
client

0,1
Client  Scheduler: tcp://172.18.4.100:38187  Dashboard: http://172.18.4.100:8787/status,Cluster  Workers: 8  Cores: 32  Memory: 192.00 GB


### From here everything is the same as with LocalCluster

In [6]:
import numpy, dask.array

def calculate_pi(size_in_bytes, number_of_chunks):
    
    """Calculate pi using a Monte Carlo method."""
    
    array_shape = (int(size_in_bytes / 8 / 2), 2)
    chunk_size = (int(array_shape[0] / number_of_chunks), 2)
    
    # 2D random positions array using dask.array
    xy = dask.array.random.uniform(
        low=0.0, high=1.0, size=array_shape,
        # specify chunk size, i.e. task number
        chunks=chunk_size )
  
    xy_inside_circle = (xy ** 2).sum(axis=1) < 1 # boolean

    pi = 4 * xy_inside_circle.sum() / xy_inside_circle.size
    
    # start Dask calculation
    pi = pi.compute()

    print(f"\nfrom {xy.nbytes / 1e9} GB randomly chosen positions")
    print(f"   pi estimate: {pi}")
    print(f"   pi error: {abs(pi - numpy.pi)}\n")
    display(xy)
    
    return pi

### Let's calculate again...

In [7]:
%time pi = calculate_pi(size_in_bytes=10_000_000_000, number_of_chunks=100) # 10 GB


from 10.0 GB randomly chosen positions
   pi estimate: 3.141683744
   pi error: 9.109041020671782e-05



Unnamed: 0,Array,Chunk
Bytes,10.00 GB,100.00 MB
Shape,"(625000000, 2)","(6250000, 2)"
Count,100 Tasks,100 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 10.00 GB 100.00 MB Shape (625000000, 2) (6250000, 2) Count 100 Tasks 100 Chunks Type float64 numpy.ndarray",2  625000000,

Unnamed: 0,Array,Chunk
Bytes,10.00 GB,100.00 MB
Shape,"(625000000, 2)","(6250000, 2)"
Count,100 Tasks,100 Chunks
Type,float64,numpy.ndarray


CPU times: user 321 ms, sys: 18.6 ms, total: 339 ms
Wall time: 1.64 s


In [8]:
%time pi = calculate_pi(size_in_bytes=100_000_000_000, number_of_chunks=250) # 100 GB


from 100.0 GB randomly chosen positions
   pi estimate: 3.1415723744
   pi error: 2.0279189793193098e-05



Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 100.00 GB 400.00 MB Shape (6250000000, 2) (25000000, 2) Count 250 Tasks 250 Chunks Type float64 numpy.ndarray",2  6250000000,

Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray


CPU times: user 1.49 s, sys: 97.1 ms, total: 1.58 s
Wall time: 9.71 s


### And we can scale up the cluster whenever needed

In [9]:
cluster.scale(jobs=16)

In [10]:
client

0,1
Client  Scheduler: tcp://172.18.4.100:38187  Dashboard: http://172.18.4.100:8787/status,Cluster  Workers: 11  Cores: 44  Memory: 264.00 GB


### Let's calculate again...

In [11]:
%time pi = calculate_pi(size_in_bytes=100_000_000_000, number_of_chunks=250) # 100 GB


from 100.0 GB randomly chosen positions
   pi estimate: 3.14160667648
   pi error: 1.4022890206799588e-05



Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 100.00 GB 400.00 MB Shape (6250000000, 2) (25000000, 2) Count 250 Tasks 250 Chunks Type float64 numpy.ndarray",2  6250000000,

Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray


CPU times: user 1.61 s, sys: 92 ms, total: 1.7 s
Wall time: 6.25 s


### Let's scale adaptively

Dask jobqueue is able to scale total worker number based on problem size. You can also specify a target duration.

More on Dask's adaptivity [can be found in the docs](https://docs.dask.org/en/latest/setup/adaptive.html).

In [12]:
ca = cluster.adapt(
    minimum=2, maximum=32,
    #minimum_jobs=2, maximum_jobs=32,
    target_duration="80s",  # measured in CPU time per worker
                             # -> 20 seconds at 4 cores / worker
    wait_count=7  # scale down less agressively
                  # this prevents shutting down workers before they can send out
                  # their results
);

sleep(10)  # Allow for scale-down

In [13]:
%time pi = calculate_pi(size_in_bytes=100_000_000_000, number_of_chunks=250) # 100 GB


from 100.0 GB randomly chosen positions
   pi estimate: 3.14160180736
   pi error: 9.153770206715706e-06



Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 100.00 GB 400.00 MB Shape (6250000000, 2) (25000000, 2) Count 250 Tasks 250 Chunks Type float64 numpy.ndarray",2  6250000000,

Unnamed: 0,Array,Chunk
Bytes,100.00 GB,400.00 MB
Shape,"(6250000000, 2)","(25000000, 2)"
Count,250 Tasks,250 Chunks
Type,float64,numpy.ndarray


CPU times: user 4.66 s, sys: 199 ms, total: 4.86 s
Wall time: 19.3 s


In [14]:
%time pi = calculate_pi(size_in_bytes=1_000_000_000_000, number_of_chunks=2_000) # 1 TB


from 1000.0 GB randomly chosen positions
   pi estimate: 3.141595756672
   pi error: 3.103082206745711e-06



Unnamed: 0,Array,Chunk
Bytes,1000.00 GB,500.00 MB
Shape,"(62500000000, 2)","(31250000, 2)"
Count,2000 Tasks,2000 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 1000.00 GB 500.00 MB Shape (62500000000, 2) (31250000, 2) Count 2000 Tasks 2000 Chunks Type float64 numpy.ndarray",2  62500000000,

Unnamed: 0,Array,Chunk
Bytes,1000.00 GB,500.00 MB
Shape,"(62500000000, 2)","(31250000, 2)"
Count,2000 Tasks,2000 Chunks
Type,float64,numpy.ndarray


CPU times: user 18.8 s, sys: 814 ms, total: 19.6 s
Wall time: 30.6 s


In [15]:
sleep(10)

In [16]:
%time pi = calculate_pi(size_in_bytes=3_000_000_000_000, number_of_chunks=5_000) # 3 TB

In [1]:
client.close()
cluster.close()

NameError: name 'client' is not defined