## Dask on Mio test!

**THE FOLLOWING COMMAND WILL START THE NUMBER OF NODES IN**

    cluster.scale(N)
    
**IT STARTS RUNNING THE NODES EVEN IF IT HAS NOTHING TO DO**

Recommended to use 

    cluster.adapt()
    
instead to automatically scale up and down jobs

In [1]:
from dask_jobqueue import SLURMCluster

# The values in this function can be set in ~/.config/dask/jobqueue.yaml instead of in this function
# A copy on my config is included in this repo
# More info: https://jobqueue.dask.org/en/latest/configuration.html
cluster = SLURMCluster(cores=24, # cores per job
                       memory="100GB", # memory per job, not sure what the mio nodes have
                       #processes = sqrt(cores) # cut the job into this many processes. Default is good
                       #queue='geop,compute', # prefer geop nodes, but accept compute
                       walltime='02:00:00', # time we are reserving the nodes for
                       log_directory="./logs", # directory for logs
                       #local_directory="~/scratch/dask_test" # directory for file spilling in case things get big
                      )

# cluster.scale(n=2,jobs=2)  # Start 2 workers in 2 jobs that match the description above
cluster.adapt(maximum_jobs=20) # automatically launches and kills nodes based on load
 
from dask.distributed import Client
client = Client(cluster)    # Connect to that cluster

## Do something Dask

In [2]:
import dask.array as da
x = da.random.random((50000, 50000), chunks=(1000, 1000))
x

Unnamed: 0,Array,Chunk
Bytes,20.00 GB,8.00 MB
Shape,"(50000, 50000)","(1000, 1000)"
Count,2500 Tasks,2500 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 20.00 GB 8.00 MB Shape (50000, 50000) (1000, 1000) Count 2500 Tasks 2500 Chunks Type float64 numpy.ndarray",50000  50000,

Unnamed: 0,Array,Chunk
Bytes,20.00 GB,8.00 MB
Shape,"(50000, 50000)","(1000, 1000)"
Count,2500 Tasks,2500 Chunks
Type,float64,numpy.ndarray


In [3]:
y = x + x.T
z = y.mean(axis=1)
z

Unnamed: 0,Array,Chunk
Bytes,400.00 kB,8.00 kB
Shape,"(50000,)","(1000,)"
Count,10900 Tasks,50 Chunks
Type,float64,numpy.ndarray
"Array Chunk Bytes 400.00 kB 8.00 kB Shape (50000,) (1000,) Count 10900 Tasks 50 Chunks Type float64 numpy.ndarray",50000  1,

Unnamed: 0,Array,Chunk
Bytes,400.00 kB,8.00 kB
Shape,"(50000,)","(1000,)"
Count,10900 Tasks,50 Chunks
Type,float64,numpy.ndarray


In [4]:
z.compute()

array([0.99734266, 1.0000075 , 1.00243606, ..., 1.00151471, 1.00230303,
       0.99873359])

## Cleanup

In [5]:
# Kills Dask moniotring too - So if you're using the status page, run this when all done
client.close() # Release the client
cluster.close() # Release the nodes