# Dask and DaskGateway Tutorial

 - [dask gateway](https://gateway.dask.org/)
 - [dask](https://tutorial.dask.org/)
 
Dask is a framework for easily paralellizing python code.

In [None]:
# core object for interacting with dask_gateway
from dask_gateway import Gateway
gateway = Gateway()

## List active cluster

QHub has a rich authorization model so it is possible that your given user does not have access. Consult your administrator if you need access.

In [None]:
gateway.list_clusters()

# Dask Gateway Options

When spinning up a Dask cluster there are many options. For QHub we have simplified these options into:
  - the environment that you want your Dask scheduler and workers to run (ensure this matches your jupyter kernel in the top right of this notebook)
  - the size of the given scheduler and workers. This is configurable by your administrator and often controls the cpu and ram per worker
  - environment variables to set on the workers

In [None]:
options = gateway.cluster_options()
options

# Create a new Dask cluster

Clusters can easily be created by asking Dask Gateway for a new cluster. The cluster will be created with all the options selected above. Once a gui is created you should be able to click on the dashboard link to see dask scheduler dashboard. This is useful for debugging. Additionally in the menu below you can scale up and down the cluster.

Often when scaling up and down the workers this causes QHub to have to create new nodes which in the cloud takes a few minutes (5-6 minutes).

In [None]:
cluster = gateway.new_cluster(options)
cluster

# Create a dask client

The client object is what dask uses for all computations. It is from this point that all computation should be a normal dask computation.

In [None]:
client = cluster.get_client()
client

# Sample Computation

In [None]:
import dask.array as da
x = da.random.random((10000, 10000), chunks=(1000, 1000))
x

In [None]:
y = x + x.T
z = y[::2, 5000:].mean(axis=1)
z

In [None]:
z.compute()