# Dask Gateway tutorial

<div class="alert alert-warning">
Warning

Dask Gateway creates schedulers and workers at the Purdue Hammer cluster via SLURM.

Therefore, all analysis code that uses Dask Gateway must be stored in a storage volume accessible from both Hammer cluster, and from Purdue Analysis Facility. At the moment, there is only one such volume - **Purdue Depot storage**.

Depot is only accessbile for users with a Purdue account, therefore **CERN and FNAL users cannot use Dask Gateway at the moment**.
</div>

- Default conda environments `python3` and `python3-ml` have all necessary software installed. If you want to use Dask Gateway in your own environment, make sure that it contains `dask-gateway`, `ipykernel` and `ipywidgets` packages.
- For more information, refer to [Dask Gateway documentation](https://gateway.dask.org).

**1. Initialize `gateway` object.** It will be used to interact with your Dask clusters.

In [1]:
from dask_gateway import Gateway
gateway = Gateway()

**2. Configure cluster.** There are two ways to configure a Dask cluster, choose what works better in your case:

1. Using `options` object via interactive Jupyter widget
2. Using keyword arguments (will override `options`)

In [2]:
# Run this cell to launch "options" widget (the widget will be displayed properly if
# the cell is executed using a kernel with `ipywidgets`` installed).
# Changes to parameters in the widget are automatically applied to the "options" object.
options = gateway.cluster_options()
options

VBox(children=(HTML(value='<h2>Cluster Options</h2>'), GridBox(children=(HTML(value="<p style='font-weight: bo…

*An example of how the widget for Gateway options will look like:*
<div>
<img src="../images/dask-gateway-widget-options.png" width="600"/>
</div>


**3. Create a new cluster.**
If Slurm job doesn't get scheduled within `cluster_start_timout`, the cluster creation will fail. You can try to increase timeout or use a different queue.

In [3]:
# 1. using "options" object

cluster = gateway.new_cluster(options)

# 2. using keywords (will override values set in "options")

# cluster = gateway.new_cluster(
#     options, # not required
#     conda_env = "/depot/cms/kernels/python3-ml",
#     queue = "cms",
#     worker_cores = 1,
#     worker_memory = 4,
#     env = {"KEY1": "VALUE1", "KEY2": "VALUE2"},
#     cluster_start_timeout = 60,
# )

cluster

VBox(children=(HTML(value='<h2>GatewayCluster</h2>'), HBox(children=(HTML(value='\n<div>\n<style scoped>\n    …

*An example of how the widget for the Gateway cluster will look like:*
<div>
<img src="../images/dask-gateway-widget-cluster.png" width="600"/>
</div>

Clusters can be scaled either via Jupyter widget or via `cluster.scale()` and `cluster.adapt()` commands.

**3a. Connect to an existing cluster.**
Use the commands below instead of `gateway.new_cluster()` to connect to an existing cluster.

In [4]:
# List available clusters
clusters = gateway.list_clusters()
print(clusters)

[ClusterReport<name=ef4fac36f4524184ba21ab67f842a9c7, status=RUNNING>]


In [5]:
# Connect to an existing cluster by name
cluster_name = "ef4fac36f4524184ba21ab67f842a9c7"   # paste cluster name here
cluster = gateway.connect(cluster_name)

**4. Connect a client to a cluster.**

In [6]:
client = cluster.get_client()

# Or connect to a specific cluster by name:
# cluster_name = "ef4fac36f4524184ba21ab67f842a9c7"   # paste cluster name here
# client = gateway.connect(cluster_name).get_client()

**5. Shut down cluster.**

In [7]:
cluster.shutdown()

# Or shut down a specific cluster by name:
# cluster_name = "ef4fac36f4524184ba21ab67f842a9c7"   # paste cluster name here
# client = gateway.connect(cluster_name).shutdown()

**Shut down all clusters:**

In [8]:
for cluster_info in gateway.list_clusters():
    gateway.connect(cluster_info.name).shutdown()