# Running Dask on AzureML

This notebook shows how to run a batch job on a Dask cluster running on an AzureML Compute cluster. 
For setup instructions of your python environment, please see the [Readme](../README.md)

## Starting the cluster

In [51]:
from azureml.core import Workspace, Experiment
from azureml.train.estimator import Estimator
from azureml.widgets import RunDetails
from azureml.core.runconfig import MpiConfiguration
from azureml.core import VERSION
import uuid
import time
VERSION


'1.0.74'

In [52]:
ws = Workspace.from_config()

In [53]:
# we assume the AML compute training cluster is already created
dask_cluster = ws.compute_targets['dask-DS12-V2']

Starting the Dask cluster using an Estimator with MpiConfiguration. Make sure the cluster is able to scale up to 10 nodes or change the `node_count` below. 

In [54]:
est = Estimator('dask', 
                compute_target=dask_cluster, 
                entry_script='startDask.py', 
                conda_dependencies_file='environment.yml', 
                script_params={'--datastore': ws.get_default_datastore(),
                              '--script': 'batch.py'},
                node_count=10,
                distributed_training=MpiConfiguration())

run = Experiment(ws, 'dask').submit(est)

In [55]:
RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

## Shut cluster down

In [50]:
for run in ws.experiments['dask'].get_runs():
    if run.get_status() == "Running":
        print(f'cancelling run {run.id}')
        run.cancel()

cancelling run dask_1575974502_b8643732
cancelling run dask_1575973181_99433e88


### Just for convenience, get the latest running Run

In [87]:
for run in ws.experiments['dask'].get_runs():
    if run.get_status() == "Running":
        print(f'latest running run is {run.id}')
        break

latest running run is dask_1574792066_49c85fe4
