# Dask - Batch Jobs

In this notebook, we'll learn how to use [Dask](https://dask.org) for Batch jobs using [``dask-mpi``](https://mpi.dask.org).

## Get AML Workspace

You can use the AML workspace to retrieve datastores and keyvaults for accessing data credentials securely.

In [2]:
from azureml.core import Workspace

ws = Workspace.from_config()
ws

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


Workspace.create(name='default', subscription_id='6560575d-fa06-4e7d-95fb-f962e74efd7a', resource_group='azureml-examples')

In [33]:
%%writefile environment.yml
name: dask-mpi-env
channels:
  - defaults
  - conda-forge
dependencies:
  - python=3.8
  - pip
  - pip:
    - azureml-core
    - azureml-mlflow
    - dask
    - distributed
    - adlfs
    - mpi4py
    - dask-mpi

Overwriting environment.yml


In [34]:
%%writefile dask-job.py
from dask_mpi import initialize
initialize()

from distributed import Client 

c = Client()
print(c)

Overwriting dask-job.py


In [31]:
from azureml.core import ScriptRunConfig, Experiment, Environment
from azureml.core.runconfig import MpiConfiguration


env = Environment.from_conda_specification("dask-mpi", "environment.yml")

mpi_config = MpiConfiguration(node_count=10)

src = ScriptRunConfig(source_directory=".", script="dask-job.py", compute_target="cpu-cluster", environment=env, distributed_job_config=mpi_config)

run = Experiment(ws, "dask-mpi-tutorial").submit(src)
run

Experiment,Id,Type,Status,Details Page,Docs Page
dask-mpi-tutorial,dask-mpi-tutorial_1607743357_78c685f0,azureml.scriptrun,Starting,Link to Azure Machine Learning studio,Link to Documentation


In [32]:
from azureml.widgets import RunDetails

RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…