# Distributed LightGBM (CPU)

Scaling out on AmlCompute is simple! The code from the previous notebook has been modified and adapted in [src/run.py](src/run.py). In particular, changes include:

- import and initialize dask_mpi
- use argparse to allow for command line argument inputs
- mlflow logging 

The [environment.yml](environment.yml) contains the conda environment specification.

## Get Workspace

In [1]:
from azureml.core import Workspace

ws = Workspace.from_config()
ws

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


Workspace.create(name='default', subscription_id='6560575d-fa06-4e7d-95fb-f962e74efd7a', resource_group='azureml-examples')

## Distributed Remotely

Simply use ``MpiConfiguration`` with the desired node count.

In [2]:
from azureml.core import ScriptRunConfig, Experiment, Environment
from azureml.core.runconfig import MpiConfiguration

arguments = ["--boosting", "gbdt", "--num_iterations", 100, "--learning_rate", 0.2, "--num_leaves", 31]
env = Environment.from_conda_specification("lightgbm-cpu-tutorial", "environment.yml")
mpi_config = MpiConfiguration(node_count=30)
src = ScriptRunConfig(
    source_directory="src",
    script="run.py",
    arguments=arguments,
    compute_target="cpu-cluster",
    environment=env,
    distributed_job_config=mpi_config,
)
run = Experiment(ws, "lightgbm-cpu-tutorial").submit(src)
run

Experiment,Id,Type,Status,Details Page,Docs Page
lightgbm-cpu-tutorial,lightgbm-cpu-tutorial_1610141157_4f54e34e,azureml.scriptrun,Starting,Link to Azure Machine Learning studio,Link to Documentation


## View Widget

Optionally, view the output in the run widget.

In [3]:
from azureml.widgets import RunDetails

RunDetails(run).show()

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

for testing, wait for the run to complete

In [None]:
run.wait_for_completion(show_output=True)