# Azure ML Local Run
In this notebook, we create an Azure ML workspace, and use it to locally run the training script.
## Imports and definitions

In [None]:
import os
import shutil
import json
from azureml.core import Workspace, Experiment, ScriptRunConfig
from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies
import azureml.core
print('azureml.core.VERSION={}'.format(azureml.core.VERSION))

## Azure subscription
If you have multiple subscriptions select the subscription you want to use.

In [None]:
%env selected_subscription=Boston Team Danielle

Login to Azure if not already logged in.

In [None]:
%%bash
list=`az account list -o table`
if [ "$list" == '[]' ] || [ "$list" == '' ]; then 
  az login -o table
else
  az account list -o table 
fi

Set the selected subscription as the default.

In [None]:
%%bash
az account set --subscription "$selected_subscription"
az account show -o table

Get the information for the selected Azure subscription.

In [None]:
account_json = !az account show
account = json.loads(''.join(account_json))

## Create an Azure ML workspace
Create a workspace, if it does not already exist, and write it out to `config.json` to reference it between notebooks.

In [None]:
ws = Workspace.create(name='maboutest',
                      subscription_id=account['id'],
                      resource_group='maboutest',
                      create_resource_group=True,
                      location='eastus2',
                      exist_ok=True)
ws.write_config()

## Define the run configuration
Define a system-managed run configuration. This configuration is used to create the environment in which the script will be run. It may take a while to build the environment the first time a run configuration is used, but that environment will be used until the run configuration is changed.  Note that `azureml-sdk` is included in the `pip_packages` because it is used by our script.

In [None]:
run_config = RunConfiguration()
run_config.environment.python.user_managed_dependencies = False
run_config.auto_prepare_environment = True
run_config.environment.python.conda_dependencies = CondaDependencies.create(
    conda_packages=['pandas==0.23.4',
                    'scikit-learn==0.20.0'],
    pip_packages=['azureml-sdk',
                  'lightgbm==2.1.2'])

## Define the script configuration
Specify the script to be run on your local machine. The path to the `data` directory must be absolute.

In [None]:
src = ScriptRunConfig(source_directory=os.path.join('.', 'scripts'), 
                      script='TrainTestClassifier.py', 
                      arguments=['--inputs', os.path.abspath('data'),
                                 '--estimators', '1000',
                                 '--match', '5',
                                 '--ngrams', '2',
                                 '--min_child_samples', '10'],
                      run_config=run_config)

## Create an experiment
Get an experiment to run the script; create it if it doesn't already exist.

In [None]:
exp = Experiment(workspace=ws, name='mabouhypelocal')

## Run the script
Submit the script to be run. This should return almost immediately.

In [None]:
exp.submit(src)

The experiment returns a table with a link to the `Details Page` in the Azure Portal. That page will let you monitor the status of this run of the experiment, and that of previous runs of that experiment. By clicking on a particular run, you can see its details, files output by the script, and the logs of the run, including the `driver.log` with the script's print outs.

Get an object associated with the latest run. Using this object, you can programmatically control the job. This object was the value returned by the `exp.submit(src)` call.

In [None]:
run_local = list(exp.get_runs())[0]

Wait for the run to complete. This returns a `dict` with detailed information about the run. Here, we see that the run is either `Finalizing` or has `Completed`. Other states include `Running` and `Failed`.

In [None]:
run_status = run_local.wait_for_completion()
run_status['status']

We can now get the metrics logged by the script during its execution.

In [None]:
run_local.get_metrics()