# Azure ML Local Run
In this notebook, we create an Azure ML workspace, and use it to locally run the training script.

The steps in this notebook are
- [import libraries](#import),
- [set the Azure subscription](#subscription),
- [create an Azure ML workspace](#workspace),
- [create an estimator](#estimator),
- [create an experiment](#experiment),
- [submit the estimator](#submit), and
- [get the results](#results).

## Imports  <a id='import'></a>

In [None]:
import os
from azure.common.credentials import get_cli_profile
from azureml.core import Workspace, Experiment
from azureml.train.estimator import Estimator
import azureml.core
from get_auth import get_auth
print('azureml.core.VERSION={}'.format(azureml.core.VERSION))

## Azure subscription <a id='subscription'></a>
If you have multiple subscriptions select the subscription you want to use. You may supply either the subscription's name or the subscription's ID. If you want to run this in a different location that supports HyperDrive, you may enter the one you want to use. You can also set the name of the resource group in which this tutorial will add resources. *IMPORTANT NOTE:* The last notebook in this example will delete this resource group and all associated resources. We also define the number of estimators to use for the local run.

In [None]:
subscription_name="YOUR_SUBSCRIPTION_NAME"
subscription_id="YOUR_SUBSCRIPTION_ID"
location="eastus"
resource_group="hypetuning"
estimators = 1000

Check that we have either a subscription name or ID. If the ID has been supplied, then use that value.

In [None]:
if (subscription_name == "YOUR_SUBSCRIPTION_NAME"
    and subscription_id == "YOUR_SUBSCRIPTION_ID"):
    raise Exception("At least one of a subscription's name or ID must be supplied")
if subscription_id != "YOUR_SUBSCRIPTION_ID":
    print("A subscription ID has been supplied, so it will be used")
    subscription_name = subscription_id

Login to Azure if not already logged in.

In [None]:
%%bash
list=`az account list -o table`
if [ "$list" == '[]' ] || [ "$list" == '' ]; then 
  az login -o table
else
  az account list -o table 
fi

Set the selected subscription as the default.

In [None]:
%%bash -s "$subscription_name"
az account set --subscription "$1"
az account show -o table

Get the ID for the selected Azure subscription.

In [None]:
az_profile = get_cli_profile()
subscription_id = az_profile.get_subscription_id()
print("Using subscription ID", subscription_id)

## Create an Azure ML workspace <a id='workspace'></a>
Create a workspace if it does not already exist or recover it if it does exist, and write out its details to `config.json` to reference it between notebooks. The first time this is run, this can take about a minute.

In [None]:
auth = get_auth()
ws = Workspace.create(name='hypetuning',
                      subscription_id=subscription_id,
                      resource_group=resource_group,
                      create_resource_group=True,
                      exist_ok=True,
                      location=location,
                      auth=auth)
ws.write_config()

## Create an estimator <a id='estimator'></a>
Create an estimator that specifies the location of the script, sets up its parameters, and specifies the packages needed to run the script. It may take a while to prepare the run environment the first time an estimator is used, but that environment will be used until the list of packages is changed. 

In [None]:
est = Estimator(source_directory=os.path.join('.', 'scripts'), 
                entry_script='TrainClassifier.py',
                script_params={'--data-folder': os.path.abspath('.'),
                               '--estimators': estimators,
                               '--match': 5,
                               '--ngrams': 2,
                               '--min_child_samples': 10,
                               "--save": "local_model"},
                compute_target='local',
                conda_packages=['pandas==0.23.4',
                                'scikit-learn==0.21.3',
                                'lightgbm==2.2.1'],
                use_docker=False)

## Create an experiment <a id='experiment'></a>
Get an experiment to run the script; create it if it doesn't already exist.

In [None]:
exp = Experiment(workspace=ws, name='hypetuning')

## Submit the script <a id='submit'></a>
Submit the estimator containing the script to be run. This should return almost immediately, and the value is an object that lets you programmatically control the run.

In [None]:
run = exp.submit(est)
run

The experiment returns a table with a link to the `Details Page` in the Azure Portal. That page will let you monitor the status of this run of the experiment, and that of previous runs of that experiment. By clicking on a particular run, you can see its details, the files output by the script, and the logs of the run, including the `driver.log` with the script's print outs.

## Get the results <a id='results'></a>
Wait for the run to complete. It will take about three minutes. This returns a `dict` with detailed information about the run. Here, we see that the run is either `Finalizing` or has `Completed`. Other statuses include `Queued`, `Preparing`, `Initializing`, `Running`, and `Failed`.

In [None]:
%%time

run_status = run.wait_for_completion()
run_status['status']

We can now get the tune dataset metrics logged by the script during its execution.

In [None]:
run.get_metrics()

## Evaluate the model using the test data <a id='evaluate'></a>
Download the trained model.

In [None]:
run.download_file(os.path.join('outputs', 'local_model.pkl'),
                  os.path.join('outputs', 'local_model.pkl'))

Look at the model's performance on the held-aside test data. This can take a couple of minutes.

In [None]:
%run -t scripts/TestClassifier.py --model local_model

In [the next notebook](04_Hyperparameter_Random_Search.ipynb), we use the AML SDK to tune the set of hyperparameters.