Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Train Locally
In this notebook, you will perform the following using Azure Machine Learning.
* Load workspace.
* Configure & execute a local run in a user-managed Python environment.
* Configure & execute a local run in a system-managed Python environment.
* Configure & execute a local run in a Docker environment.
* Register model for operationalization.

In [1]:
!who

riversand pts/0        Apr 15 12:50 (122.181.197.82)


In [1]:
import os

from azure_utils.machine_learning.utils import get_workspace_from_config
from azureml.core import Experiment
from azureml.core import ScriptRunConfig
from azureml.core.runconfig import RunConfiguration

#from notebooks import directory

If you run your code in unattended mode, i.e., where you can't give a user input, then we recommend to use ServicePrincipalAuthentication or MsiAuthentication.
Please refer to aka.ms/aml-notebook-auth for different authentication mechanisms in azureml-sdk.


## Initialize Model Hyperparameters

This notebook uses a training script that uses 
[lightgbm](https://lightgbm.readthedocs.io/en/latest/Python-API.html#scikit-learn-api). 
Here we set the number of estimators. 

In [2]:
directory="/home/riversand/notebooks/az-ml-realtime-score/notebooks"

In [3]:
num_estimators = "10"

## Initialize Workspace

Initialize a workspace object from persisted configuration file.

In [6]:
#ws = get_workspace_from_config()
from azureml.core.authentication import MsiAuthentication
from azureml.core import Workspace

msi_auth = MsiAuthentication()
ws = Workspace(subscription_id="109e56d8-d599-4905-ab89-be3f6c7e1662",
               resource_group="trial2",
               workspace_name="experiment2ml",
               auth=msi_auth)
print(ws.name, ws.resource_group, ws.location, sep="\n")

experiment2ml
trial2
eastus


## Create An Experiment
**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics 
and output artifacts from your experiments.

In [7]:
experiment_name = "mlaks-train-on-local"
exp = Experiment(workspace=ws, name=experiment_name)

## Configure & Run

In this section, we show three different ways of locally training your model through Azure ML SDK for demonstration 
purposes. Only one of these runs is sufficient to register the model.


### User-managed environment
Below, we use a user-managed run, which means you are responsible to ensure all the necessary packages that are 
available in the Python environment you choose to run the script. We will use the environment created for this 
tutorial which has Azure ML SDK and other dependencies installed.

In [8]:
# Editing a run configuration property on-fly.
run_config_user_managed = RunConfiguration()

run_config_user_managed.environment.python.user_managed_dependencies = True

# Choose the specific Python environment of this tutorial by pointing to the Python path
run_config_user_managed.environment.python.interpreter_path = "/anaconda/envs/az-ml-realtime-score/bin/python"


#### Submit script to run in the user-managed environment
Note that the whole `scripts` folder is submitted for execution, including the `item_selector.py` and `label_rank.py` 
files. The model will be written to `outputs` directory which is a special directory such that all content in this 
directory is automatically uploaded to your workspace. 

In [10]:
if not os.path.isdir("script"):
    os.mkdir("script")

In [11]:
%%writefile script/create_model.py
from azure_utils.machine_learning import create_model

if __name__ == '__main__':
    create_model.main()


Writing script/create_model.py


In [12]:
script = "create_model.py"
args = [
    "--inputs",
    os.path.abspath(directory + "/data_folder"),
    "--outputs",
    "outputs",
    "--estimators",
    num_estimators,
    "--match",
    "5",
]

In [13]:
src = ScriptRunConfig(
    source_directory="./script",
    script=script,
    arguments=args,
    run_config=run_config_user_managed,
)

run = exp.submit(src)    
run.wait_for_completion(show_output=True)
run.get_file_names()
run.get_metrics()

RunId: mlaks-train-on-local_1586860619_26212aa8
Web View: https://ml.azure.com/experiments/mlaks-train-on-local/runs/mlaks-train-on-local_1586860619_26212aa8?wsid=/subscriptions/109e56d8-d599-4905-ab89-be3f6c7e1662/resourcegroups/trial2/workspaces/experiment2ml

Streaming azureml-logs/60_control_log.txt

Streaming log file azureml-logs/60_control_log.txt
Running: ['/bin/bash', '/tmp/azureml_runs/mlaks-train-on-local_1586860619_26212aa8/azureml-environment-setup/conda_env_checker.sh']

Streaming azureml-logs/70_driver_log.txt

Starting the daemon thread to refresh tokens in background for process with pid = 11468
Entering Run History Context Manager.
Preparing to call script [ create_model.py ] with arguments: ['--inputs', '/home/riversand/notebooks/az-ml-realtime-score/notebooks/data_folder', '--outputs', 'outputs', '--estimators', '10', '--match', '5']
After variable expansion, calling script [ create_model.py ] with arguments: ['--inputs', '/home/riversand/notebooks/az-ml-realtime-sc

{'Accuracy @1': 0.0,
 'Accuracy @2': 0.24036697247706423,
 'Accuracy @3': 0.3327217125382263,
 'Mean Rank': 26.640061162079512}

## Register Model

In [14]:
run.get_metrics()

{'Accuracy @1': 0.0,
 'Accuracy @2': 0.24036697247706423,
 'Accuracy @3': 0.3327217125382263,
 'Mean Rank': 26.640061162079512}

## Register Model

We now register the model with the workspace so that we can later deploy the model.

In [15]:
# supply a model name, and the full path to the serialized model file.
model = run.register_model(model_name="question_match_model", model_path="./outputs/model.pkl")

In [16]:
print(model.name, model.version, model.url, sep="\n")

question_match_model
1
aml://asset/db6729f4e7e742408d8cf72f3f6f16df


We can now move on to [Develop Scoring Script](03_DevelopScoringScript.ipynb) notebook to train our model
using Azure Machine Learning.