Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# 04. Train in a remote Linux VM
* Create Workspace
* Create `train.py` file
* Create (or attach) DSVM as compute resource.
* Configure & execute a run in a few different ways
    - Use system-built conda
    - Use existing Python environment
    - Use Docker 
* Find the best model in the run

## Prerequisites
Make sure you go through the [00. Installation and Configuration](00.configuration.ipynb) Notebook first if you haven't.

In [None]:
# Check core SDK version number
import azureml.core

print("SDK version:", azureml.core.VERSION)

## Initialize Workspace

Initialize a workspace object from persisted configuration.

In [None]:
from azureml.core import Workspace

ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')

## Create Experiment

**Experiment** is a logical container in an Azure ML Workspace. It hosts run records which can include run metrics and output artifacts from your experiments.

In [None]:
experiment_name = 'train-on-remote-vm'

from azureml.core import Experiment
exp = Experiment(workspace=ws, name=experiment_name)

## View `train.py`

For convenience, we created a training script for you. It is printed below as a text, but you can also run `%pfile ./train.py` in a cell to show the file.

In [None]:
with open('./train.py', 'r') as training_script:
    print(training_script.read())

## Create Linux DSVM as a compute target

**Note**: If creation fails with a message about Marketplace purchase eligibilty, go to portal.azure.com, start creating DSVM there, and select "Want to create programmatically" to enable programmatic creation. Once you've enabled it, you can exit without actually creating VM.
 
**Note**: By default SSH runs on port 22 and you don't need to specify it. But if for security reasons you switch to a different port (such as 5022), you can append the port number to the address like the example below.

In [None]:
from azureml.core.compute import DsvmCompute
from azureml.core.compute_target import ComputeTargetException

compute_target_name = 'mysupervm'

try:
    dsvm_compute = DsvmCompute(workspace=ws, name=compute_target_name)
    print('found existing:', dsvm_compute.name)
except ComputeTargetException:
    print('creating new.')
    dsvm_config = DsvmCompute.provisioning_configuration(vm_size="Standard_D2_v2", ssh_port="5022")
    dsvm_compute = DsvmCompute.create(ws, name=compute_target_name, provisioning_configuration=dsvm_config)
    dsvm_compute.wait_for_completion(show_output=True)

## Attach an existing Linux DSVM
You can also attach an existing Linux VM as a compute target. The default port is 22, but below we are setting to 5022.

In [None]:
from azureml.core.compute import RemoteCompute 
# if you want to connect using SSH key instead of username/password you can provide parameters private_key_file and private_key_passphrase 
attached_dsvm_compute = RemoteCompute.attach(workspace=ws,
                                             name="attached_vm",
                                             username='<ssh-usename>',
                                             address='<ip_adress>',
                                             ssh_port=5022,
                                             password='<password>')
attached_dsvm_compute.wait_for_completion(show_output=True)

## Configure & Run
There are many ways to execute script on a remote VM.

### Conda run
You can ask the system to build a conda environment based on your dependency specification, and submit your script to run there. Once the environment is built, and if you don't change your dependencies, it will be reused in subsequent runs.

In [None]:
from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies

# create a new RunConfig object
conda_run_config = RunConfiguration(framework="python")

# Set compute target to the Linux DSVM
conda_run_config.target = dsvm_compute.name

# specify CondaDependencies obj
conda_run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])

In [None]:
from azureml.core import Run
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='.', script='train.py', run_config=conda_run_config)
run = exp.submit(config=src)

In [None]:
run

In [None]:
run.wait_for_completion(show_output=True)

### Native VM run
You can also configure to use an exiting Python environment in the VM to execute the script without asking the system to create a conda environment for you.

In [None]:
# create a new RunConfig object
vm_run_config = RunConfiguration(framework="python")

# Set compute target to the Linux DSVM
vm_run_config.target = dsvm_compute.name

# Let system know that you will configure the Python environment yourself.
vm_run_config.environment.python.user_managed_dependencies = True

The below run will likely fail because `train.py` needs dependency `azureml`, `scikit-learn` and others, which are not found in that Python environment. 

In [None]:
from azureml.core import Run
from azureml.core import ScriptRunConfig

src = ScriptRunConfig(source_directory='.', script='train.py', run_config=vm_run_config)
run = exp.submit(config=src)
run.wait_for_completion(show_output=True)

In [None]:
%%writefile ./train2.py

print('Hello World (without Azure ML SDK)!')

Now let's try again. And this time it should work fine.

In [None]:
src = ScriptRunConfig(source_directory='.', script='train2.py', run_config=vm_run_config)
run = exp.submit(config=src)
run.wait_for_completion(show_output=True)

Note even in this case you get a run record with some basic statistics.

In [None]:
run

### Configure a Docker run with new conda environment on the VM
You can execute in a Docker container in the VM. If you choose this option, the system will pull down a base Docker image, build a new conda environment in it if you ask for (you can also skip this if you are using a customer Docker image when a preconfigured Python environment), start a container, and run your script in there. This image is also uploaded into your ACR (Azure Container Registry) assoicated with your workspace, an reused if your dependencies don't change in the subsequent runs.

In [None]:
from azureml.core.runconfig import RunConfiguration
from azureml.core.conda_dependencies import CondaDependencies


# Load the "cpu-dsvm.runconfig" file (created by the above attach operation) in memory
docker_run_config = RunConfiguration(framework="python")

# Set compute target to the Linux DSVM
docker_run_config.target = dsvm_compute.name

# Use Docker in the remote VM
docker_run_config.environment.docker.enabled = True

# Use CPU base image from DockerHub
docker_run_config.environment.docker.base_image = azureml.core.runconfig.DEFAULT_CPU_IMAGE
print('Base Docker image is:', docker_run_config.environment.docker.base_image)

# Ask system to provision a new one based on the conda_dependencies.yml file
docker_run_config.environment.python.user_managed_dependencies = False

# Prepare the Docker and conda environment automatically when executingfor the first time.
docker_run_config.prepare_environment = True

# specify CondaDependencies obj
docker_run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages=['scikit-learn'])

### Submit the Experiment
Submit script to run in the Docker image in the remote VM. If you run this for the first time, the system will download the base image, layer in packages specified in the `conda_dependencies.yml` file on top of the base image, create a container and then execute the script in the container.

In [None]:
src = ScriptRunConfig(source_directory='.', script='train.py', run_config=docker_run_config)
run = exp.submit(config=src)

### View run history details

In [None]:
run

In [None]:
run.wait_for_completion(show_output=True)

### Find the best model

Now we have tried various execution modes, we can find the best model from the last run.

In [None]:
# get all metris logged in the run
run.get_metrics()
metrics = run.get_metrics()

In [None]:
# find the index where MSE is the smallest
indices = list(range(0, len(metrics['mse'])))
min_mse_index = min(indices, key=lambda x: metrics['mse'][x])

print('When alpha is {1:0.2f}, we have min MSE {0:0.2f}.'.format(
    metrics['mse'][min_mse_index], 
    metrics['alpha'][min_mse_index]
))

## Clean up compute resource

In [None]:
dsvm_compute.delete()