Copyright (c) Microsoft Corporation. All rights reserved.  
Licensed under the MIT License.

## Goal

The goal of this is to use `Experiment.submit()` to run a simple script that calls a jupyter notebook. In a second [notebook](simple-pm-run-as-pipeline.ipynb), I show how to set up a pipeline to do this and how it differs in its behavior.


## Dependencies

This notebook requires that `azureml-sdk` is installed in the environment in which it is run.

## Workload

This notebook will submit the `papermill_run_notebook.py` script in the `./projectDir` directory, which will, in turn, execute the `hello_world.ipynb` notebook (also in the `./projectDir` directory).

This entry script (`papermill_run_notebook.py`) is intended as a simplified version of the one used in the [Microsoft/Recommenders repository](https://github.com/Microsoft/Recommenders/blob/jumin/dnn/reco_utils/aml/wide_deep.py), so that I can document and test effects of parameters in the second [notebook](simple-pm-run-as-pipeline.ipynb).

### Azure Machine Learning Imports

In this first code cell, we import key Azure Machine Learning modules that we will use below. 

In [None]:
import os

import azureml.core
from azureml.core import Workspace, Run, Experiment
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core import ScriptRunConfig
from azureml.core.runconfig import CondaDependencies, RunConfiguration
from azureml.core.runconfig import DEFAULT_CPU_IMAGE

# Check core SDK version number
print("SDK version:", azureml.core.VERSION)

### Initialize Workspace

Initialize a [workspace](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace(class%29) object from persisted configuration, or get it from Azure

In [None]:
aml_compute_target = "aml-compute-d2" ## 2-16 characters
exp_name = 'papermill-as-exp-run'
# project folder
project_folder = './projectDir'

In [None]:
if os.path.isdir('aml_config'):
    print('Loading Workspace information from configuration')
    ws = Workspace.from_config()
else:
    print('Getting Workspace information from Variables. You must set these or this will fail!')
    SUBSCRIPTION_ID = os.getenv("AZ_SUB","")
    RESOURCE_GROUP = os.getenv("RESOURCE_GROUP","")
    WS_NAME = os.getenv("WS_NAME","")
    WS_LOCATION = 'eastus'
    ws=Workspace.get(name=WS_NAME,
                    resource_group=RESOURCE_GROUP,
                    subscription_id=SUBSCRIPTION_ID)

print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')


## Compute Targets
A compute target specifies where to execute your program such as a local environment, a remote Docker on a VM, or a cluster. A compute target needs to be addressable and accessible by you.

We will walk through a few examples.


## Local Compute - User managed

The simplest compute corresponds to a python executable in which you already have dependencies created. In this example, 
we will use a conda environment that was created from the `papermill_conda.yml` file on the local machine. If you need to, you can create the conda environment with:

```
conda env create -f papermill_conda.yml
```

Then, update the path to executable with the location of python within that environment

In [None]:
local_run_config_user_managed = RunConfiguration()
local_run_config_user_managed.environment.python.user_managed_dependencies = True
local_run_config_user_managed.environment.python.interpreter_path = 'C:\\Users\\jeremr\\AppData\\Local\\conda\\conda\\envs\\pm_simple\\python.exe'

# local_run_config.environment.docker.enabled = False
# local_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE

## Run as a single submission of an experiment

Simply create a `ScriptRunConfig`, then `Experiment.submit()` it to the local, user-managed configuration.

In this example, `source_directory` is the directory containing the script to execute the notebook, the notebook, and any other dependencies. All files in this directory get mounted to the container in AmlCompute.

The `script` parameter refers to the path to the script you want to run, relative to the root of `source_directory`, and `run_config` is the run configuration you just created.


In [None]:
## Can just run it as an experiment locally
my_config = ScriptRunConfig(source_directory=project_folder, 
                            script='papermill_run_notebook.py', 
                            run_config=local_run_config_user_managed
                            )
## submit it...
Experiment(ws, exp_name).submit(my_config)

### Run again (still locally), with different parameters passed to the notebook.

In [None]:
## Specify arguments:
argumentlist = ["-x", 2, "-y", 3]
## pass the arguments:
my_config = ScriptRunConfig(source_directory=project_folder, 
                            script='papermill_run_notebook.py', 
                            run_config=local_run_config_user_managed,
                            arguments=argumentlist
                            )
## submit it...
Experiment(ws, exp_name).submit(my_config)

### Use local docker environment

Create a new RunConfiguration and change a few values.

**NOTE**: This requires that docker is installed and running on the host machine.

In [None]:
## need ipykernel so that there is a kernel installed.
## need azureml-sdk for logging metrics
cd = CondaDependencies.create(pip_packages=["ipykernel", "papermill", "azureml-sdk"])

my_local_docker_run_config = RunConfiguration(conda_dependencies=cd)
my_local_docker_run_config.environment.docker.enabled = True
my_local_docker_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE
my_local_docker_run_config.auto_prepare_environment = True


In [None]:
## Run in Local docker
argumentlist = ["-x", 2, "-y", 3]
my_config = ScriptRunConfig(source_directory=project_folder, 
                            script='papermill_run_notebook.py', 
                            run_config=my_local_docker_run_config,
                            arguments=argumentlist
                            )
## submit it...
Experiment(ws, exp_name).submit(my_config)

## Use AmlCompute

First, list the Azure-based computes

#### List of Compute Targets on the workspace

In [None]:
cts = ws.compute_targets
for ct in cts:
    print(ct)

#### Retrieve or create a Azure Machine Learning compute
Azure Machine Learning Compute is a service for provisioning and managing clusters of Azure virtual machines for running machine learning workloads. Let's create a new Azure Machine Learning Compute in the current workspace, if it doesn't already exist. We will then run the training script on this compute target.

If we could not find the compute with the given name in the previous cell, then we will create a new compute here. We will create an Azure Machine Learning Compute containing **STANDARD_D2_V2 CPU VMs**. This process is broken down into the following steps:

1. Create the configuration
2. Create the Azure Machine Learning compute

**This process will take about 3 minutes and is providing only sparse output in the process. Please make sure to wait until the call returns before moving to the next cell.**

In [None]:

try:
    aml_compute = AmlCompute(ws, aml_compute_target)
    print("found existing compute target.")
except:
    print("creating new compute target")
    provisioning_config = AmlCompute.provisioning_configuration(vm_size = "STANDARD_D2_V2",
                                                                min_nodes = 1, 
                                                                max_nodes = 4)    
    aml_compute = ComputeTarget.create(ws, aml_compute_target, provisioning_config)
    aml_compute.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
print("Azure Machine Learning Compute attached")


### Create a New AmlCompute RunConfiguration

The only real change from local docker is that you need to fill the `target` field with the `aml_compute.name`

In [None]:
my_aml_run_config = RunConfiguration(conda_dependencies=cd)
my_aml_run_config.target = aml_compute.name
my_aml_run_config.environment.docker.enabled = True
my_aml_run_config.environment.docker.base_image = DEFAULT_CPU_IMAGE

### Run the same script against AmlCompute

In [None]:
## Run in AmlCompute
argumentlist = ["-x", 3, "-y", 3]
my_config = ScriptRunConfig(source_directory=project_folder, 
                            script='papermill_run_notebook.py', 
                            run_config=my_aml_run_config,
                            arguments=argumentlist
                            )
## submit it...
Experiment(ws, exp_name).submit(my_config)

In [None]:
## pass the arguments, and then run across multiple run_configs
argumentlist = ["-x", 2, "-y", 2]
run_configs = [my_local_docker_run_config, local_run_config_user_managed, my_aml_run_config]
for rc in run_configs:
    my_config = ScriptRunConfig(source_directory=project_folder, 
                            script='papermill_run_notebook.py', 
                            run_config=rc,
                            arguments=argumentlist
                            )
    ## submit it...
    Experiment(ws, exp_name).submit(my_config)

## Save the runconfig for AmlCompute to use in another notebook

In [None]:
## this will save a file to ./aml_config/<aml_compute.name>.runconfig
my_aml_run_config.save(path='.', name=aml_compute.name)