<i>Copyright (c) Microsoft Corporation. All rights reserved.</i>

<i>Licensed under the MIT License.</i>

# Submit an Existing Notebook to AzureML
This noteboook provides a scaffold to directly submit an existing notebook to AzureML compute targets. Now a user doesn't have to rewrite the training script, instead, just by replacing file name, now user can submit notebook directly.

### Advantages of using AzureML:
- Manage cloud resources for monitoring, logging, and organizing your machine learning experiments.
- Train models either locally or by using cloud resources, including GPU-accelerated model training.
- Easy to scale out when dataset grows - by just creating and pointing to new compute target

### Prerequisities
   - **Azure Subscription**
     - If you donâ€™t have an Azure subscription, create a free account before you begin. Try the [free or paid version of Azure Machine Learning service today](https://azure.microsoft.com/en-us/free/services/machine-learning/).
     - You get credits to spend on Azure services, which will easily cover the cost of running this example notebook. After they're used up, you can keep the account and use [free Azure services](https://azure.microsoft.com/en-us/free/). Your credit card is never charged unless you explicitly change your settings and ask to be charged. Or [activate MSDN subscriber benefits](https://azure.microsoft.com/en-us/pricing/member-offers/credit-for-visual-studio-subscribers/), which give you credits every month that you can use for paid Azure services.

# Setup environment

### Install azure.contrib.notebook package
We need notebook_run_config from azureml.contrib.notebook to run this notebook. Since azureml.contrib.notebook contains experimental components, it's not included in generated .yaml file by default.

In [None]:
#!pip install "azureml.contrib.notebook>=1.0.21.1"

In [None]:
import sys
sys.path.append("../")
import os
import azureml.core
from azureml.core import Workspace
from os import path, makedirs

from azureml.core import Experiment
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.widgets import RunDetails
from azureml.core.runconfig import RunConfiguration
import azureml.widgets as widgets
from azureml.contrib.notebook.notebook_run_config import NotebookRunConfig, PapermillExecutionHandler
from azureml.core.runconfig import DEFAULT_GPU_IMAGE
from azureml.core.conda_dependencies import CondaDependencies

print("azureml.core version: {}".format(azureml.core.VERSION))

### Connect to an AzureML workspace

An [AzureML Workspace](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace.workspace?view=azure-ml-py) is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows. In particular, an Azure ML Workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, deployment, inferencing, and the monitoring of deployed models.

The function below will get or create an AzureML Workspace and save the configuration to `aml_config/config.json`.

It defaults to use provided input parameters or environment variables for the Workspace configuration values. Otherwise, it will use an existing configuration file (either at `./aml_config/config.json` or a path specified by the config_path parameter).

Lastly, if the workspace does not exist, one will be created for you. See [this tutorial](https://docs.microsoft.com/en-us/azure/machine-learning/service/setup-create-workspace#portal) to locate information such as subscription id.

In [None]:
ws = Workspace.setup()

### Create or Attach Azure Machine Learning Compute 

We create a cpu cluster as our **remote compute target**. If a cluster with the same name already exists in your workspace, the script will load it instead. You can read [Set up compute targets for model training](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-set-up-training-targets) to learn more about setting up compute target on different locations. You can also create GPU machines when larger machines are necessary to train the model.

In [None]:
# Remote compute (cluster) configuration. If you want to save the cost more, set these to small.
VM_SIZE = 'STANDARD_NC6'
# Cluster nodes
MIN_NODES = 0
MAX_NODES = 2

CLUSTER_NAME = 'gpuclusternc6'

try:
    compute_target = ComputeTarget(workspace=ws, name=CLUSTER_NAME)
    print("Found existing compute target")
except:
    print("Creating a new compute target...")
    # Specify the configuration for the new cluster
    compute_config = AmlCompute.provisioning_configuration(
        vm_size=VM_SIZE,
        min_nodes=MIN_NODES,
        max_nodes=MAX_NODES
    )
    # Create the cluster with the specified name and configuration
    compute_target = ComputeTarget.create(ws, CLUSTER_NAME, compute_config)
    # Wait for the cluster to complete, show the output log
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)

In [None]:
NOTEBOOK_NAME = '01_training_introduction.ipynb' # use different notebook file name if trying to run other notebooks
experiment_name = NOTEBOOK_NAME.strip(".ipynb")

exp = Experiment(workspace=ws, name=experiment_name)

run_config = RunConfiguration()
run_config.target = "gpuclusternc6" # possible to use alternative compute targets
run_config.environment.docker.enabled = True
run_config.environment.docker.base_image = DEFAULT_GPU_IMAGE
run_config.environment.python.user_managed_dependencies = False
run_config.auto_prepare_environment = True
run_config.environment.python.conda_dependencies = CondaDependencies.create(conda_packages = ['python_version=3.6.2'], 
                                                                            pip_packages=['matplotlib', 'scrapbook', 'fastai','sklearn','scikit-learn','bqplot','torch','azureml-sdk'])

cfg = NotebookRunConfig(source_directory='../../',
                            notebook='classification/notebooks/' + NOTEBOOK_NAME,
                            output_notebook='classification/notebooks/out.ipynb',
                            run_config=run_config)

`exp.submit()` will submit source_directory folder and designate notebook to run

In [None]:
run = exp.submit(cfg)
widgets.RunDetails(run).show()

In [None]:
# run below after run is complete, otherwise metrics is empty
metrics = run.get_metrics()
print(metrics)

# Deprovision compute resource
To avoid unnecessary charges, if you created compute target that doesn't scale down to 0, make sure the compute target is deprovisioned after use.

In [None]:
# delete () is used to deprovision and delete the AmlCompute target. 
# do not run below before experiment completes

# compute_target.delete()

# deletion will take a few minutes. You can check progress in Azure Portal / Computing tab

In [None]:
# clean up temporary directory
tmp_dir.cleanup()