Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

# Preparing the environment for the course

In this Notebook, we cover the necessary steps to provision an Ubuntu Data Science VM (DSVM) and install the necessary Python libraries and Jupyter Notebook. 

Below, we describe how you can:
1. Set up Jupyter Notebook access
2. Clone this repository
3. Install Python Dependencies
4. Create your Azure Machine Learning (AML) workspace

## Set up Jupyter Notebook access

**Note**: You can most likely skip this step if you are in the classroom.

Three steps are necessary to create a jupyter notebook server with remote access:
1. Configure the Jupyter server for remote access and start the jupyter server
2. Open an incoming TCP port in the firewall so you can access the jupyter server
3. Start Jupyter server

### Configure the Jupyter Server

Log into your DSVM, using one of your preferred clients:
- SSH terminal (e.g. such as [Putty](http://www.putty.org/)
- Remote desktop (e.g [X2Go](http://wiki.x2go.org/doku.php/doc:installation:x2goclient))

From the command line run the following commands:

~~~~
jupyter notebook --generate-config # generate a jupyter config file
cd ~/.jupyter
rm jupyter_notebook_config.py
wget https://sethmottstore.blob.core.windows.net/predmaint/jupyter_notebook_config.py
~~~~

### Configure firewall

From the Azure portal, navigate into your virtual machine, click on the **Networking** tab, then on **Add in-bound security rule** and **Add**. By default, this will open port 8080 that we can use to access the Jupyter server from.

### Clone the course repo

Run the following command to clone the Github repository for the course into the VM:
~~~~
cd ~
git clone https://github.com/Azure/LearnAI-ADPM.git
~~~~

### Start Jupyter server

Return to the terminal, navigate to the root directory of the cloned repository for this course, and start the Jupyter server by typing the following command:

~~~~
cd LearnAI-ADPM
jupyter notebook
~~~~

### Connect to your Jupyter Server

Then copy the **access token** shown (the string following `&token=` in the provided URL). Find the **VM IP address** (shown on **Overview** tab of `labvm` on the Azure portal). Now, from any machine we can go to the following URL on our browser to access the Jupyter server: 

`http://<VM_IP_ADDRESS>:8080/?token=<ACCESS_TOKEN>`

## Install or upgrade Python dependencies

**NOTE**:  For the instructions in this section, instead of using an SSH terminal to log into the server, we can simply access a terminal directly from Jupyter by selection **New > Terminal** from the menu.

Install a new Conda environment called `amladpm` using the provided conda environment configuration file `environment.yml` (this file is located in the root directory of this repository).

We can execute this cell to see what the configuration file of your conda environment looks like:

In [None]:
!cat ../environment.yml

Navigate to the root directory of this repository, and run the following command to install the python dependencies:

~~~~
conda env create -f environment.yml
~~~~

If you have followed these steps successfully, you should be able to use the kernel `Python [conda env:amladpm]` for this lab.

## Creating an Azure ML Workspace

Before you continue, switch the kernel of this notebook to the conda environment you installed above (e.g. `Python [conda env:amladpm]`)

If you already have access to an Azure ML Workspace, you can skip this section.  Otherwise, we will create one under the subscription provided (assuming we have the correct permissions for the given subscription ID).

This will fail when:
1. You do not have permission to create an Azure ML Workspace in the resource group.
2. You are not a subscription owner or contributor and no Azure ML Workspace have ever been created in this subscription.

If workspace creation fails for any reason other than already existing, please work with your IT admin to provide you with the appropriate permissions or to provision the required resources.

**Note:** The workspace creation can take several minutes.

Let's start by checkign which core SDK version number we have, to validate your installation and for debugging purposes:

In [None]:
import azureml.core

print("SDK Version:", azureml.core.VERSION)

An AML Workspace is an Azure resource that organizes and coordinates the actions of many other Azure resources to assist in executing and sharing machine learning workflows.  In particular, an AML Workspace coordinates storage, databases, and compute resources providing added functionality for machine learning experimentation, operationalization, and the monitoring of operationalized models.

In [None]:
import os

home_dir = os.path.expanduser('~')
config_path = os.path.join(home_dir, 'aml_config')
os.makedirs(config_path,  exist_ok = True)

Here's how we can create a Workspace using the Azure ML SDK. This may take a few seconds to run.

Fill out the empty quotes below with your `subscription_id` and `resource_group` from the azure portal.

For workspace region, we prefer you use `eastus2`. Other supported regions include `westcentralus`, `southeastasia`, `westeurope`, `australiaeast`, although their support might lag behind.

Fill in the entries in the following cell:

In [None]:
from azureml.core import Workspace

# if subscription ID is an environment variable, use `subscription_id = os.environ.get("SUBSCRIPTION_ID")`
subscription_id = '' # run `!az account show` to get it or go to the Azure portal
resource_group = '' # use an existing resource group or create a new one if you can
workspace_name = 'predmaintws' # pick a name for your workspace (or keep the one here)
workspace_region = 'eastus2' # pick a region for your workspace (should be same as resource group region)

# create the workspace (takes several seconds)
ws = Workspace.create(name = workspace_name,
                      subscription_id = subscription_id,
                      resource_group = resource_group, 
                      location = workspace_region,
                      exist_ok = True)

# print details of workspace configuration
ws.get_details()

# Once we have a workspace, we can create a config file for it using `write_config()`.
ws.write_config(path = config_path)

Alternatively, we can also follow the instructions [here](https://docs.microsoft.com/en-us/azure/machine-learning/service/how-to-configure-environment) to create an Azure ML Workspace from the Azure portal.

If we ran the above code successfully, we should have a new file called `~/aml_config/config.json`.

In [None]:
!cat ~/aml_config/config.json

## Configuring the local environment

From now on, we can then load the workspace from this config file from any notebook in the current directory by simply running `from_config()`.

In [None]:
import os
from azureml.core import Workspace

home_dir = os.path.expanduser('~')
config_path = os.path.join(home_dir, 'aml_config')

my_workspace = Workspace.from_config(path=os.path.join(config_path, 'config.json'))
my_workspace.get_details()

Now that we have a workspace, to run the remaining Notebooks in the course we create one or more **experiments**. This allows us to track the runs in the workspace. A workspace can have muliple experiments; an experiment must belong to a workspace. To create a experiment, we simply choose a name for it and run `exp = Experiment(workspace=ws, name=experiment_name)` where `ws` points to our workspace.

# The end

## Troubleshooting

If you have trouble login into azure from jupyter, try to run the following command in the terminal.

~~~~
sudo -i az extension remove --name azure-ml-admin-cli # needed to run `az login` from jupyter
~~~~


Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.