# Submodule 0 Tutorial 3: Notebooks in Azure ML
------------------------------------------------------

## Overview
This tutorial is intended to get you started on Azure ML with a "compute instance."

The process is not (novice) user-friendly *the first time.* You will have to set up a few things (accounts, subscriptions, workspaces) but these will be your defaults after the first time. It will thus be really easy to go back to or to use AzureML for other cloud computing modules provided at the NIGMS Sandbox. 

The steps below for setup might take 10-15min.  

## Learning Objectives
After this tutorial, you should be able to:
1. Understand what Azure is
2. Create an Azure account
3. Create a subscription
4. Create a workspace
5. Import Jupyter notebooks to Azure ML
6. Start & Stop a compute instance

## Prerequisites

You will need to know your Microsoft account login information, if you already have one


# Background: What are Azure and AzureML?

Microsoft Azure is a platform that lets you use remote computers ("the cloud") to run programs and store data without needing the physical hardware. Azure can provice a virtual environment where you can run Jupyter Notebooks, Python or R programs, or other tools.

Azure "Machine Learning" (ml.azure.com) is a service within Azure that makes it easy to set up and manage Jupyter notebooks in the cloud. It’s useful if you need to run data analysis or machine learning experiments on more powerful machines than your local computer. You can choose different numbers of virtual "cores" (the numbers of parallel computers processing your program) depending on your needs. 

The remote computers are secure and cost a nominal amount to use. 
<ul>
    <li> If you have an NIH Cloud Account, the costs are covered when you use your Cloud credentials and subscription. (See tutorial 3b to learn how) and skip to Step 4</li>
    <li> Otherwise, you will have to provide payment information for your work </li>
    <li> After that, you will need to create a workspace to be able to run the NIGMS tutorials now and, hopefully, your own data analysis in the future. At first, this seems overly complex since you only need one place to work. Remember that this is a tool for programmers and for companies who may have lots of projects running.</li>
</ul>


# Step 1: Create a Microsoft Azure account

You need to make an [Azure account](https://azure.microsoft.com/en-us/pricing/purchase-options/azure-account/).

The tutorials offered at the NIGMS Sandbox are all short enough (computing time, your time) that you should be able to use only the free tier of an Azure account.

![free_azure.png](./images/free_azure.png)

After selecting the free account button, Microsoft will take you to your login (if you already use a Microsoft account). If you do not have an account with them, you can make one (or sign in at this menu)

![azure_signin.png](./images/azure_signin.png)



# Step 2: Set up your subscription, if not provided by your University or the NIH
You need a subscription to use Azure computing (required for running Python or R code). If not provided for you through your organization or the NIH, this is probably the most complicated step. 

The process assumes that you are creating an account for a long-term relationship with a substantial budget. Generally, you can accept the default information.

Cost Details:
- Azure will provide the first $200 worth of charges, which is an enormous amount of computing if you are just learning and running notebooks
- The cost is low (~ 15cents/hour for the smallest virtual computer)
- You will need a credit card, so they could charge you for use of computingafter the trial period has expired.

### To create the subscription
To create an Azure Machine Learning (AzureML) subscription where you can run the Jupyter Notebook tutorials, you need to first create a regular Azure subscription if you don't already have one. 

1. Select the subscription icon

![azure_services.png](./images/azure_services.png)

2. In the new page, select add (to create a new subscription)

![add_subscription.png](./images/add_subscription.png)

3. Choose Your First Subscription Type
You will likely be offered one of the following options (see table). The student account, if applicable, is ideal because the $100 credit does not have a time limit. 

|Subscrption Type    | Description | Recommended for |
|:---------------|:----------------------|:------------------------|
|Free Trial|200 credit for 30 days |First-time users & experimentation |
|Pay-As-You-Go |Pay only for what you use, no upfront cost |Faculty or staff|
|**Student (Azure for Students)**| **Free $100 credit (no credit card needed) +free services**|**Verified students (with .edu email)**|
|NIH Subscription|For CloudLab|See below|

4. Provide a credit card for billing that subscription, if applicable

-------------------------------------------------------------------------------------

## Step 2b: If you have an NIH-provided subscription

It is hard to predict exactly what might happen when YOU log in. In order to switch to the NIH-cloudlab subscription, follow the following steps.

1. In the upper right corner of the Azure Machine Learning Studio page is a button with your initials:

![where2switch_subscription.png](./images/where2switch_subscription.png)

2. The menu that will appear should give you the option to switch the account to an NIH CloudLab account:

![switchSubscriptNIH.png](./images/switchSubscriptNIH.png)

You will have to login again.

# Step 3: Create a workspace

AzureML requires that you do all of your cloud computing in a "workspace. They imagine that you might have several distinct projects. You *can* run all of your NIH tutorials in a single workspace.

To create your workspace, select this button from the home screen of a different part of Azure: [Azure ML](https://ml.azure.com/)

![create_workspace_button.png](./images/create_workspace_button.png)

It will open another menu where you need to provide an official name that is unique within your group (i.e., Tutorials). You do not have to make a "friendly name".

The pull-down menu should have your subscription option.
![create_new_workspace.png](./images/create_new_workspace.png)

Your [Azure ML](https://ml.azure.com/) (at ml.azure.com) will likely open directly to this when you return OR it will be the top option under the left menu option of "Workspaces."



# Step 4: Loading the tutorial into in AzureML

### Notebooks
Azure uses their "machine learning studio" to view jupyter notebooks. 

1. Go to [ml.azure.com](http://ml.azure.com)
2. Select the notebook option (** if your system opens up with workspaces, click on a workspace & it will take you to this menu list)

![notebooks.png](./images/notebooks.png)

3. To add the tutorial folders/files, you should upload with this button (OR you can clone a repository with the "elegant" directions in [Github download tutorial 1](Submodule_0_Tutorial_1_GithubDownload.ipynb)

![upload.png](./images/upload.png)

With the + button, you can upload the zipped modules that you have downloaded from Github. The entire file structure will be reproduced in Azure notebooks when you upload the whole unzipped folder. 

![upload_folder.png](./images/upload_folder.png)

<div class="alert alert-block alert-warning"> <b>Attention:</b> The Sandbox tutorials assume that you will have the folders that they provide. Hyperlinks usually refer to the folder, not to a web address. Please upload the whole tutorial folder</a>. </div>

## Step 4: Open a Jupyter notebook

In Azure, after you've loaded in some notebooks, you need to open the folder, then double click on a notebook so it will open. IN the image below, the arrow shows where you can find the list of notebooks. Circled is the button to expand the notebook & shrink the file structure. 

![opening_notebook.png](./images/opening_notebook.png)

NOTE: You cannot **run** any code boxes in the Jupyter notebook yet! If you click on the "play" arrows nothing will happen until you start a compute instance (see next box) 


# Step 5: Start & Stop compute instances
While you can read a Jupyter notebook in AzureML, or even in github, you cannot run any of the Python code without "computing." This takes additional resources so Azure makes you <u>initiate a "compute instance."</u> (Some places on the NIGMS sandbox say "spin up a compute instance" but it really all just means "start up" or "turn on")

This is a technical name for starting cloud workstation-- basically a computer operating through the internet.

This is the thing for which Azure will charge you, so you start it and stop it as needed.

For NIGMS Sandbox Notebooks, there are two main types of "Kernels": Python or R usually with a number for the version. 
<br>
![List_of_kernels.png](./images/List_of_kernels.png)


## To START the compute instance
It is simple to start the compute instance. Above the notebook on the toolbar, there is a "play" button (right pointing triangle).

![startCompute.png](./images/startCompute.png)


When you click on that, it will take 1-2 minutes for the cloud workstation to start up. During that time, you *can* run code boxes and the system will queue those boxes until the instance begins.

<br>
TYPICALLY, the correct type of kernel will start up automatically with the next step. If it's Python, you should see the following banner in the upper right:
<br>

![type_of_kernel.png](./images/type_of_kernel.png)

## To STOP the compute instance
When you have finished with the jupyter notebook, you should **stop** the compute instance because you are being charged for the whole time it is running, whether or not you are actively using the tool. 

To stop this, use the stop button that is in the same spot as the start ("play") arrow. It is a square in a circle on the toolbar.

![stopCompute](./images/stopCompute.png)

Sometimes a compute will start automatically, so it's important to check to make sure it is not "playing" when you don't want it to be.

*Azure can be configured to stop the compute automatically after some time, but be sure to STOP at the end of your work time to avoid being charged per hour* 

## Conclusion
After this tutorial, you should have all you need to run the tutorials in AzureML.

You can now:
+ learn how to [leverage Github for FAIR data practices](./Submodule_0_Tutorial_4_Github4You.md) in your research lab
+ Return to the folder list and proceed to Submodule 1


## Clean up
<div class="alert alert-block alert-warning"> <b>Attention:</b> To avoid unnecessary charges, please STOP your compute instance if you started one</a>. </div>
(you did not NEED a compute instance to run anything on this page)