# Part 0 - Lab Overview and Setup
This lab demonstrates how to use Azure Machine Learning service to orchestrate an end-to-end Deep Learning workflow from data preparation to model operationalization.


## Scenario

We will train a custom image classification model to automatically classify the type of land shown in aerial images of 224-meter x 224-meter plots. Land use classification models can be used to track urbanization, deforestation, loss of wetlands, and other major environmental trends using periodically collected aerial imagery. The images used in this lab are based on imagery from the U.S. National Land Cover Database. U.S. National Land Cover Database defines six primary classes of land use: *Developed*, *Barren*, *Forested*, *Grassland*, *Shrub*, *Cultivated*. Example images in each land use class are shown here:

Developed | Cultivated | Barren
--------- | ------ | ----------
![Developed](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/developed1.png) | ![Cultivated](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/cultivated1.png) | ![Barren](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/barren1.png)

Forested | Grassland | Shrub
---------| ----------| -----
![Forested](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/forest1.png) | ![Grassland](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/grassland1.png) | ![Shrub](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/shrub1.png)

We are going to employ a machine learning technique called transfer learning. Transfer learning is one of the fastest (code and run-time-wise) ways to start using deep learning. It allows to reuse knowledge gained while solving one problem to a different but related problem. For example, knowledge gained while learning to recognize cars could apply when trying to recognize trucks. Transfer Learning makes it feasible to train very effective ML models on relatively small training data sets, which is our case.

Although the primary goal of this lab is to understand how to use Azure ML to orchestrate Deep Learning workflows rather then to dive into Deep Learning techniques, ask the instructor if you want to better understand the approach utilized in the lab.


## Lab flow

During the lab we will walk through a full end-to-end machine learning workflow.

![AMLWorkflow](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/aml.png)

- First, we will pre-process training images using a pretrained Deep Neural Network to extract a relatively small number of powerfull features - so called bottleneck features. 

- Then we will use the extracted bottleneck features to train a small Fully Connected Neural Network to recognize land plot images.

- Finally, we will operationalize the  model as a REST web service and deploy it to Azure.

Each part of the lab runs as a Jupyter notebook:

* Part 1 - Feature extraction - `01-feature-engineering.ipynb`
* Part 2 - Training and evaluation - `02-train.ipynb`
* Part 3 - Model operationalization - `03-deploy.ipynb`

We will use Azure Machine Learning service and Azure compute and storage resources to orchestrate this workflow.


## What is Azure Machine Learning service?

Azure Machine Learning service is a cloud service that you can use to develop and deploy machine learning models. Using Azure Machine Learning service, you can track your models as you build, train, deploy, and manage them, all at the broad scale that the cloud provides.

Azure Machine Learning service fully supports open-source technologies, so you can use tens of thousands of open-source Python packages with machine learning components such as TensorFlow, PyTorch, MXNet and scikit-learn. 

In this lab, you are going to use TensorFlow, specifically TensorFlow's high level API Keras.

Azure Machine Learning service helps you orchestrate machine learning workflows using an architecture depicted on the below diagram.

![AML workflow](https://github.com/jakazmie/AIDays/raw/master/DataScientistTrack/02-AML-EndToEndWalkthrough/images/amlarch.png)


1. Data preparation and model training logic are coded as Python scripts utilizing any the hundreds of supported libraries nd frameworks. The scripts can be instrumented with AML API calls to help with capturing and managing records of training runs, such as performance measures, logs, serialized models, etc.

2. The scripts can execute in your local environment or on a remote Compute Target. You would usually do code development and debugging in your local environment using a small development dataset and train on a full training dataset on a remote Compute Target. The primary remote targets are Azure VMs and Azure Batch AI clusters. The training and validation data accessed by Compute Targets is stored in AML Datastores that are backed up by Azure Blob Storage or Azure Data Lake.

3. As you run training iterations - a.k.a. runs, run records are stored in Azure ML service Experiment. You can query the Experiment's content using Python APIs or browser through it using Azure Portal.

4. When your model is ready for deployment you register it in Model Registry. Model Registry maintains versions of the model including the model's serialized files and metadata.

5. Depending on you deployment target, AML will create an optimized docker image and store it in private Azure Container Registry. The image includes the model, the scoring file to invoke the model, and all required runtime dependencies.

6. The image can be deployed to any of the supported targets including Azure Container Instance, Azure Kubernetes Services, or Azure IoT Edge. 


All Azure ML components are managed within a top level container - Azure Machine Learning Workspace. 




## Create Azure Machine Learning Workspace

In this step, you will provision and configure Azure ML Workspace that will be used throughout the lab.

You can create AML Workspace using AML Python SDK, AML CLI, or Azure Portal. The below Jupyter cell shows the Python SDK code to create the workspace.

Replace the placeholders in the code below with your values for Azure subscription ID, a workspace name, a resource group name, and a region. 

When you execute the cell you will be asked to log in to Azure. Follow the printed instructions and use your Azure credentials to complete authentication.


***Note. We have observed issues with Python SDK `Workspace` API. As a temporary workaround, create the workspace in Azure Portal and then execute the below cell.***

***Use this URL to create the workspace in Azure Portal***

https://ms.portal.azure.com/#create/Microsoft.MachineLearningServices

In [None]:
from azureml.core import Workspace

subscription_id ='<your subscription id>'
resource_group ='<your resource group>'
workspace_name = '<your workspace name>'
workspace_region = '<your region>'
    
try:
   ws = Workspace(subscription_id = subscription_id, resource_group = resource_group, workspace_name = workspace_name)

   print('Workspace configuration succeeded. You are all set!')
except:
   print('Workspace not found. Creating...')
   ws = Workspace.create(name = workspace_name,
                subscription_id = subscription_id,
                resource_group = resource_group, 
                location = workspace_region,
                create_resource_group = True,
                exist_ok = True)

ws.get_details()
ws.write_config()

## Next Step
Your AML Workspace is ready and the configuration has been written to a json config file. You can now proceed to the first part of the lab - Feature Extraction.

Start `01-feature-engineering.ipynb` notebook.