For this workshop, you need:

* An Azure Machine Learning workspace. 
* The Azure Machine Learning CLI v2 installed.

To install the CLI you can either,

Create a compute instance, which already has installed the latest AzureML CLI and is pre-configured for ML workflows.

Use the followings commands to install Azure ML CLI v2:

```bash
az extension add --name ml
az login --identity
```



In [None]:
!az extension add --name ml
!az login --identity

# Model Training

## (Optional) 1. Create Managed Compute

A compute is a designated compute resource where you run your job or host your endpoint. Azure Machine learning supports the following types of compute:

- **Compute instance** - a fully configured and managed development environment in the cloud. You can use the instance as a training or inference compute for development and testing. It's similar to a virtual machine on the cloud.

- **Compute cluster** - a managed-compute infrastructure that allows you to easily create a cluster of CPU or GPU compute nodes in the cloud.

- **Kubernetes cluster** - used to deploy trained machine learning models to Azure Kubernetes Service. You can create an Azure Kubernetes Service (AKS) cluster from your Azure ML workspace, or attach an existing AKS cluster.

- **Attached compute** - You can attach your own compute resources to your workspace and use them for training and inference.

You can create a compute using the Studio, the cli and the sdk.

<hr>

We can create a **compute instance** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_compute_instance.png" width = "700px" alt="Create Compute Instance cli vs sdk">
</center>


<hr>

We can create a **compute cluster** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_compute_cluster.png" width = "700px" alt="Create Compute Instance cli vs sdk">
</center>


Let's create a managed compute cluster for the training workload.

``` python
# Create train job compute cluster
!az ml compute create --file train/compute.yml
```

## 2. Register Data Asset

**Datastore** - Azure Machine Learning Datastores securely keep the connection information to your data storage on Azure, so you don't have to code it in your scripts.

An Azure Machine Learning datastore is a **reference** to an **existing** storage account on Azure. The benefits of creating and using a datastore are:
* A common and easy-to-use API to interact with different storage type. 
* Easier to discover useful datastores when working as a team.
* When using credential-based access (service principal/SAS/key), the connection information is secured so you don't have to code it in your scripts.

Supported Data Resources: 

* Azure Storage blob container
* Azure Storage file share
* Azure Data Lake Gen 1
* Azure Data Lake Gen 2


It is not a requirement to use Azure Machine Learning datastores - you can use storage URIs directly assuming you have access to the underlying data.

You can create a datastore using the Studio, the cli and the sdk.

<hr>

We can create a **datastore** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_datastore.png" width = "700px" alt="Create Datastore cli vs sdk">
</center>



**Data asset** - Create data assets in your workspace to share with team members, version, and track data lineage.

By creating a data asset, you create a reference to the data source location, along with a copy of its metadata. 

The benefits of creating data assets are:

* You can **share and reuse data** with other members of the team such that they do not need to remember file locations.
* You can **seamlessly access data** during model training (on any supported compute type) without worrying about connection strings or data paths.
* You can **version** the data.

<hr>

We can create a **data asset** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_data_asset.png" width = "700px" alt="Create Data Asset cli vs sdk">
</center>

In [None]:
# Register data asset 
!az ml data create --file train/data.yml

## 3. Register Train Environment

Azure Machine Learning environments define the execution environments for your **jobs** or **deployments** and encapsulate the dependencies for your code. 

Azure ML uses the environment specification to create the Docker container that your **training** or **scoring code** runs in on the specified compute target.

Create an environment from a
* conda specification
* Docker image
* Docker build context

There are two types of environments in Azure ML: **curated** and **custom environments**. Curated environments are predefined environments containing popular ML frameworks and tooling. Custom environments are user-defined.

<hr>

We can register an **environment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_environment.png" width = "700px" alt="Create Environment cli vs sdk">
</center>

In [None]:
# Register train environment 
!az ml environment create --file train/environment.yml

## 4. Create Pipeline Job

**AML Job**:

Azure ML provides several ways to train your models, from code-first solutions to low-code solutions:

* Azure ML supports script files in python, R, Java, Julia or C#. All you need to learn is YAML format and command lines to use Azure ML.

* Distributed Training: AML supports integrations with popular frameworks, PyTorch and TensorFlow. Both frameworks employ data parallelism & model parallelism for distributed training.

* Automated ML - Train models without extensive data science or programming knowledge.

* Designer - drag and drop web-based UI.

<hr>

We can submit a **job** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_job.png" width = "700px" alt="Create Job cli vs sdk">
</center>

<br>
    
**AML Pipelines**:

An AML pipeline is an independently executable workflow of a complete machine learning task. It helps standardizing the best practices of producing a machine learning model: The core of a machine learning pipeline is to split a complete machine learning task into a multistep workflow. Each step is a manageable component that can be developed, optimized, configured, and automated individually. 

<hr>

We can submit a **pipeline job** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_pipeline.png" width = "700px" alt="Create Pipeline cli vs sdk">
</center>

In [None]:
# Create pipeline job
!az ml job create --file train/pipeline.yml

# Online Endpoint

Online endpoints are endpoints that are used for online (real-time) inferencing. They receive data from clients and can send responses back in real time.

An **endpoint** is an HTTPS endpoint that clients can call to receive the inferencing (scoring) output of a trained model. It provides:
* Authentication using "key & token" based auth
* SSL termination
* A stable scoring URI (endpoint-name.region.inference.ml.azure.com)

A **deployment** is a set of resources required for hosting the model that does the actual inferencing.
A single endpoint can contain multiple deployments.

Features of the managed online endpoint:

* **Test and deploy locally** for faster debugging
* Traffic to one deployment can also be **mirrored** (copied) to another deployment.
* **Application Insights integration**
* Security
* Authentication: Key and Azure ML Tokens
* Automatic Autoscaling
* Visual Studio Code debugging

**blue-green deployment**: An approach where a new version of a web service is introduced to production by deploying it to a small subset of users/requests before deploying it fully.

<center>
<img src="../../imgs/endpoint_concept.png" width = "500px" alt="Online Endpoint Concept cli vs sdk">
</center>

## 1. Create Online Endpoint

We can create an **online endpoint** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_online_endpoint.png" width = "700px" alt="Create Online Endpoint cli vs sdk">
</center>

In [None]:
# create online endpoint
!az ml online-endpoint create --file deploy/online/online-endpoint.yml

## 2. Create Online Deployment

To create a deployment to online endpoint, you need to specify the following elements:

* Model files (or specify a registered model in your workspace)
* Scoring script - code needed to do scoring/inferencing
* Environment - a Docker image with Conda dependencies, or a dockerfile
* Compute instance & scale settings

Note that if you're deploying **MLFlow models**, there's no need to provide **a scoring script** and execution **environment**, as both are autogenerated.

We can create an **online deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_online_deployment.png" width = "700px" alt="Create Online Deployment cli vs sdk">
</center>

In [None]:
# create online deployment
!az ml online-deployment create --file deploy/online/online-deployment.yml 

## 3. Allocate Traffic

In [None]:
# allocate traffic
!az ml online-endpoint update --name taxi-online-endpoint-en --traffic blue=100

## 4. Invoke and Test Endpoint

We can invoke the **online deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/invoke_online_endpoint.png" width = "700px" alt="Invoke online endpoint cli vs sdk">
</center>

In [None]:
# invoke and test endpoint
!az ml online-endpoint invoke --name taxi-online-endpoint-en --request-file ../../data/taxi-request.json

# Model Batch Endpoint

## 1. Create Batch Compute Cluster

In [None]:
# create compute cluster to be used by batch cluster
!az ml compute create -n batch-cluster --type amlcompute --min-instances 0 --max-instances 3

## 2. Create Batch Endpoint

We can create the **batch endpoint** with cli v2 or sdk v2 using the following syntax:


<center>
<img src="../../imgs/create_batch_endpoint.png" width = "700px" alt="Create batch endpoint cli vs sdk">
</center>

In [None]:
# create batch endpoint
!az ml batch-endpoint create --file deploy/batch/batch-endpoint.yml

## 3. Create Batch Deployment

We can create the **batch deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/create_batch_deployment.png" width = "700px" alt="Create batch deployment cli vs sdk">
</center>

Note that if you're deploying **MLFlow models**, there's no need to provide **a scoring script** and execution **environment**, as both are autogenerated.

In [None]:
# create batch deployment
!az ml batch-deployment create --file deploy/batch/batch-deployment.yml --set-default

## 4. Invoke and Test Endpoint

We can invoke the **batch deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../imgs/invoke_batch_deployment.png" width = "700px" alt="Invoke batch deployment cli vs sdk">
</center>

In [None]:
# invoke and test endpoint
!az ml batch-endpoint invoke --name taxi-batch-endpoint --input ../../data/taxi-batch.csv --input-type uri_file