# Study Note - Building AI Solutions with Azure Machine Learning
This notebook collects the notes taken through the course of **[Build AI solutions with Azure Machine Learning](https://docs.microsoft.com/en-us/learn/paths/build-ai-solutions-with-azure-ml-service/)** offered by Microsoft, with supplements from the **[documentation of Azure Machine Learning SDK for Python](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py)**.

## Guideline vs Learning Sections

### Set up an Azure Machine Learning Workspace (30-35%)

#### Create an Azure Machine Learning workspace (Lab 01)
- create an Azure Machine Learning workspace
- configure workspace settings
- manage a workspace by using Azure Machine Learning studio

#### Manage data objects in an Azure Machine Learning workspace (Lab 03)
- register and maintain data stores
- create and manage datasets

#### Manage experiment compute contexts (Lab 01 & 04)
- create a compute instance
- determine appropriate compute specifications for a training workload
- create compute targets for experiments and training

### Run Experiments and Train Models (25-30%)

#### Create models by using Azure Machine Learning Designer (Create no-code predictive models with Azure Machine Learning)
- create a training pipeline by using Azure Machine Learning designer
- ingest data in a designer pipeline
- use designer modules to define a pipeline data flow
- use custom code modules in designer

#### Run training scripts in an Azure Machine Learning workspace (Lab 02)
- create and run an experiment by using the Azure Machine Learning SDK
- consume data from a data store in an experiment by using the Azure Machine Learning SDK
- consume data from a dataset in an experiment by using the Azure Machine Learning SDK
- choose an estimator for a training experiment

#### Generate metrics from an experiment run (Lab 01)
- log metrics from an experiment run
- retrieve and view experiment outputs
- use logs to troubleshoot experiment run errors

#### Automate the model training process (Lab 05)
- create a pipeline by using the SDK
- pass data between steps in a pipeline
- run a pipeline
- monitor pipeline runs

### Optimize and Manage Models (20-25%)

#### Use Automated ML to create optimal models (Lab 09)
- use the Automated ML interface in Azure Machine Learning studio
- use Automated ML from the Azure Machine Learning SDK
- select scaling functions and pre-processing options
- determine algorithms to be searched
- define a primary metric
- get data for an Automated ML run
- retrieve the best model

#### Use Hyperdrive to tune hyperparameters (Lab 08)
- select a sampling method
- define the search space
- define the primary metric 
- define early termination options
- find the model that has optimal hyperparameter values

#### Use model explainers to interpret models (Lab 10)
- select a model interpreter
- generate feature importance data

#### Manage models (Lab 02)
- register a trained model
- monitor model history
- monitor data drift

### Deploy and Consume Models (20-25%)

#### Create production compute targets (Lab 06)
- consider security for deployed services
- evaluate compute options for deployment

#### Deploy a model as a service (Lab 06)
- configure deployment settings
- consume a deployed service
- troubleshoot deployment container issues

#### Create a pipeline for batch inferencing (Lab 07)
- publish a batch inferencing pipeline
- run a batch inferencing pipeline and obtain outputs

#### Publish a designer pipeline as a web service (Lab 06)
- create a target compute resource
- configure an Inference pipeline
- consume a deployed endpoint

## 01 Getting Started with Azure Machine Learning

The Azure ML SDK for Python provides classes you can use to work with Azure ML in your Azure subscription.

### azureml-core package
**High level process:**
1. **create a new <font color='blue'>*workspace*</font> or connect to an existing workspace** 
2. **create an Azure ML <font color='blue'>*experiment*</font> in workspace**
3. **create a <font color='blue'>*run*</font> to run codes**

#### Workspace 

A **workspace** is a context for the **experiments, data, compute targets, and other assets** associated with **a machine learning workload**. Workspaces are Azure resources, and as such they are defined within a resource group in an Azure subscription, along with other related Azure resources that are required to support the workspace. A Workspace is a fundamental **resource** for machine learning in Azure Machine Learning. You use a workspace to **experiment, train, and deploy machine learning models**.

```python
from azureml.core import Workspace
```
- All experiments and associated resources are managed within you Azure ML workspace. You can connect to an existing workspace,  create a new one using the Azure ML SDK, or load the workspace from the configuration file.

```python
# Load an existing workspace
ws = Workspace.get(name="myworkspace", subscription_id='<azure-subscription-id>', resource_group='myresourcegroup')

# Create a new one
ws = Workspace.create(name='myworkspace',
                      subscription_id='<azure-subscription-id>',
                      resource_group='myresourcegroup',
                      create_resource_group=True,
                      location='eastus2'
                     )

# Load from a configuration file
ws = Workspace.from_config()

```

- In most cases, you should store the workspace configuration in a JSON configuration file. This makes it easier to reconnect without needing to remember details like your Azure subscription ID.

```python
ws.write_config(path="./file-path", file_name="ws_config.json")
```

- You can download the JSON configuration file from the blade for your workspace in the Azure portal, but ***if you're using a Compute Instance within your workspace, the configuration file has already been downloaded to the root folder.***
- `.from_config()` finds and uses the configuration file from the root folder to connect to your workspace.

```python
ws_other_environment = Workspace.from_config(path="./file-path/ws_config.json")
```

#### Experiment

In Azure Machine Learning, an **experiment** is a **named process**, usually the running of a script or a pipeline, that can generate metrics and outputs and be tracked in the Azure Machine Learning workspace. An experiment can be run multiple times, with different data, code, or settings; and Azure Machine Learning tracks each run, enabling you to view run history and compare results for each run.

When you submit an experiment, you use its run context to initialize and end the experiment run that is tracked in Azure Machine Learning

```python
from azureml.core import Experiment

# create an experiment variable
experiment = Experiment(workspace=ws, name='test-experiment')

# start the experimennt
run = experiment.start_logging()

# experiment code goes here

# end the experiment
run.complete()

```

After the experiment run has completed, you can view the details of the run in the **Experiments** tab in Azure Machine Learning studio.

#### Run
A run represent a single trial of an experiment.

- There are two ways to create run:
    1. `experiment.start_logging()` as previous example
    2. `azureml.core.Run` to run a experiment script 
    
```python
from azureml.core import Run
```
- Create a separate script from experiment, store it in a folder along with any other files it needs, and then use Azure ML to run the experiment based on the script in the folder.
- `Run.get_context()` method to *retrieve the experiment run context when the script is run*.

<ins>**After a run object is created, use various `.log*()` methods to log the outputs.**</ins>

```python
# An experiment script, experiment.py, saved in the experiment_files folder
from azureml.core import Run
import pandas as pd
import matplotlib.pyplot as plt
import os

# Get the experiment run context
run = Run.get_context()

# load the diabetes dataset
data = pd.read_csv('data.csv')

# Count the rows and log the result
row_count = (len(data))
run.log('observations', row_count)

# Save a sample of the data
os.makedirs('outputs', exist_ok=True)
data.sample(100).to_csv("outputs/sample.csv", index=False, header=True)

# Complete the run
run.complete()
```

#### Configuration

To run a script as an experiment, you must define a script configuration that defines the script to be run and the Python environment in which to run it. This is implemented by using a **ScriptRunConfig** object.

```python
from azureml.core import Experiment, ScriptRunConfig

# Create a script config
script_config = ScriptRunConfig(source_directory=experiment_folder,
                                script='experiment.py') 

# submit the experiment
experiment = Experiment(workspace = ws, name = 'my-experiment')
run = experiment.submit(config=script_config)
run.wait_for_completion(show_output=True)
```
- Please refer to [previous note] for better understanding

### Note: How to create a simple machine learning workflow

1. Create a new workflow or load an exsiting workflow
2. Create experiment script and save it in the folder along with other files
3. Configure the file and submit the experiment

### [Lab: Getting Started with Azure Machine Learning](https://github.com/MicrosoftDocs/mslearn-aml-labs/blob/master/labdocs/Lab01.md)

- In this lab, we need to create a **workspace** in Azure Portal and then use **Azure Machine Learning studio** to manage the workspace.
    - Create a compute instance under the workspace. ***When creating a Compute Instance, a virtual machine is created.***
    - The cheapest virtual machine is STANDARD_D2S_V3
        - After the compute instance is created, click its **Jupyter link** to open Jupyter Notebooks on the VM.
- **[IMPORTANT!!]** When you have finished the lab, **close all Jupyter tabs and *Stop* your compute instance** to avoid incurring unnecessary costs.

### MLflow

**MLflow** is an open source platform for managing machine learning processes. It's **commonly (but not exclusively) used in Databricks environments** to coordinate experiments and track metrics. In Azure Machine Learning experiments, you can use MLflow to track metrics instead of the native log functionality if you desire.

```python
import mlflow
```
- Refer to the notebook codes in official Git-Hub fore more details.


## 02 Training Models with Parameters

In Azure Machine Learning, you can use a **Run Configuration** and a **Script Run Configuration** to run a script-based experiment that trains a machine learning model. However, depending on the machine learning framework being used and the dependencies it requires, the run configuration may become complex.

Azure Machine Learning also provides a higher level abstraction called an Estimator that encapsulates a **run configuration** and a **script configuration** in a single object, and for which there are pre-defined, framework-specific variants that already include the package dependencies for common machine learning frameworks such as *Scikit-Learn, PyTorch, and Tensorflow*.

#### Note: 
- A difference is to replace script_config with estimator (create estimator object and pass it into the config parameter)
- The rest of process to run a model with experiments is basically the same. 

### Steps:
#### Create a training script and log key metrics of modeling performance
#### Run the script as experiment
- Option 1: Use an Estimator

```python
from azureml.train.estimator import Estimator
from azureml.core import Experiment

# Create an estimator
estimator = Estimator(source_directory='experiment_folder',
                      entry_script='training_script.py',
                      compute_target='local',
                      conda_packages=['scikit-learn']
                      )

# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)
```

- Option 2: using framewrk-specific estimators

```python
from azureml.train.sklearn import SKLearn
from azureml.core import Experiment

# Create an estimator
estimator = SKLearn(source_directory='experiment_folder',
                    entry_script='training_script.py'
                    compute_target='local'
                    )

# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)
```

#### Register the trained model to the workspace

Note that **the outputs of the experiment include the trained model file (model.pkl)**. You can register this model in your Azure Machine Learning workspace, making it possible to track model versions and retrieve them later.

Model registration enables you to track multiple versions of a model, and retrieve models for ***inferencing (predicting label values from new data)***. When you register a model, you can specify a name, description, tags, framework (such as Scikit-Learn or PyTorch), framework version, custom properties, and other useful metadata. Registering a model with the same name as an existing model automatically creates a new version of the model, starting with 1 and increasing in units of 1.

- Option 1: **register** method of **Model** object

```python
from azureml.core import Model

model = Model.register(workspace=ws,
                       model_name='classification_model',
                       model_path='model.pkl', # local path
                       description='A classification model',
                       tags={'dept': 'sales'},
                       model_framework=Model.Framework.SCIKITLEARN,
                       model_framework_version='0.20.3')
```

- Option 2: reference to the **Run**

```python
run.register_model( model_name='classification_model',
                    model_path='outputs/model.pkl', # run outputs path
                    description='A classification model',
                    tags={'dept': 'sales'},
                    model_framework=Model.Framework.SCIKITLEARN,
                    model_framework_version='0.20.3')
```

#### Viewing registered models
```python
from azureml.core import Model

for model in Model.list(ws):
    # Get model name and auto-generated version
    print(model.name, 'version:', model.version)
```
    

### Also: using script parameters

#### Add argument into script
Adding parameters to your script enables you to repeat the same training experiment with different settings
To use parameters in a script, you must use a library such as **argparse** to read the arguments passed to the script and assign them to variables.

```python
import argparse
# also import other packages as neccessary

# Get the experiment run context
run = Run.get_context()

# Set regularization hyperparameter
parser = argparse.ArgumentParser()
parser.add_argument('--reg_rate', type=float, dest='reg', default=0.01)
args = parser.parse_args()
reg = args.reg

# Prepare the dataset

# Train a logistic regression model
model = LogisticRegression(C=1/reg, solver="liblinear").fit(X_train, y_train)

# The rest of the script
```

#### Passing Script Arguments to an Estimator
```python
from azureml.train.sklearn import SKLearn
from azureml.core import Experiment

# Configure/create an estimator
estimator = SKLearn(source_directory='experiment_folder',
                    entry_script='training_script.py',
                    script_params = {'--reg_rate': 0.1},
                    compute_target='local'
                    )

# Create and run an experiment
experiment = Experiment(workspace = ws, name = 'training_experiment')
run = experiment.submit(config=estimator)
```

### Side Note: Revisit how to interprete ROC
- Y axis calculates True Positive Rate – the base is True (Ex: 80 True instances)
- X axis calculates False Positive Rate – the base is False (Ex: 20 False instances)
    - If we select True by randomly, the probability of selecting a true or false instance is 0.8 and 0.2. Therefore, TPR and FPR will increase at around the same pace.
    - However, if we build a good predictive model, the probability of selecting a true instance should increase, skewing the curve to the top-left. ***The better the capability of the model to predict true positive, the higher the AUC.***

#### Don’t confuse the concept of AUC and Accuracy.
- AUC shows **the capability of a model to predict true positives**, and each axis has different base.
- The base of accuracy includes both true and false instances. It doesn’t take into account the capability of predicting true positives.


## 03 Work with Data in Azure Machine Learning

### [IMPORTANT NOTE] Datastores are *file locations* while datasets are are *real data*.

### Datastores
In Azure Machine Learning, ***datastores*** are abstractions for cloud data sources / storage locations.

```python
from azureml.core import Workspace, Datastore

ws = Workspace.from_config()

# Register a new datastore
blob_ds = Datastore.register_azure_blob_container(workspace=ws,
    datastore_name='blob_data',
    container_name='data_container',
    account_name='az_store_acct',
    account_key='123456abcde789…')    

# Get reference to a data score
blob_store = Datastore.get(ws, datastore_name='blob_data')
default_store = ws.get_default_datastore()
ws.set_default_datastore('blob_data')

# Working directly with a datastore
blob_ds.upload(src_dir='/files',
               target_path='/data/files',
               overwrite=True, show_progress=True)

blob_ds.download(target_path='downloads',
                 prefix='/data',
                 show_progress=True)
```

When you want to use a datastore in an experiment script, you must pass a data reference to the script. The data reference is configured for one of the following data access modes: **download, upload, and mount.**

```python
# Get a data reference
data_ref = blob_ds.path('data/files').as_download(path_on_compute='training_data')

# Configuration
estimator = SKLearn(source_directory='experiment_folder',
                    entry_script='training_script.py'
                    compute_target='local',
                    script_params = {'--data_folder': data_ref})
```

In your training script, you ca nretrieve the parameter and use it like a local folder:
```python
import os
import argparse

parser = argparse.ArgumentParser()
parser.add_argument('--data_folder', type=str, dest='data_folder')
args = parser.parse_args()
data_files = os.listdir(args.data_folder)
```

### Datasets
***Datasets*** are versioned packaged data objects that can be easily consumed in experiments and pipelines. Datasets are the recommended way to work with data, and are the primary mechanism for advanced Azure Machine Learning capabilities like data labeling and data drift monitoring.

Datasets are typically based on **files in a datastore**, though they can also be based on URLs and other sources. You can create the following types of dataset: **tabular and file**.

```python
# Create - Type 1: Creating and registering tabular datasets
from azureml.core import Dataset

blob_ds = ws.get_default_datastore()

    # The dataset in this example includes data from two file paths within the default datastore
csv_paths = [(blob_ds, 'data/files/current_data.csv'),
             (blob_ds, 'data/files/archive/*.csv')]
tab_ds = Dataset.Tabular.from_delimited_files(path=csv_paths)

    # After creating the dataset, the code registers it in the workspace with the name csv_table.
tab_ds = tab_ds.register(workspace=ws, name='csv_table')

# Create - Type 2: Creating and registering file datasets
file_ds = Dataset.File.from_files(path=(blob_ds, 'data/files/images/*.jpg'))
file_ds = file_ds.register(workspace=ws, name='img_files')

# Retrieve a registered dataset
import azureml.core
from azureml.core import Workspace, Dataset

    # Load the workspace from the saved config file
ws = Workspace.from_config()

    # Get a dataset from the workspace datasets collection (dictionary attribute)
ds1 = ws.datasets['csv_table']

    # Get a dataset by name from the datasets class (method)
ds2 = Dataset.get_by_name(ws, 'img_files')

# Dataset versioning - specifying the create_new_version property
img_paths = [(blob_ds, 'data/files/images/*.jpg'),
             (blob_ds, 'data/files/images/*.png')]
file_ds = Dataset.File.from_files(path=img_paths)
file_ds = file_ds.register(workspace=ws, name='img_files', create_new_version=True)

# Retrieving a specific dataset version
img_ds = Dataset.get_by_name(workspace=ws, name='img_files', version=2)
```

You can read data directly from a dataset, or you can pass a dataset as a named input to a script configuration or estimator.
```python
# Working with a dataset directly
    # Tabuler
df = tab_ds.to_pandas_dataframe()
# code to work with dataframe goes here

    # File
for file_path in file_ds.to_path():
    print(file_path)
```

When you need to access a dataset in an experiment script, you can pass the dataset as an input to a **ScriptRunConfig** or an **Estimator**. For example, the following code passes a tabular dataset to an estimator:

Since the script will need to work with a Dataset object, you must include either **the full azureml-sdk package** or **the azureml-dataprep package with the pandas extra library** in the script's compute environment.

```python
estimator = SKLearn( source_directory='experiment_folder',
                     entry_script='training_script.py',
                     compute_target='local',
                     inputs=[tab_ds.as_named_input('csv_data')],
                     pip_packages=['azureml-dataprep[pandas]')
```

In the experiment script itself, you can access the input and work with the Dataset object it references like this:

```python
run = Run.get_context()
data = run.input_datasets['csv_data'].to_pandas_dataframe()
```

When passing a file dataset, you must **specify the access mode**. For large volumes of data, you'd generally use the **as_mount** method to stream the files directly from the dataset source; but when running on local compute (as we are in this example), you need to use the **as_download** option to download the dataset files to a local folder.

```python
estimator = Estimator( source_directory='experiment_folder',
                     entry_script='training_script.py'
                     compute_target='local',
                     inputs=[img_ds.as_named_input('img_data').as_download(path_on_compute='data')],
                     pip_packages=['azureml-dataprep[pandas]')
```

## 04 Work with Compute in Azure machine Learning
The runtime context for each experiment run consists of two elements:
1. The *environment* for the script, which includes all packages used in the script.
2. The *compute target* on which the environment will be deployed and the script run. This could be the local workstation from which the experiment run is initiated, or a remote compute target such as a training cluster that is provisioned on-demand.
    - In Azure Machine Learning, *Compute Targets* are **physical or virtual computers on which experiments are run.

When you run a Python script as an experiment in Azure Machine Learning, a Conda environment is created to define the execution context for the script. Azure Machine Learning provides a default environment that includes many common packages; including the **azureml-defaults** package that contains the libraries necessary for working with an experiment run, as well as popular packages like **pandas** and **numpy**.


You can also define your own environment and add packages by using **conda** or **pip**, to ensure your experiment has access to all the libraries it requires.
```python
estimator = Estimator (source_directory=experiment_folder,
                       inputs=[diabetes_ds.as_named_input('diabetes')],
                       script_params=script_params,
                       compute_target = 'local', # OR compute_target = cluster_name # Run on the remote compute target
                       environment_definition = diabetes_env, # environment
                       entry_script='diabetes_training.py')
```

## 05 Orchestra machine learning with pipelines

### Definition
The term pipeline is used extensively in machine learning, often with different meanings.
- Scikit-Learn pipeline
- Azure Machine Learning pipelines: a workflow of machine learning tasks in which each task is implemented as a *step*. **Check bookmark: Introduction to Pipelines**
- Azure DevOps pipelines: the build and configuration tasks required to deliver software.

### Key objects for building a pipeline
- Steps
```python
from azureml.pipeline.steps import PythonScriptStep, EstimatorStep
```
    - A pipeline is like an experiment, and each step is like a part of the experiment. 
- PipelineData
```python
from azureml.pipeline.core import PipelineData
```
    - To use a PipelineData object to pass data between steps, you must:
        - Define a named PipelineData object that references **a location** in a datastore. **(BIG NOTE: The PipelineData is only a reference to a location. It is NOT a dataset.)**
        - Specify the PipelineData object as an input or output for the steps that use it.
        - Pass the PipelineData object as **a script parameter** in steps that run scripts (and include code in those scripts to read or write data)**(Note: Specify the location parameter in script.)**
- Pipeline
```python
from azureml.pipeline.core import Pipeline
```

### [Pattern for creating and using pipelines](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py#pattern-for-creating-and-using-pipelines)

- **A Azure Machine learning Pipeline is associated with an <ins>Azure Machine Learning workspace</ins>.**
- **A pipeline step is associated with a <ins>compute target</ins> within that workspace.**

A common pattern for pipeline steps is:

1. Specify workspace, compute, and storage
2. Configure your input and output data using
    - Dataset which makes available an existing Azure datastore
    - PipelineDataset which encapsulates typed tabular data
    - PipelineData which is used for intermediate file or directory data written by one step and intended to be consumed by another
3. Define one or more pipeline steps
4. Instantiate a pipeline using your workspace and steps
5. Create an experiment to which you submit the pipeline
6. Monitor the experiment results

## 06 Deploy real-time machine learning services with Azure Machine Learning
In machine learning, *inferencing* refers to the use of a trained model to predict labels for new data on which the model has not been trained. In Azure Machine learning, you can create **real-time inferencing solutions by deploying a model as a service**, hosted in a **containerized platform** such as **Azure Kubernetes Services (AKS)**.

**Notes:** 
- **We can also deploy the model on Azure Container Instances (ACI) Web Service or local Docker-based service during development and testing.**
- **ACI web service is best for small scale testing and quick deployments, and AKS is for deloyments as a production-scale web service.**

To deploy a model as a real-time inferencing service, you must perform the following tasks:
1.	Register a trained model
2.	Define an inference configuration
    1.	Create an **entry script**: The entry script receives data submitted to a deployed web service and passes it to the model. It then takes the response returned by the model and returns that to the client. *The script is specific to your model.* It must understand the data that the model expects and returns.
        1.	`init()`: Called when the service is initialized - Typically, this function loads the model into a global object. This function is run only once, when the Docker container for your web service is started.
        2.	`run(inpute_data)`: Called with new data is submitted to the service - This function uses the model to predict a value based on the input data. Inputs and outputs of the run typically use JSON for serialization and deserialization. You can also work with raw binary data. You can transform the data before sending it to the model or before returning it to the client.

            ```python
            [To include entry script codes]
            ```
        
    2.	Create an environment
    3.	Combine the script and environment in an InferenceConfig

    ```python
    from azureml.core.model import InferenceConfig

    classifier_inference_config = InferenceConfig(runtime= "python",
                                                  source_directory = 'service_files',
                                                  entry_script="score.py",
                                                  conda_file="env.yml")
    ```

3.	Define a deployment configuration on the chosen compute target
    - AksCompute
    ```python
    from azureml.core.compute import ComputeTarget, AksCompute
    from azureml.core.webservice import AksWebservice
    ```

    - ACI deployment
    ```python
    from azureml.core.webservice import AciWebservice
    ```
    
    - local Docker-based service
    ```python
    from azureml.core.webservice import LocalWebservice
    ```
4.	Deploy the model
```python
service = Model.deploy(workspace=ws,
                       name = 'classifier-service',
                       models = [model], #1. Model registered
                       inference_config = classifier_inference_config, # 2. Inference Configuration
                       deployment_config = classifier_deploy_config, # 3. deployment configuration
                       deployment_target = production_cluster) # (Optional) 3. deployment configuration
service.wait_for_deployment(show_output = True)
print(service.state)
```

To delete a deployed web service, use `service.delete()`. To delete a registered model, use `model.delete()`.

#### [Additional Topic: Create an endpoint](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service#create-an-endpoint)

**To create an endpoint, use `AksEndpoint.deploy_configuration` instead of `AksWebservice.deploy_configuration()`.**

```python
import azureml.core,
from azureml.core.webservice import AksEndpoint
from azureml.core.compute import AksCompute
from azureml.core.compute import ComputeTarget
# select a created compute
compute = ComputeTarget(ws, 'myaks')
namespace_name= endpointnamespace
# define the endpoint and version name
endpoint_name = "mynewendpoint"
version_name= "versiona"
# create the deployment config and define the scoring traffic percentile for the first deployment
endpoint_deployment_config = AksEndpoint.deploy_configuration(cpu_cores = 0.1, memory_gb = 0.2,
                                                              enable_app_insights = True,
                                                              tags = {'sckitlearn':'demo'},
                                                              description = "testing versions",
                                                              version_name = version_name,
                                                              traffic_percentile = 20)
 # deploy the model and endpoint
 endpoint = Model.deploy(ws, endpoint_name, [model], inference_config, endpoint_deployment_config, compute)
 # Wait for he process to complete
 endpoint.wait_for_deployment(True)
```

To *consume* a deployed real-time service (or model or endpoint), we’ll need the following: **(Note: Recall the consume tab in AML Studio.)**
-	HTTP Post/ Url **(Note: Recall the step to copy the REST url on AML studio and paste it in the script.)**
-	Key **(Note: Recall the step to copy the primary key on AML studio and paste it in the script.)**

## 07 Deploy batch inference pipelines with Azure Machine Learning
The steps are not very consistent between lectures and lab codes. Refer to the lab codes when there’s inconsistency.
1.	Register a model
2.	Create a scoring script and define a run context that includes the dependencies required by the script
3.	Create a pipeline with **ParallelRunStep** (As the chapter name suggests, we’re going to create a step in pipeline)
```python
from azureml.pipeline.steps import ParallelRunConfig, ParallelRunStep
# Define the parallel run step step configuration
# Create the parallel run step
```
4.	Run the pipeline and retrieve the step output


## 08 Tune hyperparameters with Azure Machine Learning
For a discrete parameter, use a **choice** from a list of explicit values. Example: `'--batch_size': choice(16, 32, 64)`
### Type of sampling:
- Grid sampling
- Random sampling
- Bayesian sampling

### Early termination
- Bandit policy: stop a run if the target performance metric underperforms the best run so far by a specified margin
```python
from azureml.train.hyperdrive import BanditPolicy
```
- Median stopping policy: abandons runs where the target performance metric is worse than the median of the running averages for all runs
```python
from azureml.train.hyperdrive import MedianStoppingPolicy
```
- Truncation selection policy: cancels the lowest performing X% of runs at each evaluation interval based on the truncation_percentage value you specify for X
```python
from azureml.train.hyperdrive import TruncationSelectionPolicy
```

### Running a hyperparameter tuning experiment
- Create a training script that
    - Includes an argument for each hyperparameter you want to vary (covered in previous lab)
    - Log the target performance metric (covered in previous lab)
- Configure and run hyperdrive experiment
```python
from azureml.train.hyperdrive import HyperDriveConfig, PrimaryMetricGoal
```
- Monitor and review hyperdrive runs 


## 09 Automate machine learning model selection with Azure Machine Learning
Automated Machine Learning is one of the two big features, Automated ML and Designer, in AML studio. You can use the visual interface in Azure Machine Learning studio or the SDK to leverage this capability. The SDK gives you greater control over the settings for the automated machine learning experiment, but the visual interface is easier to use.

Configure an Automated Machine Learning experiment: 
```python
from azureml.train.automl import AutoMLConfig
```

## 10 Explain machine learning models with Azure Machine Learning

### Type of Feature Importance
- **Global feature importance** quantifies the relative importance of each feature in the test dataset as a whole.
- **Local feature importance** measures the influence of each feature value for a specific individual prediction.

### Explainers
Using explainers – install the **azureml-interpret** package
- MimicExplainer – An explainer that creates a *global surrogate model* that approximates your trained model and can be used to generate explanations.
```python
from interpret.ext.blackbox import MimicExplainer
from interpret.ext.glassbox import DecisionTreeExplainableModel # Requires most arguments
```
- TabularExplainer – An explainer that acts as a wrapper around various SHAP explainer algorithms, automatically choosing the one that is most appropriate for your model architecture.
```python
from interpret.ext.blackbox import TabularExplainer # Does not require explainable_model
```
- PFIExplainer – a *Permutation Feature Importance* explainer that analyzes feature importance by shuffling feature values and measuring the impact on prediction performance.
```python
from interpret.ext.blackbox import PFIExplainer #Does not require explainable_model and initialization_example
```

## 11 Detect and mitigate unfairness in models with Azure Machine Learning

### Disparity
[To add more notes]


A model with lower disparity in predictive performance between sensitive feature groups might be favorable then the model with higher disparity and overall accuracy.

#### Side Note – under what situations we might choose a model with lower accuracy/AUC over a higher one?
- Time required for training
- Interpretability
- Lower disparity between sensitive feature groups


## 12 Monitor a Model
To capture telemetry data for Application insights, you can write any values to the standard output log in the scoring script for your service by using a print statement

Summarize the whole workflow from building, deploying, consuming, to monitoring a model.
- Refer to Jupyter Notebook for the complete codes
