# Study Note - Building AI Solutions with Azure Machine Learning
This notebook collects the notes taken through the course of **[Build AI solutions with Azure Machine Learning](https://docs.microsoft.com/en-us/learn/paths/build-ai-solutions-with-azure-ml-service/)** offered by Microsoft, with supplements from the **[documentation of Azure Machine Learning SDK for Python](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py)**.

## 01 Getting Started with Azure Machine Learning

The Azure ML SDK for Python provides classes you can use to work with Azure ML in your Azure subscription.
### azureml-core package
**High level process:**
1. **create a new <font color='blue'>*workspace*</font> or connect to an existing workspace** 
2. **create an Azure ML <font color='blue'>*experiment*</font> in workspace**
3. **create a <font color='blue'>*run*</font> to run codes**

#### Workspace   
```python
from azureml.core import Workspace
```
- All experiments and associated resources are managed within you Azure ML workspace. You can connect to an existing workspace, or create a new one using the Azure ML SDK.

```python
# Load an existing workspace
ws = Workspace.get(name="myworkspace", subscription_id='<azure-subscription-id>', resource_group='myresourcegroup')

# Create a new one
ws = Workspace.create(name='myworkspace',
                      subscription_id='<azure-subscription-id>',
                      resource_group='myresourcegroup',
                      create_resource_group=True,
                      location='eastus2'
                     )
```

- In most cases, you should store the workspace configuration in a JSON configuration file. This makes it easier to reconnect without needing to remember details like your Azure subscription ID.
```python
ws.write_config(path="./file-path", file_name="ws_config.json")
```
- You can download the JSON configuration file from the blade for your workspace in the Azure portal, but <ins>if you're using a Compute Instance within your workspace, the configuration file has already been downloaded to the root folder.</ins>
- `.from_config()` finds and uses the configuration file from the root folder to connect to your workspace.
```python
ws_other_environment = Workspace.from_config(path="./file-path/ws_config.json")
```

#### Experiment
```python
from azureml.core import Experiment
```    
- We use an Azure ML experiment to run Python code and record values extracted from data.
- Create an Azure ML experiment in workspace
```python
experiment = Experiment(workspace=ws, name='test-experiment')
```

#### Run
A run represent a single trial of an experiment.
- There are two ways to create run:
    1. `experiment.start_logging()`
    2. azureml.core.Run 
```python
from azureml.core import Run
```
        - Create a separate script from experiment, store it in a folder along with any other files it needs, and then use Azure ML to run the experiment based on the script in the folder.
        - `Run.get_context()` method to retrieve the experiment run context when the script is run

<ins>**After a run object is created, use various `.log*()` methods to log the outputs.**</ins>

#### Configuration
```python
from azureml.core import Experiment, RunConfiguration, ScriptRunConfig
```
- Please refer to [previous note] for better understanding

## MLflow

**MLflow** is an open source platform for managing machine learning processes. It's **commonly (but not exclusively) used in Databricks environments** to coordinate experiments and track metrics. In Azure Machine Learning experiments, you can use MLflow to track metrics instead of the native log functionality if you desire.

```python
import mlflow
```
- Refer to the notebook codes in official Git-Hub fore more details.


## 02 Training Models
### Key Concepts:
- Create a training script and log key metrics of modeling performance
- Use an Estimator to run the script as experiment
```python 
from azureml.train.estimator import Estimator
```
- View logged metrics
```python
from azureml.widgets import RunDetails
```
- Register the trained model to the workspace
```python
from azureml.core import Model
```
- Create a parameterized training script: Adding parameters to your script enables you to repeat the same training experiment with different settings
```python
import argparse
parser = argparse.ArgumentParser()
parser.add_argument('--reg_rate', type=float, dest='reg', default=0.01)
args = parser.parse_args()
reg = args.reg
```
- Use a framework-specific Estimator
```python
from azureml.train.sklearn import SKLearn
```
- Register a new version of the model
    - The version of the same model will be updated.

### Side Note: Revisit how to interprete ROC
- Y axis calculates True Positive Rate – the base is True (Ex: 80 True instances)
- X axis calculates False Positive Rate – the base is False (Ex: 20 False instances)
    - If we select True by randomly, the probability of selecting a true or false instance is 0.8 and 0.2. Therefore, TPR and FPR will increase at around the same pace.
    - However, if we build a good predictive model, the probability of selecting a true instance should increase, skewing the curve to the top-left. ***The better the capability of the model to predict true positive, the higher the AUC.***

#### Don’t confuse the concept of AUC and Accuracy.
- AUC shows **the capability of a model to predict true positives**, and each axis has different base.
- The base of accuracy includes both true and false instances. It doesn’t take into account the capability of predicting true positives.


## 03 Work with Data in Azure Machine Learning
### Datastores
- In Azure Machine Learning, ***datastores*** are abstractions for cloud data sources / storage locations. 
    -	Upload data
        - We can upload data to a datastore. When you uploaded the files using `.upload_files()`, note that the code returned a *data reference*.
    - Configure data reference
        - We can configures the data reference for ***download*** - in other words, it can be used to download the contents of the folder to the compute context where the data reference is being used. 
        - When working with remote compute, you can also configure a data reference to ***mount*** the datastore location and read data directly from the data source. 
    -	Train a model from a datastore
        - The **data reference** can then be passed to a training script as a parameter.

### Datasets
- ***Datasets*** are versioned packaged data objects that can be easily consumed in experiments and pipelines.
    - Create datasets
        - While you can **read data directly from datastores**, Azure Machine Learning provides a further abstraction for data in the form of *datasets*. (As apposed to data reference)
        - A dataset is a versioned reference to a specific set of data that you may want to use in an experiment. Datasets can be *tabular* or *file-based*.
        - It's easy to convert a tabular dataset to a Pandas dataframe, enabling you to work with the data using common python techniques.
    - Register datasets
    - Train a model from a dataset
        - Pass the dataset as an input in the estimator being used to run the script 
            - In the script: `run.input_datasets['diabetes'].to_pandas_dataframe()`
                - **Note: This line grabs the input dataset named ‘diabetes’ from the run submitted.**
            - `diabetes_ds = ws.datasets.get("diabetes dataset")`
                - **Note: Before configure a run, we need to set up an dataset object using the dataset in workspace**
            - In SKLearn(): `inputs=[diabetes_ds.as_named_input('diabetes')]`
                - **Note: The dataset object can then be passed to the inputs argument when configuring Estimator**

## 04 Work with Compute in Azure machine Learning
The runtime context for each experiment run consists of two elements:
1. The *environment* for the script, which includes all packages used in the script.
2. The *compute target* on which the environment will be deployed and the script run. This could be the local workstation from which the experiment run is initiated, or a remote compute target such as a training cluster that is provisioned on-demand.
    - In Azure Machine Learning, *Compute Targets* are **physical or virtual computers on which experiments are run.

When you run a Python script as an experiment in Azure Machine Learning, a Conda environment is created to define the execution context for the script. Azure Machine Learning provides a default environment that includes many common packages; including the **azureml-defaults** package that contains the libraries necessary for working with an experiment run, as well as popular packages like **pandas** and **numpy**.


You can also define your own environment and add packages by using **conda** or **pip**, to ensure your experiment has access to all the libraries it requires.
```python
estimator = Estimator (source_directory=experiment_folder,
                       inputs=[diabetes_ds.as_named_input('diabetes')],
                       script_params=script_params,
                       compute_target = 'local', # OR compute_target = cluster_name # Run on the remote compute target
                       environment_definition = diabetes_env, # environment
                       entry_script='diabetes_training.py')
```

## 05 Orchestra machine learning with pipelines

### Definition
The term pipeline is used extensively in machine learning, often with different meanings.
- Scikit-Learn pipeline
- Azure Machine Learning pipelines: a workflow of machine learning tasks in which each task is implemented as a *step*. **Check bookmark: Introduction to Pipelines**
- Azure DevOps pipelines: the build and configuration tasks required to deliver software.

### Key objects for building a pipeline
- Steps
```python
from azureml.pipeline.steps import PythonScriptStep, EstimatorStep
```
    - A pipeline is like an experiment, and each step is like a part of the experiment. 
- PipelineData
```python
from azureml.pipeline.core import PipelineData
```
    - To use a PipelineData object to pass data between steps, you must:
        - Define a named PipelineData object that references **a location** in a datastore. **(BIG NOTE: The PipelineData is only a reference to a location. It is NOT a dataset.)**
        - Specify the PipelineData object as an input or output for the steps that use it.
        - Pass the PipelineData object as **a script parameter** in steps that run scripts (and include code in those scripts to read or write data)**(Note: Specify the location parameter in script.)**
- Pipeline
```python
from azureml.pipeline.core import Pipeline
```

### [Pattern for creating and using pipelines](https://docs.microsoft.com/en-us/python/api/overview/azure/ml/?view=azure-ml-py#pattern-for-creating-and-using-pipelines)

- **A Azure Machine learning Pipeline is associated with an <ins>Azure Machine Learning workspace</ins>.**
- **A pipeline step is associated with a <ins>compute target</ins> within that workspace.**

A common pattern for pipeline steps is:

1. Specify workspace, compute, and storage
2. Configure your input and output data using
    - Dataset which makes available an existing Azure datastore
    - PipelineDataset which encapsulates typed tabular data
    - PipelineData which is used for intermediate file or directory data written by one step and intended to be consumed by another
3. Define one or more pipeline steps
4. Instantiate a pipeline using your workspace and steps
5. Create an experiment to which you submit the pipeline
6. Monitor the experiment results

## 06 Deploy real-time machine learning services with Azure Machine Learning
In machine learning, *inferencing* refers to the use of a trained model to predict labels for new data on which the model has not been trained. In Azure Machine learning, you can create **real-time inferencing solutions by deploying a model as a service**, hosted in a **containerized platform** such as **Azure Kubernetes Services (AKS)**.

**Notes:** 
- **We can also deploy the model on Azure Container Instances (ACI) Web Service or local Docker-based service during development and testing.**
- **ACI web service is best for small scale testing and quick deployments, and AKS is for deloyments as a production-scale web service.**

To deploy a model as a real-time inferencing service, you must perform the following tasks:
1.	Register a trained model
2.	Define an inference configuration
    1.	Create an **entry script**: The entry script receives data submitted to a deployed web service and passes it to the model. It then takes the response returned by the model and returns that to the client. *The script is specific to your model.* It must understand the data that the model expects and returns.
        1.	`init()`: Called when the service is initialized - Typically, this function loads the model into a global object. This function is run only once, when the Docker container for your web service is started.
        2.	`run(inpute_data)`: Called with new data is submitted to the service - This function uses the model to predict a value based on the input data. Inputs and outputs of the run typically use JSON for serialization and deserialization. You can also work with raw binary data. You can transform the data before sending it to the model or before returning it to the client.

            ```python
            [To include entry script codes]
            ```
        
    2.	Create an environment
    3.	Combine the script and environment in an InferenceConfig

    ```python
    from azureml.core.model import InferenceConfig

    classifier_inference_config = InferenceConfig(runtime= "python",
                                                  source_directory = 'service_files',
                                                  entry_script="score.py",
                                                  conda_file="env.yml")
    ```

3.	Define a deployment configuration on the chosen compute target
    - AksCompute
    ```python
    from azureml.core.compute import ComputeTarget, AksCompute
    from azureml.core.webservice import AksWebservice
    ```

    - ACI deployment
    ```python
    from azureml.core.webservice import AciWebservice
    ```
    
    - local Docker-based service
    ```python
    from azureml.core.webservice import LocalWebservice
    ```
4.	Deploy the model
```python
service = Model.deploy(workspace=ws,
                       name = 'classifier-service',
                       models = [model], #1. Model registered
                       inference_config = classifier_inference_config, # 2. Inference Configuration
                       deployment_config = classifier_deploy_config, # 3. deployment configuration
                       deployment_target = production_cluster) # (Optional) 3. deployment configuration
service.wait_for_deployment(show_output = True)
print(service.state)
```

To delete a deployed web service, use `service.delete()`. To delete a registered model, use `model.delete()`.

#### [Additional Topic: Create an endpoint](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-azure-kubernetes-service#create-an-endpoint)

**To create an endpoint, use `AksEndpoint.deploy_configuration` instead of `AksWebservice.deploy_configuration()`.**

```python
import azureml.core,
from azureml.core.webservice import AksEndpoint
from azureml.core.compute import AksCompute
from azureml.core.compute import ComputeTarget
# select a created compute
compute = ComputeTarget(ws, 'myaks')
namespace_name= endpointnamespace
# define the endpoint and version name
endpoint_name = "mynewendpoint"
version_name= "versiona"
# create the deployment config and define the scoring traffic percentile for the first deployment
endpoint_deployment_config = AksEndpoint.deploy_configuration(cpu_cores = 0.1, memory_gb = 0.2,
                                                              enable_app_insights = True,
                                                              tags = {'sckitlearn':'demo'},
                                                              description = "testing versions",
                                                              version_name = version_name,
                                                              traffic_percentile = 20)
 # deploy the model and endpoint
 endpoint = Model.deploy(ws, endpoint_name, [model], inference_config, endpoint_deployment_config, compute)
 # Wait for he process to complete
 endpoint.wait_for_deployment(True)
```

To *consume* a deployed real-time service (or model or endpoint), we’ll need the following: **(Note: Recall the consume tab in AML Studio.)**
-	HTTP Post/ Url **(Note: Recall the step to copy the REST url on AML studio and paste it in the script.)**
-	Key **(Note: Recall the step to copy the primary key on AML studio and paste it in the script.)**

## 07 Deploy batch inference pipelines with Azure Machine Learning
The steps are not very consistent between lectures and lab codes. Refer to the lab codes when there’s inconsistency.
1.	Register a model
2.	Create a scoring script and define a run context that includes the dependencies required by the script
3.	Create a pipeline with **ParallelRunStep** (As the chapter name suggests, we’re going to create a step in pipeline)
```python
from azureml.pipeline.steps import ParallelRunConfig, ParallelRunStep
# Define the parallel run step step configuration
# Create the parallel run step
```
4.	Run the pipeline and retrieve the step output


## 08 Tune hyperparameters with Azure Machine Learning
For a discrete parameter, use a **choice** from a list of explicit values. Example: `'--batch_size': choice(16, 32, 64)`
### Type of sampling:
- Grid sampling
- Random sampling
- Bayesian sampling

### Early termination
- Bandit policy: stop a run if the target performance metric underperforms the best run so far by a specified margin
```python
from azureml.train.hyperdrive import BanditPolicy
```
- Median stopping policy: abandons runs where the target performance metric is worse than the median of the running averages for all runs
```python
from azureml.train.hyperdrive import MedianStoppingPolicy
```
- Truncation selection policy: cancels the lowest performing X% of runs at each evaluation interval based on the truncation_percentage value you specify for X
```python
from azureml.train.hyperdrive import TruncationSelectionPolicy
```

### Running a hyperparameter tuning experiment
- Create a training script that
    - Includes an argument for each hyperparameter you want to vary (covered in previous lab)
    - Log the target performance metric (covered in previous lab)
- Configure and run hyperdrive experiment
```python
from azureml.train.hyperdrive import HyperDriveConfig, PrimaryMetricGoal
```
- Monitor and review hyperdrive runs 


## 09 Automate machine learning model selection with Azure Machine Learning
Automated Machine Learning is one of the two big features, Automated ML and Designer, in AML studio. You can use the visual interface in Azure Machine Learning studio or the SDK to leverage this capability. The SDK gives you greater control over the settings for the automated machine learning experiment, but the visual interface is easier to use.

Configure an Automated Machine Learning experiment: 
```python
from azureml.train.automl import AutoMLConfig
```

## 10 Explain machine learning models with Azure Machine Learning

### Type of Feature Importance
- **Global feature importance** quantifies the relative importance of each feature in the test dataset as a whole.
- **Local feature importance** measures the influence of each feature value for a specific individual prediction.

### Explainers
Using explainers – install the **azureml-interpret** package
- MimicExplainer – An explainer that creates a *global surrogate model* that approximates your trained model and can be used to generate explanations.
```python
from interpret.ext.blackbox import MimicExplainer
from interpret.ext.glassbox import DecisionTreeExplainableModel # Requires most arguments
```
- TabularExplainer – An explainer that acts as a wrapper around various SHAP explainer algorithms, automatically choosing the one that is most appropriate for your model architecture.
```python
from interpret.ext.blackbox import TabularExplainer # Does not require explainable_model
```
- PFIExplainer – a *Permutation Feature Importance* explainer that analyzes feature importance by shuffling feature values and measuring the impact on prediction performance.
```python
from interpret.ext.blackbox import PFIExplainer #Does not require explainable_model and initialization_example
```

## 11 Detect and mitigate unfairness in models with Azure Machine Learning

### Disparity
[To add more notes]


A model with lower disparity in predictive performance between sensitive feature groups might be favorable then the model with higher disparity and overall accuracy.

#### Side Note – under what situations we might choose a model with lower accuracy/AUC over a higher one?
- Time required for training
- Interpretability
- Lower disparity between sensitive feature groups


## 12 Monitor a Model
To capture telemetry data for Application insights, you can write any values to the standard output log in the scoring script for your service by using a print statement

Summarize the whole workflow from building, deploying, consuming, to monitoring a model.
- Refer to Jupyter Notebook for the complete codes
