For this workshop, you need:

* An Azure Machine Learning workspace. 
* The Azure Machine Learning Python SDK v2 installed. 

To install the SDK you can either,

Create a compute instance, which already has installed the latest AzureML Python SDK and is pre-configured for ML workflows.

Use the followings commands to install Azure ML Python SDK v2:

```bash
pip install azure-ai-ml --upgrade
```

## Connect to ML Client

To connect to a workspace, you need to provide a subscription, resource group and workspace name. These details are used in the `MLClient` from `azure.ai.ml` to get a handle to the required Azure Machine Learning workspace.

In the following example, the default Azure authentication is used along with the default workspace configuration or from any `config.json` file you might have copied into the folders structure. If no `config.json` is found, then you need to manually introduce the subscription_id, resource_group and workspace when creating `MLClient`.

```python
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient

credential = DefaultAzureCredential()
ml_client = None
try:
    ml_client = MLClient.from_config(credential)
except Exception as ex:
    print(ex)
    # Enter details of your AzureML workspace
    subscription_id = "<SUBSCRIPTION_ID>"
    resource_group = "<RESOURCE_GROUP>"
    workspace = "<AZUREML_WORKSPACE_NAME>"
    ml_client = MLClient(credential, subscription_id, resource_group, workspace)
```


In [1]:
from azure.identity import DefaultAzureCredential
from azure.ai.ml import MLClient

# Retrieve details of Azure ML workspace from environment variables of your Compute Instance
subscription_id = re.search("subscriptions/(.*)/resourceGroups", os.environ["MLFLOW_TRACKING_URI"]).group(
    1
)  # Extract Azure Subcription ID from MLFlow Tracking URI
resource_group = os.environ["CI_RESOURCE_GROUP"]
workspace_name = os.environ["CI_WORKSPACE"]

# Connect to Azure ML workspace
ml_client = MLClient(
    credential=DefaultAzureCredential(),
    subscription_id=subscription_id,
    resource_group_name=resource_group,
    workspace_name=workspace_name,
)

Found the config file in: /config.json


# Batch Endpoint

**Batch endpoints** are endpoints that are used to do batch inferencing on large volumes of data over a period of time. 

**Batch endpoints** receive pointers to data and run jobs asynchronously to process the data in parallel on compute clusters. Batch endpoints store outputs to a data store for further analysis.

<center>
<img src="../../../imgs/concept_batch_endpoint.png" width = "700px" alt="Concept batch endpoint">
</center>

## 1. Create Batch Compute Cluster

In [None]:
# create compute cluster to be used by batch cluster
from azure.ai.ml.entities import AmlCompute

my_cluster = AmlCompute(
    name=### TO DO ###,
    type=### TO DO ###, 
    size="STANDARD_DS3_V2", 
    min_instances=0, 
    max_instances=3,
    location="westeurope", 	
)
ml_client.compute.begin_create_or_update(my_cluster)

## 2. Create Batch Endpoint

We can create the **batch endpoint** with cli v2 or sdk v2 using the following syntax:


<center>
<img src="../../../imgs/create_batch_endpoint.png" width = "700px" alt="Create batch endpoint cli vs sdk">
</center>

In [None]:
# create batch endpoint
from azure.ai.ml.entities import BatchEndpoint

batch_endpoint = BatchEndpoint(
    name=### TO DO ###,
    description=### TO DO ###,
    tags={"model": ### TO DO ###},
)

ml_client.begin_create_or_update(batch_endpoint)

## 3. Create Batch Deployment

We can create the **batch deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../../imgs/create_batch_deployment.png" width = "700px" alt="Create batch deployment cli vs sdk">
</center>

Note that if you're deploying **MLFlow models**, there's no need to provide **a scoring script** and execution **environment**, as both are autogenerated.

In [None]:
# create batch deployment
from azure.ai.ml.entities import BatchDeployment, Model, Environment
from azure.ai.ml.constants import BatchDeploymentOutputAction
from azure.ai.ml.entities import (
    Environment,
    CodeConfiguration,
)


env = Environment(
    conda_file="../../../data-science/environment/deploy-batch-conda.yml",
    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
)

batch_deployment = BatchDeployment(
    name=### TO DO ###,
    description=### TO DO ###,
    endpoint_name=### TO DO ###,
    model=### TO DO ###,
    environment=### TO DO ###,
    code_configuration=CodeConfiguration(
        code="../../../data-science/src/score", scoring_script=### TO DO ###
    ),
    compute=### TO DO ###,
    instance_count=2,
    max_concurrency_per_instance=4,
    mini_batch_size=200,
    output_action=BatchDeploymentOutputAction.APPEND_ROW,
    output_file_name="predictions.csv",
)

ml_client.begin_create_or_update(batch_deployment)


Set deployment as the default deployment in the endpoint:

In [19]:
batch_endpoint = ml_client.batch_endpoints.get(### TO DO ###)
batch_endpoint.defaults.deployment_name = batch_deployment.name
ml_client.batch_endpoints.begin_create_or_update(batch_endpoint)

<azure.core.polling._poller.LROPoller at 0x7fdc443f3160>

## 4. Invoke and Test Endpoint

We can invoke the **batch deployment** with cli v2 or sdk v2 using the following syntax:

<center>
<img src="../../../imgs/invoke_batch_deployment.png" width = "700px" alt="Invoke batch deployment cli vs sdk">
</center>

In [None]:
# invoke and test endpoint
from azure.ai.ml import Input
from azure.ai.ml.constants import AssetTypes, InputOutputModes

input = Input(path="../../../data/taxi-batch.csv", 
              type=AssetTypes.URI_FILE, 
              mode=InputOutputModes.DOWNLOAD)


# invoke the endpoint for batch scoring job
ml_client.batch_endpoints.invoke(
    endpoint_name=### TO DO ###,
    input=input,
    deployment_name=### TO DO ###
)