# E2E Machine Learning Workflow on Azure ML using the Python SDK v2 pt.2

##### Model deployment and inferencing

## Deploy the model as an online endpoint

Now deploy your machine learning model as a web service in the Azure cloud, an [`online endpoint`](https://docs.microsoft.com/azure/machine-learning/concept-endpoints).

To deploy a machine learning service, you usually need:

* The model assets (filed, metadata) that you want to deploy. You've already registered these assets in your training component.
* Some code to run as a service. The code executes the model on a given input request. This entry script receives data submitted to a deployed web service and passes it to the model, then returns the model's response to the client. The script is specific to your model. The entry script must understand the data that the model expects and returns. When using a MLFlow model, as in this tutorial, this script is automatically created for you. Samples of scoring scripts can be found [here](https://github.com/Azure/azureml-examples/tree/sdk-preview/sdk/endpoints/online).



### Step 1 - Create the endpoint

The endpoint name needs to be unique in the entire Azure region. For this tutorial, you'll create a unique name using [`UUID`](https://en.wikipedia.org/wiki/Universally_unique_identifier).

In [None]:
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

subscription_id = ''
resource_group = ''
workspace = ''

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

In [None]:
import uuid
from azure.ai.ml.entities import ManagedOnlineEndpoint

# Create a unique name for the endpoint
online_endpoint_name = "credit-endpoint-" + str(uuid.uuid4())[:8]

# define an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is an online endpoint",
    auth_mode="key",
)

# create the online endpoint - takes approximately 2 minutes.

endpoint = ml_client.online_endpoints.begin_create_or_update(endpoint).result()

### Step 2: Deploy the model to the endpoint
Once the endpoint is created, deploy the model with the entry script. Each endpoint can have multiple deployments and direct traffic to these deployments can be specified using rules. Here you'll create a single deployment that handles 100% of the incoming traffic. We have chosen a color name for the deployment, for example, blue, green, red deployments, which is arbitrary.

You can check the Models page on the Azure ML studio, to identify the latest version of your registered model. Alternatively, the code below will retrieve the latest version number for you to use.

In [None]:
registered_model_name = ""

# Let's pick the latest version of the model
latest_model_version = max(
    [int(m.version) for m in ml_client.models.list(name=registered_model_name)]
)

print(latest_model_version)

In [None]:
from azure.ai.ml.entities import ManagedOnlineDeployment

# Choose the latest version of the registered model for deployment
model = ml_client.models.get(name=registered_model_name, version=1)

# define an online deployment
# if you run into an out of quota error, change the instance_type to a comparable VM that is available.\
# Learn more on https://azure.microsoft.com/en-us/pricing/details/machine-learning/.
blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name=online_endpoint_name,
    model=model,
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

In [None]:
# create the online deployment
blue_deployment = ml_client.online_deployments.begin_create_or_update(
    blue_deployment
).result()

# blue deployment takes 100% traffic
# expect the deployment to take approximately 8 to 10 minutes.
endpoint.traffic = {"blue": 100}
ml_client.online_endpoints.begin_create_or_update(endpoint).result()

### Step 3 - Initial test the endpoint with a single sample point

In [None]:
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

credit_data_latest = Data(
    name="credit_card_default_data_test",
    path="../data/credit_card_default_data_test.csv",
    type=AssetTypes.URI_FILE,
    description="Dataset for credit card defaults",
    version="1",
)

ml_client.data.create_or_update(credit_data_latest)

In [None]:
test_sample = ml_client.data.get(name="credit_card_default_data_test", version="1")
print(f"Data asset URI: {test_sample.path}")

In [None]:
import pandas as pd

raw_df = pd.read_csv(test_sample.path, header=0)
test_df = raw_df.drop(columns=["default payment next month"])
test_df.head()

In [None]:
import urllib.request
import json
import os
idx = 1
data =  {
  "input_data": [test_df.iloc[idx].to_list()]
}

body = str.encode(json.dumps(data))

url = ''
# Replace this with the primary/secondary key or AMLToken for the endpoint
api_key = ''
if not api_key:
    raise Exception("A key should be provided to invoke the endpoint")

headers = {'Content-Type':'application/json', 'Authorization':('Bearer '+ api_key), 'azureml-model-deployment': 'blue' }

req = urllib.request.Request(url, body, headers)

try:
    response = urllib.request.urlopen(req)
    result = response.read()
    print(result, raw_df.iloc[idx]["default payment next month"])
except urllib.error.HTTPError as error:
    print("The request failed with status code: " + str(error.code))
    print(error.info())
    print(error.read().decode("utf8", 'ignore'))

### Step 4: Testing the endpoint natively with the Python SDK v2

Now that the model is deployed to the endpoint, you can run inference with it.

Create a sample request file following the design expected in the run method in the score script.

In [None]:
%%writefile inference/sample-request.json
{
  "input_data": [[24, 20000, 2, 2, 1, 24, 2, 2, -1, -1, -2, -2, 3913, 3102, 689, 0, 0, 0, 0, 689, 0, 0, 0, 0]]
}

In [None]:
# test the blue deployment with some sample data

ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    request_file="./inference/sample-request.json",
    deployment_name="blue",
)

### Step 5: Safe roll-out of a new version of the model

We will now look at the safe rollout of a new version of the model using blue / green model deployment; directing a small subset of traffic to the newer version.

#### Step 5.1 Training and registering a new version of the initial model

We will begin with creating a new dataset with more training data that will be used for our newer version of the model.

In [None]:
from azure.ai.ml.entities import Data
from azure.ai.ml.constants import AssetTypes

credit_data_latest = Data(
    name="credit_card_default_data",
    path="../data/credit_card_default_data_v2.csv",
    type=AssetTypes.URI_FILE,
    description="Dataset for credit card defaults",
    version="2",
)

ml_client.data.create_or_update(credit_data_latest)

In [None]:
credit_data_latest = ml_client.data.get(name="credit_card_default_data", version="2")
print(f"Data asset URI: {credit_data_latest.path}")

In [None]:
from azure.ai.ml import dsl, Input, Output

clean_component = ml_client.components.get("data_clean_credit_card_defaults", "1")
prep_compontent = ml_client.components.get("data_prep_credit_card_defaults", "1")
train_compontent = ml_client.components.get("credit_default_model_training", "1")

@dsl.pipeline(
    compute="ej-cluster2",  # "serverless" value runs pipeline on serverless compute
    description="E2E data_perp-train pipeline",
    force_rerun = True
)
def credit_defaults_pipeline(
    pipeline_job_data_input,
    pipeline_job_learning_rate,
    pipeline_job_registered_model_name,
):
    # using data_prep_function like a python call with its own inputs
    data_clean_job = clean_component(
        data=pipeline_job_data_input
    )

    data_prep_job = prep_compontent(
        data=data_clean_job.outputs.cleaned_data,
    )

    # using train_func like a python call with its own inputs
    train_job = train_compontent(
        train_data=data_prep_job.outputs.train_data,  # note: using outputs from previous step
        test_data=data_prep_job.outputs.test_data,  # note: using outputs from previous step
        learning_rate=pipeline_job_learning_rate,  # note: using a pipeline input as parameter
        registered_model_name=pipeline_job_registered_model_name,
    )
    # a pipeline returns a dictionary of outputs
    # keys will code for the pipeline output identifier
    return {
        "pipeline_job_cleaned_data" : data_clean_job.outputs.cleaned_data,
        "pipeline_job_train_data": data_prep_job.outputs.train_data,
        "pipeline_job_test_data": data_prep_job.outputs.test_data,
    }
 

In [None]:
registered_model_name = "credit_defaults_model"

# Let's instantiate the pipeline with the parameters of our choice
pipeline = credit_defaults_pipeline(
    pipeline_job_data_input=Input(type="uri_file", path=credit_data_latest.path),
    pipeline_job_learning_rate=0.05,
    pipeline_job_registered_model_name=registered_model_name
)

In [None]:
from azure.ai.ml.entities import PipelineJob, PipelineJobSettings

pipeline_job = ml_client.jobs.create_or_update(
    pipeline,
    experiment_name="e2e_registered_components",
)
ml_client.jobs.stream(pipeline_job.name)

#### Step 5.2 Update the endpoint

Now we have our latest model trained and registered, we can now look at getting it deployed as part of our managed endpoint. This next series of cells will show how to mirror traffic over to the newer model before completely migrating the endpoint across to the newer model.

In [None]:
# scale the deployment
blue_deployment = ml_client.online_deployments.get(
    name="blue", endpoint_name=online_endpoint_name
)
blue_deployment.instance_count = 2
ml_client.online_deployments.begin_create_or_update(blue_deployment).result()

In [None]:
registered_model_name = ""

# Let's pick the latest version of the model
latest_model_version = max(
    [int(m.version) for m in ml_client.models.list(name=registered_model_name)]
)

print(latest_model_version)

In [None]:
green_deployment = ManagedOnlineDeployment(
    name="green",
    endpoint_name=online_endpoint_name,
    model=model,
    instance_type="Standard_DS3_v2",
    instance_count=1
)

ml_client.online_deployments.begin_create_or_update(green_deployment).result()

In [None]:
# test the green deployment with some sample data

ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    request_file="./inference/sample-request.json",
    deployment_name="green",
)

In [None]:
endpoint.traffic = {"blue": 90, "green": 10}
ml_client.begin_create_or_update(endpoint).result()

In [None]:
ml_client.online_deployments.begin_delete(
    name="blue", endpoint_name=online_endpoint_name
).wait()

## Clean up resources

If you're not going to use the endpoint, delete it to stop using the resource.  Make sure no other deployments are using an endpoint before you delete it.


> [!NOTE]
> Expect this step to take approximately 6 to 8 minutes.

In [None]:
ml_client.online_endpoints.begin_delete(name=online_endpoint_name)

###