**Safe Deployment: Deploy and score a machine learning model after testing the online endpoint**

Test the deployment with Mirrored traffic
Test the deployment with some percentage of live traffic
once satisfied with the deployment, assign 100% live traffic a

Workflow 

1. Initialize Workspace & create workspace handle 
2. Initialize Environment 
3. configure Endpoint 
4. Get reference of the register model 
5. create a scoring script 
6. Configure Deployment 
7. Assign traffic to the deployment 
8. Test the Deployment 
9. Create new deployment with New or enhanced model 
10. Test the Deployment with Mirrored traffic. After Testing set the mirrored traffic to 0 
11. Send some percentage of live traffic to new deployment 
12. Rollout New Deployment 
13. Remove Old Deployment 
14. Delete the Endpoint



##### Step 1: Initialize Workspace and Create Workspace handle

In [1]:
from azureml.core import Workspace
from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential

# Initialize  workspace
ws = Workspace.from_config()  

# Get a handle to the workspace
credential = DefaultAzureCredential()  # authenticate
ml_client = MLClient( credential=credential,
                      subscription_id=ws.subscription_id,
                      resource_group_name=ws.resource_group,
                      workspace_name=ws.name,
                    )


##### Step 2: Initialize Environment

In [2]:
import os
from azure.ai.ml.entities import Environment

custom_env_name  = "ENV-SDKv2"
# dependencies_dir = '../dependencies'
# env = Environment( name=custom_env_name,
#                    description="Evironment for python SDKv2 Execution",
#                    conda_file=os.path.join(dependencies_dir, "conda.yaml"),
#                    image="mcr.microsoft.com/azureml/openmpi4.1.0-ubuntu20.04:latest",
#                  )
# env = ml_client.environments.create_or_update(env)

# GET ENVIRONMENT
# use 'label' parameter to get latest environment for example label='latest'
# use 'version' parameter to get specific version environment, for example version=2
env = ml_client.environments.get(name=custom_env_name, label='latest') 

print(f"Environment with name {env.name} is registered to workspace, the environment version is {env.version}")

Environment with name ENV-SDKv2 is registered to workspace, the environment version is 6


##### Step 3: Configure Endpoint
Note: auth_mode is either **key** or **token**. The key difference is **token** expires and **key** won't

In [4]:
from azure.ai.ml.entities import ManagedOnlineEndpoint

# online endpoint
online_endpoint_name = "pima-endpoint-SDKv2-v5"

endpoint = ManagedOnlineEndpoint(
    name=online_endpoint_name,
    description="this is a sample online endpoint for mlflow model",
    auth_mode="key",
    tags={"SDK": "2"},
)
# create endpoint
ml_client.begin_create_or_update(endpoint).result()




##### Step 4: Get referrence to Registered Model

In [4]:
# referencing model from model registry
model_name = 'pima_model_SDKv2_02'
model = ml_client.models.get(name=model_name, label="latest")

##### Step 5: Create a Scoring Script and save in 'src' folder

##### Step 6: Configure Deployment

**Instance type**: The VM size to use for the deployment. For the list of supported sizes.[cick here](https://learn.microsoft.com/en-us/azure/machine-learning/reference-managed-online-endpoints-vm-sku-list?view=azureml-api-2)                                                                                                        

In [5]:
from azure.ai.ml.entities import ManagedOnlineDeployment, CodeConfiguration

deployment_name = "blue-deployment"
deployment_blue = ManagedOnlineDeployment(
                name=deployment_name, # deployment name
                endpoint_name=online_endpoint_name,  
                model=model,
                environment=env,
                code_configuration=CodeConfiguration(
                code="../src", scoring_script="pima_scoreOnlineEndpoint_SDKv2.py"
                ),
                instance_type="Standard_F4s_v2",
                instance_count=1,
            )

ml_client.online_deployments.begin_create_or_update(deployment=deployment_blue).result() # create local deployment


Check: endpoint pima-endpoint-SDKv2_v5 exists


ResourceNotFoundError: (ResourceNotFound) The Resource 'Microsoft.MachineLearningServices/workspaces/us_azure/onlineEndpoints/pima-endpoint-SDKv2_v5' under resource group 'us_azure_practice' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix
Code: ResourceNotFound
Message: The Resource 'Microsoft.MachineLearningServices/workspaces/us_azure/onlineEndpoints/pima-endpoint-SDKv2_v5' under resource group 'us_azure_practice' was not found. For more details please go to https://aka.ms/ARMResourceNotFoundFix

##### Step 7: Assign traffic to the deployment

In [None]:
# assign traffic to the deployment
endpoint.traffic = {deployment_name: 100}
ml_client.begin_create_or_update(endpoint).result()

# step 8
# testing the endpoint
# test the blue deployment with some sample data
ml_client.online_endpoints.invoke(
    endpoint_name=online_endpoint_name,
    deployment_name=deployment_name,
    request_file="../../data/sample_request.json",
)


##### Step 9: Create New Deployment - green
# Get referrence to you new or enhanced model from model registry
model_name = 'credit_defaults_model_SDKv2'
model2 = ml_client.models.get(name=model_name, label='latest')


# configure and create you new deployment
deployment_name2 = 'green' 
green_deployment = ManagedOnlineDeployment(
    name=deployment_name2,
    endpoint_name=online_endpoint_name,
    model=model2,
    environment=env,
    code_configuration=CodeConfiguration(
        code="../src", scoring_script="scoreOnlineEndpoint_SDKv2.py"
    ),
    instance_type="Standard_DS2_v2",
    instance_count=1,
)

ml_client.online_deployments.begin_create_or_update(deployment=green_deployment).result() 


##### Step 10: Test the deployment with mirrored traffic
#Once you've tested your `green` deployment, you can copy (or 'mirror') a percentage of the live traffic to it. Mirroring traffic doesn't change results returned to clients. Requests still flow 100% to the blue deployment. The mirrored percentage of the traffic is copied and submitted to the `green` deployment so you can gather metrics and logging without impacting your clients. Mirroring is useful when you want to validate a new deployment without impacting clients. For example, to check if latency is within acceptable bounds and that there are no HTTP errors.

endpoint.mirror_traffic = {deployment_name2: 30}
ml_client.begin_create_or_update(endpoint).result()



##### Step 8: Test the endpoint with sample data
We will send a sample request using a [json](../../data/sample_request.json) file. 