### This note book will do the following

>   For training a model Tune hyperparameters with a sweep job

>   With the best model that is saved Create a managed online endpoint.

>   Deploy an MLflow model.

>   Test the endpoint.

#### Getting Started

> In compute get the repo 
  
  `git clone https://github.com/PrabalMukherjee/mslearn-azure-ml.git azure-ml-labs`

> Run setup for creating compute 

  `.\setup.sh`

> Open ML workspace and in compute stand alone  terminal do the following 

  `pip install azure-ai-ml`

  `git clone https://github.com/PrabalMukherjee/mslearn-azure-ml.git azure-ml-labs`

In [1]:
pip show azure-ai-ml

Name: azure-ai-ml
Version: 1.15.0
Summary: Microsoft Azure Machine Learning Client Library for Python
Home-page: https://github.com/Azure/azure-sdk-for-python
Author: Microsoft Corporation
Author-email: azuresdkengsysadmins@microsoft.com
License: MIT License
Location: /anaconda/envs/azureml_py38/lib/python3.8/site-packages
Requires: pyyaml, msrest, azure-storage-blob, azure-core, jsonschema, marshmallow, tqdm, azure-storage-file-share, colorama, opencensus-ext-logging, opencensus-ext-azure, pydash, pyjwt, azure-storage-file-datalake, typing-extensions, azure-mgmt-core, strictyaml, azure-common, isodate
Required-by: 
Note: you may need to restart the kernel to use updated packages.


In [25]:
# Setting up Environment
import random
import string

from azure.identity import InteractiveBrowserCredential, DefaultAzureCredential
from azure.ai.ml import MLClient
from azure.ai.ml import command, Input
from azure.ai.ml.constants import AssetTypes
from azure.ai.ml.sweep import Choice

from azure.ai.ml.entities import (
    Data,
    ManagedOnlineEndpoint,
    ManagedOnlineDeployment,
    Model,
    Environment,
    CodeConfiguration,
)


#### Connect to workspace       

In [3]:
#Connect to Azure ML Workspace

try:
    credential = DefaultAzureCredential()
    credential.get_token('https://management.azure.com/.default')
    print("DefaultAzureCredential works")
except Exception as e:
    print("DefaultAzureCredential does not work", e)
    credential = InteractiveBrowserCredential()

#Get a handke to workspace
ml_client = MLClient.from_config(credential=credential)


Found the config file in: /config.json


DefaultAzureCredential works


#### Create a Data Asset for Input

In [4]:
input_data = Data(
    name="diabetes-data",
    type=AssetTypes.URI_FILE,
    path="data/diabetes.csv",
    version="1"
    )

ml_client.data.create_or_update(input_data)

# get the data asset
data_asset_from_registry = ml_client.data.get(
    name="diabetes-data", version=1
)
print(data_asset_from_registry)

[32mUploading diabetes.csv[32m (< 1 MB): 0.00B [00:00, ?B/s][32mUploading diabetes.csv[32m (< 1 MB): 100%|██████████| 518k/518k [00:00<00:00, 11.7MB/s]
[39m



creation_context:
  created_at: '2024-03-29T17:40:40.597226+00:00'
  created_by: Prabal Mukherjee
  created_by_type: User
  last_modified_at: '2024-03-29T17:40:40.606774+00:00'
id: /subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourceGroups/rg-dp100-l9216695a5c62456390/providers/Microsoft.MachineLearningServices/workspaces/mlw-dp100-l9216695a5c62456390/data/diabetes-data/versions/1
name: diabetes-data
path: azureml://subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourcegroups/rg-dp100-l9216695a5c62456390/workspaces/mlw-dp100-l9216695a5c62456390/datastores/workspaceblobstore/paths/LocalUpload/d52d15a0d7d1e95b90a03f146099424a/diabetes.csv
properties: {}
tags: {}
type: uri_file
version: '1'



#### Configure Command Job and Submit

In [5]:
# configure job
# path="data/diabetes.csv"

job = command(
    code="./src",
    command="python train.py --training_data ${{inputs.diabetes_data}} --reg_rate ${{inputs.reg_rate}}",
    inputs={
        "diabetes_data": Input(
            type=AssetTypes.URI_FILE, 
            path="azureml:diabetes-data:1"
            ),
        "reg_rate": 0.01,
    },
    environment="AzureML-sklearn-0.24-ubuntu18.04-py37-cpu@latest",
    compute="aml-cluster",
    display_name="diabetes-train-mlflow",
    experiment_name="diabetes-training", 
    tags={"model_type": "LogisticRegression"}
    )

# submit job
returned_job = ml_client.create_or_update(job)
aml_url = returned_job.studio_url
print("Monitor your job at", aml_url)

Class AutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class AutoDeleteConditionSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseAutoDeleteSettingSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class IntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class ProtectionLevelSchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
Class BaseIntellectualPropertySchema: This is an experimental class, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.
[32mUploading src (0.0 MBs): 100%|███

Monitor your job at https://ml.azure.com/runs/great_sheep_3q4klfr3dx?wsid=/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourcegroups/rg-dp100-l9216695a5c62456390/workspaces/mlw-dp100-l9216695a5c62456390&tid=49887c6a-e179-452b-a5a2-ba227a9ce9f7


#### Hyper parameter tuning creating and running sweep job

In [6]:

command_job_for_sweep = job(
    reg_rate=Choice(values=[0.01, 0.1, 1]),
)

# apply the sweep parameter to obtain the sweep_job
sweep_job = command_job_for_sweep.sweep(
    compute="aml-cluster",
    sampling_algorithm="grid",
    primary_metric="training_accuracy_score",
    goal="Maximize",
)

# set the name of the sweep job experiment
sweep_job.experiment_name="sweep-diabetes"

# define the limits for this sweep
sweep_job.set_limits(max_total_trials=4, max_concurrent_trials=2, timeout=7200)

returned_sweep_job = ml_client.create_or_update(sweep_job)
aml_url = returned_sweep_job.studio_url
print("Monitor your job at", aml_url)

Monitor your job at https://ml.azure.com/runs/boring_house_mc3qjjkx8g?wsid=/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourcegroups/rg-dp100-l9216695a5c62456390/workspaces/mlw-dp100-l9216695a5c62456390&tid=49887c6a-e179-452b-a5a2-ba227a9ce9f7


In [7]:
# submit the sweep
returned_sweep_job = ml_client.create_or_update(sweep_job)
# get a URL for the status of the job
returned_sweep_job.services["Studio"].endpoint

'https://ml.azure.com/runs/loving_owl_k0pbcx5rk3?wsid=/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourcegroups/rg-dp100-l9216695a5c62456390/workspaces/mlw-dp100-l9216695a5c62456390&tid=49887c6a-e179-452b-a5a2-ba227a9ce9f7'

Experiment,Name,Type,Status,Details Page
sweep-diabetes,boring_house_mc3qjjkx8g,sweep,Completed,Link to Azure Machine Learning studio


In [10]:
# Download best trial model output
#returned_sweep_job1 = ml_client.jobs.get("boring_house_mc3qjjkx8g")
ml_client.jobs.download("boring_house_mc3qjjkx8g", output_name="model")

In [19]:

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "diabetes-endpoint-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")
# create an online endpoint
endpoint = ManagedOnlineEndpoint(
    name=endpoint_name,
    description="Online endpoint",
    auth_mode="key"
)

ml_client.begin_create_or_update(endpoint).result()

Endpoint name: diabetes-endpoint-k915z


ManagedOnlineEndpoint({'public_network_access': 'Enabled', 'provisioning_state': 'Succeeded', 'scoring_uri': 'https://diabetes-endpoint-k915z.eastus.inference.ml.azure.com/score', 'openapi_uri': 'https://diabetes-endpoint-k915z.eastus.inference.ml.azure.com/swagger.json', 'name': 'diabetes-endpoint-k915z', 'description': 'Online endpoint', 'tags': {}, 'properties': {'azureml.onlineendpointid': '/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourcegroups/rg-dp100-l9216695a5c62456390/providers/microsoft.machinelearningservices/workspaces/mlw-dp100-l9216695a5c62456390/onlineendpoints/diabetes-endpoint-k915z', 'AzureAsyncOperationUri': 'https://management.azure.com/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/providers/Microsoft.MachineLearningServices/locations/eastus/mfeOperationsStatus/oe:8f524591-87cb-4bb1-bd78-2ffcce4ceaaf:5365a1ff-3c6b-40da-8d9d-8e272c3b3b2a?api-version=2022-02-01-preview'}, 'print_as_yaml': False, 'id': '/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db70

In [21]:
%%writefile src/score.py

import json
import joblib
import numpy as np


# called when the deployment is created or updated
def init():
    global model
    # get the path to the registered model file and load it
    model_path = './model/model.pkl'
    model = joblib.load(model_path)

# called when a request is received
def run(raw_data):
    # get the input data as a numpy array
    data = np.array(json.loads(raw_data)['data'])
    # get a prediction from the model
    predictions = model.predict(data)
    # return the predictions as any JSON serializable format
    return predictions.tolist()

Writing src/score.py


In [22]:
%%writefile src/conda.yml

name: basic-env-cpu
channels:
  - conda-forge
dependencies:
  - python=3.7
  - scikit-learn
  - pandas
  - numpy
  - matplotlib


Writing src/conda.yml


In [24]:
env = Environment(
    image="mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04",
    conda_file="./src/conda.yml",
    name="deployment-environment",
    description="Environment created from a Docker image plus Conda environment.",
)
ml_client.environments.create_or_update(env)

Environment({'arm_type': 'environment_version', 'latest_version': None, 'image': 'mcr.microsoft.com/azureml/openmpi3.1.2-ubuntu18.04', 'intellectual_property': None, 'is_anonymous': False, 'auto_increment_version': False, 'auto_delete_setting': None, 'name': 'deployment-environment', 'description': 'Environment created from a Docker image plus Conda environment.', 'tags': {}, 'properties': {'azureml.labels': 'latest'}, 'print_as_yaml': False, 'id': '/subscriptions/058d8843-d1d3-47ed-bfd3-0a4b79db7001/resourceGroups/rg-dp100-l9216695a5c62456390/providers/Microsoft.MachineLearningServices/workspaces/mlw-dp100-l9216695a5c62456390/environments/deployment-environment/versions/1', 'Resource__source_path': '', 'base_path': '/mnt/batch/tasks/shared/LS_root/mounts/clusters/ci9216695a5c62456390/code/Users/prabalmukherjee/azure-ml-labs/Labs/My_Own/Deploy_MLFlowModel', 'creation_context': <azure.ai.ml.entities._system_data.SystemData object at 0x7f0c7cfea550>, 'serialize': <msrest.serialization.Se

In [26]:
# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "diabetes-endpoint-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

Endpoint name: diabetes-endpoint-dowuz


In [None]:
model = Model(path="./model/model.pkl", name="diabetes_model")

blue_deployment = ManagedOnlineDeployment(
    name="blue",
    endpoint_name="endpoint-example_pm79183",
    model=model,
    environment="deployment-environment",
    code_configuration=CodeConfiguration(
        code="./src", scoring_script="score.py"
    ),
    instance_type="Standard_DS3_v2",
    instance_count=1,
)

ml_client.online_deployments.begin_create_or_update(blue_deployment).result()