# 1 - Business Problem

- We will consider the [Heart Disease Data Set](https://archive.ics.uci.edu/ml/datasets/heart+disease)
- In this example we will concentrate on attempting to distinguish presence (values 1,2,3,4) from absence (value 0)

# 2 - Importing required libraries

In [1]:
pip install xgboost

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip show xgboost

Name: xgboost
Version: 2.0.3
Summary: XGBoost Python Package
Home-page: 
Author: 
Author-email: Hyunsu Cho <chohyu01@cs.washington.edu>, Jiaming Yuan <jm.yuan@outlook.com>
License: Apache-2.0
Location: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages
Requires: numpy, scipy
Required-by: 
Note: you may need to restart the kernel to use updated packages.


In [3]:
pip install --upgrade xgboost

Note: you may need to restart the kernel to use updated packages.


In [4]:
pip show xgboost

Name: xgboost
Version: 2.0.3
Summary: XGBoost Python Package
Home-page: 
Author: 
Author-email: Hyunsu Cho <chohyu01@cs.washington.edu>, Jiaming Yuan <jm.yuan@outlook.com>
License: Apache-2.0
Location: /anaconda/envs/azureml_py310_sdkv2/lib/python3.10/site-packages
Requires: numpy, scipy
Required-by: 
Note: you may need to restart the kernel to use updated packages.


In [5]:
pip install mlflow

Note: you may need to restart the kernel to use updated packages.


In [6]:
pip show mlflow

Name: mlflow
Version: 2.13.0
Summary: MLflow is an open source platform for the complete machine learning lifecycle
Home-page: 
Author: 
Author-email: 
License: Copyright 2018 Databricks, Inc.  All rights reserved.
        
                                        Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that contr

In [7]:
pip install --upgrade mlflow

Note: you may need to restart the kernel to use updated packages.


In [8]:
pip show mlflow

Name: mlflow
Version: 2.13.0
Summary: MLflow is an open source platform for the complete machine learning lifecycle
Home-page: 
Author: 
Author-email: 
License: Copyright 2018 Databricks, Inc.  All rights reserved.
        
                                        Apache License
                                   Version 2.0, January 2004
                                http://www.apache.org/licenses/
        
           TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
        
           1. Definitions.
        
              "License" shall mean the terms and conditions for use, reproduction,
              and distribution as defined by Sections 1 through 9 of this document.
        
              "Licensor" shall mean the copyright owner or entity authorized by
              the copyright owner that is granting the License.
        
              "Legal Entity" shall mean the union of the acting entity and all
              other entities that contr

In [None]:
# restart kernel
# kernel operations - restart kernel

In [1]:
import xgboost
import mlflow
from mlflow.tracking.client import MlflowClient
import sklearn
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [2]:
import warnings
warnings.simplefilter("ignore")

# 3 - Configuring the experiment

In [3]:
import mlflow
mlflow.set_experiment(experiment_name="heart-condition-classifier")

<Experiment: artifact_location='', creation_time=1716321271009, experiment_id='d5ea7a5a-307f-471c-acc1-9a9887d19cc7', last_update_time=None, lifecycle_stage='active', name='heart-condition-classifier', tags={}>

# 4- Exploring data

In [4]:
import pandas as pd

In [5]:
file_url = "https://azuremlexampledata.blob.core.windows.net/data/heart-disease-uci/data/heart.csv"
df = pd.read_csv(file_url)

In [6]:
df.head()

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,63,1,1,145,233,1,2,150,0,2.3,3,0,fixed,0
1,67,1,4,160,286,0,2,108,1,1.5,2,3,normal,1
2,67,1,4,120,229,0,2,129,1,2.6,2,2,reversible,0
3,37,1,3,130,250,0,0,187,0,3.5,3,0,normal,0
4,41,0,2,130,204,0,2,172,0,1.4,1,0,normal,0


In [7]:
df.shape

(303, 14)

In [8]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 303 entries, 0 to 302
Data columns (total 14 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   age       303 non-null    int64  
 1   sex       303 non-null    int64  
 2   cp        303 non-null    int64  
 3   trestbps  303 non-null    int64  
 4   chol      303 non-null    int64  
 5   fbs       303 non-null    int64  
 6   restecg   303 non-null    int64  
 7   thalach   303 non-null    int64  
 8   exang     303 non-null    int64  
 9   oldpeak   303 non-null    float64
 10  slope     303 non-null    int64  
 11  ca        303 non-null    int64  
 12  thal      303 non-null    object 
 13  target    303 non-null    int64  
dtypes: float64(1), int64(12), object(1)
memory usage: 33.3+ KB


In [9]:
df['thal'].nunique()

5

In [10]:
df['thal'].unique()

array(['fixed', 'normal', 'reversible', '1', '2'], dtype=object)

In [11]:
df['thal'] = df['thal'].astype("category").cat.codes

In [12]:
df['thal'].unique()

array([2, 3, 4, 0, 1], dtype=int8)

In [13]:
# split dataset into train and test set

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df.drop("target", axis=1), df["target"], test_size=0.3)

# 5 - Training the model

In [14]:
mlflow.xgboost.autolog()

In [15]:
from xgboost import XGBClassifier
model = XGBClassifier(use_label_encoder=False, eval_metric='logloss')

In [16]:
run = mlflow.start_run()

In [17]:
model.fit(X_train, y_train, eval_set=[(X_test, y_test)], verbose=False)

# 6 - Logging extra metrics

In [18]:
y_pred = model.predict(X_test)



In [19]:
from sklearn.metrics import accuracy_score, recall_score

accuracy = accuracy_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)



In [20]:
print("Accuracy: %.2f%%" % (accuracy * 100.0))
print("Recall: %.2f%%" % (recall * 100.0))

Accuracy: 80.22%
Recall: 62.96%


In [21]:
mlflow.end_run()

# 7 - Register Model

In [22]:
# Ensure you have the dependencies for this notebook
%pip install -r model_management.txt

Note: you may need to restart the kernel to use updated packages.


In [23]:
# Define naming conventions

experiment_name = "heart-condition-classifier"
model_name = "heart-classifier"
artifact_path = "model"

In [24]:
# search for the last run of the experiment:

exp = mlflow.get_experiment_by_name(experiment_name)
last_run = mlflow.search_runs(exp.experiment_id, output_format="list")[-1]
print(last_run.info.run_id)

be501b18-92a7-43ab-8bf1-de0076ca6d08


In [25]:
# register the model using Mlflow client

mlflow.register_model(f"runs:/{last_run.info.run_id}/{artifact_path}", model_name)

Registered model 'heart-classifier' already exists. Creating a new version of this model...
2024/05/22 18:39:45 INFO mlflow.store.model_registry.abstract_store: Waiting up to 300 seconds for model version to finish creation. Model name: heart-classifier, version 2
Created version '2' of model 'heart-classifier'.


<ModelVersion: aliases=[], creation_timestamp=1716403185137, current_stage='None', description='', last_updated_timestamp=1716403185137, name='heart-classifier', run_id='be501b18-92a7-43ab-8bf1-de0076ca6d08', run_link='', source='azureml://westus.api.azureml.ms/mlflow/v2.0/subscriptions/3b57d2fe-08b1-4fe9-b535-f5c4387b9a66/resourceGroups/mlflow-rg98/providers/Microsoft.MachineLearningServices/workspaces/mlflow-ws98/experiments/d5ea7a5a-307f-471c-acc1-9a9887d19cc7/runs/be501b18-92a7-43ab-8bf1-de0076ca6d08/artifacts/model', status='READY', status_message='', tags={}, user_id='', version='2'>

In [26]:
# check if the model is registered

mlflow_client = MlflowClient()
model_versions = mlflow_client.search_model_versions(filter_string=f"name = '{model_name}'")

In [27]:
model_versions

[<ModelVersion: aliases=[], creation_timestamp=1716403185137, current_stage='None', description='', last_updated_timestamp=1716403185137, name='heart-classifier', run_id='be501b18-92a7-43ab-8bf1-de0076ca6d08', run_link='', source='azureml://westus.api.azureml.ms/mlflow/v2.0/subscriptions/3b57d2fe-08b1-4fe9-b535-f5c4387b9a66/resourceGroups/mlflow-rg98/providers/Microsoft.MachineLearningServices/workspaces/mlflow-ws98/experiments/d5ea7a5a-307f-471c-acc1-9a9887d19cc7/runs/be501b18-92a7-43ab-8bf1-de0076ca6d08/artifacts/model', status='READY', status_message='', tags={}, user_id='', version='2'>]

In [28]:
# If model is not registered, then register

if any(model_versions):
    version = model_versions[0].version
else:
    registered_model = mlflow_client.create_model_version(
        name=model_name, source=f"file://{artifact_path}"
    )
    version = registered_model.version

In [29]:
print(f"We are going to deploy model {model_name} with version {version}")

We are going to deploy model heart-classifier with version 2


# 8 - Create an Online Endpoint

### a) Configure the endpoint

In [30]:
import random
import string

# Creating a unique endpoint name by including a random suffix
allowed_chars = string.ascii_lowercase + string.digits
endpoint_suffix = "".join(random.choice(allowed_chars) for x in range(5))
endpoint_name = "heart-classifier-" + endpoint_suffix

print(f"Endpoint name: {endpoint_name}")

Endpoint name: heart-classifier-wnr0u


### b) Create an Online Endpoint

In [31]:
# create an MLflow deployment client for Azure Machine Learning

from mlflow.deployments import get_deploy_client
deployment_client = get_deploy_client(mlflow.get_tracking_uri())

In [32]:
# create the endpoint with basic configuration

endpoint = deployment_client.create_endpoint(endpoint_name)

In [33]:
#  get the scoring URI from the endpoint

scoring_uri = deployment_client.get_endpoint(endpoint=endpoint_name)["properties"]["scoringUri"]
print(scoring_uri)

https://heart-classifier-wnr0u.westus.inference.ml.azure.com/score


### c) Create a deployment

In [34]:
deployment_name = "default"

In [35]:
deploy_config = {
    "instance_type": "Standard_DS2_v2",
    "instance_count": 1,
}

In [36]:
import json

deployment_config_path = "deployment_config.json"

with open(deployment_config_path, "w") as outfile:
    outfile.write(json.dumps(deploy_config))

In [37]:
deployment = deployment_client.create_deployment(
    name=deployment_name,
    endpoint=endpoint_name,
    model_uri=f"models:/{model_name}/{version}",
    config={"deploy-config-file": deployment_config_path},
)

.................................................................................................................

### d) Assign all the traffic to created deployment

In [38]:
# Traffic Routing Configuration

traffic_config = {"traffic": {deployment_name: 100}}

In [39]:
traffic_config_path = "traffic_config.json"

with open(traffic_config_path, "w") as outfile:
    outfile.write(json.dumps(traffic_config))

In [40]:
deployment_client.update_endpoint(
    endpoint=endpoint_name,
    config={"endpoint-config-file": traffic_config_path},
)

{'id': '/subscriptions/3b57d2fe-08b1-4fe9-b535-f5c4387b9a66/resourceGroups/mlflow-rg98/providers/Microsoft.MachineLearningServices/workspaces/mlflow-ws98/onlineEndpoints/heart-classifier-wnr0u',
 'name': 'heart-classifier-wnr0u',
 'type': 'Microsoft.MachineLearningServices/workspaces/onlineEndpoints',
 'systemData': {'createdBy': 'Vijay Gadhave',
  'createdAt': '2024-05-22T18:40:18.521869Z',
  'lastModifiedAt': '2024-05-22T18:40:18.521869Z'},
 'tags': {},
 'location': 'westus',
 'identity': {'principalId': '38a22f59-7250-4483-82fa-77cce4d95c09',
  'tenantId': '843e1e65-e875-4fbd-8ddc-ce696f3a551d',
  'type': 'SystemAssigned'},
 'kind': 'Managed',
 'properties': {'authMode': 'AMLToken',
  'properties': {'azureml.mlflow_client_endpoint': 'True',
   'azureml.onlineendpointid': '/subscriptions/3b57d2fe-08b1-4fe9-b535-f5c4387b9a66/resourcegroups/mlflow-rg98/providers/microsoft.machinelearningservices/workspaces/mlflow-ws98/onlineendpoints/heart-classifier-wnr0u',
   'AzureAsyncOperationUri'

# 9 - Test the deployment

In [41]:
df = pd.read_csv("data/heart.csv")
df

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal,target
0,67,1,4,120,229,0,2,129,1,2.6,2,2,4,0
1,41,0,2,130,204,0,2,172,0,1.4,1,0,3,0
2,62,0,4,140,268,0,2,160,0,3.6,3,2,3,1
3,63,1,4,130,254,0,2,147,0,1.4,2,1,4,1
4,57,1,4,140,192,0,0,148,0,0.4,2,0,2,0


In [42]:
df.drop(columns=['target'], inplace=True)
df

Unnamed: 0,age,sex,cp,trestbps,chol,fbs,restecg,thalach,exang,oldpeak,slope,ca,thal
0,67,1,4,120,229,0,2,129,1,2.6,2,2,4
1,41,0,2,130,204,0,2,172,0,1.4,1,0,3
2,62,0,4,140,268,0,2,160,0,3.6,3,2,3
3,63,1,4,130,254,0,2,147,0,1.4,2,1,4
4,57,1,4,140,192,0,0,148,0,0.4,2,0,2


In [43]:
deployment_client.predict(endpoint=endpoint_name, df=df)

array([1, 0, 1, 1, 0])

# 10 - Making REST requests

In [44]:
# Imports

from azure.ai.ml import MLClient
from azure.identity import DefaultAzureCredential
import json
import requests

In [45]:
# Azure Machine Learning (AML) Configuration

subscription_id = "3b57d2fe-08b1-4fe9-b535-f5c4387b9a66"
resource_group = "mlflow-rg98"
workspace = "mlflow-ws98"

ml_client = MLClient(DefaultAzureCredential(), subscription_id, resource_group, workspace)

In [46]:
# Retrieving Endpoint Secret Key
endpoint_secret_key = ml_client.online_endpoints.get_keys(name=endpoint_name).access_token

In [47]:
# Preparing the Request for Prediction

headers = {
    "Content-Type": "application/json",
    "Authorization": ("Bearer " + endpoint_secret_key),
    "azureml-model-deployment": "default",
}

sample_request = {
    "input_data": json.loads(df.to_json(orient="split", index=False))
}

In [48]:
# Sending the Prediction Request (Method 1: Using requests library)

import requests
req = requests.post(scoring_uri, json=sample_request, headers=headers)
req.json()

[1, 0, 1, 1, 0]

In [49]:
# Sending the Prediction Request (Method 2: Using shell command)

with open("sample.json", "w") as f:
    f.write(json.dumps(sample_request))

authentication_header = f"'Authorization: Bearer {endpoint_secret_key}'"

!cat -A sample.json | curl $scoring_uri \
                        --request POST \
                        --header 'Content-Type: application/json' \
                        --header $authentication_header \
                        --data-binary @-

[1, 0, 1, 1, 0]