# Data Science in the Cloud: The "Azure ML SDK" way

**Introduction**
In this notebook, we will learn how to use the Azure ML SDK to train, deploy and consume a model through Azure ML.

*Pre-requisites:*
1. You created an Azure ML workspace.
2. You loaded the Heart Failure dataset into Azure ML.
3. You uploaded this notebook into Azure ML Studio.

The next steps are:

1. Create an Experiment in an existing Workspace.
2. Create a Compute cluster.
3. Load the dataset.
4. Configure AutoML using AutoMLConfig.
5. Run the AutoML experiment.
6. Explore the results and get the best model.
7. Register the best model.
8. Deploy the best model.
9. Consume the endpoint.

# Azure Machine Learning SDK-specific imports

In [1]:
from azureml.core import Workspace, Experiment
from azureml.core.compute import AmlCompute
from azureml.train.automl import AutoMLConfig
from azureml.widgets import RunDetails
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

# Initialize workspace

Initialize a workspace object from persisted configuration. Make sure the config file is present at .\config.json

In [2]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

mesdedatos-aml
mesdedatosrg
westeurope
a0c759b9-5d15-4a68-b18d-708855e1e11c


# Create an Azure ML experiment

Let's create an experiment named 'aml-experiment' in the workspace we just initialized.

In [3]:
experiment_name = 'aml-experiment'
experiment = Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
aml-experiment,mesdedatos-aml,Link to Azure Machine Learning studio,Link to Documentation


# Create a Compute cluster

You will need to create (or retrieve) a compute target for your AutoML run.

In [4]:
aml_name = "mesdedatos-vm"
try:
    aml_compute = AmlCompute(ws, aml_name)
    print('Found existing AML compute context.')
except:
    print('Creating new AML compute context.')
    aml_config = AmlCompute.provisioning_configuration(vm_size = "Standard_D2_v2", min_nodes=1, max_nodes=3)
    aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)
    aml_compute.wait_for_completion(show_output = True)

cts = ws.compute_targets
compute_target = cts[aml_name]

Found existing AML compute context.


# Data

Make sure you have uploaded the dataset to Azure ML and that the key is the same name as the dataset.

In [5]:
key = 'heart-failure-records'
dataset = ws.datasets[key]
df = dataset.to_pandas_dataframe()
df.describe()

Unnamed: 0,age,creatinine_phosphokinase,ejection_fraction,platelets,serum_creatinine,serum_sodium,time
count,299.0,299.0,299.0,299.0,299.0,299.0,299.0
mean,60.833893,581.839465,38.083612,263358.029264,1.39388,136.625418,130.26087
std,11.894809,970.287881,11.834841,97804.236869,1.03451,4.412477,77.614208
min,40.0,23.0,14.0,25100.0,0.5,113.0,4.0
25%,51.0,116.5,30.0,212500.0,0.9,134.0,73.0
50%,60.0,250.0,38.0,262000.0,1.1,137.0,115.0
75%,70.0,582.0,45.0,303500.0,1.4,140.0,203.0
max,95.0,7861.0,80.0,850000.0,9.4,148.0,285.0


# AutoML Configuration

In [6]:
automl_settings = {
    "experiment_timeout_minutes": 20,
    "max_concurrent_iterations": 3,
    "primary_metric" : 'AUC_weighted'
}

automl_config = AutoMLConfig(compute_target=compute_target,
                             task = "classification",
                             training_data=dataset,
                             label_column_name="DEATH_EVENT",
                             enable_early_stopping= True,
                             featurization= 'auto',
                             debug_log = "automl_errors.log",
                             **automl_settings
                            )

# AutoML Run

In [7]:
remote_run = experiment.submit(automl_config)

Submitting remote run.


Experiment,Id,Type,Status,Details Page,Docs Page
aml-experiment,AutoML_0d699f45-e7b2-4248-8bc5-507c4e5b9135,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation


In [8]:
RunDetails(remote_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

# Save the best model

In [9]:
best_run, fitted_model = remote_run.get_output()

In [10]:
best_run.get_properties()

{'runTemplate': 'automl_child',
 'pipeline_id': '__AutoML_Ensemble__',
 'pipeline_spec': '{"pipeline_id":"__AutoML_Ensemble__","objects":[{"module":"azureml.train.automl.ensemble","class_name":"Ensemble","spec_class":"sklearn","param_args":[],"param_kwargs":{"automl_settings":"{\'task_type\':\'classification\',\'primary_metric\':\'AUC_weighted\',\'verbosity\':20,\'ensemble_iterations\':15,\'is_timeseries\':False,\'name\':\'aml-experiment\',\'compute_target\':\'mesdedatos-vm\',\'subscription_id\':\'a0c759b9-5d15-4a68-b18d-708855e1e11c\',\'region\':\'westeurope\',\'spark_service\':None}","ensemble_run_id":"AutoML_0d699f45-e7b2-4248-8bc5-507c4e5b9135_44","experiment_name":"aml-experiment","workspace_name":"mesdedatos-aml","subscription_id":"a0c759b9-5d15-4a68-b18d-708855e1e11c","resource_group_name":"mesdedatosrg"}}]}',
 'training_percent': '100',
 'predicted_cost': None,
 'iteration': '44',
 '_aml_system_scenario_identification': 'Remote.Child',
 '_azureml.ComputeTargetType': 'amlctrain'

In [18]:
model_name = best_run.properties['model_name']
script_file_name = 'inference/score.py'
best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'inference/score.py')
best_run.download_file('outputs/model.pkl', 'outputs/model.pkl')
description = "aml heart failure project sdk"
model = best_run.register_model(model_name = model_name,
                                description = description,
                                model_path = 'outputs/model.pkl',
                                tags = None)

# Deploy the Best Model

Run the following code to deploy the best model. You can see the state of the deployment in the Azure ML portal. This step can take a few minutes.

In [19]:
inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,
                                               memory_gb = 1,
                                               tags = {'type': "automl-heart-failure-prediction"},
                                               description = 'Sample service for AutoML Heart Failure Prediction')

aci_service_name = 'automl-hf-sdk'
aci_service = Model.deploy(ws, aci_service_name, [model], inference_config, aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)



Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2023-03-15 16:19:24+00:00 Creating Container Registry if not exists..
2023-03-15 16:29:24+00:00 Registering the environment.
2023-03-15 16:29:25+00:00 Use the existing image..
2023-03-15 16:29:26+00:00 Submitting deployment to compute..
2023-03-15 16:29:35+00:00 Checking the status of deployment automl-hf-sdk..
2023-03-15 16:31:51+00:00 Checking the status of inference endpoint automl-hf-sdk.
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


# Consume the Endpoint

You can add inputs to the following input sample.

In [21]:
import json

data = {
    "data":
    [
        {
            'age': "60",
            'anaemia': "false",
            'creatinine_phosphokinase': "500",
            'diabetes': "false",
            'ejection_fraction': "38",
            'high_blood_pressure': "false",
            'platelets': "260000",
            'serum_creatinine': "1.40",
            'serum_sodium': "137",
            'sex': "false",
            'smoking': "false",
            'time': "130",
        },
    ],
}

test_sample = str.encode(json.dumps(data))

In [22]:
response = aci_service.run(input_data=test_sample)
response

'{"result": [false]}'