# Data Science in the Cloud: The "Azure ML SDK" way 

## Introduction

In this notebook, we will learn how to use the Azure ML SDK to train, deploy and consume a model through Azure ML.

Pre-requisites:
1. You created an Azure ML workspace.
2. You loaded the [Heart Failure dataset](https://www.kaggle.com/andrewmvd/heart-failure-clinical-data) into Azure ML.
3. You uploaded this notebook into Azure ML Studio.

The next steps are:

1. Create an Experiment in an existing Workspace.
2. Create a Compute cluster.
3. Load the dataset.
4. Configure AutoML using AutoMLConfig.
5. Run the AutoML experiment.
6. Explore the results and get the best model.
7. Register the best model.
8. Deploy the best model.
9. Consume the endpoint.

## Azure Machine Learning SDK-specific imports

In [1]:
from azureml.core import Workspace, Experiment
from azureml.core.compute import AmlCompute
from azureml.train.automl import AutoMLConfig
from azureml.widgets import RunDetails
from azureml.core.model import InferenceConfig, Model
from azureml.core.webservice import AciWebservice

## Initialize Workspace
Initialize a workspace object from persisted configuration. Make sure the config file is present at .\config.json

In [2]:
ws = Workspace.from_config()
print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep = '\n')

nahmed30-azureml-workspace
epe-poc-nazeer
centralus
16bc73b5-82be-47f2-b5ab-f2373344794c


## Create an Azure ML experiment

Let's create an experiment named 'aml-experiment' in the workspace we just initialized.

In [3]:
experiment_name = 'emailspam-aml-experiment'
experiment = Experiment(ws, experiment_name)
experiment

Name,Workspace,Report Page,Docs Page
emailspam-aml-experiment,nahmed30-azureml-workspace,Link to Azure Machine Learning studio,Link to Documentation


## Create a Compute Cluster
You will need to create a [compute target](https://docs.microsoft.com/azure/machine-learning/concept-azure-machine-learning-architecture#compute-target) for your AutoML run.

In [4]:
aml_name = "cpu-cluster"
try:
    aml_compute = AmlCompute(ws, aml_name)
    print('Found existing AML compute context.')
except:
    print('Creating new AML compute context.')
    aml_config = AmlCompute.provisioning_configuration(vm_size = "Standard_D2_v2", min_nodes=1, max_nodes=3)
    aml_compute = AmlCompute.create(ws, name = aml_name, provisioning_configuration = aml_config)
    aml_compute.wait_for_completion(show_output = True)

cts = ws.compute_targets
compute_target = cts[aml_name]

Found existing AML compute context.


## Data
Make sure you have uploaded the dataset to Azure ML and that the key is the same name as the dataset.

In [5]:
key = 'UdacityPrjEmailSpamDataSet'
dataset = ws.datasets[key]
df = dataset.to_pandas_dataframe()
df.describe()

Unnamed: 0,v1,v2,Column3,Column4,Column5
count,5572,5572,50,12,6
unique,2,5169,43,10,5
top,ham,"Sorry, I'll call later","bt not his girlfrnd... G o o d n i g h t . . .@""","MK17 92H. 450Ppw 16""","GNT:-)"""
freq,4825,30,3,2,2


## AutoML Configuration

In [6]:
automl_settings = {
    "experiment_timeout_minutes": 20,
    "max_concurrent_iterations": 3,
    "primary_metric" : 'AUC_weighted'
}

automl_config = AutoMLConfig(compute_target=compute_target,
                             task = "classification",
                             training_data=dataset,
                             label_column_name="v1",
                             enable_early_stopping= True,
                             featurization= 'auto',
                             debug_log = "emailspam_automl_errors.log",
                             **automl_settings
                            )

## AutoML Run

In [7]:
remote_run = experiment.submit(automl_config)

Submitting remote run.


Experiment,Id,Type,Status,Details Page,Docs Page
emailspam-aml-experiment,AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation


In [8]:
RunDetails(remote_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Save the best model

In [9]:
best_run, fitted_model = remote_run.get_output()

Package:azureml-automl-runtime, training version:1.44.0, current version:1.41.0
Package:azureml-core, training version:1.44.0, current version:1.41.0
Package:azureml-dataprep, training version:4.2.2, current version:3.1.1
Package:azureml-dataprep-rslex, training version:2.8.1, current version:2.5.2
Package:azureml-dataset-runtime, training version:1.44.0, current version:1.41.0
Package:azureml-defaults, training version:1.44.0, current version:1.41.0
Package:azureml-inference-server-http, training version:0.7.4, current version:0.4.13
Package:azureml-interpret, training version:1.44.0, current version:1.41.0
Package:azureml-mlflow, training version:1.44.0, current version:1.41.0
Package:azureml-pipeline-core, training version:1.44.0, current version:1.41.0
Package:azureml-responsibleai, training version:1.44.0, current version:1.41.0
Package:azureml-telemetry, training version:1.44.0, current version:1.41.0
Package:azureml-train-automl-client, training version:1.44.0, current version:1

In [10]:
best_run.get_properties()

{'runTemplate': 'automl_child',
 'pipeline_id': '__AutoML_Ensemble__',
 'pipeline_spec': '{"pipeline_id":"__AutoML_Ensemble__","objects":[{"module":"azureml.train.automl.ensemble","class_name":"Ensemble","spec_class":"sklearn","param_args":[],"param_kwargs":{"automl_settings":"{\'task_type\':\'classification\',\'primary_metric\':\'AUC_weighted\',\'verbosity\':20,\'ensemble_iterations\':15,\'is_timeseries\':False,\'name\':\'emailspam-aml-experiment\',\'compute_target\':\'cpu-cluster\',\'subscription_id\':\'16bc73b5-82be-47f2-b5ab-f2373344794c\',\'region\':\'centralus\',\'spark_service\':None}","ensemble_run_id":"AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_57","experiment_name":"emailspam-aml-experiment","workspace_name":"nahmed30-azureml-workspace","subscription_id":"16bc73b5-82be-47f2-b5ab-f2373344794c","resource_group_name":"epe-poc-nazeer"}}]}',
 'training_percent': '100',
 'predicted_cost': None,
 'iteration': '57',
 '_aml_system_scenario_identification': 'Remote.Child',
 '_azureml.

In [11]:
for child_run in remote_run.get_children():
    print(child_run,"\n********************\n")

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_58,
Type: azureml.scriptrun,
Status: Completed) 
********************

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_57,
Type: azureml.scriptrun,
Status: Completed) 
********************

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_56,
Type: azureml.scriptrun,
Status: Completed) 
********************

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_55,
Type: azureml.scriptrun,
Status: Completed) 
********************

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_54,
Type: azureml.scriptrun,
Status: Canceled) 
********************

Run(Experiment: emailspam-aml-experiment,
Id: AutoML_74d98dfa-3fa9-4c59-9fee-a8de8a4e0459_53,
Type: azureml.scriptrun,
Status: Canceled) 
********************

Run(Experiment: emailspam-aml-experi

In [12]:
model_name = best_run.properties['model_name']
script_file_name = 'emailspam_inference/score.py'
best_run.download_file('outputs/scoring_file_v_1_0_0.py', 'emailspam_inference/score.py')
description = "aml email spam project sdk"


In [13]:
model_name

'AutoML74d98dfa357'

In [14]:
import os

os.makedirs('./outputs',exist_ok=True)

In [15]:
#TODO: Save the best model
import joblib
joblib.dump(fitted_model,filename= "outputs/automl.joblib")

['outputs/automl.joblib']

In [16]:

from azureml.core.model import Model
reg_model = remote_run.register_model(model_name = model_name, description=description)


In [17]:
from azureml.automl.core.shared import constants
env = best_run.get_environment()
script_file = "./outputs/score.py"

best_run.download_file('outputs/scoring_file_v_1_0_0.py', script_file)
best_run.download_file(constants.CONDA_ENV_FILE_PATH, 'env.yml')

## Deploy the Best Model

Run the following code to deploy the best model. You can see the state of the deployment in the Azure ML portal. This step can take a few minutes.

In [18]:
script_file_name

'emailspam_inference/score.py'

In [19]:
inference_config = InferenceConfig(entry_script=script_file_name, environment=best_run.get_environment())

aciconfig = AciWebservice.deploy_configuration(cpu_cores = 1,
                                               memory_gb = 1,
                                               tags = {'type': "automl-email-spam-prediction"},
                                               description = 'Sample service for AutoML Email Spam Prediction')

aci_service_name = 'automl-es-sdk'
aci_service = Model.deploy(ws, aci_service_name, [reg_model], inference_config, aciconfig)
aci_service.wait_for_deployment(True)
print(aci_service.state)

Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2022-09-02 07:13:04+00:00 Creating Container Registry if not exists.
2022-09-02 07:13:04+00:00 Registering the environment.
2022-09-02 07:13:04+00:00 Use the existing image.
2022-09-02 07:13:05+00:00 Submitting deployment to compute.
2022-09-02 07:13:08+00:00 Checking the status of deployment automl-es-sdk..
2022-09-02 07:15:38+00:00 Checking the status of inference endpoint automl-es-sdk.
Succeeded
ACI service creation operation finished, operation "Succeeded"
Healthy


## Consume the Endpoint
You can add inputs to the following input sample. 

In [20]:
scoring_uri = aci_service.scoring_uri
print(scoring_uri)

http://7a49a89c-4825-45af-bb02-59fd30c5a1d3.centralus.azurecontainer.io/score


In [21]:
import requests
import json

 
data = {
  "data": [
    {
      "v2": "Click link below to collect $10000",
      "Column4": "example_value",
      "Column5": "example_value",
      "Column6": "example_value"
    }
  ],
  "method": "predict"
}
    
# Convert to JSON string
input_data = json.dumps(data)
with open("data.json", "w") as _f:
    _f.write(input_data)

# Set the content type
headers = {'Content-Type': 'application/json'}
# If authentication is enabled, set the authorization header
#headers['Authorization'] = f'Bearer {key}'

# Make the request and display the response
resp = requests.post(scoring_uri, input_data, headers=headers)
print("prediction is :" , resp.json())

prediction is : {"result": ["ham"]}
