# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [6]:
import logging
import os
import csv
import joblib 

from matplotlib import pyplot as plt
import numpy as np
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
from azureml.core.compute_target import ComputeTargetException
import pandas as pd
from sklearn import datasets
import pkg_resources
from azureml.core import Dataset
from azureml.widgets import RunDetails
from azureml.core.model import Model
from azureml.core import Environment
from azureml.core.model import InferenceConfig
import azureml.core
from azureml.core import Experiment
from azureml.core import Workspace
from azureml.train.automl import AutoMLConfig
from azureml.widgets import RunDetails
from azureml.core.webservice import AciWebservice
from azureml.data.dataset_factory import TabularDatasetFactory
from azureml.core.webservice import LocalWebservice


## Dataset

### Overview
TODO: In this markdown cell, give an overview of the dataset you are using. Also mention the task you will be performing.

The dataset that I am using is a binary classification dataset using wine chemical data. The dataset attempts to classify wine as either 'good' or 'bad' according to professional tasters. The input data contains various data related to consistency and taste parameters of wine. The following AutoML run will train an array of models and rank them based upon their accuracy scores

TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

In [7]:
def preprocess(data):
    y_df = data.pop("quality").apply(lambda x: 1 if x == "good" else 0)
    return data, y_df

ws = Workspace.from_config()
# download csv from git source
dataset = pd.read_csv("https://raw.githubusercontent.com/cpaulisi/nd00333-capstone/master/starter_file/wine.csv")
x, y = preprocess(dataset)
x['y'] = y
os.makedirs('data', exist_ok=True)
# convert data to tabular dataset
x.to_parquet('data/wine.parquet')
dataref = ws.get_default_datastore().upload('data')
dataset = Dataset.Tabular.from_parquet_files(path = dataref.path('wine.parquet'))

# choose a name for experiment
experiment_name = 'auto-ml-exp-1'

experiment=Experiment(ws, experiment_name)

Uploading an estimated of 1 files
Target already exists. Skipping upload for wine.parquet
Uploaded 0 files


## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [8]:
# TODO: Put your automl settings here
automl_settings = {
    "experiment_timeout_minutes": 20,
    "max_concurrent_iterations": 3,
    "primary_metric" : 'accuracy'
}

cluster_name = 'automl-cluster'
try:
    cpu_cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print("Using existing cluster...")
except ComputeTargetException:
    print("Creating cluster " + cluster_name+ "...")
    compute_config = AmlCompute.provisioning_configuration(
        vm_size='Standard_D2_V2', 
        vm_priority='lowpriority', 
        max_nodes=4
    )
    cpu_cluster = ComputeTarget.create(ws, cluster_name, compute_config)

cpu_cluster.wait_for_completion(show_output=True)

# TODO: Put your automl config here
automl_config = AutoMLConfig(
    compute_target=cpu_cluster,
    task = "classification",
    training_data=dataset,
    label_column_name="y",   
    enable_early_stopping= True,
    featurization= 'auto',
    debug_log = "automl_errors.log",
    **automl_settings
)

Using existing cluster...
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


In [9]:
# TODO: Submit your experiment
remote_run = experiment.submit(automl_config)

Submitting remote run.


Experiment,Id,Type,Status,Details Page,Docs Page
auto-ml-exp-1,AutoML_78274bb4-124e-4998-af9b-06d52616b981,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation


## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [10]:

RunDetails(remote_run).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [13]:
best_run, best_model = remote_run.get_output()
best_run_metrics = best_run.get_metrics()
for metric_name in best_run_metrics:
    metric = best_run_metrics[metric_name]
    print(metric_name, metric)



matthews_correlation 0.6451530958913828
recall_score_micro 0.823639774859287
log_loss 0.4716338272766596
accuracy 0.823639774859287
recall_score_weighted 0.823639774859287
norm_macro_recall 0.6450921621198836
precision_score_micro 0.823639774859287
weighted_accuracy 0.8247044865753158
recall_score_macro 0.8225460810599418
precision_score_macro 0.8226073247641011
AUC_micro 0.8825344639649312
precision_score_weighted 0.8237921915024323
average_precision_score_macro 0.8744731165349752
f1_score_macro 0.8224829603896847
AUC_macro 0.8826492677396987
average_precision_score_weighted 0.8759498858136404
AUC_weighted 0.8826492677396987
f1_score_weighted 0.8236225035901755
average_precision_score_micro 0.8761029435668672
balanced_accuracy 0.8225460810599418
f1_score_micro 0.823639774859287
confusion_matrix aml://artifactId/ExperimentRun/dcid.AutoML_cbb7afad-8fcd-41c1-a98c-2f50357f20a8_34/confusion_matrix
accuracy_table aml://artifactId/ExperimentRun/dcid.AutoML_cbb7afad-8fcd-41c1-a98c-2f50357f20a

In [14]:
#TODO: Save the best model
os.makedirs('models', exist_ok=True)
joblib.dump(best_model, filename='./models/model.pkl')
model_name = best_run.properties['model_name']
print(model_name)


AutoMLcbb7afad834


## Model Deployment

Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [16]:
best_model

TODO: In the cell below, send a request to the web service you deployed to test it.

In [54]:
# test with local service instance
env = best_run.get_environment()
dummy_inference_config = InferenceConfig(
    environment=env,
    entry_script="echo_score.py",
)
deployment_config = LocalWebservice.deploy_configuration(port=6789)
service = Model.deploy(
    ws,
    "local-wine",
    [model],
    dummy_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)

Downloading model automl-model:4 to /tmp/azureml_3_ygeytx/automl-model/4
Generating Docker build context.
Package creation Succeeded
Logging into Docker registry viennaglobal.azurecr.io
Logging into Docker registry viennaglobal.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM viennaglobal.azurecr.io/azureml/azureml_e587ea2e7eb04895e6a9947f351d1672
 ---> f4b133118dd9
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 3a55bee39c75
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjVhNGFiMmJhLTZjNTEtNDgwNS04MTU1LTU4NzU5YWQ1ODlkOCIsInJlc291cmNlR3JvdXBOYW1lIjoiYW1sLXF1aWNrc3RhcnRzLTE4NTkxNyIsImFjY291bnROYW1lIjoicXVpY2stc3RhcnRzLXdzLTE4NTkxNyIsIndvcmtzcGFjZUlkIjoiNWI3ODY3NjUtN2NhMC00YzFhLTg0ZjEtZGJjNDM4MTUzOTM5In0sIm1vZGVscyI6e30sIm1vZGVsc0luZm8iOnt9fQ== | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 2ebaae216f81
 ---> 97161b9fc41c
Step 4/5 : RUN mv '/var/azureml-app/tmpd_f1c047.py' /var/azureml

In [55]:
import requests
import json

uri = service.scoring_uri
requests.get("http://localhost:6789")
headers = {"Content-Type": "application/json"}
data = {
    "fixed acidity":7.3,	
    "volatile acidity": 0.9,	
    "citric acid": 0.1,	
    "residual sugar": 1.9,	
    "chlorides": 0.076,	
    "free sulfur dioxide": 11.0, 	
    "total sulfur dioxide": 34.0,	
    "density": 0.9978,	
    "pH": 3.49,	
    "sulphates": 0.50,	
    "alcohol": 9.4
}
data = json.dumps(data)
response = requests.post(uri, data=data, headers=headers)
print(response.text)
print(response.json())


"{\"classification\": \"0\"}"
{"classification": "0"}


In [62]:
# register model
best_child = remote_run.get_best_child()
model = best_child.register_model(model_name='automl-model',model_path='outputs/model.pkl')
print(model.name, model.id, model.version, sep=' ') 

# deploy full instance
service_name = "wine-service"
env = best_run.get_environment()
print(env)

inference_config = InferenceConfig(
    environment=env,
    entry_script="echo_score.py",
)

aci_config = AciWebservice.deploy_configuration(
    cpu_cores = 0.5, 
    memory_gb=1, 
    auth_enabled=True
)
web_service = Model.deploy(
    workspace=ws, 
    name=service_name,
    models=[model], 
    inference_config=inference_config, 
    deployment_config=aci_config, 
    overwrite=True
)
web_service.wait_for_deployment(show_output=True)

automl-model automl-model:6 6
Environment(Name: AzureML-AutoML,
Version: 98)
Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2022-02-16 04:20:36+00:00 Creating Container Registry if not exists.
2022-02-16 04:20:36+00:00 Registering the environment.
2022-02-16 04:20:36+00:00 Use the existing image.
2022-02-16 04:20:37+00:00 Submitting deployment to compute.
2022-02-16 04:20:41+00:00 Checking the status of deployment wine-service..
2022-02-16 04:24:54+00:00 Checking the status of inference endpoint wine-service.
Succeeded
ACI service creation operation finished, operation "Succeeded"


In [64]:
uri = web_service.scoring_uri
api_key = 'EvWeVi2rnmYBE7S0XZlEwF6HadSX6zMA'
headers = {"Content-Type": "application/json",  'Authorization':('Bearer '+ api_key)}
data = {
    "fixed acidity":7.3,	
    "volatile acidity": 0.9,	
    "citric acid": 0.1,	
    "residual sugar": 1.9,	
    "chlorides": 0.076,	
    "free sulfur dioxide": 11.0, 	
    "total sulfur dioxide": 34.0,	
    "density": 0.9978,	
    "pH": 3.49,	
    "sulphates": 0.50,	
    "alcohol": 9.4
}
data = json.dumps(data)
response = requests.post(uri, data=data, headers=headers)
print(response.json())


{"classification": 0}


TODO: In the cell below, print the logs of the web service and delete the service

In [None]:
print(web_service.get_logs())
web_service.delete()
model.delete()

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.
