##Requirements
Before running this notebook make sure you have the following:
- An Azure ML workspace
- A compute instance

# Automated ML

TODO: Import Dependencies. In the cell below, import all the dependencies that you will need to complete the project.

In [1]:
from azureml.core import Workspace, Experiment, Dataset
from azureml.train.automl import AutoMLConfig
from azureml.core.compute import ComputeTarget, AmlCompute
from azureml.core.compute_target import ComputeTargetException
from azureml.core import Dataset, Datastore
from azureml.data.datapath import DataPath
from azureml.core import Environment
from azureml.core.model import InferenceConfig
from azureml.core.webservice import LocalWebservice
from azureml.data.dataset_type_definitions import PromoteHeadersBehavior
from azureml.widgets import RunDetails
from azureml.core.webservice import AciWebservice
from azureml.core.model import Model


import requests
import json
import pandas as pd
import os

## Dataset

### Overview
TODO: In this markdown cell, give an overview of the dataset you are using. Also mention the task you will be performing.


TODO: Get data. In the cell below, write code to access the data you will be using in this project. Remember that the dataset needs to be external.

In [2]:
#Retrieve current workspace
ws = Workspace.from_config()

# choose a name for experiment
experiment_name = 'Tuscan_Wine_Forecast'

experiment=Experiment(ws, experiment_name)

print('Workspace name: ' + ws.name, 
      'Azure region: ' + ws.location, 
      'Subscription id: ' + ws.subscription_id, 
      'Resource group: ' + ws.resource_group, sep = '\n')



Performing interactive authentication. Please follow the instructions on the terminal.
To sign in, use a web browser to open the page https://microsoft.com/devicelogin and enter the code F4WFHSVHQ to authenticate.
You have logged in. Now let us find all the subscriptions to which you have access...
Interactive authentication successfully completed.
Workspace name: quick-starts-ws-162405
Azure region: southcentralus
Subscription id: 3e42d11f-d64d-4173-af9b-12ecaa1030b3
Resource group: aml-quickstarts-162405


## Compute Cluster
If compute cluster already exists, it will just identify it and end.


In [3]:
#Create Cluster if does not already exist

cluster_name = "myCluster"
try:
    cluster = ComputeTarget(workspace=ws, name=cluster_name)
    print("Cluster already created")
except ComputeTargetException:
    compute_config = AmlCompute.provisioning_configuration(vm_size="STANDARD_DS12_V2",min_nodes=1, max_nodes=6)
    cluster = ComputeTarget.create(ws,cluster_name, compute_config) #creates the actual cluster

cluster.wait_for_completion(show_output=True)  #Allows to continue on other threads while cluster is being created


#SOURCE / HELP: https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.compute.amlcompute.amlcompute?view=azure-ml-py

Cluster already created
Succeeded
AmlCompute wait for completion finished

Minimum number of nodes requested have been provisioned


## Dataset
A Dataset is created from the csv file in the folder "data".

In [5]:
# Create a dataset from delimited files with header

#Retrieves default Datastore and display info
datastore = ws.get_default_datastore()
print("Default Datastore Info:\n") 
print(datastore)

#Google Drive path to dl csv
url_data = "https://docs.google.com/spreadsheets/d/1X9M3eNuBDv0ZKsOdidkdaBx9W1NubNDz3kxXuULsXmo/export?format=csv"

#Loads into Dataframe
df = pd.read_csv(url_data)

#Cleaning / Transform
df.drop("vintage") # nor relevance for prediction was just needed to match weather with history

#Dictionary to rearrange rankings
points

#Save into Datastore AND Register as dataset
dataset = Dataset.Tabular.register_pandas_dataframe(df,datastore,'Dataset_Wine',description="A Dataset composed of Tuscan wines with ratings, vintage, grapes and corresponding season avg weather")

#dataset = Dataset.Tabular.from_delimited_files(path=url_data,validate=True, include_path=False,header=PromoteHeadersBehavior.ONLY_FIRST_FILE_HAS_HEADERS, support_multi_line=False)

datasetWineTuscan = dataset.register(workspace=ws, name='Dataset_Wine', description="A Dataset composed of Tuscan wines with ratings, vintage, grapes and corresponding season avg weather")




#----Previous exploration to upload file from within Azure

directory = os.getcwd()
print("\nCurrent Working Directory:") 
print(directory)
#datastore.upload(src_dir='./data',
#                  target_path='datasets/',
#                  overwrite=True)

Default Datastore Info:

{
  "name": "workspaceblobstore",
  "container_name": "azureml-blobstore-e9d6e0b1-debf-4f50-9a27-3037af827375",
  "account_name": "mlstrg162405",
  "protocol": "https",
  "endpoint": "core.windows.net"
}


Method register_pandas_dataframe: This is an experimental method, and may change at any time. Please see https://aka.ms/azuremlexperimental for more information.


Validating arguments.
Arguments validated.
Successfully obtained datastore reference and path.
Uploading file to managed-dataset/f50898e8-c931-46ce-b914-697b28ba2439/
Successfully uploaded file to datastore.
Creating and registering a new dataset.
Successfully created and registered a new dataset.
Current Working Directory:

/mnt/batch/tasks/shared/LS_root/mounts/clusters/amlcomputealeaume/code


In [21]:
#Display top 10 lines
dataset.take(10).to_pandas_dataframe()



Unnamed: 0,vintage,points,variety,winery,avg winter temp,avg spring temp,avg summer temp,avg fall temp,avg winter sun hour,avg spring hour,avg summer sun hour,avg fall sun hour,avg daily precip winter,avg daily precip spring,avg daily precip summer,avg daily precip fall
0,2010,85,Sangiovese,Casa Raia,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
1,2010,89,Sangiovese,Molinari Carlo,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
2,2010,92,Sangiovese,Castelli del Grevepesa,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
3,2010,93,Sangiovese,Le Gode,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
4,2010,97,Cabernet Franc,Le Macchiole,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
5,2010,90,Red Blend,Toscolo,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
6,2010,88,Sangiovese Grosso,Tenute Silvio Nardi,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
7,2010,88,Sangiovese,Mocali,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
8,2010,88,Chardonnay,Vigliano,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538
9,2010,93,Red Blend,Boscarelli,4.661111,13.25,25.271739,15.082418,6.451111,11.759783,14.097826,8.967033,2.06,1.720652,1.076087,1.461538


In [None]:

#Display Basic Stats
df.describe()

## AutoML Configuration

TODO: Explain why you chose the automl settings and cofiguration you used below.

In [12]:
# TODO: Put your automl settings here
#automl_settings = {}

# TODO: Put your automl config here
automl_config = AutoMLConfig(
    experiment_timeout_minutes=45,
    task='classification',
    primary_metric='AUC_weighted',
    training_data= dataset,
    validation_size = 0.20,
    label_column_name='points',
    compute_target = cluster,
    n_cross_validations=5
    )


#metric = normalized_root_mean_squared_error => recommended for review score prediction // for regression scenario

#AUC_weighted for classification

In [13]:
# TODO: Submit your experiment
runAutoML = experiment.submit(automl_config, show_output=True)

Submitting remote run.
No run_configuration provided, running on myCluster with default configuration
Running on remote compute: myCluster


Experiment,Id,Type,Status,Details Page,Docs Page
Tuscan_Wine_Forecast,AutoML_c00a5743-3b17-4db7-941f-7228df25ebbf,automl,NotStarted,Link to Azure Machine Learning studio,Link to Documentation



Current status: FeaturesGeneration. Generating features for the dataset.
Current status: DatasetBalancing. Performing class balancing sweeping
Current status: DatasetCrossValidationSplit. Generating individually featurized CV splits.
Current status: ModelSelection. Beginning model selection.

****************************************************************************************************
DATA GUARDRAILS: 

TYPE:         Class balancing detection
STATUS:       ALERTED
DESCRIPTION:  To decrease model bias, please cancel the current run and fix balancing problem.
              Learn more about imbalanced data: https://aka.ms/AutomatedMLImbalancedData
DETAILS:      Imbalanced data can lead to a falsely perceived positive effect of a model's accuracy because the input data has bias towards one class.
+---------------------------------+---------------------------------+--------------------------------------+
|Size of the smallest class       |Name/Label of the smallest class |Number of 

## Run Details

OPTIONAL: Write about the different models trained and their performance. Why do you think some models did better than others?

TODO: In the cell below, use the `RunDetails` widget to show the different experiments.

In [14]:
RunDetails(runAutoML).show()

_AutoMLWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', 's…

## Best Model

TODO: In the cell below, get the best model from the automl experiments and display all the properties of the model.



In [9]:
#WARNING:root:The model you attempted to retrieve requires 'azureml-train-automl-runtime' to be installed at '==1.34.1'. 
#Please install 'azureml-train-automl-runtime==1.34.1' (e.g. `pip install azureml-train-automl-runtime==1.34.1`) and then rerun the previous command.

import subprocess
import sys

def install(package):
    subprocess.check_call([sys.executable, "-m", "pip", "install", 'azureml-train-automl-runtime==1.34.1'])


In [18]:
best_run, fitted_model = runAutoML.get_output()
print(best_run)



print(fitted_model)

RunDetails(best_run).show()

Package:azureml-core, training version:1.35.0, current version:1.34.0
Package:azureml-dataprep, training version:2.23.2, current version:2.22.2
Package:azureml-dataprep-rslex, training version:1.21.2, current version:1.20.1
Package:azureml-dataset-runtime, training version:1.35.0, current version:1.34.0
Package:azureml-defaults, training version:1.35.0, current version:1.34.0
Package:azureml-interpret, training version:1.35.0, current version:1.34.0
Package:azureml-mlflow, training version:1.35.0, current version:1.34.0
Package:azureml-pipeline-core, training version:1.35.0, current version:1.34.0
Package:azureml-telemetry, training version:1.35.0, current version:1.34.0
Package:azureml-train-automl-client, training version:1.35.0, current version:1.34.0
Package:azureml-train-core, training version:1.35.0, current version:1.34.0
Package:azureml-train-restclients-hyperdrive, training version:1.35.0, current version:1.34.0
Package:azureml-responsibleai, training version:1.35.0
Package:az

Run(Experiment: Tuscan_Wine_Forecast,
Id: AutoML_c00a5743-3b17-4db7-941f-7228df25ebbf_30,
Type: azureml.scriptrun,
Status: Completed)
None


_UserRunWidget(widget_settings={'childWidgetDisplay': 'popup', 'send_telemetry': False, 'log_level': 'INFO', '…

In [19]:
#TODO: Save the best model

# Register run as Model

model_name = "AleaumeModelAutoML"
description = "Best AutoML Model"
model_path ="outputs/modelAutoML.pkl"

#model = best_run.register_model(model_name = model_name, description = description, model_path = model_path )

model = runAutoML.register_model(model_name = model_name, description = description)

#SOURCE / HELP: https://docs.microsoft.com/en-us/azure/machine-learning/how-to-deploy-and-where?tabs=python

#Unfortunately so far unsuccessful to register the model via SDK and get a default generated swagger.json uri. Also,  MSFT support recommends to do so via AML studio...

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


## Model Deployment

Remember you have to deploy only one of the two models you trained but you still need to register both the models. Perform the steps in the rest of this notebook only if you wish to deploy this model.

TODO: In the cell below, register the model, create an inference config and deploy the model as a web service.

In [20]:
#To create a local deployment configuration
deployment_config = LocalWebservice.deploy_configuration(port=9064)

#Create Environment

env = Environment(name="AzureML-AutoML")
#myenv=env.clone("myenv")
#myenv.python.conda_dependencies.add_pip_package("joblib==1.1.0")

#Uploaded manually entry scripts in the below defined "entry_script" path.

my_inference_config = InferenceConfig(
    environment=env,
    source_directory= './',
    entry_script="./Users/odl_user_162405/score.py"
    #entry_script="./score.py"
)



# Deploy the service locally

service = model.deploy(ws, "local-service", [model], my_inference_config, deployment_config)
service.reload()
print(service.get_logs())

print(service.scoring_uri)

service.wait_for_deployment(show_output=True)


Downloading model AleaumeModelAutoML:1 to /tmp/azureml_kcwitrdd/AleaumeModelAutoML/1
Generating Docker build context.


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Usin

Package creation Succeeded
Logging into Docker registry viennaglobal.azurecr.io
Logging into Docker registry viennaglobal.azurecr.io
Building Docker image from Dockerfile...
Step 1/5 : FROM viennaglobal.azurecr.io/azureml/azureml_744753b4f832c74a6086091ff294e4a6


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


 ---> 7101fb6a62f0
Step 2/5 : COPY azureml-app /var/azureml-app
 ---> 452d255c1b69
Step 3/5 : RUN mkdir -p '/var/azureml-app' && echo eyJhY2NvdW50Q29udGV4dCI6eyJzdWJzY3JpcHRpb25JZCI6IjNlNDJkMTFmLWQ2NGQtNDE3My1hZjliLTEyZWNhYTEwMzBiMyIsInJlc291cmNlR3JvdXBOYW1lIjoiYW1sLXF1aWNrc3RhcnRzLTE2MjQwNSIsImFjY291bnROYW1lIjoicXVpY2stc3RhcnRzLXdzLTE2MjQwNSIsIndvcmtzcGFjZUlkIjoiZTlkNmUwYjEtZGViZi00ZjUwLTlhMjctMzAzN2FmODI3Mzc1In0sIm1vZGVscyI6e30sIm1vZGVsc0luZm8iOnt9fQ== | base64 --decode > /var/azureml-app/model_config_map.json
 ---> Running in 65b1c1f3d190


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


 ---> aa3f82133e1c
Step 4/5 : RUN mv '/var/azureml-app/tmpu3oaxgh4.py' /var/azureml-app/main.py
 ---> Running in 2e0e92394e0c
 ---> 6a5d52327f09
Step 5/5 : CMD ["runsvdir","/var/runit"]
 ---> Running in f064d81bc341
 ---> 7a4425d2eb8c
Successfully built 7a4425d2eb8c
Successfully tagged local-service:latest
Starting Docker container...
Docker container running.
Container has been successfully cleaned up.
Starting Docker container...
Docker container running.
2021-10-30T15:57:40,806216906+00:00 - gunicorn/run 
2021-10-30T15:57:40,806351511+00:00 - rsyslog/run 
Dynamic Python package installation is disabled.
Starting HTTP server
2021-10-30T15:57:40,812209622+00:00 - nginx/run 
2021-10-30T15:57:40,812495833+00:00 - iot-server/run 
rsyslogd: /azureml-envs/azureml_705720c76ff57b57c77d577152dabb18/lib/libuuid.so.1: no version information available (required by rsyslogd)

http://localhost:9064/score
Checking container health...
Local webservice is running at http://localhost:9064


In [29]:
#Call model to test 

#service.update(enable_app_insights=True)


uri = service.scoring_uri
requests.get("http://localhost:9064")
headers = {"Content-Type": "application/json"}
data = {
            "vintage": 2016,
            "variety": "Sangiovese",
            "winery": "Casa Raia",
            "avg winter temp": 4.2,
            "avg spring temp": 15,
            "avg summer temp": 27,
			"avg fall temp":16,
			"avg winter sun hour":7,
			"avg spring hour":12,
			"avg summer sun hour":16,
			"avg fall sun hour":9,
			"avg daily precip winter":3,
			"avg daily precip spring":3,
			"avg daily precip summer":1,
			"avg daily precip fall":2
}
data = json.dumps(data)
response = requests.post(uri, data=data, headers=headers)
print(response.json())

JSONDecodeError: Expecting value: line 1 column 1 (char 0)

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


In [25]:
#Deploy to ACI

deployment_config = AciWebservice.deploy_configuration(
    cpu_cores=1, memory_gb=1, auth_enabled=True
)

service = model.deploy(
    ws,
    "mywebservice",
    [model],
    my_inference_config,
    deployment_config,
    overwrite=True,
)
service.wait_for_deployment(show_output=True)

print(service.get_logs())

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


Tips: You can try get_logs(): https://aka.ms/debugimage#dockerlog or local deployment: https://aka.ms/debugimage#debug-locally to debug if deployment takes longer than 10 minutes.
Running
2021-10-30 16:10:38+00:00 Creating Container Registry if not exists.
2021-10-30 16:10:39+00:00 Registering the environment.
2021-10-30 16:10:39+00:00 Use the existing image.
2021-10-30 16:10:39+00:00 Generating deployment configuration.
2021-10-30 16:10:40+00:00 Submitting deployment to compute.
2021-10-30 16:10:44+00:00 Checking the status of deployment mywebservice.

INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads
INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


.
2021-10-30 16:15:10+00:00 Checking the status of inference endpoint mywebservice.
Succeeded
ACI service creation operation finished, operation "Succeeded"
2021-10-30T16:14:55,115690200+00:00 - rsyslog/run 
2021-10-30T16:14:55,125918600+00:00 - iot-server/run 
2021-10-30T16:14:55,157314500+00:00 - nginx/run 
rsyslogd: /azureml-envs/azureml_705720c76ff57b57c77d577152dabb18/lib/libuuid.so.1: no version information available (required by rsyslogd)
2021-10-30T16:14:55,167991600+00:00 - gunicorn/run 
Dynamic Python package installation is disabled.
Starting HTTP server
EdgeHubConnectionString and IOTEDGE_IOTHUBHOSTNAME are not set. Exiting...
2021-10-30T16:14:55,467553000+00:00 - iot-server/finish 1 0
2021-10-30T16:14:55,468959200+00:00 - Exit code 1 is normal. Not restarting iot-server.
Starting gunicorn 20.1.0
Listening at: http://127.0.0.1:31311 (72)
Using worker: sync
worker timeout is set to 300
Booting worker with pid: 99
SPARK_HOME not set. Skipping PySpark Initialization.
Initializ

TODO: In the cell below, send a request to the web service you deployed to test it.

In [32]:
import requests
import json

primary, secondary = service.get_keys()

# URL for the web service
scoring_uri = service.scoring_uri
# If the service is authenticated, set the key or token
key = primary

# Two sets of data to score, so we get two results back
data = {
			"vintage": 2016,
			"variety": "Sangiovese",
			"winery": "Casa Raia",
			"avg winter temp": 4.2,
			"avg spring temp": 15,
			"avg summer temp": 27,
			"avg fall temp":16,
			"avg winter sun hour":7,
			"avg spring hour":12,
			"avg summer sun hour":16,
			"avg fall sun hour":9,
			"avg daily precip winter":3,
			"avg daily precip spring":3,
			"avg daily precip summer":1,
			"avg daily precip fall":2
	
}
# Convert to JSON string
input_data = json.dumps(data)

# Set the content type
headers = {'Content-Type': 'application/json'}
# If authentication is enabled, set the authorization header
headers['Authorization'] = f'Bearer {key}'

# Make the request and display the response
resp = requests.post(scoring_uri, input_data, headers=headers)
print(resp.text)

{"data": "'data'", "message": "Failed to predict "}


INFO:interpret_community.common.explanation_utils:Using default datastore for uploads


TODO: In the cell below, print the logs of the web service and delete the service

In [25]:
# Deleting the webservice

service.delete()

**Submission Checklist**
- I have registered the model.
- I have deployed the model with the best accuracy as a webservice.
- I have tested the webservice by sending a request to the model endpoint.
- I have deleted the webservice and shutdown all the computes that I have used.
- I have taken a screenshot showing the model endpoint as active.
- The project includes a file containing the environment details.
