
### Prerequisites

This notebook assumes you have:

* An Azure subscription 
* with an Azure Machine Learning workspace
* with a provisioned CPU cluster

### Before Running this Notebook

Configure the *config.json* file with the relevant Azure subscription ID, Azure resource group, and Azure Machine Learning workspace name.

Create a conda environment with the provided *myenv.yml* file

`conda env create myenv --file myenv.yml`

### Running the Notebook

This notebook will:
* Connect to your Azure ML workspace
* Create a new experiment
* Create a registed dataset from a naively engineered version of the Titanic dataset
* Publish a decision tree classifier model to the workspace
* Deploy a webservice for inferencing the model (see *score.py* for inferencing configuration and input schema)

There is also a testing cell at the end to verify the service is operational.

### Troubleshooting

#### My Azure ML Workspace is not in my default tenant.

If your Azure Machine Learning workspace is not in your default tenant when you login to the Azure Portal, you will need to login manually outside of this notebook via the Azure CLI (installed with the conda environment above) or Azure Powershell

For Azure CLI (replace {tenant_id} with your tenant GUID)

`az login -t {tenant_id}`

For Azure PowerShell:

`Connect-AzAccount -Tenant '{tenant_id}'`



In [2]:
from azureml.core import Webservice, Workspace, Dataset, Datastore, Experiment, Run
from azureml.core.model import InferenceConfig, Model
import azureml.dataprep
import math, random, pickle
import pandas as pd
import numpy as np

In [3]:
experiment_name = "titanic_classifier"
webservice_name = 'titanic-classifier' # only accepts alphanumerics and dashes
dataset_name = "titanic_ds"

In [4]:
from azureml.core.authentication import AzureCliAuthentication
import json

try:
    ws = Workspace.from_config() # only works if your ML workspace is in your default tenant
except:
    cli_auth = AzureCliAuthentication() #workaround: login with "az login -t {tenant_id}" in CLI

    with open("./config.json") as json_file:
        config = json.load(json_file)
    
    ws = Workspace(subscription_id=config['subscription_id'],
               resource_group=config['resource_group'],
               workspace_name=config['workspace_name'],
               auth=cli_auth)

Performing interactive authentication. Please follow the instructions on the terminal.
Interactive authentication successfully completed.


In [35]:
experiment = Experiment(workspace = ws, name = experiment_name)

In [17]:
datastore = Datastore.get_default(workspace=ws)
datastore

{
  "name": "workspaceblobstore",
  "container_name": "azureml-blobstore-6f7dfc08-44b5-438d-a6bc-9c804e0bdd76",
  "account_name": "kylemhaleamlsa",
  "protocol": "https",
  "endpoint": "core.windows.net"
}

In [51]:
#Upload and register our engineered dataset

datastore.upload(src_dir='./data/uploads', target_path='data', overwrite=True)

dataset = Dataset.Tabular.from_delimited_files(datastore.path('data/titanic-engineered.csv'))
dataset = dataset.register(workspace=ws, name=dataset_name, description="Titanic training data", create_new_version=True)

Uploading an estimated of 1 files
Uploading ./data/uploads\titanic-engineered.csv
Uploaded ./data/uploads\titanic-engineered.csv, 1 files out of an estimated total of 1
Uploaded 1 files


In [1]:
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier

run = experiment.start_logging(snapshot_directory=None)


titanic_df = dataset.to_pandas_dataframe()

# separate dependent and independent variables
X = titanic_df.iloc[ : , :-1].values
y = titanic_df.iloc[ : , 4].values

# 1/3 testing, 2/3 training
X_train, X_test, Y_train, Y_test = train_test_split(X, y, test_size=0.33, random_state=51)

# simple decision tree for demo purposes
decision_tree = DecisionTreeClassifier() 
decision_tree.fit(X_train, Y_train)  
Y_pred = decision_tree.predict(X_test) 
acc_decision_tree = round(decision_tree.score(X_train, Y_train) * 100, 2)

# Log final results
run.log("Decision tree accuracy", acc_decision_tree)

filename = 'outputs/finalized_model.sav'
pickle.dump(decision_tree , open(filename, 'wb'))
run.upload_file(name = 'outputs/finalized_model.sav', path_or_stream = filename)

# Complete tracking and get link to details
run.complete()
print("Run completed")

NameError: name 'experiment' is not defined

In [57]:
model = run.register_model(model_name = "titanic_classifier_model", model_path = "outputs/finalized_model.sav")

In [59]:
inference_config = InferenceConfig(entry_script='score.py', runtime='python', conda_file='service-env.yml')


In [64]:
from azureml.core.webservice import AciWebservice

try:
    service = Webservice(ws, webservice_name)
    service.update(models=[model], inference_config = inference_config)
except:
    pass
    
    aci_config = AciWebservice.deploy_configuration(cpu_cores=1, memory_gb=1)

    service = Model.deploy(workspace=ws, name=webservice_name, models=[model], inference_config=inference_config, deployment_config=aci_config, deployment_target=None)
    service.update(description='Binary classifier for Titanic')
    service.wait_for_deployment(show_output = True)

print(f'Service State: {service.state}')

SucceededACI service creation operation finished, operation "Succeeded"
Service State: Transitioning


In [70]:
   # Testing our webservice
   
   import json
   from azureml.core import Webservice

   service = Webservice(workspace=ws, name=webservice_name)
    
   request = json.dumps({"data" : [{"sex": 0, "pclass" : 3, "age": 3, "unaccompanied": 1}, {"sex": 1, "pclass" : 1, "age": 2, "unaccompanied": 1}]})
   response = service.run(request)
   response #should receive array with 2 predicted values of survival

[1, 1]