### Operationalizing ML Models
John Hoff  
Machine Learning Architect  
jhoff@productiveedge.com
# Step 5: Collect Model Usage
![Step 1: Prepare](https://drive.google.com/uc?export=view&id=1F2S7oLDKU4TRGVofFgeIyvmYENHW3sam)

This step will collect the model inputs sent in Step 4 to the deployed model.  This data will be saved into the `bank_marketing_usage` hive table for Step 6.

_**Please Note: The ACI and AKS deployment scenarios are not currently supported in this notebook.  This notebook will be updated once Databricks and Microsoft have sorted out the current issues with native Spark MLlib model deployments.**_

_Please Note: Do not "Run All". Run a cell below that correctly fits the deployment scenario that was used in step 4._

In [2]:
from azureml.core import Workspace
from azureml.core.webservice import Webservice
import pandas as pd
import re

In [3]:
workspace_name = 'ai-saturday'
workspace_location = 'Central US'
resource_group = 'databricks-ai-saturday'
subscription_id = 'f11ca4c4-08ba-41d9-8102-89797ec8f810'

mlflow_run_id = '42640d86a1c24df0adadb0a09c2655f9'

aci_webservice_name = 'acisparkmodel'
aks_cluster_name = 'akssparkcluster'
aks_webservice_name = 'akssparkmodel'

## Option 1: Model Was Deployed Locally
Run the following cell to analyze the usage data sent through the locally deployed model.

In [5]:
usage_data = spark.sql("select * from bank_marketing_skewed")
usage_data.write.saveAsTable("bank_marketing_usage", mode='overwrite')

## Option 2: Model was Deployed to ACI
Run the following cell to analyze the usage data sent through the ACI deployed model.

In [7]:
workspace = Workspace.create(name = workspace_name,
                             location = workspace_location,
                             resource_group = resource_group,
                             subscription_id = subscription_id,
                             exist_ok=True)

aci_webservice = None
for service in Webservice.list(workspace):
  if service.name == aci_webservice_name:
    aci_webservice = service

In [8]:
pandas_df = None
input_lines = re.findall(r'Received input: {.*}', aci_webservice.get_logs(1000000))
for line in input_lines:
  line_df = pd.read_json(line.replace('Received input: ', ''), orient='split')
  if pandas_df is None:
    pandas_df = line_df
  else:
    pandas_df = pd.concat([pandas_df, line_df])

In [9]:
usage_data = spark.createDataFrame(pandas_df)
usage_data.write.saveAsTable("bank_marketing_usage", mode='overwrite')

## Option 3: Model was Deployed to AKS
Run the following cell to analyze usage data sent through the AKS deployed model.

In [11]:
workspace = Workspace.create(name = workspace_name,
                             location = workspace_location,
                             resource_group = resource_group,
                             subscription_id = subscription_id,
                             exist_ok=True)

aks_webservice = None
for service in Webservice.list(workspace):
  if service.name == aks_webservice_name:
    aks_webservice = service

In [12]:
pandas_df = None
input_lines = re.findall(r'Received input: {.*}', aks_webservice.get_logs(1000000))
for line in input_lines:
  line_df = pd.read_json(line.replace('Received input: ', ''), orient='split')
  if pandas_df is None:
    pandas_df = line_df
  else:
    pandas_df = pd.concat([pandas_df, line_df])

In [13]:
usage_data = spark.createDataFrame(pandas_df)
usage_data.write.saveAsTable("bank_marketing_usage", mode='overwrite')