# Working with Azure Machine Learning Studio engine

This notebook shows how to log the payload for a model deployed on Microsoft Azure serving engine using AI OpenScale python sdk.

Contents
- [1. Setup](#setup)
- [2. Binding machine learning engine](#binding)
- [3. Subscriptions](#subscription)
- [4. Scoring and payload logging](#scoring)
- [5. Feedback logging](#feedback)
- [6. Data Mart](#datamart)

<a id="setup"></a>
## 1. Setup

### 1.0 Sample model creation using [Azure Machine Learning Studio](https://studio.azureml.net)

- Download training data set from [here](https://github.com/pmservice/wml-sample-models/raw/master/spark/product-line-prediction/data/GoSales_Tx.csv)
- [Create an experiment in Azure ML Studio](https://docs.microsoft.com/en-us/azure/machine-learning/studio/create-experiment) using the diagram below. (You can search for each module in the palette by name)
- When you get to the `Train Model` module, select the `Product Line` column as the label.
- Run the experiment to train the model.
- [Create (deploy) web service](https://docs.microsoft.com/en-us/azure/machine-learning/studio/publish-a-machine-learning-web-service) (Choose the `new` NOT `classic`)

<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/azure_product_line_model.png" align="left" alt="experiment">

**NOTE:** Classic web services are not supported.

### 1.1 Installation and authentication

In [46]:
!pip install --upgrade ibm-ai-openscale --no-cache | tail -n 1

Successfully installed ibm-ai-openscale-1.0.456


Import and initiate.

In [47]:
from ibm_ai_openscale import APIClient
from ibm_ai_openscale.supporting_classes import PayloadRecord
from ibm_ai_openscale.supporting_classes.enums import InputDataType, ProblemType
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *

#### ACTION: Get AI OpenScale `instance_guid` and `apikey`

[Install IBM Cloud (bluemix) console](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

Use the IBM Cloud CLI to get an api key:
```bash
ibmcloud login --sso
ibmcloud iam api-key-create 'my_key'
```

Get your AI OpenScale instance GUID:
> if your resource group is different than `default` switch to the resource group containing AI OpenScale instance

```bash
ibmcloud target -g <myResourceGroup>
```

Get details of the instance:
```bash
ibmcloud resource service-instance `AI-OpenScale-instance_name`
```

#### Let's define some constants required to set up data mart:

- AIOS_CREDENTIALS
- POSTGRES_CREDENTIALS
- SCHEMA_NAME

In [48]:
AIOS_CREDENTIALS = {
  "url": "https://api.aiopenscale.cloud.ibm.com",
  "instance_guid": "****",
  "apikey": "****"
}

In [49]:
POSTGRES_CREDENTIALS = {
    "db_type": "postgresql",
    "uri_cli_1": "xxx",
    "maps": [],
    "instance_administration_api": {
        "instance_id": "xxx",
        "root": "xxx",
        "deployment_id": "xxx"
    },
    "name": "xxx",
    "uri_cli": "xxx",
    "uri_direct_1": "xxx",
    "ca_certificate_base64": "xxx",
    "deployment_id": "xxx",
    "uri": "xxx"
}


In [50]:
SCHEMA_NAME = 'data_mart_for_azure'

Create schema for data mart.

In [51]:
create_postgres_schema(postgres_credentials=POSTGRES_CREDENTIALS, schema_name=SCHEMA_NAME)

In [52]:
client = APIClient(AIOS_CREDENTIALS)

In [53]:
client.version

'1.0.429'

### 1.2 DataMart setup

In [55]:
client.data_mart.setup(db_credentials=POSTGRES_CREDENTIALS, schema=SCHEMA_NAME)

In [56]:
data_mart_details = client.data_mart.get_details()

<a id="binding"></a>
## 2. Bind machine learning engines

### 2.1 Bind  `Azure` machine learning engine

Provide credentials using following fields:
- `client_id`
- `client_secret`
- `subscription_id`
- `tenant`

In [57]:
AZURE_ENGINE_CREDENTIALS = {
    "client_id": "***",
    "client_secret": "***",
    "subscription_id": "***",
    "tenant": "***"
}

In [58]:
binding_uid = client.data_mart.bindings.add('My Azure ML Studio engine', AzureMachineLearningInstance(AZURE_ENGINE_CREDENTIALS))

In [59]:
bindings_details = client.data_mart.bindings.get_details()

In [60]:
client.data_mart.bindings.list()

0,1,2,3
149502ff-65b3-45c9-a145-a7b2ac25ba32,My Azure ML Studio engine,azure_machine_learning,2019-01-29T21:54:44.760Z


<a id="subsciption"></a>
## 3. Subscriptions

### 3.1 Add subscriptions

List available deployments.

**Note:** Depending on the number of assets it may take some time.

In [61]:
client.data_mart.bindings.list_assets()

0,1,2,3,4,5,6
986fd3e779b52d0e23a2bde5b6da996c,ScottdaAzureML12.2019.1.29.21.29.50.528,2019-01-29T21:30:38.2982264Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
0c32322ed02bb81b286b16ef5de94466,GermanCreditRiskFastpath.2019.1.24.19.50.12.591,2019-01-24T19:52:49.7309075Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
9b6abb9a934cecf33a465f07d73aabe6,MultiClass-ProductLineTestAutomationNC,2019-01-21T06:50:06.8313304Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
33ee79fb8e64881e9182dd4fd3475586,ClaimInsuranceRegressionTestAutomationCat,2019-01-21T06:46:49.6811946Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
97521fdfaf0830239fadfbac0d5d0091,ClaimInsuranceRegressionTestAutomationNC,2019-01-21T06:41:04.8380274Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
a7474537b6166d3cc01481b647423769,Germancreditrisk.2019.1.17.14.9.9.994,2019-01-17T14:09:46.5811815Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
085460ef94636166aea5800e9ea26168,GermanCreditRisk.2019.1.9.10.41.58.611,2019-01-09T10:42:59.7933412Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
77919091f862ab6a128edc60ccafc064,ASHDrugModelPred.2019.1.8.9.48.15.590,2019-01-08T09:48:33.9184573Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
920beeba5513b8059799900ecfcbdeca,MultiClass-ProductLineTestAutomation,2019-01-05T05:55:41.6061263Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False
56f60f0e2f997ed58d851e5a063eb554,ClaimInsuranceRegressionTestAutomation,2019-01-04T10:43:02.5032489Z,model,,149502ff-65b3-45c9-a145-a7b2ac25ba32,False


**Action:** Assign your source_uid to `source_uid` variable below.

In [62]:
source_uid = '986fd3e779b52d0e23a2bde5b6da996c'

In [63]:
subscription = client.data_mart.subscriptions.add(
            AzureMachineLearningAsset(source_uid=source_uid,
                                      binding_uid=binding_uid,
                                      input_data_type=InputDataType.STRUCTURED,
                                      problem_type=ProblemType.MULTICLASS_CLASSIFICATION,
                                      label_column='PRODUCT_LINE',
                                      prediction_column='Scored Labels'))

#### Get subscriptions list

In [64]:
subscriptions = client.data_mart.subscriptions.get_details()

In [65]:
subscriptions_uids = client.data_mart.subscriptions.get_uids()
print(subscriptions_uids)

['986fd3e779b52d0e23a2bde5b6da996c']


#### List subscriptions

In [66]:
client.data_mart.subscriptions.list()

0,1,2,3,4
986fd3e779b52d0e23a2bde5b6da996c,ScottdaAzureML12.2019.1.29.21.29.50.528,model,149502ff-65b3-45c9-a145-a7b2ac25ba32,2019-01-29T21:57:34.818Z


<a id="scoring"></a>
## 4. Scoring and payload logging

### 4.1 Score the product line model and measure response time

In [67]:
import requests
import time
import json

subscription_details = subscription.get_details()
scoring_url = subscription_details['entity']['deployments'][0]['scoring_endpoint']['url']

data = {
    "Inputs": {
        "input1":
            [
                {
                    'GENDER': "F",
                    'AGE': 27,
                    'MARITAL_STATUS': "Single",
                    'PROFESSION': "Professional",
                    'PRODUCT_LINE': "Personal Accessories",
                }
            ],
    },
    "GlobalParameters": {
    }
}

body = str.encode(json.dumps(data))

token = subscription_details['entity']['deployments'][0]['scoring_endpoint']['credentials']['token']
headers = subscription_details['entity']['deployments'][0]['scoring_endpoint']['request_headers']
headers['Authorization'] = ('Bearer ' + token)

start_time = time.time()
response = requests.post(url=scoring_url, data=body, headers=headers)
response_time = int(time.time() - start_time)*1000
result = response.json()

print(json.dumps(result, indent=2))

{
  "Results": {
    "output1": [
      {
        "GENDER": "F",
        "AGE": "27",
        "MARITAL_STATUS": "Single",
        "PROFESSION": "Professional",
        "PRODUCT_LINE": "Personal Accessories",
        "Scored Probabilities for Class \"Camping Equipment\"": "0",
        "Scored Probabilities for Class \"Golf Equipment\"": "0",
        "Scored Probabilities for Class \"Mountaineering Equipment\"": "0.0570687164231906",
        "Scored Probabilities for Class \"Outdoor Protection\"": "0",
        "Scored Probabilities for Class \"Personal Accessories\"": "0.942931283576809",
        "Scored Labels": "Personal Accessories"
      }
    ]
  }
}


### 4.2 Store the request and response in payload logging table

#### Transform the model's input and output to the format compatible with AI OpenScale standard.

In [68]:
request_data = {'fields': list(data['Inputs']['input1'][0]),
           'values': [list(x.values()) for x in data['Inputs']['input1']]}

response_data = {'fields': list(result['Results']['output1'][0]),
            'values': [list(x.values()) for x in result['Results']['output1']]}

#### Store the payload using Python SDK

**Hint:** You can embed payload logging code into your custom deployment so it is logged automatically each time you score the model.

In [69]:
records_list = [PayloadRecord(request=request_data, response=response_data, response_time=response_time), 
                PayloadRecord(request=request_data, response=response_data, response_time=response_time)]

for i in range(1, 10):
    records_list.append(PayloadRecord(request=request_data, response=response_data, response_time=response_time))

subscription.payload_logging.store(records=records_list)

#### Store the payload using REST API

Get the token first.

In [70]:
token_endpoint = "https://iam.bluemix.net/identity/token"
headers = {
    "Content-Type": "application/x-www-form-urlencoded",
    "Accept": "application/json"
}

data = {
    "grant_type":"urn:ibm:params:oauth:grant-type:apikey",
    "apikey":AIOS_CREDENTIALS["apikey"]
}

req = requests.post(token_endpoint, data=data, headers=headers)
token = req.json()['access_token']

Store the payload.

In [71]:
import requests, uuid

PAYLOAD_STORING_HREF_PATTERN = '{}/v1/data_marts/{}/scoring_payloads'
endpoint = PAYLOAD_STORING_HREF_PATTERN.format(AIOS_CREDENTIALS['url'], AIOS_CREDENTIALS['data_mart_id'])

payload = [{
    'binding_id': binding_uid, 
    'deployment_id': subscription.get_details()['entity']['deployments'][0]['deployment_id'], 
    'subscription_id': subscription.uid, 
    'scoring_id': str(uuid.uuid4()), 
    'response': response_data,
    'request': request_data
}]


headers = {"Authorization": "Bearer " + token}
      
req_response = requests.post(endpoint, json=payload, headers = headers)

print("Request OK: " + str(req_response.ok))

Request OK: True


<a id="feedback"></a>
## 5. Feedback logging & quality (accuracy) monitoring

### Enable quality monitoring

You need to provide the monitoring `threshold` and `min_records` (minimal number of feedback records).

In [72]:
subscription.quality_monitoring.enable(threshold=0.7, min_records=10)

### Feedback records logging

Feedback records are used to evaluate your model. The predicted values are compared to real values (feedback records).

You can check the schema of feedback table using the below method.

In [73]:
subscription.feedback_logging.print_table_schema()

0,1,2
GENDER,string,True
AGE,integer,True
MARITAL_STATUS,string,True
PROFESSION,string,True
PRODUCT_LINE,string,True
_training,timestamp,False


The feedback records can be sent to the feedback table using the code below.

In [74]:
fields = ["GENDER", "AGE", "MARITAL_STATUS", "PROFESSION", "PRODUCT_LINE"]

records = [
    ["F", "27", "Single", "Professional", "Personal Accessories"],
    ["M", "27", "Single", "Professional", "Personal Accessories"]]

for i in range(1,10):
    records.append(["F", "27", "Single", "Professional", "Personal Accessories"])

subscription.feedback_logging.store(feedback_data=records, fields=fields)

### Run quality monitoring on demand

By default, quality monitoring is run on hourly schedule. You can also trigger it on demand using the code below.

In [75]:
run_details = subscription.quality_monitoring.run()

Since the monitoring runs in the background you can use method below to check the status of the job.

In [76]:
status = run_details['status']
id = run_details['id']

print("Run status: {}".format(status))

start_time = time.time()
elapsed_time = 0

while status != 'completed' and elapsed_time < 60:
    time.sleep(10)
    run_details = subscription.quality_monitoring.get_run_details(run_uid=id)
    status = run_details['status']
    elapsed_time = time.time() - start_time
    print("Run status: {}".format(status))

Run status: initializing
Run status: completed


### Show the quality metrics

In [77]:
subscription.quality_monitoring.show_table()

0,1,2,3,4,5,6,7
2019-01-29 22:01:07.307000+00:00,0.9090909090909092,0.7,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,Accuracy_evaluation_04039cff-20a5-4786-90c2-826c402796a3,


Get all calculated metrics.

In [78]:
deployment_uids = subscription.get_deployment_uids()

In [79]:
subscription.quality_monitoring.get_metrics(deployment_uid=deployment_uids[0])

{'start': '2019-01-29T20:57:34.818Z',
 'end': '2019-01-29T22:02:04.471142Z',
 'metrics': [{'timestamp': '2019-01-29T22:01:07.307Z',
   'value': {'metrics': [{'name': 'weightedTruePositiveRate',
      'value': 0.9090909090909091},
     {'name': 'accuracy', 'value': 0.9090909090909091},
     {'name': 'weightedFMeasure', 'value': 0.8658008658008658},
     {'name': 'weightedRecall', 'value': 0.9090909090909091},
     {'name': 'weightedFalsePositiveRate', 'value': 0.9090909090909091},
     {'name': 'weightedPrecision', 'value': 0.8264462809917354}],
    'quality': 0.9090909090909091,
    'threshold': 0.7},
   'process': 'Accuracy_evaluation_04039cff-20a5-4786-90c2-826c402796a3'}]}

<a id="datamart"></a>
## 6. Get the logged data

### 6.1 Payload logging

#### Print schema of payload_logging table

In [80]:
subscription.payload_logging.print_table_schema()

0,1,2
scoring_id,string,False
scoring_timestamp,timestamp,False
deployment_id,string,False
asset_revision,string,True
GENDER,string,True
AGE,integer,True
MARITAL_STATUS,string,True
PROFESSION,string,True
PRODUCT_LINE,string,True
"Scored Probabilities for Class ""Camping Equipment""",string,True


#### Show (preview) the table

In [81]:
subscription.payload_logging.describe_table()

        AGE
count   8.0
mean   27.0
std     0.0
min    27.0
25%    27.0
50%    27.0
75%    27.0
max    27.0


#### Return the table content as pandas dataframe

In [82]:
pandas_df = subscription.payload_logging.get_table_content(format='pandas')

### 6.2 Feddback logging

Check the schema of table.

In [83]:
subscription.feedback_logging.print_table_schema()

0,1,2
GENDER,string,True
AGE,integer,True
MARITAL_STATUS,string,True
PROFESSION,string,True
PRODUCT_LINE,string,True
_training,timestamp,False


Preview table content.

In [84]:
subscription.feedback_logging.show_table()

0,1,2,3,4,5
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
M,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00
F,27,Single,Professional,Personal Accessories,2019-01-29 22:01:03.429000+00:00


Describe table (calulcate basic statistics).

In [85]:
subscription.feedback_logging.describe_table()

        AGE
count  11.0
mean   27.0
std     0.0
min    27.0
25%    27.0
50%    27.0
75%    27.0
max    27.0


Get table content.

In [86]:
feedback_pd = subscription.feedback_logging.get_table_content(format='pandas')

### 6.3 Quality metrics table

In [87]:
subscription.quality_monitoring.print_table_schema()

0,1,2
ts,timestamp,False
quality,float,False
quality_threshold,float,False
binding_id,string,False
subscription_id,string,False
deployment_id,string,True
process,string,False
asset_revision,string,True


In [88]:
subscription.quality_monitoring.show_table()

0,1,2,3,4,5,6,7
2019-01-29 22:01:07.307000+00:00,0.9090909090909092,0.7,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,Accuracy_evaluation_04039cff-20a5-4786-90c2-826c402796a3,


### 6.4 Performance metrics table

In [89]:
subscription.performance_monitoring.print_table_schema()

0,1,2
ts,timestamp,False
scoring_time,float,False
scoring_records,object,False
binding_id,string,False
subscription_id,string,False
deployment_id,string,True
process,string,False
asset_revision,string,True


In [90]:
subscription.performance_monitoring.show_table()

0,1,2,3,4,5,6,7
2019-01-29 22:00:37.073801+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073920+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073780+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073901+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073670+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073842+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073757+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073881+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073727+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,
2019-01-29 22:00:37.073862+00:00,0.0,1,149502ff-65b3-45c9-a145-a7b2ac25ba32,986fd3e779b52d0e23a2bde5b6da996c,3bf505608b3fce0eaa5c89fc1db08457,,


### 6.5 Data Mart measurement facts table

In [91]:
client.data_mart.get_deployment_metrics()

{'deployment_metrics': [{'subscription': {'subscription_id': '986fd3e779b52d0e23a2bde5b6da996c',
    'url': '/v1/data_marts/b873054a-9264-48c4-bcfe-c462ac3b8cf8/service_bindings/149502ff-65b3-45c9-a145-a7b2ac25ba32/subscriptions/986fd3e779b52d0e23a2bde5b6da996c'},
   'asset': {'name': 'ScottdaAzureML12.2019.1.29.21.29.50.528',
    'asset_id': '986fd3e779b52d0e23a2bde5b6da996c',
    'url': 'https://ussouthcentral.services.azureml.net/subscriptions/744bca722299451cb682ed6fb75fb671/services/24490adc9dfe4958a6e3480039d0e151/swagger.json',
    'asset_type': 'model',
    'created_at': '2019-01-29T21:30:38.2982264Z'},
   'deployment': {'name': 'ScottdaAzureML12.2019.1.29.21.29.50.528',
    'url': 'https://ussouthcentral.services.azureml.net:443/subscriptions/744bca722299451cb682ed6fb75fb671/services/24490adc9dfe4958a6e3480039d0e151/execute?api-version=2.0&format=swagger',
    'deployment_type': 'online',
    'scoring_endpoint': {'url': 'https://ussouthcentral.services.azureml.net:443/subscrip

---

### Authors
Lukasz Cmielowski, PhD, is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.