# Working with Azure Machine Learning Studio engine

This notebook shows how to log the payload for a model deployed on Microsoft Azure serving engine using AI OpenScale python sdk.

Contents
- [1. Setup](#setup)
- [2. Binding machine learning engine](#binding)
- [3. Subscriptions](#subscription)
- [4. Scoring and payload logging](#scoring)
- [5. Feedback logging](#feedback)
- [6. Data Mart](#datamart)

<a id="setup"></a>
## 1. Setup

### 1.0 Sample model creation using [Azure Machine Learning Studio](https://studio.azureml.net)

- Download training data set from [here](https://github.com/pmservice/wml-sample-models/raw/master/spark/product-line-prediction/data/GoSales_Tx.csv)
- [Create an experiment in Azure ML Studio](https://docs.microsoft.com/en-us/azure/machine-learning/studio/create-experiment) using the diagram below. (You can search for each module in the palette by name)
- When you get to the `Train Model` module, select the `Product Line` column as the label.
- Run the experiment to train the model.
- [Create (deploy) web service](https://docs.microsoft.com/en-us/azure/machine-learning/studio/publish-a-machine-learning-web-service) (Choose the `new` NOT `classic`)

<img src="https://github.com/pmservice/ai-openscale-tutorials/raw/master/notebooks/images/azure_product_line_model.png" align="left" alt="experiment">

**NOTE:** Classic web services are not supported.

### 1.1 Installation and authentication

In [None]:
!pip install --upgrade ibm-ai-openscale --no-cache | tail -n 1

Import and initiate.

In [None]:
from ibm_ai_openscale import APIClient
from ibm_ai_openscale.supporting_classes import PayloadRecord
from ibm_ai_openscale.supporting_classes.enums import InputDataType, ProblemType
from ibm_ai_openscale.engines import *
from ibm_ai_openscale.utils import *

#### ACTION: Get AI OpenScale `instance_guid` and `apikey`

[Install IBM Cloud (bluemix) console](https://console.bluemix.net/docs/cli/reference/ibmcloud/download_cli.html#install_use)

Use the IBM Cloud CLI to get an api key:
```bash
ibmcloud login --sso
ibmcloud iam api-key-create 'my_key'
```

Get your AI OpenScale instance GUID:
> if your resource group is different than `default` switch to the resource group containing AI OpenScale instance

```bash
ibmcloud target -g <myResourceGroup>
```

Get details of the instance:
```bash
ibmcloud resource service-instance `AI-OpenScale-instance_name`
```

#### Let's define some constants required to set up data mart:

- AIOS_CREDENTIALS
- POSTGRES_CREDENTIALS
- SCHEMA_NAME

In [None]:
AIOS_CREDENTIALS = {
  "url": "https://api.aiopenscale.cloud.ibm.com",
  "instance_guid": "****",
  "apikey": "****"
}

In [None]:
POSTGRES_CREDENTIALS = {
    "db_type": "postgresql",
    "uri_cli_1": "xxx",
    "maps": [],
    "instance_administration_api": {
        "instance_id": "xxx",
        "root": "xxx",
        "deployment_id": "xxx"
    },
    "name": "xxx",
    "uri_cli": "xxx",
    "uri_direct_1": "xxx",
    "ca_certificate_base64": "xxx",
    "deployment_id": "xxx",
    "uri": "xxx"
}


In [None]:
SCHEMA_NAME = 'data_mart_for_azure'

Create schema for data mart.

In [None]:
create_postgres_schema(postgres_credentials=POSTGRES_CREDENTIALS, schema_name=SCHEMA_NAME)

In [None]:
client = APIClient(AIOS_CREDENTIALS)

In [None]:
client.version

### 1.2 DataMart setup

In [None]:
client.data_mart.setup(db_credentials=POSTGRES_CREDENTIALS, schema=SCHEMA_NAME)

In [None]:
data_mart_details = client.data_mart.get_details()

<a id="binding"></a>
## 2. Bind machine learning engines

### 2.1 Bind  `Azure` machine learning engine

Provide credentials using following fields:
- `client_id`
- `client_secret`
- `subscription_id`
- `tenant`

In [None]:
AZURE_ENGINE_CREDENTIALS = {
    "client_id": "***",
    "client_secret": "***",
    "subscription_id": "***",
    "tenant": "***"
}

In [None]:
binding_uid = client.data_mart.bindings.add('My Azure ML Studio engine', AzureMachineLearningInstance(AZURE_ENGINE_CREDENTIALS))

In [None]:
bindings_details = client.data_mart.bindings.get_details()

In [None]:
client.data_mart.bindings.list()

<a id="subsciption"></a>
## 3. Subscriptions

### 3.1 Add subscriptions

List available deployments.

**Note:** Depending on the number of assets it may take some time.

In [None]:
client.data_mart.bindings.list_assets()

**Action:** Assign your source_uid to `source_uid` variable below.

In [None]:
source_uid = 'xxxxxx'

In [None]:
subscription = client.data_mart.subscriptions.add(
            AzureMachineLearningAsset(source_uid=source_uid,
                                      binding_uid=binding_uid,
                                      input_data_type=InputDataType.STRUCTURED,
                                      problem_type=ProblemType.MULTICLASS_CLASSIFICATION,
                                      label_column='PRODUCT_LINE',
                                      prediction_column='Scored Labels'))

#### Get subscriptions list

In [None]:
subscriptions = client.data_mart.subscriptions.get_details()

In [None]:
subscriptions_uids = client.data_mart.subscriptions.get_uids()
print(subscriptions_uids)

#### List subscriptions

In [None]:
client.data_mart.subscriptions.list()

<a id="scoring"></a>
## 4. Scoring and payload logging

### 4.1 Score the product line model and measure response time

In [None]:
import requests
import time
import json

subscription_details = subscription.get_details()
scoring_url = subscription_details['entity']['deployments'][0]['scoring_endpoint']['url']

data = {
    "Inputs": {
        "input1":
            [
                {
                    'GENDER': "F",
                    'AGE': 27,
                    'MARITAL_STATUS': "Single",
                    'PROFESSION': "Professional",
                    'PRODUCT_LINE': "Personal Accessories",
                }
            ],
    },
    "GlobalParameters": {
    }
}

body = str.encode(json.dumps(data))

token = subscription_details['entity']['deployments'][0]['scoring_endpoint']['credentials']['token']
headers = subscription_details['entity']['deployments'][0]['scoring_endpoint']['request_headers']
headers['Authorization'] = ('Bearer ' + token)

start_time = time.time()
response = requests.post(url=scoring_url, data=body, headers=headers)
response_time = int(time.time() - start_time)*1000
result = response.json()

print(json.dumps(result, indent=2))

### 4.2 Store the request and response in payload logging table

#### Transform the model's input and output to the format compatible with AI OpenScale standard.

In [None]:
request_data = {'fields': list(data['Inputs']['input1'][0]),
           'values': [list(x.values()) for x in data['Inputs']['input1']]}

response_data = {'fields': list(result['Results']['output1'][0]),
            'values': [list(x.values()) for x in result['Results']['output1']]}

#### Store the payload using Python SDK

**Hint:** You can embed payload logging code into your custom deployment so it is logged automatically each time you score the model.

In [None]:
records_list = [PayloadRecord(request=request_data, response=response_data, response_time=response_time), 
                PayloadRecord(request=request_data, response=response_data, response_time=response_time)]

for i in range(1, 10):
    records_list.append(PayloadRecord(request=request_data, response=response_data, response_time=response_time))

subscription.payload_logging.store(records=records_list)

#### Store the payload using REST API

Get the token first.

In [None]:
token_endpoint = "https://iam.bluemix.net/identity/token"
headers = {
    "Content-Type": "application/x-www-form-urlencoded",
    "Accept": "application/json"
}

data = {
    "grant_type":"urn:ibm:params:oauth:grant-type:apikey",
    "apikey":AIOS_CREDENTIALS["apikey"]
}

req = requests.post(token_endpoint, data=data, headers=headers)
token = req.json()['access_token']

Store the payload.

In [None]:
import requests, uuid

PAYLOAD_STORING_HREF_PATTERN = '{}/v1/data_marts/{}/scoring_payloads'
endpoint = PAYLOAD_STORING_HREF_PATTERN.format(AIOS_CREDENTIALS['url'], AIOS_CREDENTIALS['data_mart_id'])

payload = [{
    'binding_id': binding_uid, 
    'deployment_id': subscription.get_details()['entity']['deployments'][0]['deployment_id'], 
    'subscription_id': subscription.uid, 
    'scoring_id': str(uuid.uuid4()), 
    'response': response_data,
    'request': request_data
}]


headers = {"Authorization": "Bearer " + token}
      
req_response = requests.post(endpoint, json=payload, headers = headers)

print("Request OK: " + str(req_response.ok))

<a id="feedback"></a>
## 5. Feedback logging & quality (accuracy) monitoring

### Enable quality monitoring

You need to provide the monitoring `threshold` and `min_records` (minimal number of feedback records).

In [None]:
subscription.quality_monitoring.enable(threshold=0.7, min_records=10)

### Feedback records logging

Feedback records are used to evaluate your model. The predicted values are compared to real values (feedback records).

You can check the schema of feedback table using the below method.

In [None]:
subscription.feedback_logging.print_table_schema()

The feedback records can be sent to the feedback table using the code below.

In [None]:
fields = ["GENDER", "AGE", "MARITAL_STATUS", "PROFESSION", "PRODUCT_LINE"]

records = [
    ["F", "27", "Single", "Professional", "Personal Accessories"],
    ["M", "27", "Single", "Professional", "Personal Accessories"]]

for i in range(1,10):
    records.append(["F", "27", "Single", "Professional", "Personal Accessories"])

subscription.feedback_logging.store(feedback_data=records, fields=fields)

### Run quality monitoring on demand

By default, quality monitoring is run on hourly schedule. You can also trigger it on demand using the code below.

In [None]:
run_details = subscription.quality_monitoring.run()

Since the monitoring runs in the background you can use the method below to check the status of the job.

In [None]:
status = run_details['status']
id = run_details['id']

print("Run status: {}".format(status))

start_time = time.time()
elapsed_time = 0

while status != 'completed' and elapsed_time < 60:
    time.sleep(10)
    run_details = subscription.quality_monitoring.get_run_details(run_uid=id)
    status = run_details['status']
    elapsed_time = time.time() - start_time
    print("Run status: {}".format(status))

### Show the quality metrics

In [None]:
subscription.quality_monitoring.show_table()

Get all calculated metrics.

In [None]:
deployment_uids = subscription.get_deployment_uids()

In [None]:
subscription.quality_monitoring.get_metrics(deployment_uid=deployment_uids[0])

<a id="datamart"></a>
## 6. Get the logged data

### 6.1 Payload logging

#### Print schema of payload_logging table

In [None]:
subscription.payload_logging.print_table_schema()

#### Show (preview) the table

In [None]:
subscription.payload_logging.describe_table()

#### Return the table content as pandas dataframe

In [None]:
pandas_df = subscription.payload_logging.get_table_content(format='pandas')

### 6.2 Feddback logging

Check the schema of table.

In [None]:
subscription.feedback_logging.print_table_schema()

Preview table content.

In [None]:
subscription.feedback_logging.show_table()

Describe table (calulcate basic statistics).

In [None]:
subscription.feedback_logging.describe_table()

Get table content.

In [None]:
feedback_pd = subscription.feedback_logging.get_table_content(format='pandas')

### 6.3 Quality metrics table

In [None]:
subscription.quality_monitoring.print_table_schema()

In [None]:
subscription.quality_monitoring.show_table()

### 6.4 Performance metrics table

In [None]:
subscription.performance_monitoring.print_table_schema()

In [None]:
subscription.performance_monitoring.show_table()

### 6.5 Data Mart measurement facts table

In [None]:
client.data_mart.get_deployment_metrics()

---

### Authors
Lukasz Cmielowski, PhD, is an Automation Architect and Data Scientist at IBM with a track record of developing enterprise-level applications that substantially increases clients' ability to turn data into actionable knowledge.