# Running Certifai Pro scans on Azure ML models

## CONTENTS

1. First we'll understand how to install Certifai Pro and its setup in an Azure cloud environment. Setup process includes configuring your Certifai Pro instance with storage parameters for a pre-existing container in an Azure Storage Account

2. We will then install the required Certifai Toolkit python libraries so scans that generate information about the fairness, robustness and explainability of your ML models can be evaluated with the [CERTIFAI framework](https://cognitivescale.github.io/cortex-certifai/docs/about)

3. Then, we will create sklearn models to classify [german credit loan risk](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) (predict whether loan will be granted or not)

4. Register our model and deploy it as a webservice in ACI (AZURE CONTAINER INSTANCE) with authentication

5. Test the deployed webservice

6. Construct Certifai Scan Definitions for this Binary Classification model

7. Upload the required datasets for the scan into the Azure Storage Account container used to configure Certifai Pro on Step 1

<a id='prereqs'> - Certifai Pro

Cortex Certifai Pro is a cloud based offering from [CognitiveScale](https://www.cognitivescale.com/certifai/) that allows data scientists to define, scan and analyse their models to determine their Fairness, Robustness and Explainability measures using the [CERTIFAI framework](https://arxiv.org/abs/1905.07857). 

You can find more details about Certifai on the [official documentation site](https://cognitivescale.github.io/cortex-certifai/docs/about).

This tutorial helps users that use **Azure Machine Learning resources (Hosted Notebooks/Models/Endpoints)** to setup their model and ready it for scanning with Certifai Pro.

Certifai Pro is a single user, VM installed version of Cortex Certifai that runs on an Azure VM and can be installed from the [Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/cognitive-scale.cortex-certifai-pro). Certifai Pro VMs can run scans on your own models with the aid of the Certifai Toolkit, a downloadable set of Python packages and CLI tools that can run Certifai scans on your personal machines. The toolkit also enables you to connect to and run scans remotely on the Certifai Pro VM. You can find more details about the [Certifai Toolkit](https://cognitivescale.github.io/cortex-certifai/docs/toolkit/setup/download-toolkit)

This guide walks you through using the Certifai Python API to help define Scan Definitions that can be passed on to the Certifai Pro instance along with Datasets and Secrets needed, if any
be covered separately


### Install Certifai Pro from the Azure Marketplace

You can find and create a personal instance of Cortex Certifai Pro from the [Azure Marketplace](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/cognitive-scale.cortex-certifai-pro). Please follow the instructions from the official [Certifai docs](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup) for the Azure platform to get up and running.

A brief summary of this process includes:
- Certifai Pro instance setup from the Marketplace and initial authentication workflows
- Configure your Certifai Pro instance with blob storage containers and credentials for an Azure Storage account of your choice.
    - You may also install sample reports for a variety of usecases in Finance, Healthcare and Insurance to understand how the AI Trust Index scores generated by Certifai, accompanied by the extraordinarily helpful reports on Fairness, Robustness and Explainability can be used to improve your machine learning models
- Configure Custom SSL certificates (if needed)

So, please go ahead to the Azure Marketplace listing for [Cortex Certifai Pro](https://azuremarketplace.microsoft.com/en-us/marketplace/apps/cognitive-scale.cortex-certifai-pro) and setup your instance of Certifai Pro and head back here after you've completed the detailed instructions we've provided for Azure on the [official Certifai docs](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup).


Once you're done, head back to this tutorial where we walk you through the widely known [German Credit dataset](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) and build two classification models that determine whether a Bank should extend lines of Credit to customers based on demographic, financial and employment features.

Both of the classification models (SVM and Logistic Regression) we'll build today use the `Scikit-learn` Python module. You can follow along to get a good sense of the contracts/patterns needed to deploy a machine learning model that your Certifai Pro installation understands and connects to (for inference).

The following notebook cells walk you through the routine Data Science workflow (Pre-process, Data Splits, Model Training and Model Deployment). Feel free to use these notebooks as a starting point in your journey to scan your machine learning models with the CERTIFAI framework and gain a deeper understanding of their performance and behavior in terms of the following quantities:

    1. Robustness to Data Variations
    2. Explainability of Model Predictions
    3. Fairness By Group

Refer to the [Certifai Quickstart](https://cognitivescale.github.io/cortex-certifai/docs/quickstart) for a rehash on each of these topics and what they mean in general and what they can mean to your machine learning model.


### Download the Cortex Certifai Toolkit from your Certifai Pro VM

Follow the instructions described in [Certifai Pro Azure Setup](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup) to finish initial setup for your Certifai Pro VM. Now, click on the Help icon (top right) and select `Download Toolkit` to download a zip file containing the Cortex Certifai Toolkit to your computer.

If you're running this notebook from an Azure Hosted Notebook, you'll need to upload the Certifai Toolkit zip file to the Hosted Notebook and make note of its path.

### Storage Account Configuration for Certifai Pro VMs

Follow the instructions described in the [Azure Storage Setup](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup#certifai-console-storage-setup) to configure your Azure based Certifai Console hosted via the Certifai Pro VM with an Azure Blob Container Storage Account of your choice.

## Prerequisites - Notebook Dependencies

If you are using an Azure Machine Learning Notebook VM, you are all set. Otherwise, make sure you go through the 
[configuration-notebook](https://github.com/Azure/MachineLearningNotebooks/blob/c520bd1d4130d9a01ee46e0937459e2de95d15ec/configuration.ipynb) to create an Azure workspace. Creating local and remote environments/dependencies will be covered in the notebook

**PleaseNote**: to step through this notebook, make sure you have necessary dependencies installed locally

- python>=3.6.2,<3.7
- scikit-learn=0.20.3
- numpy=1.16.2
- pandas
- azureml-sdk=1.4.0
- ipython
- matplotlib
- jupyter

You can also use [Conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/) to create the local environment using the `certifai_azure_model_env.yml` file provided with the notebook

Open your favorite terminal and cd into folder where this notebook is located to execute the below commands

- `conda env create -f certifai_azure_model_env.yml` : will create local conda env with the necessary python packages for working through the notebook
- `jupyter-notebook` : to launch jupyter notebook sesssion. 


**Note**: Installing `Cortex-Certifai` packages will be covered separately

## Certifai Pro on Azure

First, we follow the instructions detailed on the official [Cortex Certifai documentation site](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup) to tick off the following items from our Pre-requisites:

- [ ] [Install Certifai Pro](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup) from the Azure Marketplace into an Azure Virtual Machine
- [ ] [Configure](https://cognitivescale.github.io/cortex-certifai/docs/platforms/azure/azure-setup#certifai-console-storage-setup) your Certifai Pro instance with storage credentials for a blob container inside an Azure Storage Account.

Once you've configured your Certifai Pro instance with the Azure Storage Account credentials, download the Certifai Toolkit by clicking on the Help Icon on the top left of the site and selecting `Download Toolkit`
- [ ] Download the Certifai Toolkit and upload it into your hosted Azure ML Notebook and make note of the path

### Set Cortex Certifai Toolkit path
- update the `certifai_toolkit_path` to point to your downloaded Certifai Toolkit
- this will be used later to install cortex certifai python packages

In [None]:
from os.path import expanduser, isfile
home = expanduser("~")
certifai_toolkit_path = f'{home}/Downloads/toolkit'
certifai_toolkit_path

## Creating a [german credit](https://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) prediction model using sklearn

In [None]:
# required imports for model building and persistance 

import pandas as pd
import numpy as np
from sklearn.linear_model import LogisticRegression
from sklearn import svm
from sklearn.model_selection import GridSearchCV
from sklearn.model_selection import train_test_split
import random
from sklearn.externals import joblib

### Test to confirm correct version of scikit-learn and numpy are installed

In [None]:
import sklearn as sklearn_version_test
assert sklearn_version_test.__version__ == '0.20.3', 'scikit-learn version mismatch, `pip install scikit-learn==0.20.3` to install right sklearn version for this notebook'
assert np.__version__                   == '1.16.2', 'numpy version mismatch, `pip install numpy==1.16.2` to install right numpy version for this notebook'

In [None]:
# special import - 
# for multiprocessing to work in a Notebook,  pickled classes must be in a separate package or notebook
# hence, the model encoder class has to be somewhere other than the current notebook

import os
import sys
sys.path.insert(0, os.path.abspath(os.path.join('.')))
from scripts.cat_encoder import CatEncoder

###  load data in dataframe

In [None]:
# load the dataset into memory
df = pd.read_csv('data/german_credit_eval.csv')

### define features 

In [None]:
cat_columns = [
    'checkingstatus',
    'history',
    'purpose',
    'savings',
    'employ',
    'status',
    'others',
    'property',
    'age',
    'otherplans',
    'housing',
    'job',
    'telephone',
    'foreign'
    ]

label_column = 'outcome'

### separate features and target variable

In [None]:
y = df[label_column]
X = df.drop(label_column, axis=1)

### split dataset into the training and test set

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=42)

### encode and scale features

In [None]:
encoder = CatEncoder(cat_columns, X)

### build and train model using sklearn

In [None]:
def build_model(data, name, model_family, test=None):
    if test is None:
        test = data
        
    if model_family == 'SVM':
        parameters = {'kernel':('linear', 'rbf', 'poly'), 'C':[0.1, .5, 1, 2, 4, 10], 'gamma':['auto']}
        m = svm.SVC()
    elif model_family == 'logistic':
        parameters = {'C': (0.5, 1.0, 2.0), 'solver': ['lbfgs'], 'max_iter': [1000]}
        m = LogisticRegression()
    model = GridSearchCV(m, parameters, cv=3)
    model.fit(data[0], data[1])

    # Assess on the test data
    accuracy = model.score(test[0], test[1].values)
    print(f"Model '{name}' accuracy is {accuracy}")
    return model

svm_model_name      = 'german_credit_svm'
logistic_model_name = 'german_credit_logit'

svm_model = build_model((encoder(X_train.values), y_train),
                        svm_model_name,
                        'SVM',
                        test=(encoder(X_test.values), y_test))

logistic_model = build_model((encoder(X_train.values), y_train),
                        logistic_model_name,
                        'logistic',
                        test=(encoder(X_test.values), y_test))

### dump the trained models (along with corresponding encoder object) to disk 

In [None]:
# encoder object is dumped(along with trained model) to apply same transformation during prediction
def dump_model(model_name,model_obj,encoder_obj=encoder):
    model_path = f'{model_name}.pkl'
    model_obj = {
        "model":model_obj,
        "encoder":encoder_obj
    }
    joblib.dump(value=model_obj, filename=model_path)
    print(f'model saved on disk {model_obj}')
    return model_path

# persist models to disk
svm_model_disk_path      = dump_model(svm_model_name,svm_model)
logistic_model_disk_path = dump_model(logistic_model_name,logistic_model)

## In the section below we will:

1. Configure Azure workspace
2. Register models (built above) to the workspace
3. Create a prediction environment in the remote Azure workspace (created above) and
4. Deploy models (predict) as web service

### Configure and Initialize Azure workspace

- Follow the instructions listed here [creating and managing azure-ml workspace](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-manage-workspace) to create an azure-ml workspace

**Once you have the workspace created easiest way to run through remaining steps is to download the `config.json` to the current directory and replace the exisiting config.json**

### Create a [Workspace](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.workspace%28class%29?view=azure-ml-py) object from the persisted configuration.

In [None]:
from azureml.core import Workspace
ws = Workspace.from_config()

### Register models to created  workspace

Register a file or folder as a model by calling [Model.register()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#register-workspace--model-path--model-name--tags-none--properties-none--description-none--datasets-none--model-framework-none--model-framework-version-none--child-paths-none-).

In addition to the content of the model file itself (model + scaler object), our registered model will also store model metadata like model description, tags, etc. -- that will be useful when managing and deploying models in our workspace.

In [None]:
from azureml.core.model import Model

logistic_model_azure = Model.register(model_path=logistic_model_disk_path,
                       model_name=logistic_model_name,
                       tags={'area': "banking credit risk", 'type': "classification"},
                       description="Logistic Classifier model to predict credit loan approval",
                       workspace=ws)

svm_model_azure = Model.register(model_path=svm_model_disk_path,
                       model_name=svm_model_name,
                       tags={'area': "banking credit risk", 'type': "classification"},
                       description="Support Vector Machine Classifier model to predict credit loan approval",
                       workspace=ws)

### Create a custom prediction environment inside azure-ml workspace

If we want control over how our model is run, if it uses another framework, or if it has special runtime requirements, we can instead specify our own environment and scoring method. Custom environments can be used for any model we want to deploy.

Specify the model's runtime environment by creating an [Environment](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.environment%28class%29?view=azure-ml-py) object and providing the [CondaDependencies](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.conda_dependencies.condadependencies?view=azure-ml-py) needed by the model

In this example we will create a conda environment for our german credit model from file **myenv.yml** and register it to our workspace


In [None]:
with open("myenv.yml", 'r') as f:
    print(f.read())

In [None]:
from azureml.core.conda_dependencies import CondaDependencies
from azureml.core.environment import Environment

environment = Environment("german-credit-env")
environment.python.conda_dependencies = CondaDependencies("myenv.yml")
environment.register(workspace=ws)


## Create Inference Configuration and deploy webservice

**Inference Configuration** will contain:

1. Scoring script
2. Environment (created above)

We create the scoring script, called **score.py**. The web service call uses this script to show how to use the model.

We include below two required functions in the scoring script:

1. The `init()` function, which typically loads the model into a global object. This function is run only once when the Docker container is started.

2. The `run(data)` function uses the model to predict a value based on the input data. Inputs and outputs to the run typically use JSON for serialization and de-serialization, but other formats are also supported.




Deploy the registered model in the custom environment by providing an [InferenceConfig](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.inferenceconfig?view=azure-ml-py) object to [Model.deploy()](https://docs.microsoft.com/en-us/python/api/azureml-core/azureml.core.model.model?view=azure-ml-py#deploy-workspace--name--models--inference-config--deployment-config-none--deployment-target-none-). In this case we are also using the [AciWebservice.deploy_configuration()](https://docs.microsoft.com/python/api/azureml-core/azureml.core.webservice.aci.aciwebservice#deploy-configuration-cpu-cores-none--memory-gb-none--tags-none--properties-none--description-none--location-none--auth-enabled-none--ssl-enabled-none--enable-app-insights-none--ssl-cert-pem-file-none--ssl-key-pem-file-none--ssl-cname-none--dns-name-label-none--) method to generate a custom deploy configuration
        
**Note**: This step can take several minutes.


In [None]:
with open('scripts/svm_score.py') as f:
    print(f.read())

In [None]:
from azureml.core.model import InferenceConfig
from azureml.core import Webservice
from azureml.exceptions import WebserviceException
from azureml.core.webservice import AciWebservice

inference_config_logistic = InferenceConfig(entry_script="logistic_score.py",
                                   environment=environment,source_directory="scripts")
inference_config_svm = InferenceConfig(entry_script="svm_score.py",
                                   environment=environment,source_directory="scripts")

logistic_service_name = 'german-credit-logistic-service'
svm_service_name = 'german-credit-svm-service'

aci_deployment_config = AciWebservice.deploy_configuration(auth_enabled=True)

# Remove any existing services under the same name.
try:
    Webservice(ws, logistic_service_name).delete()
except WebserviceException:
    pass

try:
    Webservice(ws, svm_service_name).delete()
except WebserviceException:
    pass


service_logistic = Model.deploy(ws, logistic_service_name, [logistic_model_azure],inference_config=inference_config_logistic,deployment_config=aci_deployment_config)
service_svm      = Model.deploy(ws, svm_service_name,      [svm_model_azure],     inference_config=inference_config_svm,     deployment_config=aci_deployment_config)
service_logistic.wait_for_deployment(show_output=True)
service_svm.wait_for_deployment(show_output=True)

## Test the webservice

1. Get the webservice endpoint using `service.scoring_uri` :: string
2. Get the authentication headers usinh `service.get_keys()` :: tuple


In [None]:
service_logistic_uri  = service_logistic.scoring_uri
service_logistic_keys = service_logistic.get_keys()

service_svm_uri       = service_svm.scoring_uri
service_svm_keys      = service_svm.get_keys()

In [None]:
# create json test data sample(from csv)

import json
sample_input = json.dumps({
"payload": {
    "instances": [
        [
            "... < 0 DM",
            6,
            "critical account/ other credits existing (not at this bank)",
            "radio/television",
            1169,
            "unknown/ no savings account",
            ".. >= 7 years",
            4,
            "male : single",
            "others - none",
            4,
            "real estate",
            "> 25 years",
            "none",
            "own",
            2,
            "skilled employee / official",
            1,
            "phone - yes, registered under the customers name",
            "foreign - yes"
        ]
    ]
}
})
sample_input

In [None]:
import requests
import json

headers = {
    'Content-Type': 'application/json',
    'Authorization': f'Bearer {service_svm_keys[0]}'          
          }

response = requests.post(
    service_svm_uri, data=sample_input, headers=headers)
print(response.status_code)
print(response.elapsed)
print(response.json())

## Running Cortex Certifai Scan locally

1. Install the cortex certifai packages required to initiate model scan

2. Configure scan details and execute


## Installing Cortex Certifai python packages

initiating a Cortex Certifai scan requires following python packages to be installed in the current local environment

`required-packages`

- cortex-certifai-scanner
- cortex-certifai-engine
- cortex-certifai-common

`optional-packages`

- cortex-certifai-client
- cortex-certifai-console

Download [certifai toolkit](https://www.cognitivescale.com/download-certifai) and follow instructions in the `Readme.md` to install the python-packages in the current environment

### Install required certifai packages (optional packages are left for user to install)


In [None]:
!find $certifai_toolkit_path/packages/all       -type f ! -name "*console-*" | xargs -I % sh -c 'pip install % ' ;
!find $certifai_toolkit_path/packages/python3.6 -type f   -name "*engine-*"                      | xargs -I % sh -c 'pip install % ' ;
!find $certifai_toolkit_path/packages/all -type f   -name "*client-*"                      | xargs -I % sh -c 'pip install % ' ;

In [None]:
from certifai.client.remote import remote_config, remote_list

class Namespace:
    def __init__(self, **kwargs):
        self.__dict__.update(kwargs)

remote_alias = 'cpro-az'
pro_kubeconf_locn = 'certifai-kubeconfig.json'

args = Namespace(file='certifai-kubeconfig.json',
                alias=remote_alias,
                context='current-context',
                namespace='certifai',
                timeout=15)

r_conf = remote_config.remote_config(args)

### List Scan job on the Certifai Pro remote instance
We use the following code block to list all scan jobs on the configured `remote_alias` using the `certifai-client` python package


#### Equivalent CLI command
```
certifai remote list -a <remote_alias>
```

In [None]:
from certifai.client.remote import remote_list
list_args = Namespace(alias=remote_alias)
remote_list.remote_list(list_args)

### Build Scan definition for our models

We'll now use the `certifai-scanner` python package to build a scan definition for the SVM and Logistic Regression models (via the Service Endpoints created earlier).

Our Certifai Scan needs some mandatory parameters like:

1. Prediction Task Outcomes and Values
2. Model Details (names, endpoints and more)
3. Datasets to evaluate the models on

And optional parameters that depend on the desired evaluation reports. Evaluation types include:

1. Fairness
2. Robustness
3. Explainability
4. SHAP reports

In [None]:
# make sure certifai package was installed correctly
from certifai.scanner.version import get_version
get_version()

### Using Cortex Certifai Client python-package to launch a Remote Scan

In the below code block, set the variable named `pro_kubeconf_locn` to the location of the KubeConfig file downloaded from your Certifai Pro instance. We configure the installed `certifai` cli tool via it's python library to configure and save the kuberentes connection settings to your Certifai Pro instance under the alias stored in variable `remote_alias`.

We can now use this `remote_alias` (on successful connection) to list, submit and delete Certifai Scans on the running Certifai Pro instance

#### Equivalent CLI commands

```
certifai remote config --file certifai-kubeconfig.json --alias <remote_alias>
```

In [None]:
# necessary imports for creating a scan

from certifai.scanner.builder import (CertifaiScanBuilder, CertifaiModel, CertifaiModelMetric,
                                      CertifaiDataset, CertifaiGroupingFeature, CertifaiDatasetSource,
                                      CertifaiPredictionTask, CertifaiTaskOutcomes, CertifaiOutcomeValue)
from certifai.scanner.report_utils import scores, construct_scores_dataframe


### define cortex certifai task type

- `CertifaiTaskOutcomes` : cortex certifai supports classification as well as regression models. here we have an example of binary-classification (e.g. predict whether loan should be granted or not)
- `CertifaiOutcomeValue` : define the different outcomes possible from the model predictions. here we have a model that predicts either 1(loan granted) or 2(loan denied)

**Note**: Please refer to [Certifai Api Docs](https://cognitivescale.github.io/cortex-certifai/certifai-api-ref/certifai.scanner.builder.html) for more details

In [None]:
# Create the scan object from scratch using the ScanBuilder class with tasks and outcomes

# First define the possible prediction outcomes
task = CertifaiPredictionTask(CertifaiTaskOutcomes.classification(
    [
        CertifaiOutcomeValue(1, name='Loan granted', favorable=True),
        CertifaiOutcomeValue(2, name='Loan denied')
    ]),
    prediction_description='Determine whether a loan should be granted')

#  create a certifai scan object and add the certifai task created above
scan = CertifaiScanBuilder.create('model_auth_demo',
                                  prediction_task=task)


In [None]:
!pip install azure-storage-blob

In [None]:
import os, uuid
from azure.storage.blob import BlobServiceClient, BlobClient, ContainerClient
# set credentials for azure storage account
az_credentials = 'REDACTED'
# set connection string as env var so the Certifai Library can pick it up
os.environ['AZURE_CONNECTION_STRING'] = az_credentials
# upload our evaluation dataset to an Azure Blob Storage Account Container.
client = BlobServiceClient.from_connection_string(az_credentials)

In [None]:
# upload our eval dataset to the blob storage container
az_container_name = 'set_container_name' # this container should already exist. You can create one from the Azure Portal
german_credit_eval_data_file = "data/german_credit_eval.csv"
az_german_credit_blob_name = 'az-certifai-examples/german_credit_eval.csv'

blob_client = client.get_blob_client(container=az_container_name, blob=az_german_credit_blob_name)
with open(german_credit_eval_data_file, 'rb') as f:
    blob_client.upload_blob(f)

container_client = client.get_container_client(az_container_name)
blob_list = container_client.list_blobs()
for blob in blob_list:
    print("\n" + blob.name)



### add logistic and svm models (created above) to scan object

Additional parameters that maybe provided to the `CertifaiModel` class can be gleaned from the [API Reference for CertifaiModel](https://cognitivescale.github.io/cortex-certifai/certifai-api-ref-1.2.14/certifai.scanner.builder.html#certifai.scanner.builder.CertifaiModel)

or `?CertifaiModel`

In [None]:
# Create a Certifai Model Object using the web service (from earlier) by passing the deployed web service url
first_model = CertifaiModel('SVM',
                            predict_endpoint=service_svm_uri)
scan.add_model(first_model)

second_model = CertifaiModel('logistic',
                            predict_endpoint=service_logistic_uri)
scan.add_model(second_model)

# Add corresponding model headers for service authentication and content-type

# add the default headers applicable to all models
scan.add_model_header(header_name='Content-Type',header_value='application/json')

# add defined headers corresponding to auth keys for respective model services
scan.add_model_header(header_name='Authorization', header_value=f'Bearer {service_svm_keys[0]}', model_id='SVM')
scan.add_model_header(header_name='Authorization', header_value=f'Bearer {service_logistic_keys[0]}', model_id='logistic')



### add the evaluation dataset to scan object

- `evaluation dataset` dataset to be used by cortex certifai to evaluate the model against

In [None]:
# create an evaluation object and pass the evaluation dataset(csv) here 
eval_dataset = CertifaiDataset('evaluation',
                               CertifaiDatasetSource.csv(url=f'abfs://{az_container_name}/{az_german_credit_blob_name}'))
scan.add_dataset(eval_dataset)

### evaluating model fairness 

- add `fairness` as evaluation type to scan object
- create an `evaluation_dataset_id` to refer to added evaluation datset

In [None]:
# Setup an evaluation for fairness on the above dataset using the model
# We'll look at disparity between groups defined by marital status and age
scan.add_fairness_grouping_feature(CertifaiGroupingFeature('age'))
scan.add_fairness_grouping_feature(CertifaiGroupingFeature('status'))
scan.add_evaluation_type('fairness')
scan.evaluation_dataset_id = 'evaluation'

In [None]:
# Because the dataset contains a ground truth outcome column which the model does not
# expect to receive as input we need to state that in the dataset schema (since it cannot
# be inferred from the CSV)
scan.dataset_schema.outcome_feature_name = 'outcome'

In [None]:
# Check the status of the triggered remote scan job
!certifai remote logs -a cpro-az -n certifai-scanner-9a695932

## View the reports from this Remote Scan

Now, head on over to the URL to the Certifai Console of the Certifai Pro VM instance we created earlier in this tutorial. Use the `User Icon` on the top right and select `Storage Settings` from the dropdown. Update the `Scan Directory` field to the `reports_folder` variable configured in the previous cell. Please omit `abfs://` while pasting this variable's value in the `Scan Directory` field. 

Now, save your settings and wait while the page reloads and loads reports from the previous scan.

In [None]:
local_scan_definition_file = 'data/german_credit_scan_definition.yaml'
model_headers_template = f"""
model_headers:
  default:
  - name: Content-Type
    value: application/json
  - name: accept
    value: application/json
  defined:
  - model_id: SVM
    name: Authorization
    value: Bearer {service_svm_keys[0]}
  - model_id: logistic
    name: Authorization
    value: Bearer {service_logistic_keys[0]}'
"""

with open(local_scan_definition_file, 'w') as f:
    scan.save(f)
# we also need to add the model headers section separately
with open(local_scan_definition_file, 'a') as f:
    f.write(model_headers_template)


In [None]:
# run a remote scan
from certifai.client.remote import remote_scan

reports_folder = f'abfs://{az_container_name}/az_certifai_examples/reports'

!certifai remote scan --alias cpro-az --definition-file data/german_credit_scan_definition.yaml --output abfs://pkandarpa/az_certifai_examples/reports


## Resource Cleanup


- In this tutorial we
 - created and registered `logistic_model_azure` and `svm_model_azure` models to our Azure workspace
 - created `german-credit-logistic-service` and `german-credit-svm-service` ACI (Azure Container Instance) webservices 

- Once Cortex Certifai evaluation is complete, make sure to clear all azure resources in order to avoid cost
- Follow the [Azure Ml resource cleanup docs][1] to remove all resources created above

[1]:https://docs.microsoft.com/en-us/azure/machine-learning/tutorial-1st-experiment-sdk-train#clean-up-resources