## Introduction

This tutorial introduces Aito's explanablility and its different query languages to developers who are interested in an AI system that can explain its result and allow the user to make ad-hoc adjustments. The tutorial does not requires background in data science.

The tutorial uses the [German Credit Dataset](https://archive.ics.uci.edu/ml/datasets/statlog+(german+credit+data)) to predict people credit rating based on a set of attributes.

## Predict with explanation

Aito offers [Predict API](https://aito.ai/docs/api/) which predict a feature based on a set of hypothesis. In this example, the predicting feature is credit rating and the hypothesis is the different attributes. An example query is presented below with the attribute given in the `"where"` clause. We will run the query using the [AitoClient](https://aitodotai.github.io/aito-python-tools/api/aito_client.html?highlight=aitoclient#) from the [Aito Python SDK](https://aitodotai.github.io/aito-python-tools/index.html)

In [None]:
!pip install aitoai

In [1]:
from aito.sdk.aito_client import AitoClient
import json

In [2]:
client = AitoClient('https://public-1.api.aito.ai', 'bvss2i2dIkaWUfBCdzEO89LpxUkwO3A24hYg8MBq')
predict_query = {
    "from": "german_credit_rating",
    "where": {
        "existing_checking_account_status": "< 0",
        "duration": {"$numeric": 48},
        "credit_history": "existing credits paid back duly till now",
        "purpose": "business",
        "credit_amount": {"$numeric": 4308},
        "savings_account_or_bonds": "< 100",
        "present_employment_since": "< 1",
        "installment_rate_in_percentage_of_disposable_income": 3,
        "personal_status_and_sex": "female : divorced \/ separated \/ married",
        "other_debtors_or_guarantors": "none",
        "present_residence_since": 4,
        "property": "building society savings agreement \/ life insurance",
        "age": {"$numeric": 24},
        "other_installment_plans": "none",
        "housing": "rent",
        "number_of_existing_credits_at_this_bank": 1,
        "job": "skilled employee \/ official",
        "number_of_people_being_liable_to_provide_maintenance_for": 1,
        "telephone": "none",
        "foreign_worker": "yes",
    },
    "predict": "credit_rating",
}
resp = client.request('POST', '/api/v1/_predict', predict_query)
print(json.dumps(resp, indent=4))

{
    "offset": 0,
    "total": 2,
    "hits": [
        {
            "$p": 0.7751513479468853,
            "field": "credit_rating",
            "feature": "bad"
        },
        {
            "$p": 0.22484865205311466,
            "field": "credit_rating",
            "feature": "good"
        }
    ]
}


Aito predict that the person would have a good credit rating with a probability of approximately 77.5%

The customer might feel this is unfair and want to understand why the model would think so. It is possible to open Aito's explanation by simply adding the [$why](https://aito.ai/docs/api/#schema-why-field)

In [3]:
predict_query_with_expl = {**predict_query, 'select': ['$p', 'feature', '$why']}
resp_with_expl = client.request('POST', '/api/v1/_predict', predict_query_with_expl)
print(json.dumps(resp_with_expl, indent=4))

{
    "offset": 0,
    "total": 2,
    "hits": [
        {
            "$p": 0.7751513479468853,
            "feature": "bad",
            "$why": {
                "type": "product",
                "factors": [
                    {
                        "type": "baseP",
                        "value": 0.3003992015968064,
                        "proposition": {
                            "credit_rating": {
                                "$has": "bad"
                            }
                        }
                    },
                    {
                        "type": "product",
                        "factors": [
                            {
                                "type": "normalizer",
                                "name": "exclusiveness",
                                "value": 0.9520592512864998
                            },
                            {
                                "type": "normalizer",
                                "name": 

We can see the different components that contribute to the prediction. For instance, the following components:
```json
{
    "type": "relatedPropositionLift",
    "proposition": {
        "existing_checking_account_status": {
            "$has": "< 0"
        }
    },
    "value": 1.6374480578376625
}
```
explains that when because the customer does not have money in the existing account, it is 63% more likely That the customer has a bad credit rating.

At this point, the customer might not want the model to use the existing account status. This can be done by simple removing the hypothesis.

In [4]:
predict_query_filtered = predict_query_with_expl.copy()
predict_query_filtered['where'].pop('existing_checking_account_status')
resp_filtered = client.request('POST', '/api/v1/_predict', predict_query_filtered)
print(json.dumps(resp_filtered, indent=4))

{
    "offset": 0,
    "total": 2,
    "hits": [
        {
            "$p": 0.6165272150554144,
            "feature": "bad",
            "$why": {
                "type": "product",
                "factors": [
                    {
                        "type": "baseP",
                        "value": 0.3003992015968064,
                        "proposition": {
                            "credit_rating": {
                                "$has": "bad"
                            }
                        }
                    },
                    {
                        "type": "product",
                        "factors": [
                            {
                                "type": "normalizer",
                                "name": "exclusiveness",
                                "value": 0.9336904833350448
                            },
                            {
                                "type": "normalizer",
                                "name": 

## Instant Evaluation

Removing a hypothesis might reduces the accuracy. It's crucial that the system is able to notify that the accuracy drops and it might be necessary for a human check. 
Aito supports this function with the [Evaluate API](https://aito.ai/docs/api/#post-api-v1-evaluate) which seperates the data into the train and test set and evaluate on the test set.
The following query evaluate the model performance of the original query on 10% of the data

In [8]:
eval_orig_query = {
    "test": {"$index": {"$mod": [10, 0]}},
    "evaluate": {
        "from": "german_credit_rating",
        "where": {
            "existing_checking_account_status": {"$get": "existing_checking_account_status"},
            "duration": {"$numeric": {"$get": "duration"}},
            "credit_history": {"$get": "credit_history"},
            "purpose": {"$get": "purpose"},
            "credit_amount": {"$numeric": {"$get": "credit_amount"}},
            "savings_account_or_bonds": {"$get": "savings_account_or_bonds"},
            "present_employment_since": {"$get": "present_employment_since"},
            "installment_rate_in_percentage_of_disposable_income": {"$get": "installment_rate_in_percentage_of_disposable_income"},
            "personal_status_and_sex": {"$get": "personal_status_and_sex"},
            "other_debtors_or_guarantors": {"$get": "other_debtors_or_guarantors"},
            "present_residence_since": {"$get": "present_residence_since"},
            "property": {"$get": "property"},
            "age": {"$numeric": {"$get": "age"}},
            "other_installment_plans": {"$get": "other_installment_plans"},
            "housing": {"$get": "housing"},
            "number_of_existing_credits_at_this_bank": {"$get": "number_of_existing_credits_at_this_bank"},
            "job": {"$get": "job"},
            "number_of_people_being_liable_to_provide_maintenance_for": {"$get": "number_of_people_being_liable_to_provide_maintenance_for"},
            "telephone": {"$get": "telephone"},
            "foreign_worker": {"$get": "foreign_worker"},
        },
        "predict": "credit_rating"
    }
}

In [9]:
eval_resp = client.job_request('/api/v1/jobs/_evaluate', eval_orig_query)
print(json.dumps(eval_resp, indent=4))

{
    "mxe": 0.7908998174190437,
    "baseAccuracy": 0.75,
    "meanUs": 22499.24575,
    "accuracyGain": 0.030000000000000027,
    "n": 100,
    "medianUs": 18931.018,
    "accurateOffsets": [
        0,
        2,
        3,
        4,
        5,
        6,
        7,
        9,
        10,
        11,
        14,
        15,
        16,
        17,
        20,
        21,
        22,
        25,
        26,
        27,
        28,
        29,
        30,
        33,
        34,
        35,
        36,
        37,
        38,
        39,
        40,
        41,
        42,
        43,
        44,
        45,
        46,
        48,
        49,
        50,
        51,
        52,
        53,
        55,
        56,
        57,
        59,
        60,
        61,
        62,
        63,
        64,
        66,
        67,
        68,
        69,
        71,
        73,
        75,
        76,
        77,
        80,
        81,
        82,
        83,
        84,
        86,
        87

With the original query, Aito reach an accuracy of 78%

In [10]:
eval_filterd_query = eval_orig_query.copy()
eval_filterd_query['evaluate']['where'].pop('existing_checking_account_status')
eval_filtered_query_resp = client.job_request('/api/v1/jobs/_evaluate', eval_filterd_query)
print(json.dumps(eval_filtered_query_resp, indent=4))

{
    "mxe": 0.85463675334963,
    "baseAccuracy": 0.75,
    "meanUs": 17865.90243,
    "accuracyGain": -0.020000000000000018,
    "n": 100,
    "medianUs": 15218.562,
    "accurateOffsets": [
        0,
        2,
        3,
        4,
        5,
        6,
        7,
        9,
        10,
        11,
        14,
        15,
        16,
        17,
        20,
        21,
        22,
        25,
        26,
        27,
        28,
        29,
        30,
        33,
        34,
        35,
        36,
        37,
        38,
        39,
        40,
        41,
        42,
        43,
        44,
        45,
        46,
        48,
        49,
        52,
        53,
        55,
        56,
        59,
        60,
        62,
        63,
        66,
        67,
        68,
        69,
        71,
        73,
        74,
        75,
        76,
        77,
        80,
        81,
        82,
        83,
        86,
        87,
        88,
        89,
        91,
        92,
        93,

When removing the existing checking account information, the accuracy drops to 73%

### Summary

We believe that explanability is a crucial aspect in making a fair AI system. We hope that this example is able to demonstrate how Aito can allow the user to determine whetherthe system is fair and make adjustments for a fairer results.