### What is fairness and why do we care?

One of the most, if not the most important subject in the field of artificial intelligence (AI) is fairness. AI fairness has gained traction in the past few years due to the increasing number of incidents of AI systems making unfair decisions. We can see the trend of AI fairness in this [comparison graph](https://trends.google.com/trends/explore?date=all&q=ai%20fairness,fainess) of the interest over time of the search term "AI" and the search term "AI Fairness".

While we all have a general idea of what is fair, there is no universal definition of fairness. Fairness depends on the context of the situation. For instance, a model that rates your job application based on your gender can be considered as unfair, while a model that recommends your diet should take into account your gender (if you allow it). We believe that there is no single metric that can tell whether an AI system is fair or unfair. Instead, the system should be as transparent and as agile as possible so that not only the developers but also the users can decide the system's fairness and make adjustments to the system. 

Aito cares about fairness and we would like to enable developers to create a fair system by:
- Detecting bias in the training data
- Exposing the model's reasoning behind each result
- Allowing full control of the model and make adjustments to achieve fairer results.

#### Detecting bias

Before building an AI system, we should try to detect if there is any bias in the training data. This is a cost-efficient approach since it would avoid having to readjust or even reimplement the system after the system is built if it is discovered to be unfair.

One way to detect the bias is to use the statistical parity different metric. Assume that you have a "protected" variable in your data. In the example below, we use the [Germand Credit Dataset](http://archive.ics.uci.edu/ml/datasets/Statlog+%28German+Credit+Data%29) and the protected variable is people with old age (age >= 65) because we are afraid that the model would give higher credit risk to people with old age. The statistical parity, in this case, would be the difference between the probability that people with old age given a bad credit rating and the probability that people not with old age given a bad credit rating. It can be formularized as:
```
    P(bad credit rating | old age) - P(bad credit rating | ! old age)
```
We can use the [Relate Query](https://aito.ai/docs/api/#post-api-v1-relate) to calculate the statistical parity simply be taking the difference between the ```ps.OnCondition``` and the ```ps.OnNotConditionField```

In [1]:
from aito.sdk.aito_client import AitoClient
import json
client = AitoClient('http://localhost:9005', '')
relate_query = {
    'from': 'german_credit_rating',
    'where': {
        'age': {'$gte': 65}
    },
    'relate': {'credit_rating': 'bad'}
}
resp = client.request('POST', '/api/v1/_relate', relate_query)
print(json.dumps(resp, indent=4))

{
    "offset": 0,
    "total": 1,
    "hits": [
        {
            "related": {
                "credit_rating": {
                    "$has": "bad"
                }
            },
            "lift": 0.9042169024575606,
            "condition": {
                "age": {
                    "$lte": 65
                }
            },
            "fs": {
                "f": 300.0,
                "fOnCondition": 6.0,
                "fOnNotCondition": 294.0,
                "fCondition": 23.0,
                "n": 1000.0
            },
            "ps": {
                "p": 0.3003992015968064,
                "pOnCondition": 0.27162603556858855,
                "pOnNotCondition": 0.30078340248505925,
                "pCondition": 0.023251498292873325
            },
            "info": {
                "h": 0.8817783329695427,
                "mi": 0.0028968828765771684,
                "miTrue": -0.03945618301348357,
                "miFalse": 0.042353065890060736
            

In [2]:
statistical_parity = resp['hits'][0]['ps']['pOnCondition'] - resp['hits'][0]['ps']['pOnNotCondition']
statistical_parity

-0.029157366916470695

The ideal statistical parity would be 0. Usually, we aim for a value in the range of (-0.1, 0.1).
In this case, a negative value of approximately -0.3 means that credit rating is biased against people with old age. 
This is an example of using Aito as a statistical machine to calculate the fairness metrics of the training data.

#### Fairness by reasoning

Further than detecting the bias and make adjustments to the training data, we envision a world where not only the system developers but also the end-users of the system have full knowledge of the reasoning behind each model's decision. Furthermore, they would be able to use that knowledge to make an ad-hoc change to the model. Aito realizes this vision by its reasoning layer and its query language.

Let's first take a look at the reasoning layer of Aito. The example below show how Aito explains the credit rating prediction using the given customer information (which is used in the ``where`` clause of the prediction query)

In [3]:
predict_query = {
    "from": "german_credit_rating",
    "where": {
        "age": { "$numeric": 25 },
        "credit_amount": { "$numeric": 1295 },
        "credit_history": "existing credits paid back duly till now",
        "duration": { "$numeric": 12 },
        "existing_checking_account_status": "[0, 200)",
        "foreign_worker": "yes",
        "housing": "rent",
        "installment_rate_in_percentage_of_disposable_income": { "$numeric": 3 },
        "job": "skilled employee / official",
        "number_of_existing_credits_at_this_bank": { "$numeric": 1 },
        "number_of_people_being_liable_to_provide_maintenance_for": { "$numeric": 1 },
        "other_debtors_or_guarantors": "none",
        "other_installment_plans": "none",
        "personal_status_and_sex": "female : divorced / separated / married",
        "present_employment_since": "< 1",
        "present_residence_since": { "$numeric": 1 },
        "property": "car or other, not in attribute 6",
        "purpose": "car (new)",
        "savings_account_or_bonds": "< 100",
        "telephone": "none"
    },
    "predict": "credit_rating",
    "select": ["$p", "feature", "$why"]}
resp = client.request('POST', '/api/v1/_predict', predict_query)
print(json.dumps(resp, indent=4))

{
    "offset": 0,
    "total": 2,
    "hits": [
        {
            "$p": 0.7700940269366429,
            "feature": "bad",
            "$why": {
                "type": "product",
                "factors": [
                    {
                        "type": "baseP",
                        "value": 0.3003992015968064,
                        "proposition": {
                            "credit_rating": {
                                "$has": "bad"
                            }
                        }
                    },
                    {
                        "type": "normalizer",
                        "name": "exclusiveness",
                        "value": 1.0107261952705602
                    },
                    {
                        "type": "relatedPropositionLift",
                        "proposition": {
                            "existing_checking_account_status": {
                                "$has": "[0, 200)"
                            

From the result, you can see that the model predicts that the given customer would have a bad credit rating with a probability of 0.77 (```"$p": 0.7700940269366429```)  and the explanation is given behind in the ```factor``` field.

One of the explanation factors is:

```
{
    "type": "relatedPropositionLift",
    "proposition": {
        "age": {
            "$numeric": 25.0
        }
    },
    "value": 1.2571934477042366
}
```

which means that because the age of the customer is approximately 25, it increases the chance of having a bad credit rating by 25%

Assuming that the system shows this explanation to the end-users and they want the model to opt-out their age, the system simply has to change the query language by removing the age proposition in the ```where```clause of the query and get the new prediction

In [4]:
predict_query['where'].pop('age')
resp = client.request('POST', '/api/v1/_predict', predict_query)
print(json.dumps(resp, indent=4))

{
    "offset": 0,
    "total": 2,
    "hits": [
        {
            "$p": 0.7003891527556237,
            "feature": "bad",
            "$why": {
                "type": "product",
                "factors": [
                    {
                        "type": "baseP",
                        "value": 0.3003992015968064,
                        "proposition": {
                            "credit_rating": {
                                "$has": "bad"
                            }
                        }
                    },
                    {
                        "type": "normalizer",
                        "name": "exclusiveness",
                        "value": 1.009684912758094
                    },
                    {
                        "type": "relatedPropositionLift",
                        "proposition": {
                            "existing_checking_account_status": {
                                "$has": "[0, 200)"
                            }

We also would like to know the impact of opting out the age proposition on the prediction accuracy. This can be done with the [Evaluate Query](https://aito.ai/docs/api/#post-api-v1-evaluate)

In [5]:
import numpy as np

def k_fold_evaluation_predict_credit_rating(k, propositions):
    accuracy = []
    
    for batch_idx in range(k):
        print(f'evaluating batch {batch_idx}')
        eval_query = {
            'test': {'$index': {'$mod': [k, batch_idx]}},
            'evaluate': {
                'from': 'german_credit_rating',
                'where': propositions,
                'predict': 'credit_rating'
            }
        }
        resp = client.job_request('/api/v1/jobs/_evaluate', eval_query)
        accuracy.append(resp['accuracy'])
    return f'{np.average(accuracy)} +- {np.std(accuracy)}'


In [None]:
columns_schema = client.get_table_schema('german_credit_rating')['columns']
proposition_cols = [col for col in columns_schema if col != 'credit_rating']
original_propositions = {
    col: ({'$numeric': {'$get': col}} if columns_schema[col]['type'] == 'Int' else {'$get': col})  
    for col in proposition_cols
}
print(f'Original accuracy: {k_fold_evaluation_predict_credit_rating(10, original_propositions)}')
opted_age_propositions = {field: original_propositions[field] for field in original_propositions if field != 'age'}
print(f'Opted out age accuracy: {k_fold_evaluation_predict_credit_rating(10, opted_age_propositions)}')

evaluating batch 0
evaluating batch 1
evaluating batch 2
evaluating batch 3
evaluating batch 4
evaluating batch 5


### Summary

While there is no single approach, tool, or product that can ensure that you would build a fair AI system, we hope that Aito would be an useful tool to promote fairness consideration at every step of developing an AI system, from the training data to develop a transparent and agile model.