# Conditioned Insights

## Overview

Howso Engine enables powerful prediction insights through Howso's multitude of prediction stats views. includes a variety of ways to view and condition condition statistcs

For this recipe, we will use the `Adult` dataset and explore how the prediction performance differs by `sex`.

In [1]:
import os

import numpy as np
import pandas as pd
import plotly.io as pio
import plotly.express as px
from pmlb import fetch_data

from howso.engine import Trainee
from howso.utilities import infer_feature_attributes

pio.renderers.default = os.getenv("HOWSO_RECIPE_RENDERER", "notebook")

# Section 1: Load, Train, Analyze

The [basic workflow guide](https://docs.howso.com/user_guide/basics/basic_workflow.html) goes into more specifics about the individual details of this section. This recipe will focus more on the insights.

In [18]:
df = fetch_data('adult', local_cache_dir="../../data/adult")

train_data = df.iloc[:-30]
new_data = df[~df.index.isin(train_data.index)]
features = infer_feature_attributes(train_data)

# Subsample the data to ensure the example runs quickly
df = df.sample(2000)

# Split out the last row for a prediction set and drop the Action Feature
test_case = df.iloc[[-1]].copy()

# Set the sex to always be female
test_case.at[0, 'sex'] = 0

df.drop(df.index[-1], inplace=True)
test_case = test_case.drop('target', axis=1)

In [17]:
type(test_case)

pandas.core.frame.DataFrame

In [3]:
t = Trainee(features=features)

action_features = ['target']
context_features = features.get_names(without=action_features)

t.train(df)

t.analyze(context_features=context_features, action_features=action_features)

t.react_into_trainee(action_feature=action_features[0], residuals=True)

The following parameters from configuration file will override the Amalgam parameters set in the code: {'library_path', 'trace'}


# Prediction Stats

Howso provides a variety of prediction stats based on the variable type, whether it is continuous or nominal. These prediction stats gives insight into the predictive performance of the Trainee as well as insights into the data itself. 

The [bias_mitigation.ipynb](https://github.com/howsoai/howso-engine-recipes/blob/main/recipes/2-Workflows/bias_mitigation/bias_mitigation.ipynb) recipe highlights looking for bias in the data through the analysis of individual features. This recipe demonstrates how insights like overall data bias can be explored.

### Global Prediction Stats


Global prediction stats provides an overall view on how accurate the Trainee is. This can be a great introductory look into the usability of the data and Trainee. Most machine learning models are evaluated on similiar global metrics.

For more information on the Global statistics, see the [Global vs Local](https://docs.howso.com/user_guide/concepts/global_vs_local.html) documentation.


In [4]:
global_stats = t.get_prediction_stats()['target']
global_stats

accuracy          0.840000
mcc               0.520827
precision         0.725543
recall            0.800676
mae               0.267338
spearman_coeff         NaN
rmse                   NaN
r2                     NaN
Name: target, dtype: float64

### Local Prediction Stats

In addition to global prediction stats, Howso has the unique ability to provide local prediction stats. These prediction stats are more finely tuned to the exact cases in which you are interested in. This can be extremely valuable for workflows such as data exploration and it provides more nuanced performance metrics. The more variance there is in the performance of dataset from region to region, the more powerful this ability is.

In [5]:
results = t.react(
    test_case,
    context_features=context_features,
    action_features=action_features,
    details = {
        "prediction_stats": True
    }
)

local_stats = results['details']['prediction_stats'][0]
local_stats['accuracy'][action_features[0]]

0.7941176470588235

We can see how the local prediction stats differ from the global stats. Using local metrics, we can drill into specific cases. For example, in our test case which is manually set to `female`, we can get insights into how well the Trainee fits similiar cases. In the local space, this does not gurantee all of the other cases in the local space are also female, however it increases the chances that they are.

If we see that the local accuracy drops off a lot from the global accuracy, this can demonstrate that in this region of the data, the Trainee and data do not have the same predictive performance as the rest of the Trainee and data.

# Conditioned Prediction Stats

Zooming back out to global prediction stats, Howso also has the ability the condition the prediction stats by providing conditions for the context set and action set. Similar to context features and action featuers, the context set is the set queried to make predictions and the action set is the set that the predictions are made for. In other words, you use the data from the context set is the known data and the action set is the unknown data in which you are trying to predict using the context set.

This conditioning gives us the ability to segment the context and action sets to give us more detailed insights in the relationship and Trainee predictive performance between groups.

### Action Condition Only

By conditioning on an action condition only, prediction stats for a certain segment are returned by holding out each case from the action set respectively and making predictions in a Leave One Out (LOO) fashion. Thus, using this method, the context set consists of every case except the actual case being predicted at the time. After each case from the action set is predicted, the case is returned to the context set, making it available to be queried for other predictions.

In [None]:
sex_0_accuracy = t.get_prediction_stats(action_condition={'sex': 0})['target']['accuracy'].round(2)
sex_1_accuracy = t.get_prediction_stats(action_condition={'sex': 1})['target']['accuracy'].round(2)
print(f"Sex 0 accuracy: {sex_0_accuracy}")
print(f"Sex 1 accuracy: {sex_1_accuracy}")

If we see a difference in the model performance between males and females, this may indicate that there may be further bias that needs to be investigated and/or this Trainee is not suitable for inference on both sexes.

If this Trainee and data were used to make decisions on loan acceptance based on income, these differences in performance may indicate that this Trainee and data may be suitable for predictions on males but not females. This can lead to actions like gathering better data that captures the characteristics of females better.

### Action Condition and Context Condition

Another way to see how different groups can differ is by specifying both action and context conditions. Unlike when only the action condition is specified, specifying both conditions will hold out `ALL` of the action set. After a case from the action set is used to make a prediction, unlike when just the action condition is specified, that case will continue to be held out. Thus, using this method, the cases from the context set and action set remain separate during the entire process.

In [26]:
sex_1_conditioned_accuracy = t.get_prediction_stats(context_condition={'sex': 0}, action_condition={'sex': 1})['target']['accuracy'].round(2)
sex_0_conditioned_accuracy = t.get_prediction_stats(context_condition={'sex': 1}, action_condition={'sex': 0})['target']['accuracy'].round(2)
print(f"Sex 1 accuracy: {sex_1_conditioned_accuracy}")
print(f"Sex 0 accuracy: {sex_0_conditioned_accuracy}")

Sex 1 accuracy: 0.69
Sex 0 accuracy: 0.87


This method may reveal extra insights that conditioning on the action set may not. Since the entire action set is being held out, the prediction stats are based purely on the context set. Differences in the sets may also be explored using this method. For example, if we see that when only cases that are males are used to predict cases that are females, and the performance drops greatly from the global predicition stats, then this provides further indication in the difference in the underlying data and relationships among females vs males. A difference in the performance may also shed light into 