# Counterfactual explanations

In [1]:
import trustyai

trustyai.init(
    # JVM settings specific for mybinder.org's limitations
    "-Xmx128m", "-XX:+UseG1GC", "-XX:MaxGCPauseMillis=1000",
    path=[
        "../dep/org/kie/kogito/explainability-core/1.8.0.Final/*",
        "../dep/org/slf4j/slf4j-api/1.7.30/slf4j-api-1.7.30.jar",
        "../dep/org/apache/commons/commons-lang3/3.12.0/commons-lang3-3.12.0.jar",
        "../dep/org/optaplanner/optaplanner-core/8.8.0.Final/optaplanner-core-8.8.0.Final.jar",
        "../dep/org/apache/commons/commons-math3/3.6.1/commons-math3-3.6.1.jar",
        "../dep/org/kie/kie-api/7.55.0.Final/kie-api-7.55.0.Final.jar",
        "../dep/io/micrometer/micrometer-core/1.6.6/micrometer-core-1.6.6.jar",
    ]
)

## Simple example

We start by defining our black-box model, typically represented by

$$
f(\mathbf{x}) = \mathbf{y}
$$

Where $\mathbf{x}=\{x_1, x_2, \dots,x_m\}$ and $\mathbf{y}=\{y_1, y_2, \dots,y_n\}$.

Our example toy model, in this case, takes an all-numerical input $\mathbf{x}$ and return a $\mathbf{y}$ of either `true` or `false` if the sum of the $\mathbf{x}$ components is within a threshold $\epsilon$ of a point $\mathbf{C}$, that is:

$$
f(\mathbf{x}, \epsilon, \mathbf{C})=\begin{cases}
\text{true},\qquad \text{if}\ \mathbf{C}-\epsilon<\sum_{i=1}^m x_i <\mathbf{C}+\epsilon \\
\text{false},\qquad \text{otherwise}
\end{cases}
$$

This model is provided in the `TestUtils` module. We instantiate with a $\mathbf{C}=500$ and $\epsilon=1.0$.

In [2]:
from trustyai.utils import TestUtils

center = 500.0
epsilon = 1.0

model = TestUtils.getSumThresholdModel(center, epsilon)

Next we need to define a **goal**.
If our model is $f(\mathbf{x'})=\mathbf{y'}$ we are then defining our $\mathbf{y'}$ and the counterfactual result will be the $\mathbf{x'}$ which satisfies $f(\mathbf{x'})=\mathbf{y'}$.

We will define our goal as `true`, that is, the sum is withing the vicinity of a (to be defined) point $\mathbf{C}$. The goal is a list of `Output` which take the following parameters

- The feature name
- The feature type
- The feature value (wrapped in `Value`)
- A confidence threshold, which we will leave at zero (no threshold)

In [3]:
from trustyai.model import Output, Type, Value

goal = [Output("inside", Type.BOOLEAN, Value(True), 0.0)]

We will now define our initial features, $\mathbf{x}$. Each feature can be instantiated by using `FeatureFactory` and in this case we want to use numerical features, so we'll use `FeatureFactory.newNumericalFeature`.

In [4]:
import random
from trustyai.model import FeatureFactory

features = [
    FeatureFactory.newNumericalFeature(f"x{i+1}", random.random() * 10.0)
    for i in range(4)
]

As we can see, the sum of of the features will not be within $\epsilon$ (1.0) of $\mathbf{C}$ (500.0). As such the model prediction will be `false`:

In [5]:
feature_sum = 0.0
for f in features:
    value = f.getValue().asNumber()
    print(f"Feature {f.getName()} has value {value}")
    feature_sum += value
print(f"\nFeatures sum is {feature_sum}")

Feature x1 has value 1.0797757631109761
Feature x2 has value 7.783907841267885
Feature x3 has value 0.8743785923461056
Feature x4 has value 9.467068375377035

Features sum is 19.205130572102004


The next step is to specify the **constraints** of the features, i.e. which features can be changed and which should be fixed. Since we want all features to be able to change, we specify `False` for all of them:

In [6]:
constraints = [False] * 4

Finally, we also specify which are the **bounds** for the counterfactual search. Typically this can be set either using domain-specific knowledge or taken from the data. In this case we simply specify an arbitrary (sensible) value, e.g. all the features can vary between `0` and `1000`.

In [7]:
from trustyai.model.domain import NumericalFeatureDomain

feature_boundaries = [NumericalFeatureDomain.create(0.0, 1000.0)] * 4

In order to use the boundaries in the explainer we need to wrap all of them in a `DataDomain` class:

In [8]:
from trustyai.model import DataDomain

data_domain = DataDomain(feature_boundaries)

We can now instantiate the **explainer** itself.

To do so, we will to configure the termination criteria. For this example we will specify that the counterfactual search should only execute a maximum of 10,000 iterations before stopping and returning whatever the best result is so far.

In [9]:
from org.optaplanner.core.config.solver.termination import TerminationConfig
from org.kie.kogito.explainability.local.counterfactual import (
    CounterfactualConfigurationFactory,
)
from java.lang import Long

termination_config = TerminationConfig().withScoreCalculationCountLimit(
    Long.valueOf(10_000)
)

solver_config = (
    CounterfactualConfigurationFactory.builder()
    .withTerminationConfig(termination_config)
    .build()
)

We can can now instantiate the explainer itself using `CounterfactualExplainer` and our `solver_config` configuration.

In [10]:
from org.kie.kogito.explainability.local.counterfactual import CounterfactualExplainer

explainer = CounterfactualExplainer.builder().withSolverConfig(solver_config).build()

SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details.


We will now express the counterfactual problem as defined above.

- `original` represents our $\mathbf{x}$ which know gives a prediction of `False`
- `goals` represents our $\mathbf{y'}$, that is our desired prediction (`True`)
- `domain` repreents the boundaries for the counterfactual search

In [11]:
from trustyai.model import PredictionFeatureDomain, PredictionInput, PredictionOutput

original = PredictionInput(features)
goals = PredictionOutput(goal)
domain = PredictionFeatureDomain(data_domain.getFeatureDomains())

We wrap these quantities in a `CounterfactualPrediction` (the UUID is simply to label the search instance):

In [12]:
import uuid
from trustyai.model import CounterfactualPrediction

prediction = CounterfactualPrediction(
    original, goals, domain, constraints, None, uuid.uuid4()
)

We now request the counterfactual $\mathbf{x'}$ which is closest to $\mathbf{x}$ and which satisfies $f(\mathbf{x'}, \epsilon, \mathbf{C})=\mathbf{y'}$:

In [13]:
explanation_async = explainer.explainAsync(prediction, model)

The counterfactual explainer API operates in a asynchronous way, so we need to `.get()` the result:

In [14]:
explanation = explanation_async.get()

We can see that the counterfactual $\mathbf{x'}$

In [15]:
feature_sum = 0.0
for entity in explanation.getEntities():
    print(entity)
    feature_sum += entity.getProposedValue()

print(f"\nFeature sum is {feature_sum}")

java.lang.DoubleFeature{value=1.0797757631109761, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x1'}
java.lang.DoubleFeature{value=7.783907841267885, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x2'}
java.lang.DoubleFeature{value=0.8743785923461056, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x3'}
java.lang.DoubleFeature{value=489.6443087689171, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x4'}

Feature sum is 499.38237096564205


### Constrained features

As we've seen, it is possible to constraint a specific feature $x_i$ by setting the _constraints_ list corresponding element to `True`.

In this example, we know want to fix $x_1$ and $x_4$. That is, these features should have the same value in the counterfactual $\mathbf{x'}$ as in the original $\mathbf{x}$.

In [16]:
constraints = [True, False, False, True]  # x1, x2, x3 and x4

We simply need to wrap the previous quantities with the new constraints:

In [17]:
prediction = CounterfactualPrediction(
    original, goals, domain, constraints, None, uuid.uuid4()
)

And request a new counterfactual explanation

In [18]:
explanation = explainer.explainAsync(prediction, model).get()

We can see that $x_1$ and $x_4$ has the same value as the original and the model satisfies the conditions.

In [19]:
print(f"Original x1: {features[0].getValue()}")
print(f"Original x4: {features[3].getValue()}\n")

for entity in explanation.getEntities():
    print(entity)

Original x1: 1.0797757631109761
Original x4: 9.467068375377035

java.lang.DoubleFeature{value=1.0797757631109761, intRangeMinimum=1.0797757631109761, intRangeMaximum=1.0797757631109761, id='x1'}
java.lang.DoubleFeature{value=487.8482733174738, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x2'}
java.lang.DoubleFeature{value=0.8743785923461056, intRangeMinimum=0.0, intRangeMaximum=1000.0, id='x3'}
java.lang.DoubleFeature{value=9.467068375377035, intRangeMinimum=9.467068375377035, intRangeMaximum=9.467068375377035, id='x4'}


## Using Python models



We will now show how to use a custom Python model with TrustyAI counterfactual explanations.

The model will be an [XGBoost](https://github.com/dmlc/xgboost) one trained with the `credit-bias` dataset (available [here](https://github.com/ruivieira/benchmark-models/tree/main/credit-bias)).

For convenience, the model is pre-trained and serialised with `joblib` so that for this example we simply need to deserialised it.

In [20]:
import joblib

xg_model = joblib.load("models/credit-bias-xgboost.joblib")
print(xg_model)

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              learning_rate=0.07, max_delta_step=0, max_depth=8,
              min_child_weight=1, missing=nan, monotone_constraints='()',
              n_estimators=200, n_jobs=12, num_parallel_tree=1, random_state=27,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=0.9861206227457426,
              seed=27, subsample=1, tree_method='exact', validate_parameters=1,
              verbosity=None)


This model has as a single **output** a boolean `PaidLoan`, which will predict whether a certain loan applicant will repay the loan in time or not. The model is slightly more complex than the previous examples, with **input** features:

|Input feature         | Type    | Note        |
|----------------------|---------|-------------|
|`NewCreditCustomer`   |boolean  ||
|`Amount`              |numerical||
|`Interest`            |numerical||
|`LoanDuration`        |numerical|In months|
|`Education`           |numerical|Level (1, 2, 3..)|
|`NrOfDependants`      |numerical|Integer|
|`EmploymentDurationCurrentEmployer`|numerical|Integer (years)|
|`IncomeFromPrincipalEmployer`|numerical||
|`IncomeFromPension`   |numerical||
|`IncomeFromFamilyAllowance`|numerical||
|`IncomeFromSocialWelfare`|numerical||
|`IncomeFromLeavePay`|numerical||
|`IncomeFromChildSupport`|numerical||
|`IncomeOther`|numerical||
|`ExistingLiabilities`|numerical|integer|
|`RefinanceLiabilities`|numerical|integer|
|`DebtToIncome`|numerical||
|`FreeCash`|numerical||
|`CreditScoreEeMini`|numerical|integer|
|`NoOfPreviousLoansBeforeLoan`|numerical|integer|
|`AmountOfPreviousLoansBeforeLoan`|numerical||
|`PreviousRepaymentsBeforeLoan`|numerical||
|`PreviousEarlyRepaymentsBefoleLoan`|numerical||
|`PreviousEarlyRepaymentsCountBeforeLoan`|numerical|integer|
|`Council_house`|boolean||
|`Homeless`|boolean||
|`Joint_ownership`|boolean||
|`Joint_tenant`|boolean||
|`Living_with_parents`|boolean||
|`Mortgage`|boolean||
|`Other`|boolean||
|`Owner`|boolean||
|`Owner_with_encumbrance`|boolean||
|`Tenant`|boolean||
|`Entrepreneur`|boolean||
|`Fully`|boolean||
|`Partially`|boolean||
|`Retiree`|boolean||
|`Self_employed`|boolean||

We will start by testing the model with an input we are quite sure (from the original data) that will be predicted as `false`:

In [21]:
x = [
    [
        False,
        2125.0,
        20.97,
        60,
        4.0,
        0.0,
        6.0,
        0.0,
        301.0,
        0.0,
        53.0,
        0.0,
        0.0,
        0.0,
        8,
        6,
        26.29,
        10.92,
        1000.0,
        1.0,
        500.0,
        590.95,
        0.0,
        0.0,
        0,
        0,
        0,
        0,
        0,
        0,
        0,
        1,
        0,
        0,
        0,
        0,
        0,
        1,
        0,
    ]
]

We can see that this application will be rejected with a probability of $\sim77\%$:

In [22]:
import numpy as np

print(xg_model.predict_proba(np.array(x)))
print(f"Paid loan is predicted as: {xg_model.predict(np.array(x))}")

[[0.7770493  0.22295067]]
Paid loan is predicted as: [False]


We will now prepare the XGBoost model to be used from the TrustyAI counterfactual engine.

To do so, we simply need to first create a prediction function which takes:

- A `java.util.List` of `PredictionInput` as inputs
- A `java.util.List` of `PredictionOutput` as outputs

If these two conditions are met, the actual inner working of this method can be anything (including calling a XGBoost Python model for prediction as in our case):

In [23]:
from typing import List
from trustyai.utils import toJList


def predict(inputs: List[PredictionInput]) -> List[PredictionOutput]:
    values = [feature.getValue().asNumber() for feature in inputs.get(0).getFeatures()]
    result = xg_model.predict_proba(np.array([values]))
    false_prob, true_prob = result[0]
    if false_prob > true_prob:
        prediction = (False, false_prob)
    else:
        prediction = (True, true_prob)
    output = Output("PaidLoan", Type.BOOLEAN, Value(prediction[0]), prediction[1])
    return toJList([PredictionOutput([output])])

Once the prediction method is created, we wrap in a `PredictionProvider` class.

This class takes care of all the JVM's asynchronous plumbing for us.

In [24]:
from trustyai.model import PredictionProvider

model = PredictionProvider(predict)

We will now express the previous inputs (`x`) in terms of `Feature`s, so that we might use it for the counterfactual search:

In [25]:
def make_feature(name, value):
    if type(value) is bool:
        return FeatureFactory.newBooleanFeature(name, value)
    else:
        return FeatureFactory.newNumericalFeature(name, value)


features = [
    make_feature(p[0], p[1])
    for p in [
        ("NewCreditCustomer", False),
        ("Amount", 2125.0),
        ("Interest", 20.97),
        ("LoanDuration", 60.0),
        ("Education", 4.0),
        ("NrOfDependants", 0.0),
        ("EmploymentDurationCurrentEmployer", 6.0),
        ("IncomeFromPrincipalEmployer", 0.0),
        ("IncomeFromPension", 301.0),
        ("IncomeFromFamilyAllowance", 0.0),
        ("IncomeFromSocialWelfare", 53.0),
        ("IncomeFromLeavePay", 0.0),
        ("IncomeFromChildSupport", 0.0),
        ("IncomeOther", 0.0),
        ("ExistingLiabilities", 8.0),
        ("RefinanceLiabilities", 6.0),
        ("DebtToIncome", 26.29),
        ("FreeCash", 10.92),
        ("CreditScoreEeMini", 1000.0),
        ("NoOfPreviousLoansBeforeLoan", 1.0),
        ("AmountOfPreviousLoansBeforeLoan", 500.0),
        ("PreviousRepaymentsBeforeLoan", 590.95),
        ("PreviousEarlyRepaymentsBefoleLoan", 0.0),
        ("PreviousEarlyRepaymentsCountBeforeLoan", 0.0),
        ("Council_house", False),
        ("Homeless", False),
        ("Joint_ownership", False),
        ("Joint_tenant", False),
        ("Living_with_parents", False),
        ("Mortgage", False),
        ("Other", False),
        ("Owner", False),
        ("Owner_with_encumbrance", True),
        ("Tenant", True),
        ("Entrepreneur", False),
        ("Fully", False),
        ("Partially", False),
        ("Retiree", True),
        ("Self_employed", False),
    ]
]

We can confirm now, with the newly created `PredictionProvider` model that this input will lead to a `false` `PaidLoan` prediction:

In [26]:
from trustyai.utils import toJList

model.predictAsync(toJList([PredictionInput(features)])).get()[0].getOutputs()[
    0
].toString()

'Output{value=false, type=boolean, score=0.7835956811904907, name='PaidLoan'}'

### Unconstraind basic search

To get started we will search for a counterfactual with no constraints at all. This is not a realistic use case, but we will use it as a baseline.

In [27]:
n_features = len(features)

constraints = [False] * n_features

We will also create a set of equal bounds for all the features. Again, this is not realistic, but we do it to establish a baseline. Note that boolean features will ignore the bounds anyway, so we can just create a set such as:

In [28]:
features_boundaries = [NumericalFeatureDomain.create(0.0, 10000.0)] * n_features

Next, we create a termination criteria for the search. We will use a 10 second time limit for the search:

In [29]:
termination_config = TerminationConfig().withSecondsSpentLimit(Long.valueOf(10))

solver_config = (
    CounterfactualConfigurationFactory.builder()
    .withTerminationConfig(termination_config)
    .build()
)

We can now instantiate the counterfactual explainer:

In [30]:
explainer = CounterfactualExplainer.builder().withSolverConfig(solver_config).build()

We want our **goal** to be the model predicting the loan will be paid (`PaidLoad=true`), so we specify it as:

In [31]:
goal = [Output("PaidLoan", Type.BOOLEAN, Value(True), 0.0)]

We now wrap all this context in a `CounterfactualPrediction` object

In [32]:
original = PredictionInput(features)
goals = PredictionOutput(goal)
domain = PredictionFeatureDomain(features_boundaries)

prediction = CounterfactualPrediction(
    original, goals, domain, constraints, None, uuid.uuid4()
)

We are now ready to search for a counterfactual:

In [33]:
explanation = explainer.explainAsync(prediction, model).get()

First we will confirm that our counterfactual changes the outcome, by predicting its outcome using the model:

In [34]:
testf = [f.asFeature() for f in explanation.getEntities()]
model.predictAsync(toJList([PredictionInput(testf)])).get()[0].getOutputs()[
    0
].toString()

'Output{value=true, type=boolean, score=0.6006738543510437, name='PaidLoan'}'

And indeed it changes. We will now verify _which_ features were changed:

In [35]:
def show_changes(explanation, original):
    entities = explanation.getEntities()
    N = len(original)
    for i in range(N):
        name = original[i].getName()
        original_value = original[i].getValue()
        new_value = entities[i].asFeature().getValue()
        if original_value != new_value:
            print(f"Feature '{name}': {original_value} -> {new_value}")


show_changes(explanation, features)

Feature 'IncomeFromSocialWelfare': 53.0 -> 53.31125429433703
Feature 'RefinanceLiabilities': 6.0 -> 1.230474777192958
Feature 'PreviousEarlyRepaymentsCountBeforeLoan': 0.0 -> 6.0
Feature 'Owner': false -> true
Feature 'Owner_with_encumbrance': true -> false


Here we can see the problem with the unconstrained search.

Some of the fields that were changed (_e.g._ `IncomeFromSocialWelfare`, `RefinanceLiabilities`, etc) might be difficult to change in practice.

### Constrained search

We will now try a more realistic search, which incorporates domain specific knowledge (and common sence).

To do so, we will constrain features we feel they shouldn't (or mustn't) change and specify sensible search bounds.
We will start with the constraints:

In [36]:
constraints = [
    True,  # NewCreditCustomer
    False,  # Amount
    True,  # Interest
    False,  # LoanDuration
    True,  # Education
    True,  # NrOfDependants
    False,  # EmploymentDurationCurrentEmployer
    False,  # IncomeFromPrincipalEmployer
    False,  # IncomeFromPension
    False,  # IncomeFromFamilyAllowance
    False,  # IncomeFromSocialWelfare
    False,  # IncomeFromLeavePay
    False,  # IncomeFromChildSupport
    False,  # IncomeOther
    True,  # ExistingLiabilities
    True,  # RefinanceLiabilities
    False,  # DebtToIncome
    False,  # FreeCash
    False,  # CreditScoreEeMini
    True,  # NoOfPreviousLoansBeforeLoan
    True,  # AmountOfPreviousLoansBeforeLoan
    True,  # PreviousRepaymentsBeforeLoan
    True,  # PreviousEarlyRepaymentsBefoleLoan
    True,  # PreviousEarlyRepaymentsCountBeforeLoan
    False,  # Council_house
    False,  # Homeless
    False,  # Joint_ownership
    False,  # Joint_tenant
    False,  # Living_with_parents
    False,  # Mortgage
    False,  # Other
    False,  # Owner
    False,  # Owner_with_encumbrance"
    False,  # Tenant
    False,  # Entrepreneur
    False,  # Fully
    False,  # Partially
    False,  # Retiree
    False,  # Self_employed
]

The constraints should be self-explanatory, but in essence they were divided into three groups

- Attributes you **cannot** or **should** not change (protected), for instance age, education level, etc
- Attributes you **can** change, for loan duration, loan amount, etc
- Attributes you probably won't be able to change, but might be informative to change. For instance, you might not be able to easily change your income, but you might be interested in how much would it need to be in order to get the prediction as favourable.

In [37]:
features_boundaries = [
    None,  # NewCreditCustomer
    NumericalFeatureDomain.create(0.0, 1000.0),  # Amount
    None,  # Interest
    NumericalFeatureDomain.create(0.0, 120.0),  # LoanDuration
    None,  # Education
    None,  # NrOfDependants
    NumericalFeatureDomain.create(0.0, 40.0),  # EmploymentDurationCurrentEmployer
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromPrincipalEmployer
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromPension
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromFamilyAllowance
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromSocialWelfare
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromLeavePay
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeFromChildSupport
    NumericalFeatureDomain.create(0.0, 1000.0),  # IncomeOthe
    None,  # ExistingLiabilities
    None,  # RefinanceLiabilities
    NumericalFeatureDomain.create(0.0, 100.0),  # DebtToIncome
    NumericalFeatureDomain.create(0.0, 100.0),  # FreeCash
    NumericalFeatureDomain.create(0.0, 10000.0),  # CreditScoreEeMini
    None,  # NoOfPreviousLoansBeforeLoan
    None,  # AmountOfPreviousLoansBeforeLoan
    None,  # PreviousRepaymentsBeforeLoan
    None,  # PreviousEarlyRepaymentsBefoleLoan
    None,  # PreviousEarlyRepaymentsCountBeforeLoan
    None,  # Council_house
    None,  # Homeless
    None,  # Joint_ownership
    None,  # Joint_tenant
    None,  # Living_with_parents
    None,  # Mortgage
    None,  # Other
    None,  # Owner
    None,  # Owner_with_encumbrance
    None,  # Tenant
    None,  # Entrepreneur
    None,  # Fully
    None,  # Partially
    None,  # Retiree
    None,  # Self_employed
]

As before, we wrap this data in a `CounterfactualPrediction`:

In [38]:
original = PredictionInput(features)
goals = PredictionOutput(goal)
domain = PredictionFeatureDomain(features_boundaries)

prediction = CounterfactualPrediction(
    original, goals, domain, constraints, None, uuid.uuid4()
)

And we start a new search:

In [39]:
explanation = explainer.explainAsync(prediction, model).get()

We test that the counterfactual does change the outcome:

In [40]:
testf = [f.asFeature() for f in explanation.getEntities()]
model.predictAsync(toJList([PredictionInput(testf)])).get()[0].getOutputs()[
    0
].toString()

'Output{value=true, type=boolean, score=0.5038489103317261, name='PaidLoan'}'

And we confirm that no constrained features were changed:

In [41]:
show_changes(explanation, features)

Feature 'LoanDuration': 60.0 -> 56.947228037333545
Feature 'EmploymentDurationCurrentEmployer': 6.0 -> 6.382654342876313
Feature 'IncomeFromSocialWelfare': 53.0 -> 60.0
Feature 'FreeCash': 10.92 -> 10.914352713171315


### Minimum counterfactual probabilities

We can see that the previous answer is very close to $50\%$.

With TrustyAI we have the possiblity to specify a minimum probability for the result (when the model supports prediction confidences).

Let's say we want a result that is at least $75\%$ confident that the loan will be repaid. We can just encode the **minimum probability** as the last argument of each `Output`. A minimum probability of $0$ (as we've used) simply means that any desired outcome will be accepted, regardless of its probability. 

In [42]:
goal = [Output("PaidLoan", Type.BOOLEAN, Value(True), 0.75)]

We can then re-run the search with all the data as defined previously:

In [43]:
original = PredictionInput(features)
goals = PredictionOutput(goal)
domain = PredictionFeatureDomain(features_boundaries)

prediction = CounterfactualPrediction(
    original, goals, domain, constraints, None, uuid.uuid4()
)

In [44]:
explanation = explainer.explainAsync(prediction, model).get()

As previously, we check that the answer is what we are looking for

In [45]:
testf = [f.asFeature() for f in explanation.getEntities()]
model.predictAsync(toJList([PredictionInput(testf)])).get()[0].getOutputs()[
    0
].toString()

'Output{value=true, type=boolean, score=0.7572674751281738, name='PaidLoan'}'

And we show which features need to be changed for said desired outcome:

In [46]:
show_changes(explanation, features)

Feature 'LoanDuration': 60.0 -> 14.899149688096976
Feature 'EmploymentDurationCurrentEmployer': 6.0 -> 5.8223107382429395
Feature 'FreeCash': 10.92 -> 10.942602612323316
