# Counterfactuals

Counterfactuals are great for seeing what input we would need to get some desired output.  
In our case, it might be that we wanted to check what input we needed to make the song popular in France.  
We will be using TrustyAI to test exactly this, and see how much we would need to change.

In [None]:
!pip -q install "numpy==1.26.4" "tensorflow==2.18.0"

In [None]:
import pickle
import pandas as pd
import numpy as np
import keras

In [3]:
import warnings

# Ignore UserWarnings
warnings.filterwarnings("ignore", category=UserWarning)

Let's start by choosing a country we want the song to be popular in.  
We also pick what probability we need to see before we say that there's a good chance that our song will be popular in that country.  

In [4]:
OUTPUT_FEATURE = "price"
LABEL_THRESHOLD = 0.0
SAVE_FOLDER = "new_model"

We then load our model, as well as our pre-and-post-processing artifacts.  

In [None]:
keras_model = keras.saving.load_model(f"{SAVE_FOLDER}/model.keras")

with open(f'{SAVE_FOLDER}/scalers.pkl', 'rb') as handle:
    scalers = pickle.load(handle)

### Data

Then we pick a song we want to try to make popular in that country.  
We will also process the song properties a bit, such as scaling them, just like what we did when training the model. This is to make sure they have an input that the model understands. 

In [None]:
test_data = pd.read_parquet(f'{SAVE_FOLDER}/X_test.parquet')
strange_prediction = test_data.loc[[532]].drop(OUTPUT_FEATURE, axis=1)
strange_prediction

### Counterfactual analysis

Now that we have all of this set up, will set up our counterfactual analysis.  
Here we need to first create a predict function (if your model inputs and outputs pandas dataframe by default, this is not needed).  
Then we will create a TrustyAI "Model", this just wraps our model and will be used by TrustyAI to iterate on different input values.  
Finally, we will define TrustyAI "domains" for each of our inputs. This tells TrustyAI what values the input is allowed to be between.

In [7]:
def pred(x):
    prediction = keras_model.predict(x)
    unscaled_pred = scalers[OUTPUT_FEATURE].inverse_transform(prediction)[0][0]
    print(unscaled_pred)
    if unscaled_pred >= LABEL_THRESHOLD:
        pred = {OUTPUT_FEATURE: True}
    else:
        pred = {OUTPUT_FEATURE: False}
    return pd.DataFrame([pred])

In [8]:
from trustyai.model import Model

model = Model(pred, output_names=[OUTPUT_FEATURE])

In [None]:
# Test the model
model(strange_prediction.to_numpy())

In [10]:
from trustyai.model.domain import feature_domain

domains = {}

for key in strange_prediction.columns:
    if "category" in key or "sellable_online" in key or "other_colors" in key:
        domains[key] = feature_domain([False, True])
        strange_prediction[[key]] = strange_prediction[[key]].astype("bool")
    else:
        domains[key] = feature_domain((0, 1))

domains = list(domains.values())

In [11]:
from trustyai.model import output
goal = [output(name=OUTPUT_FEATURE, dtype="bool", value=True)]

After we have the model, the domains, and the goal, we can start running through possible inputs to see which one can give us the output we want.  
When this has finished running, we can see how much the input differed from the original input we sent into the model (remember the song we chose at the start).  
This gives us a good idea of what we would need to change for the song to become popular in our country.

In [None]:
strange_prediction

In [None]:
from trustyai.explainers import CounterfactualExplainer

STEPS=50
explainer = CounterfactualExplainer(steps=STEPS)
explanation = explainer.explain(inputs=strange_prediction, goal=goal, model=model, feature_domains=domains)

In [None]:
model(explanation.proposed_features_dataframe.to_numpy())

In [None]:
explanation.as_dataframe()

In [None]:
df = explanation.as_dataframe()
df[df.difference != 0.0]

In [None]:
if not df[df.difference != 0.0].empty:
    explanation.plot()
else:
    print(f"We did not manage to make '{OUTPUT_FEATURE}' larger than '{LABEL_THRESHOLD}'")