# Global Causal Analysis (GCA)
We perform experiments, where we use GCA with a dataset of high-level features $Z$ (constructed using `gca.data.prepare_data()`) and predicted labels $\hat{Y}$ of a finetuned `DistilRoBERTa-base` model (trained and applied with `gca.model.train_and_apply()`).

## 0. Installing requirements
We first install the supplied `gca` package used for reproducing the experiments.

In [None]:
!pip3 install -e .

If inferring features for yourself, do not forget to download spaCy model `en_core_web_sm`.

In [None]:
#!python3 -m spacy download en_core_web_sm

Uncomment this line if you are on Google Colab or Linux to download the `NRC Sentiment Emotion Lexicon (EmoLex)`:

In [None]:
#!wget -nc http://saifmohammad.com/WebDocs/Lexicons/NRC-Suite-of-Sentiment-Emotion-Lexicons.zip

## 1. Load the data and define labels
To avoid retraining of the model (which takes approximately 1.5 hours on a GPU), we have provided the dataset with high-level features $Z$ and the predicted label $\hat{Y}$ in `'data/go_emotions_xai-distilroberta-base.csv'`.

In [None]:
import pandas as pd
from gca.data import get_data

# Download data with PRED_label column
dataset = get_data('data/go_emotions_xai-distilroberta-base.csv')

# Take subset of
columns_to_select = ['all_lower', 'flesch_grade', 'is_active', 'subreddit',
                     'len_chr', 'len_tok', 'len_snt',
                     'has_name', 'has_emoji', 'has_religion',
                     'NRC_anger', 'NRC_anticipation', 'NRC_disgust', 'NRC_fear',
                     'NRC_joy', 'NRC_sadness', 'NRC_surprise', 'NRC_trust',
                     'NRC_valence', 'NRC_arousal', 'NRC_dominance',
                     'male_words', 'female_words', 'non-binary_words',
                     'PRED_label']
df = dataset.with_format('pandas')[:][columns_to_select]

# One-hot encode labels (for class-wise contrastive explanation)
df = pd.concat([df, pd.get_dummies(df['PRED_label'], prefix='PRED')], axis=1)

# Names of labels
LABELS = ['label', 'positive', 'neutral', 'negative', 'ambiguous']

# Place to hold all results
results = {}

In [None]:
df

## 2. Experiments
### 2a. Task-related features
Use GCA to estimate a global explanatory graph $\mathcal{P}$ over $V=(Z_{task}, y)$ for $y \in \{\hat{Y}, \hat{Y}_{positive}, \hat{Y}_{negative}, \hat{Y}_{ambiguous}, \hat{Y}_{neutral}\}$.

In [None]:
from gca.data.tasks import TASK_FEATURES
TASK_FEATURES

In [None]:
from gca import generate_and_evaluate

results['task'] = [generate_and_evaluate(df[TASK_FEATURES + [f'PRED_{label}']],
                                         continuous=['NRC_valence', 'NRC_arousal', 'NRC_dominance'],
                                         n_trials=0)
                   for label in LABELS]

### 2b. Robustness-related features
Use GCA to estimate a global explanatory graph $\mathcal{P}$ over $V=(Z_{robust}, y)$ for $y \in \{\hat{Y}, \hat{Y}_{positive}, \hat{Y}_{negative}, \hat{Y}_{ambiguous}, \hat{Y}_{neutral}\}$.

In [None]:
from gca.data.tasks import ROBUSTNESS_FEATURES
ROBUSTNESS_FEATURES

In [None]:
from gca import generate_and_evaluate

results['robustness'] = [generate_and_evaluate(df[ROBUSTNESS_FEATURES + [f'PRED_{label}']], n_trials=0)
                         for label in LABELS]

### 2c. Fairness-related features
Use GCA to estimate a global explanatory graph $\mathcal{P}$ over $V=(Z_{fair}, y)$ for $y \in \{\hat{Y}, \hat{Y}_{positive}, \hat{Y}_{negative}, \hat{Y}_{ambiguous}, \hat{Y}_{neutral}\}$

In [None]:
from gca.data.tasks import FAIRNESS_FEATURES
FAIRNESS_FEATURES

In [None]:
from gca import generate_and_evaluate

results['fairness'] = [generate_and_evaluate(df[FAIRNESS_FEATURES + [f'PRED_{label}']], n_trials=0)
                       for label in LABELS]

### 2d. Combined (all aspects)
Use GCA to estimate a global explanatory graph $\mathcal{P}$ over $V=(Z, y)$ for $y \in \{\hat{Y}, \hat{Y}_{positive}, \hat{Y}_{negative}, \hat{Y}_{ambiguous}, \hat{Y}_{neutral}\}$, where $Z = Z_{task} \cup Z_{robust} \cup Z_{fair}$.

In [None]:
FEATURES = TASK_FEATURES + ROBUSTNESS_FEATURES + FAIRNESS_FEATURES

In [None]:
from gca import generate_and_evaluate

results['all'] = [generate_and_evaluate(df[FEATURES + [f'PRED_{label}']],
                                        continuous=['NRC_valence', 'NRC_arousal', 'NRC_dominance'],
                                        color=True,
                                        depth=2,
                                        n_trials=0)
                  for label in LABELS]

### 2.+ Save results locally

In [None]:
import pickle
from datetime import datetime

with open(f'results-{datetime.utcnow().strftime("%Y-%m-%d-%H%M")}.pickle', 'wb') as f:
   pickle.dump(results, f)

In [None]:
with open('all_all.svg', 'wb') as f:
    f.write(results['all'][0].svg)

In [None]:
with open('task_positive.svg', 'wb') as f:
    f.write(results['task'][1].svg)

In [None]:
with open('robustness_neutral.svg', 'wb') as f:
    f.write(results['robustness'][2].svg)

## 3. Results
Tabulate the results of the experiments conducted above.

Show the label distributions of $\hat{Y}$:

In [None]:
df['PRED_label'].value_counts()

In [None]:
import pandas as pd

def latexify_df(df, **kwargs):
    return df.reset_index().rename(columns={'index': 'Aspect'}).to_latex(index=False, **kwargs).replace('Aspect', '\\textit{Aspect}')

def results_to_table(to_select,
                     results=results,
                     round_to=None,
                     percentage=False,
                     columns=LABELS,
                     to_latex=True):
    df = pd.DataFrame.from_dict(
        {aspect: [to_select(res) for res in aspect_results]
         for aspect, aspect_results in results.items()},
        orient='index',
        columns=columns
    )
    df = df.sort_index()

    if percentage:
        df *= 100
    if round_to:
        df = df.round(round_to)

    return latexify_df(df) if to_latex else df

### 3.1 $Z$-fidelity
$Z$-fidelity estimates how predictive the selected high-level features $Z$ are of behavior $\hat{Y}$. We report the $F_1$-score (because of non-equal label distributions), but other metrics can be used instead.

In [None]:
# Z-Fidelity (f1-score)
print(results_to_table(lambda res: res.z_f1, percentage=True, round_to=2))

### 3.2 Structural fit & stability
The relative MVEE indicates the structural fit and stability of the generated explanatory graph.

In [None]:
# Relative MVEE (PHD with strategy MVEE)
print(results_to_table(lambda res: res.mvee_relative, round_to=2))

In [None]:
# Mean relative MVEE, SD
df_mvee_relative = results_to_table(lambda res: res.mvee_relative, to_latex=False).stack()

print(f'mean: {df_mvee_relative.mean().round(3)} | SD: {df_mvee_relative.std().round(3)}')

In [None]:
# Absolute MVEE (PHD with strategy MVEE)
df_abs = results_to_table(lambda res: res.mvee, to_latex=False)
df_abs.insert(0, 'nodes', results_to_table(lambda res: res.n_features + 1, to_latex=False)['label'])
print(latexify_df(df_abs))

### 3.+ Time
We also report the time taken to generate the explanatory graphs, for each aspect-label combination.

In [None]:
# Time (seconds)
print(results_to_table(lambda res: res.elapsed_time, round_to=5))

In [None]:
# Mean time
results_to_table(lambda res: res.elapsed_time, to_latex=False).mean(axis=1).round(2)