# Fairness Experiment — Selector-Based Models (Good vs Bad)

This notebook:

- Loads the dataset from: `../data/synth_data_for_training.csv`
- Defines selectors:
  - **good model** → uses safe prefixes (`valid_prefixes`)
  - **bad model** → uses discriminatory prefixes (`biased_prefixes`)
- Trains both models *without removing columns from the dataframe*
- Wraps models with a `SelectedModel` that applies the selector at prediction time
- Runs ALL partition tests (using full unmodified X_test)
- Runs ALL metamorphic tests (using full unmodified X_test)

Because data is never altered, partitioning and metamorphic tests behave correctly.


In [1]:
import numpy as np
import pandas as pd
import sys
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

from partition_tests_2 import PartitionTester
from metamorphic_tests import MetamorphicTester

print("Imports OK.")

Imports OK.


## Load Dataset + Define Feature Selectors

In [2]:
DATA_PATH = "../data/synth_data_for_training.csv"

data = pd.read_csv(DATA_PATH)
y = data['checked']

X_full = data.drop(columns=['checked']).astype(np.float32)

# Allowed (good) prefixes
valid_prefixes = [
    "afspraak_",
    "contacten_soort_",
    "instrument_",
    "deelname_",
    "pla_",
    "typering_",
    "ontheffing_"
]

good_features = [
    col for col in X_full.columns
    if any(col.startswith(p) for p in valid_prefixes)
]

# Biased (bad) prefixes
biased_prefixes = [
    "adres_",
    "persoonlijke_eigenschappen_spreektaal",
    "persoonlijke_eigenschappen_nl_",
    "persoonlijke_eigenschappen_taaleis_",
    "relatie_",
    "belemmering_",
    "beschikbaarheid_",
    "contacten_"
]

biased_features = [
    col for col in X_full.columns
    if any(col.startswith(p) for p in biased_prefixes)
]

print("GOOD model feature count:", len(good_features))
print("BAD model feature count:", len(biased_features))

GOOD model feature count: 132
BAD model feature count: 164


## Train/Test Split
We keep *full* X during splitting.

In [3]:
RANDOM_STATE = 42

X_train, X_test, y_train, y_test = train_test_split(
    X_full, y, test_size=0.25, random_state=RANDOM_STATE, stratify=y
)

print("Train/Test sizes:")
print(X_train.shape, X_test.shape)

Train/Test sizes:
(9483, 315) (3162, 315)


# Selector-Based Model Wrapper

This class lets us:
- Train on a subset of columns
- Predict using only that subset
- Keep full dataset intact for testers


In [4]:
class SelectedModel:
    def __init__(self, model, selector):
        self.model = model
        self.selector = selector

    def fit(self, X, y):
        return self.model.fit(X[self.selector], y)

    def predict(self, X):
        return self.model.predict(X[self.selector])

print("Selector model class ready.")

Selector model class ready.


## Train BAD Model (uses biased features)

In [5]:
bad_base = DecisionTreeClassifier(max_depth=None, min_samples_leaf=1, random_state=RANDOM_STATE)
bad_model = SelectedModel(bad_base, biased_features)

bad_model.fit(X_train, y_train)
print("Bad model trained.")

Bad model trained.


## Train GOOD Model (uses safe/allowed features)

In [6]:
good_base = DecisionTreeClassifier(max_depth=None, min_samples_leaf=1, random_state=RANDOM_STATE)
good_model = SelectedModel(good_base, good_features)

good_model.fit(X_train, y_train)
print("Good model trained.")

Good model trained.


## Evaluation on Test Set
Prediction uses selector automatically.

In [7]:
print("### BAD MODEL PERFORMANCE ###")
pred_bad = bad_model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, pred_bad))
print(classification_report(y_test, pred_bad, zero_division=0))

print("\n### GOOD MODEL PERFORMANCE ###")
pred_good = good_model.predict(X_test)
print("Accuracy:", accuracy_score(y_test, pred_good))
print(classification_report(y_test, pred_good, zero_division=0))

### BAD MODEL PERFORMANCE ###
Accuracy: 0.823213156230234
              precision    recall  f1-score   support

           0       0.91      0.89      0.90      2846
           1       0.19      0.24      0.21       316

    accuracy                           0.82      3162
   macro avg       0.55      0.56      0.56      3162
weighted avg       0.84      0.82      0.83      3162


### GOOD MODEL PERFORMANCE ###
Accuracy: 0.8513598987982289
              precision    recall  f1-score   support

           0       0.92      0.91      0.92      2846
           1       0.27      0.28      0.27       316

    accuracy                           0.85      3162
   macro avg       0.59      0.60      0.60      3162
weighted avg       0.85      0.85      0.85      3162



# Safe Test Runners

Testers always receive **full X**, but model prediction applies selector.

In [8]:
def safe_partition_run(tester, model):
    print("\n### Partition Tests ###")
    for part in tester.partitions:
        name = part['name']
        cond = part['condition']
        try:
            mask = cond(tester.X_test)
        except Exception as e:
            print(f"Skipping {name}: {e}")
            continue
        dfp = tester.X_test[mask]
        if dfp.empty:
            print(f"Skipping {name}: no rows")
            continue
        preds = model.predict(dfp)
        labels = tester.y_test.loc[dfp.index]
        TP = np.sum((preds == 1) & (labels == 1))
        TN = np.sum((preds == 0) & (labels == 0))
        FP = np.sum((preds == 1) & (labels == 0))
        FN = np.sum((preds == 0) & (labels == 1))
        print(f"\n=== {name} ===")
        print(f"Rows: {len(dfp)}  TP={TP} TN={TN} FP={FP} FN={FN}")
    print("Done.")


def safe_metamorphic_run(meta, model):
    print("\n### Metamorphic Tests ###")
    tests = [
        'test_gender_flip',
        'test_language_flip',
        'test_neighborhood_shuffle'
    ]
    for t in tests:
        fn = getattr(meta, t)
        try:
            fn(model)
        except Exception as e:
            print(f"Skipping {t}: {e}")
    print("Done.")

# Run BAD Model Through Tests

In [9]:
print("\n===== BAD MODEL TESTING =====")

pt_bad = PartitionTester(DATA_PATH)
pt_bad.X_test = X_test.copy()
pt_bad.y_test = y_test.copy()

safe_partition_run(pt_bad, bad_model)

mt_bad = MetamorphicTester(DATA_PATH)
mt_bad.X_base = X_test.copy()
safe_metamorphic_run(mt_bad, bad_model)


===== BAD MODEL TESTING =====


  df_raw = pd.read_csv(self.DATA_PATH, header=None)



### Partition Tests ###

=== men ===
Rows: 1660  TP=38 TN=1320 FP=179 FN=123

=== women ===
Rows: 1502  TP=38 TN=1207 FP=140 FN=117

=== young_adults ===
Rows: 131  TP=13 TN=78 FP=18 FN=22

=== middle_aged ===
Rows: 2470  TP=56 TN=1984 FP=254 FN=176

=== seniors ===
Rows: 561  TP=7 TN=465 FP=47 FN=42

=== single_parents ===
Rows: 1052  TP=41 TN=797 FP=106 FN=108

=== married_with_children ===
Rows: 110  TP=1 TN=86 FP=13 FN=10

=== no_children_no_partner ===
Rows: 1933  TP=33 TN=1592 FP=192 FN=116

=== currently_married ===
Rows: 177  TP=2 TN=138 FP=21 FN=16

=== currently_unmarried_with_partner ===
Rows: 262  TP=7 TN=211 FP=24 FN=20

=== currently_single ===
Rows: 2744  TP=67 TN=2195 FP=276 FN=206
Skipping multiple_unmarried_partners: no rows

=== likely_divorced ===
Rows: 514  TP=18 TN=408 FP=46 FN=42

=== likely_divorced_with_children ===
Rows: 239  TP=12 TN=187 FP=19 FN=21

=== likely_divorced_no_children ===
Rows: 275  TP=6 TN=221 FP=27 FN=21

=== divorced_women ===
Rows: 227  TP=

  df_raw = pd.read_csv(self.DATA_PATH, header=None)



### Metamorphic Tests ###
>>> Running MR: Gender Flip
Test: Gender Flip
-----------------------------------
Total rows tested: 3162
Prediction Flips:  0
Violation Rate:    0.00%
-----------------------------------

>>> Running MR: Language Proficiency Flip
Test: Language Proficiency Flip
-----------------------------------
Total rows tested: 3162
Prediction Flips:  24
Violation Rate:    0.76%
-----------------------------------

>>> Running MR: Neighborhood Swap (Feijenoord <-> Kralingen)
Test: Neighborhood Swap
-----------------------------------
Total rows tested: 712
Prediction Flips:  8
Violation Rate:    1.12%
-----------------------------------

Done.


# Run GOOD Model Through Tests

In [10]:
print("\n===== GOOD MODEL TESTING =====")

pt_good = PartitionTester(DATA_PATH)
pt_good.X_test = X_test.copy()
pt_good.y_test = y_test.copy()

safe_partition_run(pt_good, good_model)

mt_good = MetamorphicTester(DATA_PATH)
mt_good.X_base = X_test.copy()
safe_metamorphic_run(mt_good, good_model)


===== GOOD MODEL TESTING =====


  df_raw = pd.read_csv(self.DATA_PATH, header=None)



### Partition Tests ###

=== men ===
Rows: 1660  TP=48 TN=1341 FP=158 FN=113

=== women ===
Rows: 1502  TP=41 TN=1262 FP=85 FN=114

=== young_adults ===
Rows: 131  TP=16 TN=89 FP=7 FN=19

=== middle_aged ===
Rows: 2470  TP=66 TN=2034 FP=204 FN=166

=== seniors ===
Rows: 561  TP=7 TN=480 FP=32 FN=42

=== single_parents ===
Rows: 1052  TP=49 TN=835 FP=68 FN=100

=== married_with_children ===
Rows: 110  TP=2 TN=92 FP=7 FN=9

=== no_children_no_partner ===
Rows: 1933  TP=35 TN=1617 FP=167 FN=114

=== currently_married ===
Rows: 177  TP=5 TN=151 FP=8 FN=13

=== currently_unmarried_with_partner ===
Rows: 262  TP=5 TN=218 FP=17 FN=22

=== currently_single ===
Rows: 2744  TP=80 TN=2251 FP=220 FN=193
Skipping multiple_unmarried_partners: no rows

=== likely_divorced ===
Rows: 514  TP=17 TN=407 FP=47 FN=43

=== likely_divorced_with_children ===
Rows: 239  TP=12 TN=187 FP=19 FN=21

=== likely_divorced_no_children ===
Rows: 275  TP=5 TN=220 FP=28 FN=22

=== divorced_women ===
Rows: 227  TP=6 TN=1

  df_raw = pd.read_csv(self.DATA_PATH, header=None)



### Metamorphic Tests ###
>>> Running MR: Gender Flip
Test: Gender Flip
-----------------------------------
Total rows tested: 3162
Prediction Flips:  0
Violation Rate:    0.00%
-----------------------------------

>>> Running MR: Language Proficiency Flip
Test: Language Proficiency Flip
-----------------------------------
Total rows tested: 3162
Prediction Flips:  0
Violation Rate:    0.00%
-----------------------------------

>>> Running MR: Neighborhood Swap (Feijenoord <-> Kralingen)
Test: Neighborhood Swap
-----------------------------------
Total rows tested: 712
Prediction Flips:  0
Violation Rate:    0.00%
-----------------------------------

Done.
