FACTS method bug: `feature_weights` parameter of the method is not propagated correctly #532

phantom-duck · 2024-05-20T14:10:22Z

Inside the fit method of the FACTS detector, the function calc_costs is called to calculate the costs of the recourses. But the params argument (which includes the feature_weights) is not passed. As a result, the default parameters are always used, which assign weight equal to 1 to all features.

Example to reproduce:

from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.compose import ColumnTransformer
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder

from aif360.sklearn.datasets.openml_datasets import fetch_adult
from aif360.sklearn.detectors.facts.clean import clean_dataset
from aif360.sklearn.detectors.facts import FACTS
from aif360.sklearn.detectors.facts.predicate import Predicate
import pandas as pd

random_seed = 131313 # to produce the expected if-then clause

# load the adult dataset and perform some simple preprocessing steps
# See output for a glimpse of the final dataset's characteristics
X, y, sample_weight = fetch_adult()
data = clean_dataset(X.assign(income=y), "adult")

# split into train-test data
y = data['income']
X = data.drop('income', axis=1)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size=0.7, random_state=random_seed, stratify=y)

#### here, we incrementally build the example model. It consists of one preprocessing step,
#### which is to turn categorical features into the respective one-hot encodings, and
#### a simple scikit-learn logistic regressor.
categorical_features = X.select_dtypes(include=["object", "category"]).columns.to_list()
categorical_features_onehot_transformer = ColumnTransformer(
    transformers=[
        ("one-hot-encoder", OneHotEncoder(), categorical_features)
    ],
    remainder="passthrough"
)
model = Pipeline([
    ("one-hot-encoder", categorical_features_onehot_transformer),
    ("clf", LogisticRegression(max_iter=1500))
])

#### train the model
model = model.fit(X_train, y_train)

detector = FACTS(
    clf=model,
    prot_attr="sex",
    freq_itemset_min_supp=0.08,
    feature_weights={f: 10 for f in X.columns},
    feats_not_allowed_to_change=[]
)

detector = detector.fit(X_test)

print(detector.rules_by_if[Predicate.from_dict({"age": pd.Interval(26., 34.), "hours-per-week": "FullTime"})]["Female"][1][0])

The output of the final command is (Predicate(features=['age', 'hours-per-week'], values=[Interval(41.0, 50.0, closed='right'), 'OverTime']), 0.07728337236533955, 2.0), which shows that the action of changing the features "age" and "hours-per-week" (which are categorical) is counted as having cost equal to 2.0. The correct value, however, would be 20.0, since we have assigned weight equal to 10 to all features.

The text was updated successfully, but these errors were encountered:

phantom-duck mentioned this issue May 20, 2024

Bug fixes for the FACTS method #533

Merged

hoffmansc closed this as completed Jul 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FACTS method bug: `feature_weights` parameter of the method is not propagated correctly #532

FACTS method bug: `feature_weights` parameter of the method is not propagated correctly #532

phantom-duck commented May 20, 2024

FACTS method bug: feature_weights parameter of the method is not propagated correctly #532

FACTS method bug: feature_weights parameter of the method is not propagated correctly #532

Comments

phantom-duck commented May 20, 2024

FACTS method bug: `feature_weights` parameter of the method is not propagated correctly #532

FACTS method bug: `feature_weights` parameter of the method is not propagated correctly #532