In [1]:
import torch
import numpy as np
from regression_dataset import make_regression_dataset
from regression_network import RegressionNetwork

# Deterministic execution.
CUDA_LAUNCH_BLOCKING = 1
seed = 42
torch.manual_seed(seed)
np.random.seed(seed)
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True

# Example for the Application of the Feature Steering Method Presented in “Beyond Debiasing: Actively Steering Feature Selection via Loss Regularization”

This jupyter notebook provides an example for our method for the redundant regression dataset presented in our paper.

You can choose to generate feature attributions with the feature attribution method provided by Reimers et al. based on both **contextual decomposition** and **conditional mutual information**. Additionally, you can choose other hyperparameters such as the weight factor $\lambda$ and the norm that is applied (L1 / L2 norm).

## Dataset
We create a small regression dataset with redundant variables as described in our paper. That is, the created dataset has 9 input variables with a redundancy of 3 variables. In total, we generate 2000 samples, of which 1400 are used for training.

*Note:* In the evaluations for our paper we not only generate one, but rather 9 datasets with different seeds.

In [2]:
# Configuration of the datasets.
high_dim_transform = True
normalize = True  # Should only be set if HIGH_DIM_TRANSFORM
n_informative_low_dim = 6
n_high_dim = 9
n_train, n_test, n_validation = 1400, 300, 300
n_uninformative_low_dim = 0
dataset_seed = 42
batch_size = 100
n_datasets = 9

noise_on_output = 0.0
noise_on_high_dim_snrdb = None

In [3]:
# Create and load the regression dataset.
train_dataloader, test_dataloader, validation_dataloader, _ = make_regression_dataset(
    high_dim_transform=high_dim_transform,
    n_features_low_dim=n_informative_low_dim,
    n_uninformative_low_dim=n_uninformative_low_dim,
    n_high_dim=n_high_dim,
    noise_on_high_dim_snrdb=noise_on_high_dim_snrdb,
    noise_on_output=noise_on_output,
    n_train=n_train,
    n_test=n_test,
    n_validation=n_validation,
    normalize=normalize,
    seed=dataset_seed,
    batch_size=batch_size,
)

## Network
We follow the paper and create a network with a single hidden layer of size 9 and input size 9.

In [4]:
# Network architecture.
input_size = n_high_dim
hidden_dim_size = n_high_dim
n_hidden_layers = 1
device = "cuda:0" if torch.cuda.is_available() else "cpu"

# Create Network.
mlp = RegressionNetwork(
    input_shape=input_size,
    n_hidden_layers=n_hidden_layers,
    hidden_dim_size=hidden_dim_size,
    device=device,
)

## Training

After creating the network, we can train it with the *feature steering loss*.

Recall from the paper that our method to steer feature usage is implemented via loss regularization. Let $D$ refer to the set of features that should be discouraged and $E$ to the set of features that should be encouraged. With $c_i$ being a measure of the influence of feature $i$ on the model's prediction process, $\lambda \in \mathbb{R}_{\ge 0}$ as a weight factor and $\mathcal{L}$ as the standard maximum-likelihood loss for network parameters $\theta$, our model is trained with the following loss function:

$$ \mathcal{L}'(\theta) = \mathcal{L}(\theta) + \lambda \left( \sum_{i \in D} || c_i || - \sum_{i \in E} || c_i || \right) .$$
For $|| \cdot ||$, we consider the L1 and L2 norms.

**Parameters:**

Our implementation allows you to choose several *hyperparameters* for the feature steering process. You can adapt the following aspects of the calculation of the loss function:

* The feature attributions $c_i$ are generated based on the feature attribution method proposed by Reimers et al. For this, the attribution modes `cmi` for feature attribution based on the (transformed) conditional mutual information and `contextual_decomposition` for feature attribution performed with contextual decomposition are available.
* Feature steering can be performed with feature attributions weighted with L1 norm (`loss_l1`) and L2 norm (`loss_l2`). That is, this modifies the norm applied for $|| \cdot ||$.
* The indices of the features that shall be encouraged or discouraged (defining $D$ and $E$) are passed as lists.
* The weight factor $\lambda$ is specified as `lambda`.

In [9]:
# Training configuration.
learning_rate = 0.01
epochs = 90
feat_steering_config = {
    "attrib_mode": "cmi",
    "steering_mode": "loss_l2",
    "encourage": [0, 1, 2],
    "discourage": [],
    "lambda": 100.0, # Adapt accordingly for CMI / CD
}

# Train the network.
mlp.train(train_dataloader, feat_steering_config, epochs, learning_rate)

Loss (per sample) after epoch 1: 4712089.607142857
Loss (per sample) after epoch 2: 4480544.142857143
Loss (per sample) after epoch 3: 4258867.017857143
Loss (per sample) after epoch 4: 4050848.214285714
Loss (per sample) after epoch 5: 3851129.5714285714
Loss (per sample) after epoch 6: 3662716.375
Loss (per sample) after epoch 7: 3484030.3214285714
Loss (per sample) after epoch 8: 3317511.035714286
Loss (per sample) after epoch 9: 3158147.5714285714
Loss (per sample) after epoch 10: 3006144.9821428573
Loss (per sample) after epoch 11: 2864496.8035714286
Loss (per sample) after epoch 12: 2727674.410714286
Loss (per sample) after epoch 13: 2597680.910714286
Loss (per sample) after epoch 14: 2478867.535714286
Loss (per sample) after epoch 15: 2361367.4553571427
Loss (per sample) after epoch 16: 2251085.125
Loss (per sample) after epoch 17: 2148403.6160714286


KeyboardInterrupt: 