# Tabular Concept Bottleneck Models (TabCBMs): Synthetic Nonlinear Task Example

This very short notebook will showcase how to set up a Tabular Concept Bottleneck
Model (TabCBM) using our library and train it on the Synthetic Nonlinear
(Synth-Nonlin) dataset used in our TabCBM TMLR 2023 paper. This example will
showcase how to train TabCBMs when **no concept supervision** is provided, and,
therefore, concept learning is entirely unsupervised.

Our example is composed by four different steps:
1. Loading the dataset of interest in a format that can be "digested" by our models.
2. Instantiating a TabCBM with the embedding size and encoder/decoder architectures we want to use.
3. Pretraining the latent code encoder and label predictor.
4. Pretraining TabCBM's mask generators using Self-supervised Learning (SSL) on the Synth-Nonlin dataset.
5. Training TabCBM to discover tabular concepts and predict labels in the Synth-Nonlin dataset.
6. Evaluating TabCBM's task accuracy.
7. Finding ground-truth concepts that are strongly correlated with discovered TabCBM concepts.
8. Intervening on strongly aligned concepts during test time.

## Part 1: Load Data

As a first step, we will show you how one can generate a dataset from scratch
that is compatible with how our training pipeline is set.

In practice, you can train any TabCBM using our library as long as
your dataset is structured such that samples, task labels, and concept labels
(if any) are stored in numpy arrays.

Below, we show how we do this for the Synth-Nonlin dataset. For details on the
actual dataset, please refer to our paper.

In [1]:
%load_ext tensorboard
%matplotlib inline

In [2]:
import matplotlib.pyplot as plt
import numpy as np
from sklearn.model_selection import train_test_split

# A helper function to generate our synthetic tabular datasets
def generate_tabular_synth_data(
    n_features,
    spacing=10,
    n_concepts=2,
    dataset_size=10000,
    latent_map=lambda x: x,
    test_percent=0.2,
    overlap=0,
    seed=0,
):
    np.random.seed(seed)
    latent = np.random.normal(size=(dataset_size, n_features)).astype(
        np.float32
    )
    X_train = latent_map(latent)
    c_train = np.zeros((dataset_size, n_concepts), dtype=np.int32)
    ground_truth_concept_masks = np.zeros(
        shape=(n_concepts, n_features), dtype=np.int32
    )
    for i in range(n_concepts):
        start = i * spacing
        start = max(start - overlap, 0)
        end = (i + 1) * spacing
        end = min(end + overlap, latent.shape[-1])
        c_train[:, i] = (
            np.sum(latent[:, start:end], axis=-1) > 0
        ).astype(np.int32)
        ground_truth_concept_masks[i, start:end] = 1
    y_train = np.zeros((dataset_size,), dtype=np.int32)
    for i in range(dataset_size):
        bin_str = ''
        for c in c_train[i, :]:
            bin_str += str(c)
        y_train[i] = int(bin_str, 2)
    X_train, X_test, y_train, y_test, c_train, c_test = train_test_split(
        X_train,
        y_train,
        c_train,
        test_size=test_percent,
        random_state=seed,
    )
    return (
        X_train,
        X_test,
        y_train,
        y_test,
        c_train,
        c_test,
        ground_truth_concept_masks
    )

def generate_tabular_synth_nonlinear_data(seed):
    n_ground_truth_concepts = 2
    extra_hyperparameters = {
        'n_ground_truth_concepts': n_ground_truth_concepts,
    }
    data = generate_tabular_synth_data(
        dataset_size=15000,
        n_features=100,
        spacing=5,
        n_concepts=2,
        latent_map=lambda x: np.sin(x) + x,
        seed=seed,
    )
    return data, extra_hyperparameters



In [3]:
data, extra_hypers  = generate_tabular_synth_nonlinear_data(
    seed=42,
)
x_train, x_test, y_train, y_test, c_train, c_test, ground_truth_concept_masks = data

In [4]:
print("x_train has shape", x_train.shape, "and type", x_train.dtype)
print("y_train has shape", y_train.shape, "and type", y_train.dtype)
print("c_train has shape", c_train.shape, "and type", c_train.dtype)
print("Ground truth concept masks are:")
for i, mask in enumerate(ground_truth_concept_masks):
    print("\tConcept", i + 1, "depends on the following features", mask)

print("x_test has shape", x_test.shape, "and type", x_test.dtype)
print("y_test has shape", y_test.shape, "and type", y_test.dtype)
print("c_test has shape", c_test.shape, "and type", c_test.dtype)

x_train has shape (12000, 100) and type float32
y_train has shape (12000,) and type int32
c_train has shape (12000, 2) and type int32
Ground truth concept masks are:
	Concept 1 depends on the following features [1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
	Concept 2 depends on the following features [0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
x_test has shape (3000, 100) and type float32
y_test has shape (3000,) and type int32
c_test has shape (3000, 2) and type int32


## Part 2: TabCBM Training

### Step 1: Create Main Components for TabCBM


Now that we have our dataset, we can proceed to construct our TabCBM object. For this,
we will first need to construct TabCBM's feature-to-latent-code encoder (i.e., $\phi$) and its 
concept-to-label predictor (i.e., $f$).

We will proceed to construct and pretrain these TF models first before using them
to construct our `TabCBM` model:


In [5]:

# First define a key set of parameters of importance here and throughout this
# notebook

import tensorflow as tf

################################################################################
# Parameters defining the architecture we will use
################################################################################

input_shape = x_train.shape[1:]
num_outputs = len(set(y_train))
encoder_units = [16, 16]
decoder_units = [16]
latent_dims = 16
learning_rate = 0.001
validation_size = 0.1

In [6]:

################################################################################
# Next, we build the feature to latent code encoder model (i.e., phi)
################################################################################

encoder_inputs = tf.keras.Input(shape=input_shape)
encoder_compute_graph = encoder_inputs

# Include the fully connected bottleneck here
for i, units in enumerate(encoder_units):
    encoder_compute_graph = tf.keras.layers.Dense(
        units,
        activation='relu',
        name=f"encoder_dense_{i}",
    )(encoder_compute_graph)

# TIme to generate the latent code here
encoder_compute_graph = tf.keras.layers.Dense(
    latent_dims,
    activation=None,
    name="encoder_bypass_channel",
)(encoder_compute_graph)

encoder = tf.keras.Model(
    encoder_inputs,
    encoder_compute_graph,
    name="encoder",
)
encoder.summary()

Model: "encoder"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 100)]             0         
                                                                 
 encoder_dense_0 (Dense)     (None, 16)                1616      
                                                                 
 encoder_dense_1 (Dense)     (None, 16)                272       
                                                                 
 encoder_bypass_channel (Den  (None, 16)               272       
 se)                                                             
                                                                 
Total params: 2,160
Trainable params: 2,160
Non-trainable params: 0
_________________________________________________________________


In [7]:
################################################################################
# Then, we build the concept to label model  (i.e., the label predictor f)
################################################################################

decoder_inputs = tf.keras.Input(shape=[latent_dims])
decoder_layers = [
    tf.keras.layers.Dense(
        units,
        activation=tf.nn.relu,
        name=f"decoder_dense_{i+1}",
    ) for i, units in enumerate(decoder_units)
]
decoder_graph = tf.keras.Sequential(decoder_layers + [
    tf.keras.layers.Dense(
        num_outputs if num_outputs > 2 else 1,
        activation=None,
        name="decoder_model_output",
    )
])
decoder = tf.keras.Model(
    decoder_inputs,
    decoder_graph(decoder_inputs),
    name="decoder",
)
decoder.summary()

Model: "decoder"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 16)]              0         
                                                                 
 sequential (Sequential)     (None, 4)                 340       
                                                                 
Total params: 340
Trainable params: 340
Non-trainable params: 0
_________________________________________________________________


In [8]:
################################################################################
# We then put them both together to make an end-to-end model we can pretrain
################################################################################

end_to_end_inputs = tf.keras.Input(shape=input_shape)
latent = encoder(end_to_end_inputs)
end_to_end_model_compute_graph = decoder(latent)
# Now time to collapse all the concepts again back into a single vector
end_to_end_model = tf.keras.Model(
    end_to_end_inputs,
    end_to_end_model_compute_graph,
    name="complete_model",
)
end_to_end_model.compile(
    optimizer=tf.keras.optimizers.Adam(learning_rate),
    loss=(
        tf.keras.losses.BinaryCrossentropy(from_logits=True) if (num_outputs <= 2)
        else tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
    ),
    metrics=[
        "accuracy" if (num_outputs <= 2)
        else "sparse_categorical_accuracy"
    ],
)
end_to_end_model.summary()

Model: "complete_model"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_3 (InputLayer)        [(None, 100)]             0         
                                                                 
 encoder (Functional)        (None, 16)                2160      
                                                                 
 decoder (Functional)        (None, 4)                 340       
                                                                 
Total params: 2,500
Trainable params: 2,500
Non-trainable params: 0
_________________________________________________________________


### Step 2: Encoder and Decoder Pretraining

As discussed in our paper, for ease of convergence and stability, we first 
pretrain $\phi$ and $f$ by training and end-to-end model $f(\phi(x))$ that maps
inputs to the downstream tasks:

In [9]:
################################################################################
## Latent code model pre-training (using end-to-end model)
################################################################################

# Number of epochs to pretain our encoder and decoder models (see values used
# for each dataset and model in our paper's appendix)
pretrain_epochs = 50
batch_size = 1024
pretrain_hist = end_to_end_model.fit(
    x=x_train,
    y=y_train,
    epochs=pretrain_epochs,
    batch_size=batch_size,
    validation_split=validation_size,
    verbose=1,
)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


We can evaluate this pre-trained model to at least make sure it is sensical:

In [10]:
import scipy
import sklearn

# We will accumulate all metrics/results in the same dictionary
results = {}

# Make test predictions for the test set
end_to_end_preds = end_to_end_model.predict(
    x_test,
    batch_size=batch_size,
)

# Get accuracy/AUC using the corresponding test labels
if ((len(end_to_end_preds.shape) == 2)) and (end_to_end_preds.shape[-1] >= 2):
    # Then we are using multi-class outputs
    preds = scipy.special.softmax(
        end_to_end_preds,
        axis=-1,
    )

    one_hot_labels = tf.keras.utils.to_categorical(y_test)
    results['pre_train_acc'] = sklearn.metrics.accuracy_score(
        y_test,
        np.argmax(preds, axis=-1),
    )

    # And select just the labels that are in fact being used
    results['pre_train_auc'] = sklearn.metrics.roc_auc_score(
        one_hot_labels,
        preds,
        multi_class='ovo',
    )
else:
    # Otherwise we are dealing with simple binary outputs
    if np.min(end_to_end_preds) < 0.0 or np.max(end_to_end_preds) > 1:
        # Then we assume that we have outputed logits
        end_to_end_preds = tf.math.sigmoid(end_to_end_preds).numpy()
    end_to_end_preds = (end_to_end_preds >= 0.5).astype(np.int32)
    results['pre_train_acc'] = sklearn.metrics.accuracy_score(
        y_test,
        end_to_end_preds,
    )
    results['pre_train_auc'] = sklearn.metrics.roc_auc_score(
        y_test,
        end_to_end_preds,
    )
print(f"Pretrained model task accuracy: {results['pre_train_acc']*100:.2f}%")

Pretrained model task accuracy: 88.43%


### Step 3: Construct TabCBM

We are now ready to construct a TabCBM. For this, we will first compute the
empirical covariance matrix in order for us to learn useful masks using a
similar approach to that proposed by SEFS:

In [11]:
# Construct the training set's empirical covariance matrix
# NOTE: This step can be very computationally expensive/intractable in large
#       datasets. In those cases, one may ignore the covariance matrix when
#       performing TabCBM's pretraining at the potential cost of performance or
#       more accurate concept discovery.
cov_mat = np.corrcoef(x_train.T)
print(cov_mat)

[[ 1.00000000e+00  6.21229800e-03 -2.84060534e-05 ... -3.36486944e-03
  -4.50653273e-04 -7.42122047e-03]
 [ 6.21229800e-03  1.00000000e+00  1.58402942e-03 ... -2.80296350e-03
   4.41576906e-03  8.81597038e-04]
 [-2.84060534e-05  1.58402942e-03  1.00000000e+00 ...  5.41576832e-03
  -3.71076596e-03 -2.14195915e-02]
 ...
 [-3.36486944e-03 -2.80296350e-03  5.41576832e-03 ...  1.00000000e+00
   1.80182253e-03  1.88755588e-02]
 [-4.50653273e-04  4.41576906e-03 -3.71076596e-03 ...  1.80182253e-03
   1.00000000e+00 -9.81476912e-03]
 [-7.42122047e-03  8.81597038e-04 -2.14195915e-02 ...  1.88755588e-02
  -9.81476912e-03  1.00000000e+00]]


We then use this covariance matrix, together with our encoder and decoder models that
have been pretrained above, to construct a TabCBM model:

In [12]:
from tabcbm.models.tabcbm import TabCBM

# Number of concepts we want to discover
n_concepts = 2

# Set the weights for the different regularisers in the loss
coherence_reg_weight = 0.1  # $lambda_{co}
diversity_reg_weight = 5 # $lambda_{div}
feature_selection_reg_weight = 5 # $lambda_{spec}
gate_estimator_weight = 10 # Gate prediction regularizer for SEFS's pre-text task

# Select how many neighbors to use for the coherency loss (must be less than
# the batch size!)
top_k = 256

# Generate a dictionary with the parameters to use for TabCBM as we will have
# to use the same parameters twice:
tab_cbm_params = dict(
    features_to_concepts_model=encoder,  # The $\phi$ sub-model
    concepts_to_labels_model=decoder,  # The $f$ sub-model
    latent_dims=latent_dims,  # The dimensionality of the concept embeddings $m$
    n_concepts=n_concepts,  # The number of concepts to discover $k^\prime$
    cov_mat=cov_mat,  # The empirical covariance matrix
    loss_fn=end_to_end_model.loss,  # The downstream task loss function
    # Then we provide all the regularizers weights
    coherence_reg_weight=coherence_reg_weight,
    diversity_reg_weight=diversity_reg_weight,
    feature_selection_reg_weight=feature_selection_reg_weight,
    gate_estimator_weight=gate_estimator_weight,
    top_k=top_k,

    # And indicate that we will not be providing any supervised concepts! Change
    # this is training concepts (e.g., `c_train`) are provided/known during
    # training
    n_supervised_concepts=0,
    concept_prediction_weight=0,

    # The accuracy metric to use for logging performance
    acc_metric=(
        lambda y_true, y_pred: tf.math.reduce_mean(
            tf.keras.metrics.sparse_categorical_accuracy(
                y_true,
                y_pred,
            )
        )
    ),

    # ANd architectural details of the self-supervised reconstruction modules
    concept_generator_units=[64],
    rec_model_units=[64],
)

### Step 4: Mask Generator Self-supervised Training

Next, we proceed to do the self-supervised training of the mask generators for
TabCBM. For this, we will follow a similar approach to that of SEFS. Our TabCBM
module allows one to do this by setting the `self_supervised_mode` flag to
`True` before calling the `.fit()` method:

In [13]:
# We can now construct our TabCBM model which we will first self-supervise!
ss_tabcbm = TabCBM(
    self_supervised_mode=True,
    **tab_cbm_params,
)
# Compile it with the appropiate optimizer
ss_tabcbm.compile(
    optimizer=tf.keras.optimizers.Adam(
        learning_rate,
    )
)

# Let's do a dummy call to initialize the model so that we can inspect it
ss_tabcbm._compute_self_supervised_loss(
    x_train[:2, :],
)
ss_tabcbm(x_train[:2, :])

# And generate a summary
ss_tabcbm.summary()

Model: "tab_cbm"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 decoder (Functional)        (None, 4)                 340       
                                                                 
 encoder (Functional)        (None, 16)                2160      
                                                                 
 sequential_1 (Sequential)   (None, 16)                9516      
                                                                 
 concept_generators_0 (Seque  (2, 16)                  7504      
 ntial)                                                          
                                                                 
 concept_generators_1 (Seque  (2, 16)                  7504      
 ntial)                                                          
                                                                 
 rec_values_model_0 (Sequent  (2, 100)                 7588

In [14]:
# And we are ready to do the SS pretraining of the mask generators for a total
# of 50 epochs
self_supervised_train_epochs = 50
print("TabCBM self-supervised training stage...")
ss_tabcbm_hist = ss_tabcbm.fit(
    x=x_train,
    y=y_train,
    validation_split=validation_size,
    epochs=self_supervised_train_epochs,
    batch_size=batch_size,
    verbose=1,
)
print("\tTabCBM self-supervised training completed")

TabCBM self-supervised training stage...
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
	TabCBM self-supervised training completed


### Step 5: End-to-end Training Stage

Finally, we proceed to do an end-to-end training of all of TabCBM's components
using its composite loss. This will yield a fully trained TabCBM from which
learnt concept masks could be analyzed and predictions could be made:

In [15]:
# First we will instantiate a new TabCBM that is NOT in self-supervised mode
# and we will load its weights so that they are the same as the model whose
# mask generators have been pre-trained using the SS loss.
tabcbm = TabCBM(
    self_supervised_mode=False,
    # Notice how we provide as concept generators the concept generators of the
    # SS TabCBM:
    concept_generators=ss_tabcbm.concept_generators,
    # as well as the feature probability masks:
    prior_masks=ss_tabcbm.feature_probabilities,
    **tab_cbm_params,
)
tabcbm.compile(optimizer=tf.keras.optimizers.Adam(learning_rate))
tabcbm._compute_supervised_loss(
    x_train[:2, :],
    y_train[:2],
    c_true=None,
)
tabcbm(x_train[:2, :])
tabcbm.summary()

Model: "tab_cbm_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 decoder (Functional)        (None, 4)                 340       
                                                                 
 encoder (Functional)        (None, 16)                2160      
                                                                 
 sequential_2 (Sequential)   (None, 16)                9516      
                                                                 
 concept_generators_0 (Seque  (None, 16)               7504      
 ntial)                                                          
                                                                 
 concept_generators_1 (Seque  (None, 16)               7504      
 ntial)                                                          
                                                                 
Total params: 27,448
Trainable params: 27,224
Non-trainab

In [16]:
#####################
## Next, we perform the end-to-end training of this architecture
#####################

# Number of maximum epochs to train
max_epochs = 1500

# Time to do the end-to-end training!
tabcbm_hist = tabcbm.fit(
    x=x_train,
    y=y_train,
    validation_split=validation_size,
    epochs=max_epochs,
    batch_size=batch_size,
    verbose=1,
)
print("\tTabCBM supervised training completed")

Epoch 1/1500
Epoch 2/1500
Epoch 3/1500
Epoch 4/1500
Epoch 5/1500
Epoch 6/1500
Epoch 7/1500
Epoch 8/1500
Epoch 9/1500
Epoch 10/1500
Epoch 11/1500
Epoch 12/1500
Epoch 13/1500
Epoch 14/1500
Epoch 15/1500
Epoch 16/1500
Epoch 17/1500
Epoch 18/1500
Epoch 19/1500
Epoch 20/1500
Epoch 21/1500
Epoch 22/1500
Epoch 23/1500
Epoch 24/1500
Epoch 25/1500
Epoch 26/1500
Epoch 27/1500
Epoch 28/1500
Epoch 29/1500
Epoch 30/1500
Epoch 31/1500
Epoch 32/1500
Epoch 33/1500
Epoch 34/1500
Epoch 35/1500
Epoch 36/1500
Epoch 37/1500
Epoch 38/1500
Epoch 39/1500
Epoch 40/1500
Epoch 41/1500
Epoch 42/1500
Epoch 43/1500
Epoch 44/1500
Epoch 45/1500
Epoch 46/1500
Epoch 47/1500
Epoch 48/1500
Epoch 49/1500
Epoch 50/1500
Epoch 51/1500
Epoch 52/1500
Epoch 53/1500
Epoch 54/1500
Epoch 55/1500
Epoch 56/1500
Epoch 57/1500
Epoch 58/1500
Epoch 59/1500
Epoch 60/1500
Epoch 61/1500
Epoch 62/1500
Epoch 63/1500
Epoch 64/1500
Epoch 65/1500
Epoch 66/1500
Epoch 67/1500
Epoch 68/1500
Epoch 69/1500
Epoch 70/1500
Epoch 71/1500
Epoch 72/1500
E

## Part 3: Evaluate Model

Once the TabCBM has been trained, you can (1) inspect its learn concepts (
through their learnt masks), (2) evaluate its performance on a test set, and (3)
see if its concepts align to any known ground truth concepts; if so, then you
can intervene on them too! Here we will show how each of these things can be
done.


First, it is important to know how to interact with a trained TabCBM. A TabCBM
can be called with any input sample of shape `(batch_size, ...)` using TF's
functional API:
```python
y_pred, concept_scores = tabcbm(x)
```
Where:
1. `y_pred` is a $(\text{batch\_size}, L)$-dimensional vector where the $i$-th
dimension is proportional to the probability that the i-th label is predicted
for the current sample (the model outputs logits by default). If the downstream
task is binary, then the TabCBM will output a $(\text{batch\_size})$-dimensional
vector where each entry is the logit of the probability of the downstream class
being $1$.
2. `concept_scores` is a $(\text{batch\_size}, k^\prime)$-dimensional vector whose
entries, all in $[0, 1]$, represent the activation of each of the $k^\prime$ discovered
concepts for all samples in the provided input.

This allows us to compute some metrics of interest. Below, we will use
this API to run inference in batches in a GPU and obtain all test activations:

In [22]:
# Compute the test task label predictions and the test set concept scores
test_y_pred, test_concept_scores = tabcbm.predict(
    x_test,
    batch_size=batch_size,
)
if ((len(test_y_pred.shape) == 2)) and (test_y_pred.shape[-1] >= 2):
    # Then lets apply a softmax activation over all the probability
    # classes
    preds = scipy.special.softmax(
        test_y_pred,
        axis=-1,
    )

    one_hot_labels = tf.keras.utils.to_categorical(y_test)
    results['acc'] = sklearn.metrics.accuracy_score(
        y_test,
        np.argmax(preds, axis=-1),
    )

    # And select just the labels that are in fact being used
    results['auc'] = sklearn.metrics.roc_auc_score(
        one_hot_labels,
        preds,
        multi_class='ovo',
    )
else:
    test_preds = test_y_pred
    if np.min(test_preds) < 0.0 or np.max(test_preds) > 1:
        # Then we assume that we have outputed logits
        test_preds = tf.math.sigmoid(test_preds).numpy()
    test_preds = (test_preds >= 0.5).astype(np.int32)
    results['acc'] = sklearn.metrics.accuracy_score(
        y_test,
        test_preds,
    )
    results['auc'] = sklearn.metrics.roc_auc_score(
        y_test,
        test_preds,
    )

print(
    f"Accuracy is {results['acc']*100:.2f}%"
)

Accuracy is 95.93%


We can also look at the learnt masks using the `feature_probabilities` field in our TabCBM object:

In [23]:
# The masks are stored as logits, so we need to turn them to probabilities using
# a sigmoid
masks = tf.sigmoid(tabcbm.feature_probabilities).numpy()
print("Thresholded concept masks learnt by TabCBM:")
for i, mask in enumerate((masks>0.5).astype(np.int32)):
    print("\tFor concept", i, "we are selecting the following features", mask)
print("-" * 80)
print("-" * 80)
print("For comparison, the ground truth concept masks are")
for i, mask in enumerate(ground_truth_concept_masks):
    print("\tFor GROUND-TRUTH concept", i, " the following features are relevant", mask)


Thresholded concept masks learnt by TabCBM:
	For concept 0 we are selecting the following features [1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
	For concept 1 we are selecting the following features [0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0]
--------------------------------------------------------------------------------
--------------------------------------------------------------------------------
For comparison, the ground truth concept masks are
	For GROUND-TRUTH concept 0  the following features are relevant [1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0

We can also look at the learnt concept scores and see if they are closely
correlated with the activation of a known ground truth concept:

In [24]:
import tabcbm.metrics as metrics

# We will do this using the training set to avoid information leakage from
# the test set
train_y_pred, train_concept_scores = tabcbm.predict(
    x_train,
    batch_size=batch_size,
)
best_concept_alignment, best_concept_aligment_corr = \
    metrics.find_best_independent_alignment(
        scores=train_concept_scores,
        c_train=c_train,
    )

Here, `best_concept_alignment` will be a vector of size $k^\prime$ such that
`best_concept_alignment[i]` indicates which ground truth concept is most highly
(linearly) correlated with discovered concept `i`. In contrast, `best_concept_aligment_corr`
is a vector of size $k^\prime$ telling you the absolute Pearson correlation coefficient
between ground truth concept `best_concept_alignment[i]` and the scores of the
$i$-th discovered concept.

In [25]:
for discovered_concept_idx, gt_concept_idx in enumerate(best_concept_alignment):
    print(
        "Discovered concept",
        discovered_concept_idx,
        "is most closely aligned with ground truth concept",
        gt_concept_idx,
        "and has an absolute Pearson correlation of",
        best_concept_aligment_corr[discovered_concept_idx],
    )

Discovered concept 0 is most closely aligned with ground truth concept 0 and has an absolute Pearson correlation of 0.9200275111879257
Discovered concept 1 is most closely aligned with ground truth concept 1 and has an absolute Pearson correlation of 0.9285000848460767


We can use this to compute the effect of concept interventions on these algined
concepts:

In [26]:
# We first define a threshold of correlation under which we consider a
# correlation to be strong enough that we can intervene on it. As in our paper,
# we will consider a discovered concept to be strongly aligned with a ground
# truth concept is its absolute Pearson correlation coefficient is at least
# 0.85
thresh = 0.85

# Next, figure out which discovered concepts are strongly aligned with known
# ground truth concepts
n_ground_truth_concepts = c_train.shape[-1]
selected_concepts = best_concept_aligment_corr >= thresh
selected_concepts_idxs = np.array(list(range(n_concepts)))[selected_concepts]
best_concept_alignment = best_concept_alignment[selected_concepts]

# This lets us know how many concepts at most we will intervene on during
# testing
interveneable_concepts = np.sum(selected_concepts)
print(
    f"Number of concepts we will intervene on " +
    f"is {interveneable_concepts}/{n_concepts}"
)

# At this point, we can predict the concept scores/bottleneck for all of the
# test samples. This will be useful as we will update these bottlenecks as a
# way to try out interventions
_, test_bottleneck = tabcbm.predict_bottleneck(x_test)
test_bottleneck = test_bottleneck.numpy()
one_hot_labels = tf.keras.utils.to_categorical(y_test)
intervention_accs = []

# And time to make interventions starting with NO interventions up to
# intervening on all `interveneable_concepts` concepts.
for num_intervened_concepts in range(0, interveneable_concepts + 1):
    # We will average interventions over `intervention_trials`  random interventions
    # with `num_intervened_concepts` being intervened to get an estimate of the
    # effect of interveningon `num_intervened_concepts` randomly selected concepts
    intervention_trials = 5
    avg = 0.0
    for _ in range(intervention_trials):
        # For each trial, randomly select `num_intervened_concepts` concepts out
        # of the set of concepts we considered strongly aligned
        current_sel = np.random.permutation(
            list(range(len(selected_concepts_idxs)))
        )[:num_intervened_concepts]

        # Look at the ground truth concepts that correspond to these learnt
        # concepts
        fixed_used_concept_idxs = selected_concepts_idxs[current_sel]
        real_corr_concept_idx = best_concept_alignment[current_sel]

        # And update the bottleneck accordingly
        new_test_bottleneck = test_bottleneck[:, :]
        # We need to figure out the "direction" of the intervention:
        #     There is not reason why a learnt concept aligned such that its
        #     corresponding ground truth concept is high when the learnt concept
        #     is high. Because they are binary, it could perfectly be the case
        #     that the alignment happend with the complement.
        for learnt_concept_idx, real_concept_idx in zip(
            fixed_used_concept_idxs,
            real_corr_concept_idx,
        ):
            correlation = np.corrcoef(
                train_concept_scores[:, learnt_concept_idx],
                c_train[:, real_concept_idx],
            )[0, 1]
            pos_score = np.percentile(
                train_concept_scores[:, learnt_concept_idx],
                95
            )
            neg_score = np.percentile(
                train_concept_scores[:, learnt_concept_idx],
                5
            )
            if correlation > 0:
                # Then this is a positive alignment
                new_test_bottleneck[:, learnt_concept_idx] = \
                    c_test[:, real_concept_idx] * pos_score + (
                        (1 - c_test[:, real_concept_idx]) * neg_score
                    )
            else:
                # Else we are aligned with the complement
                new_test_bottleneck[:, learnt_concept_idx] =  \
                    (1 - c_test[:, real_concept_idx]) * pos_score + (
                        c_test[:, real_concept_idx] * neg_score
                    )

        # and time to compute the accuracy with the updated bottleneck:
        partial_acc = sklearn.metrics.accuracy_score(
            y_test,
            np.argmax(
                scipy.special.softmax(
                    tabcbm.from_bottleneck(new_test_bottleneck),
                    axis=-1,
                ),
                axis=-1
            ),
        )
        avg += partial_acc
    avg = avg / intervention_trials
    intervention_accs.append(avg)
    print(
        f"\tIntervention accuracy after intervening on {num_intervened_concepts} "
        f"concepts (thresh = {thresh} with "
        f"{interveneable_concepts} interveneable concepts): "
        f"{avg * 100:.2f}%"
    )

Number of concepts we will intervene on is 2/2
	Intervention accuracy after intervening on 0 concepts (thresh = 0.85 with 2 interveneable concepts): 95.93%
	Intervention accuracy after intervening on 1 concepts (thresh = 0.85 with 2 interveneable concepts): 99.57%
	Intervention accuracy after intervening on 2 concepts (thresh = 0.85 with 2 interveneable concepts): 100.00%
