### Random Baseline for CEBaB

This is one interesting baseline. For each model architecture, we basically take a randomly initialized model and evaluate CEBaB score. This is different from the `RandomExplainer` mentioned in the paper. Here, we actually have model with random weights. 

This script simply randomly initialize different models and save to disk for evaluation.

**Note**: For random initialized model, there are two ways: (1) taking the pretrained weights which is really bad at classifying things. (2) randomly initialized model. For the `LSTM` model, it is a little tricky, but I don't think there is a much difference in this case.

In [1]:
from libs import *
from modelings.modelings_bert import *
from modelings.modelings_roberta import *
from modelings.modelings_gpt2 import *
from modelings.modelings_lstm import *

In [2]:
"""
The following blocks will run CEBaB benchmark in
all the combinations of the following conditions.
"""
grid = {
    "seed": [42, 66, 77],
    "class_num": [5],
    "model_arch" : ["bert-base-uncased"]
}

keys, values = zip(*grid.items())
permutations_dicts = [dict(zip(keys, v)) for v in itertools.product(*values)]

Random Baseline

In [4]:
for i in range(len(permutations_dicts)):
    seed=permutations_dicts[i]["seed"]
    class_num=permutations_dicts[i]["class_num"]
    model_arch=permutations_dicts[i]["model_arch"]
    if model_arch == "bert-base-uncased":
        model_path = "BERT-baseline-random"
        interchange_layer = 10
        h_dim = 192
    elif model_arch == "roberta-base":
        model_path = "RoBERTa-baseline-random"
    elif model_arch == "gpt2":
        model_path = "gpt2-baseline-random"
    elif model_arch == "lstm":
        model_path = "lstm-baseline-random"

    output_dir = f'../proxy_training_results/{model_path}/'\
                 f'cebab.alpha.0.0.beta.0.0.gemma.0.0.'\
                 f'lr.8e-05.dim.{h_dim}.hightype.{model_arch}.'\
                 f'CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.'\
                 f'approximate.k.0.int.layer.{interchange_layer}.'\
                 f'seed_{seed}/'
    print("outputting to: ", output_dir)
    
    config_name = None
    tokenizer_name = None
    if model_arch == "lstm":
        config_name = "bert-base-uncased"
        tokenizer_name = "bert-base-uncased"
        
    config = AutoConfig.from_pretrained(
        config_name if config_name else model_arch,
        num_labels=class_num,
        cache_dir="../huggingface_cache/",
    )
    config.intervention_h_dim = h_dim
    config.interchange_hidden_layer = interchange_layer
    
    tokenizer = AutoTokenizer.from_pretrained(
        tokenizer_name if tokenizer_name else model_arch,
        cache_dir="../huggingface_cache/",
        use_fast=True,
    )
    
    if "bert-base-uncased" in model_arch:
        model_serving_module = IITBERTForSequenceClassification
    elif "gpt2" in model_arch:
        model_serving_module = IITGPT2ForSequenceClassification
    elif "roberta" in model_arch:
        model_serving_module = IITRobertaForSequenceClassification
    elif "lstm" in model_arch:
        model_serving_module = IITLSTMForSequenceClassification
        config.update_embeddings=False
        config.bidirectional=True
        config.num_hidden_layers=1
        config.hidden_size=300
    model = model_serving_module(
        config=config,
    )
    if "lstm" in model_arch:
        # load the preloaded embedding file.
        fasttext_embeddings = torch.load("../eval_pipeline/customized_models/lstm/embeddings.bin")
        model.lstm.embeddings.word_embeddings.weight.data = nn.Embedding(
            fasttext_embeddings.shape[0], fasttext_embeddings.shape[1]
        ).weight.data
    # some post-editing for customized models.
    if model_arch == "gpt2":
        # Define a padding token
        model.config.pad_token_id = tokenizer.pad_token_id
    model.save_pretrained(
        output_dir,
    )

outputting to:  ../proxy_training_results/BERT-baseline-random/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-05.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_42/
outputting to:  ../proxy_training_results/BERT-baseline-random/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-05.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_66/
outputting to:  ../proxy_training_results/BERT-baseline-random/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-05.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_77/


Black-box Baseline

In [6]:
for i in range(len(permutations_dicts)):
    seed=permutations_dicts[i]["seed"]
    class_num=permutations_dicts[i]["class_num"]
    model_arch=permutations_dicts[i]["model_arch"]
    h_dim = 75 if "lstm" in model_arch else 192
    if model_arch == "bert-base-uncased":
        model_path = "BERT-baseline-blackbox"
        interchange_layer = 10
        h_dim = 192
    elif model_arch == "roberta-base":
        model_path = "RoBERTa-baseline-blackbox"
    elif model_arch == "gpt2":
        model_path = "gpt2-baseline-blackbox"
    elif model_arch == "lstm":
        model_path = "lstm-baseline-blackbox"
        
    blackbox_model_path = f'../saved_models/{model_arch}.opentable.CEBaB.sa.'\
                          f'{class_num}-class.exclusive.seed_{seed}'
    output_dir = f'../proxy_training_results/{model_path}/'\
                 f'cebab.alpha.0.0.beta.0.0.gemma.0.0.'\
                 f'lr.8e-5.dim.{h_dim}.hightype.{model_arch}.'\
                 f'CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.'\
                 f'approximate.k.0.int.layer.{interchange_layer}.'\
                 f'seed_{seed}/'
    
    print("outputting to: ", output_dir)
    config = AutoConfig.from_pretrained(
        blackbox_model_path,
        num_labels=class_num,
        cache_dir="../huggingface_cache/",
    )
    config.intervention_h_dim = h_dim
    config.interchange_hidden_layer = interchange_layer
    
    tokenizer = AutoTokenizer.from_pretrained(
        blackbox_model_path,
        cache_dir="../../huggingface_cache/",
        use_fast=True,
    )

    if "bert-base-uncased" in model_arch:
        model_serving_module = IITBERTForSequenceClassification
    elif "gpt2" in model_arch:
        model_serving_module = IITGPT2ForSequenceClassification
    elif "roberta" in model_arch:
        model_serving_module = IITRobertaForSequenceClassification
    elif "lstm" in model_arch:
        model_serving_module = IITLSTMForSequenceClassification
        config.update_embeddings=False
        config.bidirectional=True
        config.num_hidden_layers=1
        config.hidden_size=300
    model = model_serving_module.from_pretrained(
        blackbox_model_path,
        config=config,
        cache_dir="../../huggingface_cache"
    )
    # some post-editing for customized models.
    if model_arch == "gpt2":
        # Define a padding token
        model.config.pad_token_id = tokenizer.pad_token_id

    model.save_pretrained(
        output_dir,
    )

outputting to:  ../proxy_training_results/BERT-baseline-blackbox/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-5.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_42/


Some weights of IITBERTForSequenceClassification were not initialized from the model checkpoint at ../saved_models/bert-base-uncased.opentable.CEBaB.sa.5-class.exclusive.seed_42 and are newly initialized: ['multitask_classifier.out_proj.weight', 'multitask_classifier.dense.bias', 'multitask_classifier.dense.weight', 'multitask_classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


outputting to:  ../proxy_training_results/BERT-baseline-blackbox/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-5.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_66/


Some weights of IITBERTForSequenceClassification were not initialized from the model checkpoint at ../saved_models/bert-base-uncased.opentable.CEBaB.sa.5-class.exclusive.seed_66 and are newly initialized: ['multitask_classifier.out_proj.weight', 'multitask_classifier.dense.bias', 'multitask_classifier.dense.weight', 'multitask_classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


outputting to:  ../proxy_training_results/BERT-baseline-blackbox/cebab.alpha.0.0.beta.0.0.gemma.0.0.lr.8e-5.dim.192.hightype.bert-base-uncased.CEBaB.cls.dropout.0.1.enc.dropout.0.1.counter.type.approximate.k.0.int.layer.10.seed_77/


Some weights of IITBERTForSequenceClassification were not initialized from the model checkpoint at ../saved_models/bert-base-uncased.opentable.CEBaB.sa.5-class.exclusive.seed_77 and are newly initialized: ['multitask_classifier.out_proj.weight', 'multitask_classifier.dense.bias', 'multitask_classifier.dense.weight', 'multitask_classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
