## Setup

### Clone the [Repo](https://github.com/neil-dandekar/capstone)

In [1]:
!git clone https://github.com/neil-dandekar/capstone.git

fatal: destination path 'capstone' already exists and is not an empty directory.


In [2]:
import os
os.chdir("/content/capstone/classification")

In [3]:
!pip install -r requirements.txt



### Clone Checkpoints

In [4]:
!git lfs install
!git clone https://huggingface.co/cesun/cbllm-classification temp_repo
!mv temp_repo/mpnet_acs .
!rm -rf temp_repo

Updated git hooks.
Git LFS initialized.
Cloning into 'temp_repo'...
remote: Enumerating objects: 138, done.[K
remote: Counting objects: 100% (3/3), done.[K
remote: Compressing objects: 100% (3/3), done.[K
remote: Total 138 (delta 0), reused 0 (delta 0), pack-reused 135 (from 1)[K
Receiving objects: 100% (138/138), 20.01 KiB | 20.01 MiB/s, done.
Filtering content: 100% (117/117), 9.05 GiB | 156.63 MiB/s, done.
mv: cannot move 'temp_repo/mpnet_acs' to './mpnet_acs': Directory not empty


## Run Evaluations

The following code is aimed at replicating Table 2 of the paper.

This runs one of the evaluations from Table 2

In [5]:
!python test_CBLLM.py --cbl_path mpnet_acs/SetFit_sst2/roberta_cbm/cbl_acc.pt

2025-11-08 04:04:31.618276: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-11-08 04:04:31.637795: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1762574671.660282    9279 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1762574671.667280    9279 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1762574671.684813    9279 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking 

We automate this process by running all commands, collecting the outputs, and displaying them in a format like Table 2.

In [7]:
import subprocess
import json
import pandas as pd
import re

def run(cmd):
    print(f"\n=== Running: {cmd} ===\n")
    out = subprocess.check_output(cmd, shell=True, text=True)
    print(out)

    # extract accuracy from patterns like:
    # {'accuracy': 0.9406919275123559}
    match = re.search(r"\{\'accuracy\':\s*([0-9\.]+)\}", out)
    if match:
        return float(match.group(1))
    else:
        raise ValueError("Could not parse accuracy from output.")

results = []

In [8]:
commands = [
    # CB-LLM (no ACC)
    ("CB-LLM", "SST2",        "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_sst2/roberta_cbm/cbl.pt"),
    ("CB-LLM", "YelpP",       "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_yelp_polarity/roberta_cbm/cbl.pt --dataset yelp_polarity"),
    ("CB-LLM", "AGnews",      "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_ag_news/roberta_cbm/cbl.pt --dataset ag_news"),
    ("CB-LLM", "DBpedia",     "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_dbpedia_14/roberta_cbm/cbl.pt --dataset dbpedia_14"),

    # CB-LLM w/ ACC
    ("CB-LLM w/ ACC", "SST2",    "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_sst2/roberta_cbm/cbl_acc.pt"),
    ("CB-LLM w/ ACC", "YelpP",   "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_yelp_polarity/roberta_cbm/cbl_acc.pt --dataset yelp_polarity"),
    ("CB-LLM w/ ACC", "AGnews",  "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_ag_news/roberta_cbm/cbl_acc.pt --dataset ag_news"),
    ("CB-LLM w/ ACC", "DBpedia", "python test_CBLLM.py --cbl_path mpnet_acs/SetFit_dbpedia_14/roberta_cbm/cbl_acc.pt --dataset dbpedia_14"),

    # TBM & C3M
    ("TBM&C3M", "SST2",      "python test_black_box.py --model_path baseline_models/tbmc3m/backbone_finetuned_sst2.pt"),
    ("TBM&C3M", "YelpP",     "python test_black_box.py --model_path baseline_models/tbmc3m/backbone_finetuned_yelp_polarity.pt --dataset yelp_polarity"),
    ("TBM&C3M", "AGnews",    "python test_black_box.py --model_path baseline_models/tbmc3m/backbone_finetuned_ag_news.pt --dataset ag_news"),
    ("TBM&C3M", "DBpedia",   "python test_black_box.py --model_path baseline_models/tbmc3m/backbone_finetuned_dbpedia_14.pt --dataset dbpedia_14"),

    # Roberta black-box baseline
    ("Roberta", "SST2",      "python test_black_box.py --model_path baseline_models/roberta/backbone_finetuned_sst2.pt"),
    ("Roberta", "YelpP",     "python test_black_box.py --model_path baseline_models/roberta/backbone_finetuned_yelp_polarity.pt --dataset yelp_polarity"),
    ("Roberta", "AGnews",    "python test_black_box.py --model_path baseline_models/roberta/backbone_finetuned_ag_news.pt --dataset ag_news"),
    ("Roberta", "DBpedia",   "python test_black_box.py --model_path baseline_models/roberta/backbone_finetuned_dbpedia_14.pt --dataset dbpedia_14"),
]

In [None]:
for method, dataset, cmd in commands:
    acc = run(cmd)
    results.append((method, dataset, acc))

df = pd.DataFrame(results, columns=["Method", "Dataset", "Accuracy"])
pivot = df.pivot(index="Method", columns="Dataset", values="Accuracy")

print("\n=== FINAL ACCURACY TABLE ===\n")
print(pivot)


=== Running: python test_CBLLM.py --cbl_path mpnet_acs/SetFit_sst2/roberta_cbm/cbl.pt ===

