Implementation of Window-Based Comparison (WBC) - a membership inference attack against fine-tuned Large Language Models using localized window-based analysis.
# Clone repository
git clone https://github.com/Stry233/WBC
cd wbc-attack
# Create environment
python -m venv venv
source venv/bin/activate
# Install dependencies
pip install -r requirements.txtCreate balanced member/non-member splits from a HuggingFace dataset:
python dataset/prep.py \
--dataset_name "HuggingFaceTB/cosmopedia" \
--config "khanacademy" \
--num_samples 20000 \
--min_length 512 \
--output_dir "cosmopedia-khanacademy-subset"This creates train.json (members) and test.json (non-members) in the output directory.
Fine-tune a model on the member data:
python trainer/get_target.py \
--config_path configs/config_all.yaml \
--base_path ./weights \
--train_subset_size 10000 \
--ref_subset_size 10000To get the yaml file for your setup, please modify and run \trainer\configs\prep.py based on instructions in that file.
Execute WBC and baseline attacks:
python run.py \
--config configs/config_all.yaml \
--output results/ \
--base-dir ./weights \
--seed 42global:
target_model: "./path/to/target"
reference_model_path: "EleutherAI/pythia-2.8b"
datasets:
- json_train_path: "data/train.json"
json_test_path: "data/test.json"
batch_size: 1
max_length: 512
fpr_thresholds: [0.1, 0.01, 0.001]
n_bootstrap_samples: 100
# WBC attack settings
Wbc:
module: "wbc"
reference_model_path: "EleutherAI/pythia-2.8b"
context_window_lengths: [2, 3, 4, 6, 9, 13, 18, 25, 32, 40]Enable/disable attacks by commenting them in configs/config_all.yaml:
# Reference-free attacks
loss:
module: loss
zlib:
module: zlib
# Reference-based attacks
ratio:
module: ratio
reference_model_path: "EleutherAI/pythia-2.8b"
# Our method
Wbc:
module: "wbc"
# ... configurationpython dataset/prep.py \
--dataset_name "your_dataset" \
--text_column "text" \
--split "train" \
--num_sample 20000 \
--min_length 512 \
--tokenizer_name "EleutherAI/pythia-2.8b"To add a new attack, create a file in attacks/:
from attacks import AbstractAttack
class YourAttack(AbstractAttack):
def __init__(self, name, model, tokenizer, config, device):
super().__init__(name, model, tokenizer, config, device)
def _process_batch(self, batch):
# Implement your attack logic
scores = compute_membership_scores(batch)
return {self.name: scores}Then add to configs/config_all.yaml:
your_attack:
module: your_attack
# your parametersThe attack produces:
-
Metadata file:
metadata_[timestamp]_[config].pklcontaining:- Attack scores for all methods
- Ground truth labels
- AUC and TPR metrics
- Configuration details
-
Console output: Results table with AUC and TPR@FPR metrics
├── attacks/ # Attack implementations
│ ├── wbc.py # WBC attack implementation
│ └── misc/
│ └── utils.py # Loss computation utilities
├── trainer/ # Model fine-tuning
│ ├── get_target.py # Main training script
│ └── configs/ # Training configurations
├── configs/ # Attack configurations
├── dataset/ # Dataset preparation
├── scripts/ # Automation scripts
├── run.py # Main attack runner
└── utils.py # Shared utilities
-
GPU Memory:
- Minimum: 8GB (for Pythia-160M)
- Recommended: 40GB
-
Disk Space: ~5GB per fine-tuned model
-
Python Packages: See
requirements.txt
TBD.