Skip to content

UCSC-REAL/SCPL

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[ACL 2026 Main] Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop (SCPL)

Official implementation for:

Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop
Yaxuan Wang, Zhongteng Cai, Yujia Bao, Xueru Zhang, Yang Liu
arXiv:2601.05184

This repository studies bias dynamics in self-consuming loops and provides training, data generation/selection, and mitigation pipelines across multiple tasks (News, NuminaMath, and Preference).

What Is Included

  • Performative incremental finetuning and retraining loops.
  • Seven data-generation variants used in the paper.
  • Mitigation methods (VRS, TPP, TOP, and our reweight strategy).
  • DPO experiment scripts for News and Numina.
  • Reproducible run scripts under runs/.

Installation

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Optional (recommended) cache setup:

export SCPL_CACHE_DIR=/path/to/hf_cache
export HF_HOME=$SCPL_CACHE_DIR

Data Requirements

News

Expected source files:

  • data/news/right_news.json
  • data/news/left_news.json
  • data/news/right_news_test.json
  • data/news/left_news_test.json

Preference (Round-0 initialization)

Round-0 preference data is built by init_preference_data.py from Hugging Face datasets (defaults are configured in the script).

NuminaMath

Numina round-0 data is prepared by init_numina_data.py.

Bias Classifier Preparation

Before running News bias evaluation, prepare the classifier artifact and make sure the path used in evaluate_bias_vllm.py points to your generated checkpoint.

python bias_classifier.py

Main Experiment Settings and Variants

Two settings

  1. Incremental finetuning loop
  2. Retraining loop

Seven variants

  • real_only_no_dynamic
  • syn_only_no_dynamic
  • real_dynamic
  • syn_dynamic
  • syn_dynamic_fr (fixed-ratio dynamic synthetic)
  • real_dynamic_accumulation
  • syn_dynamic_accumulation

Quick Commands

News (finetuning/retraining)

# One variant
VARIANT=syn_dynamic MAX_ROUND=3 bash runs/news/finetuning.sh
VARIANT=syn_dynamic MAX_ROUND=3 bash runs/news/retrain.sh

# All 7 variants
RUN_ALL=1 MAX_ROUND=3 bash runs/news/finetuning.sh
RUN_ALL=1 MAX_ROUND=3 bash runs/news/retrain.sh

News mitigation

# finetuning mode (default)
METHOD=reweight MAX_ROUND=3 bash runs/mitigation/news_mitigation.sh
RUN_ALL=1 MAX_ROUND=3 bash runs/mitigation/news_mitigation.sh

# retraining mode
TRAIN_MODE=retrain RUN_ALL=1 MAX_ROUND=3 bash runs/mitigation/news_mitigation.sh

NuminaMath

VARIANT=syn_dynamic MAX_ROUND=3 bash runs/numina/finetuning.sh
VARIANT=syn_dynamic MAX_ROUND=3 bash runs/numina/retrain.sh
RUN_ALL=1 MAX_ROUND=3 bash runs/numina/finetuning.sh

Preference (training + next-round data generation only)

Preference evaluation is handled by an external codebase: GAIR-NLP/Preference-Dissection

# finetuning / retraining loops
VARIANT=syn_dynamic MAX_ROUND=5 bash runs/preference/finetuning.sh
RUN_ALL=1 MAX_ROUND=5 bash runs/preference/retrain.sh

# mitigation
METHOD=reweight TRAIN_MODE=finetune bash runs/preference/mitigation.sh

# Please refer to our path `Preference-Dissection/visualization/evaluate.sh` for the evaluation code.

DPO

Please use scripts under runs/dpo/.

Multi-source

bash runs/multi-source.sh

Minimal end-to-end demo

bash runs/easy_run.sh

Main Entry Points

  • finetune_news.py: News SFT loop training.
  • finetune_harm.py: training entry for Numina/Preference (and related tasks).
  • news_selection_vllm.py: News next-round data generation and sampling.
  • data_selection.py: Numina/Preference data selection and generation.
  • train_dpo.py: DPO training entry.
  • evaluate_bias_vllm.py: News bias evaluation.
  • init_news_data.py, init_numina_data.py, init_preference_data.py: round-0 data initialization.

Repository Layout

  • runs/: clean runnable experiment scripts.
  • scripts/: original/legacy experiment scripts and references.
  • config/: Hydra configs and model settings.
  • dpo/: DPO utilities.
  • data/: input datasets.
  • docs/: experiment and repository documentation.
  • exps/: generated outputs (checkpoints, per-round data, logs).

Additional Documentation

  • docs/README.md: documentation index.
  • docs/acl_camera_ready_repro_commands.md: one-page camera-ready reproducibility commands.
  • docs/repository_guide.md: detailed repository guide (workflow, modules, directories, outputs).

Notes

  • scripts/ are kept for traceability and may include environment-specific assumptions.
  • exps/ is intentionally gitignored.
  • For public release, prefer running and reporting from runs/.

Citation

@article{wang2026observations,
  title={Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop},
  author={Wang, Yaxuan and Cai, Zhongteng and Bao, Yujia and Zhang, Xueru and Liu, Yang},
  journal={arXiv preprint arXiv:2601.05184},
  year={2026}
}

About

Observations and Remedies for Large Language Model Bias in Self-Consuming Performative Loop

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors