personalization-trap

Personalization Trap

This repository is an anonymous, sata-bench-style scaffold for evaluating memory-conditioned LLM behavior on a demographic fairness dataset and fitting random-effects models over the resulting outcomes described in The Personalization Trap: How User Memory Alters Emotional Reasoning in LLMs.

Repository Structure

src/personalization_trap/
├── evaluation/
│   ├── dataset/          # dataset loading and preparation
│   └── metrics/          # accuracy / flip-rate helpers
├── methods/
│   ├── inference/        # vLLM inference pipeline
│   ├── random_effects/   # mixed-effects analysis
│   └── utils/            # prompt / extraction / IO helpers
├── configs/              # example TOML configs
scripts/
├── run_inference.py
└── run_random_effects.py
tests/

What Is Included

Standard inference code for groupfairnessllm/random_effect_example.
vLLM inference over system_prompt + user_prompt, with support for open models such as Qwen/Qwen3-4B.
A second-stage extraction pipeline to normalize raw generations into scored labels, with multiple extraction logics:
- choice_letter
- yes_no
- regex
- json_field
- label_map
- llm_extract
Random-effects modeling to estimate how gender, age, religion, and ethnicity influence correctness.

Quick Start

python -m venv .venv
source .venv/bin/activate
pip install -e .

Run Inference

python scripts/run_inference.py \
  --dataset groupfairnessllm/random_effect_example \
  --split train \
  --model Qwen/Qwen3-4B \
  --output-dir outputs/qwen3_4b \
  --extractor choice_letter \
  --extraction-prompt-key steu_choice \
  --tensor-parallel-size 1

Fit The Random-Effects Model

python scripts/run_random_effects.py \
  --input outputs/qwen3_4b/predictions.csv \
  --output-dir outputs/qwen3_4b/random_effects \
  --group-col question_id

Expected Dataset Columns

The inference pipeline expects at least:

system_prompt
user_prompt
question_id
gold_label
gender
age
religion
ethnicity

If your dataset uses different names, pass --column-map with a TOML/JSON config or update the CLI flags.

Example:

python scripts/run_inference.py \
  --dataset groupfairnessllm/random_effect_example \
  --split train \
  --model Qwen/Qwen3-4B \
  --output-dir outputs/qwen3_4b \
  --column-map src/personalization_trap/configs/column_map.example.toml

Notes

The project is intentionally anonymous and does not include paper-identifying metadata.
The mixed-effects implementation uses statsmodels with a binomial mixed model and question-level random intercepts.
The code is designed to be extended for both emotional understanding and recommendation-style tasks.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly