Skip to content

DorLitvak/severance

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

The Severance Problem: LLMs are Unaware of the Person Beyond the Prompt

The Severance Problem

Code and data release for "The Severance Problem: LLMs are Unaware of the Person Beyond the Prompt".

The Severance Schema is a prompt-level structural prior that gives an LLM an explicit inventory of which categories of person-context exist for the user it is serving — even when no personal data is filled in. We show that this single structural change (i) cuts harm and sycophancy on every one of five model families, (ii) recovers most of the safety cost that bullet-style memory introduces (hallucination 3.7–11.7% → 1.7–4.0%), (iii) holds across fill levels from 0% to 100%, and (iv) is the only condition under which a model's clarifying questions translate into a usefulness gain on the next turn.

This release contains everything needed to reproduce the four experiments reported in the paper.


Repository layout

release/
├── core.py                 # Prompts, schema, profile-routing, scenario logic
├── run_all.py              # Single entrypoint: generation, evaluation, tables
├── run_all_batch.py        # Optional: batch-mode judge (Anthropic Batch API)
├── data/                   # Profiles, scenarios, claims (the benchmark)
├── results/                # Pre-computed model outputs + judge scores (134 MB)
├── assets/                 # Bootstrap, table-builder, figure scripts
├── examples/               # Single qualitative transcript (Fig. 4)
├── requirements.txt
└── LICENSE

Setup

pip install -r requirements.txt

# Portkey is used for ALL judging and for Claude/GPT subject generation.
# Set up two virtual keys in your Portkey dashboard (one fronting Anthropic,
# one fronting OpenAI) and export them along with your Portkey API key:
export PORTKEY_API_KEY=...
export PORTKEY_VK_ANTHROPIC=...      # Portkey virtual key for Anthropic
export PORTKEY_VK_OPENAI=...         # Portkey virtual key for OpenAI

# Together is only needed for the open-weight subject models:
export TOGETHER_API_KEY=...

Models referenced in the paper:

Slug Provider
claude-sonnet-4-20250514 Portkey/Anthropic
meta-llama/Llama-3.3-70B-Instruct-Turbo Together
deepseek-ai/DeepSeek-V3 Together
google/gemma-4-31B-it Together
Qwen/Qwen3-235B-A22B-Instruct-2507-tput Together
gpt-5.2-2025-12-11 (judge) Portkey/OpenAI

The four experiments

Each experiment is a three-stage pipeline:

  1. generate subject-model responses
  2. evaluate them with one or both judges
  3. tables / figures / bootstrap to produce the paper artifacts

The pre-computed outputs of stages 1–2 are shipped under results/, so a reviewer can skip directly to stage 3 if they trust the API outputs.

1 — exp_outie (Sec. 3.1, Tab. 2, Fig. fig_b_cross_family_6panel)

Five subject models × four conditions (No Schema, Memory, Severance Schema, Severance Schema + Mem.).

# Generate (one command per subject model):
python run_all.py exp_outie --model claude-sonnet-4-20250514 --concurrency 8
python run_all.py exp_outie --model meta-llama/Llama-3.3-70B-Instruct-Turbo --concurrency 8
python run_all.py exp_outie --model deepseek-ai/DeepSeek-V3 --concurrency 8
python run_all.py exp_outie --model google/gemma-4-31B-it --concurrency 8
python run_all.py exp_outie --model Qwen/Qwen3-235B-A22B-Instruct-2507-tput --concurrency 8

# Evaluate (per subject, with both judges):
python run_all.py evaluate exp_outie --model <subject> --judge-model gpt-5.2-2025-12-11
python run_all.py evaluate exp_outie --model <subject> --judge-model claude-sonnet-4-20250514

2 — exp_cal (Sec. 3.2, Fig. fig_c_fill_curve)

Claude Sonnet 4, 5 fill levels (0%, 25%, 50%, 75%, 100%) × 2 formats (Memory, Severance Schema).

python run_all.py exp_cal --model claude-sonnet-4-20250514 --concurrency 8
python run_all.py evaluate exp_cal --model claude-sonnet-4-20250514 --judge-model gpt-5.2-2025-12-11

3 — exp_multi_natural (Sec. 3.3, Tab. exp_multi-main)

Claude Sonnet 4, two-turn natural-asking protocol across four conditions. The "natural" suffix marks the protocol used in the paper: turn 1 is the bare scenario question (no instruction to ask anything), and turn 2 re-runs the same scenario with the model's own clarifying questions answered from the profile.

python run_all.py exp_multi_natural \
    --model claude-sonnet-4-20250514 \
    --extractor-model claude-haiku-4-5-20251001 \
    --concurrency 8
python run_all.py evaluate exp_multi_natural --model claude-sonnet-4-20250514 --judge-model gpt-5.2-2025-12-11
python run_all.py evaluate exp_multi_natural --model claude-sonnet-4-20250514 --judge-model claude-sonnet-4-20250514

The paper's exp_multi-main table reports T2 (from this experiment) and Δ = T2 − T1, where T1 is the same condition's row from exp_outie (no asking, no retrieval). assets/build_tables.py does the join automatically.

4 — exp_ablation (App., Tab. exp-ablation)

Claude Sonnet 4, three ablation arms (Format Only, Content Only, full Severance Schema) plus the No Schema baseline.

python run_all.py exp_ablation --model claude-sonnet-4-20250514 --concurrency 8
python run_all.py evaluate exp_ablation --model claude-sonnet-4-20250514 --judge-model gpt-5.2-2025-12-11

Reproducing the paper's tables and figures

All headline numbers in the paper are produced from the JSONs in results/. No API calls are needed.

# Main paper tables (Tab. 2, exp_cal-main, exp_multi-main, app-ablation, ...):
python assets/build_tables.py

# Per-cell appendix tables and full breakdowns:
python assets/full_tables.py

# Cluster-bootstrap CIs (B = 10,000, clustered on profile × scenario):
python assets/bootstrap.py
# → writes assets/bootstrap_cis.txt (the file the paper transcribes verbatim)

# Cross-judge agreement statistics (App. judge):
python assets/compute_judge_agreement.py

# Figures used in the paper (output: release/figures/):
python assets/figures.py
# Produces:
#   fig_b_cross_family_6panel.pdf   (Fig. 2 — exp_outie cross-family sweep)
#   fig_c_fill_curve.pdf            (Fig. 3 — exp_cal fill-level curve)
# The qualitative example (Fig. 4, example_lin_vaccines_mem.pdf) is shipped
# pre-rendered under release/examples/ — it is hand-composed, not generated.

Reproducing the headline number end-to-end (smoke test)

If you want to verify a single cell of Tab. 2 from scratch with a tiny budget:

python run_all.py exp_outie --model claude-sonnet-4-20250514 --max-profiles 2 --max-scenarios 5
python run_all.py evaluate exp_outie --model claude-sonnet-4-20250514 --judge-model gpt-5.2-2025-12-11
python run_all.py tables exp_outie --model claude-sonnet-4-20250514 --judge-model gpt-5.2-2025-12-11

This runs ~40 generations and ~40 judge calls.


The benchmark (data/)

File Contents
profiles.json 10 synthetic person profiles, organized by the schema's six dimensions
scenarios.json 30 advisory scenarios + per-scenario claim lists
scenario_variants.json Profile-routed variants (e.g. s21_grandkid for childless personas)
claims_by_dimension.json The 52 claim labels grouped by dimension

Per-scenario claim assignments (which claims are decision-flip vs. good-answer for each scenario) are visualized in assets/scenario_claim_matrix.pdf.


License

MIT. See LICENSE.

About

The Severance Problem: LLMs are Unaware of the Person Beyond the Prompt

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages