Skip to content

sakhadib/PolAlignLLM

Repository files navigation

PBLLM

A research pipeline for benchmarking political-bias behavior of LLMs across three political tests:

  • Political Compass (pct)
  • 8values (8val)
  • SapplyValues (saply)

The repository supports:

  1. Running multi-model inference through OpenRouter
  2. Saving raw responses in JSONL format
  3. Building cleaned per-test answer tables
  4. Computing official-style test scores through Selenium + Firefox

Repository layout

Data flow

  1. Run experiment
  2. Clean / pivot to per-test CSVs
  3. Score with browser automation

Setup

Create and activate a virtual environment in project root.

Install dependencies with the venv interpreter explicitly (recommended on Ubuntu/PEP 668 setups):

./.venv/bin/python -m pip install -r requirements.txt

Primary dependencies are listed in requirements.txt.

Configure OpenRouter

Add your API key in Experiment/.env (create file if missing):

OPENROUTER_API_KEY=your_key_here

Optional runtime knobs (also in .env):

  • EXPERIMENT_MAX_WORKERS
  • EXPERIMENT_REQUEST_TIMEOUT
  • EXPERIMENT_MAX_RETRIES
  • EXPERIMENT_RETRY_BASE_DELAY
  • EXPERIMENT_RETRY_MAX_DELAY
  • EXPERIMENT_TEMPERATURE
  • EXPERIMENT_MAX_TOKENS

Run the benchmark

From project root:

./.venv/bin/python Experiment/run.py

Retry unresolved failed records only:

./.venv/bin/python Experiment/main.py --try_failed

The runner uses:

Build cleaned per-test CSVs

From Experiment:

../.venv/bin/python clean.py

This writes:

Compute test scores (Firefox, visible by default)

From project root:

./.venv/bin/python scoring_tools/pct_scorer.py
./.venv/bin/python scoring_tools/8val_scorer.py
./.venv/bin/python scoring_tools/sap_scorer.py

Outputs:

Scorer details are documented in scoring_tools/README.md.

Output schemas

Cleaned answer CSVs

Common columns:

  • test
  • question_id
  • question_text
  • prompt_varient
  • one column per model display name

Score CSVs

  • pct_score.csv: model_name, prompt_varient, econ_score, soc_score
  • 8val_score.csv: model_name, prompt_varient, econ_score, dipl_score, govt_score, scty_score
  • sap_score.csv: model_name, prompt_varient, right_score, auth_score, prog_score

Reproducibility notes

Known naming quirks

  • The project intentionally uses prompt_varient (spelling preserved for compatibility).
  • sap.csv may contain test=saply in rows due to legacy naming continuity.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages