SIM-PANEL — Synthetic Panel Datasets for Agent Evaluation

Current version: v0.1.0-public — research prototype release, May 2026

Documentation / project site: https://bingchen-wang.github.io/sim-panel/

SIM-PANEL is a reproducible research-engineering toolkit for generating, validating, analyzing, and comparing panel-style event datasets.

It is designed as an engineering scaffold for LLM-based agent simulation, preference reconstruction, and verifiable behavioral evaluation. The package focuses on transparent data generation, schema validation, YAML-configured experiments, and distributional diagnostics rather than large-scale black-box simulation.

What SIM-PANEL does

SIM-PANEL supports workflows where simulated panelists evaluate products, interventions, or other candidate items over time.

Core capabilities include:

versioned event-level schemas;
YAML-configured generation runs;
randomized, manual, and self-selection exposure policies;
deterministic seeding for non-LLM components;
JSONL-first outputs with metadata and data dictionaries;
optional CSV export;
source adapters for importing real review-style data;
benchmark subset construction from imported real data;
single-run analysis, optional regression diagnostics, and multi-condition comparison;
optional LLM-backed enrichment, selection, and outcome generation.

Synthetic data in SIM-PANEL is intended for schema debugging, pipeline testing, ablation scaffolding, and simulation-design prototyping. It should not be interpreted as a substitute for primary empirical validation.

Core concepts

Concept	Meaning
Panelist	A simulated respondent, user, customer, or agent.
Product	An item, intervention, treatment, or candidate object being evaluated.
Event	One schema-valid row in `events.jsonl`.
Policy	Exposure logic determining how panelists encounter products.
Outcome	Structured evaluation result, such as rating or purchase intent.
Trace	Optional auxiliary text, rationale, source provenance, or debug payload.

SIM-PANEL currently supports three exposure policies:

Policy	Description
`random`	Products are assigned to panelists exogenously.
`manual`	Product-panelist assignments are loaded from a schedule or mapping.
`self_selection`	Panelists choose products from a shown choice set.

Module structure

SIM-PANEL keeps ingestion, generation, analysis, and comparison separate:

sources/
  raw external data -> imported canonical artifacts

benchmarks/
  imported artifacts -> frozen real-data subsets

generators/
  panelists + products + policies + outcomes -> synthetic events

analysis/
  one run -> summaries, metrics, plots, reports, optional regression

analysis/compare/
  multiple conditions or reference subsets -> comparison metrics and reports

The shorthand is:

Sources ingest. Benchmarks freeze. Generation simulates. Analysis inspects. Comparison evaluates.

Outputs

A standard generation run writes:

outputs/run_001/
  events.jsonl
  metadata.json
  data_dictionary.json

Optional CSV export writes:

outputs/run_001/
  events.csv

events.jsonl is the canonical dataset artifact. Each row is a schema-valid event with fields such as:

schema_version
event_id
event_type
policy
panelist_id
product_id
t
outcomes
traces
panelist_features
product_features

Self-selection runs may also include selection events and linked evaluation events via selection_id.

Installation

Clone the repository:

git clone https://github.com/bingchen-wang/sim-panel.git
cd sim-panel

Create and activate a virtual environment:

python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip

Install SIM-PANEL:

pip install -e .

For development and documentation work:

pip install -e ".[dev,docs]"

Verify the CLI:

sim-panel --help

Quickstart

Generate a small dataset from a YAML config:

sim-panel generate \
  --config examples/configs/minimal.yaml \
  --output-dir outputs/run_001

Validate the generated events:

sim-panel validate --input outputs/run_001/events.jsonl

Sample a few rows:

sim-panel sample \
  --input outputs/run_001/events.jsonl \
  --n 5 \
  --seed 0

Run single-run analysis:

sim-panel analyze --config examples/configs/analysis.yaml

Compare multiple conditions or compare synthetic outputs against a reference:

sim-panel compare --config examples/configs/compare.yaml

Source import and benchmark subsets

SIM-PANEL can import external review-style datasets into canonical artifacts. The current source layer includes an Amazon Reviews'23 adapter.

A typical real-data workflow is:

sim-panel import --config examples/configs/import_amazon.yaml

sim-panel benchmark-subset --config examples/configs/benchmark_subset.yaml

This produces a frozen real-data subset that can be used by the comparison layer.

CLI commands

Command	Purpose
`make-data`	Generate demo persona/product datasets.
`generate`	Generate synthetic event rows from a run config.
`validate`	Validate an events JSONL file.
`sample`	Print sampled rows from an events JSONL file.
`import`	Import an external source dataset.
`benchmark-subset`	Freeze a benchmark-ready real-data subset.
`analyze`	Run single-run analysis.
`compare`	Compare multiple conditions or synthetic outputs against a reference.

Documentation

The Sphinx documentation lives under:

docs/source/

Build locally with:

sphinx-build -b html docs/source docs/build/html

After the repository is public and GitHub Pages is enabled, the documentation site will be available at:

https://bingchen-wang.github.io/sim-panel/

Development

Run tests:

python -m pytest

Build docs with warnings treated as errors:

sphinx-build -b html docs/source docs/build/html -W

Check the CLI:

sim-panel --help
sim-panel generate --help
sim-panel validate --help
sim-panel analyze --help
sim-panel compare --help

Do not commit raw external data, generated outputs, local benchmark runs, or built documentation artifacts.

Project status and scope

SIM-PANEL is an ongoing research-engineering project. The public API and schema may evolve.

The current emphasis is on:

clean event schemas;
deterministic local generation;
modular policies and outcome models;
real-data ingestion scaffolds;
frozen reference subsets;
transparent diagnostics and comparison reports.

SIM-PANEL is not intended to claim that synthetic panelists are substitutes for human subjects or primary empirical validation.

Contact

For reproducible bugs, feature requests, or documentation issues, please use the GitHub issue tracker.

For research-related inquiries, contact Bingchen Wang at bw2506 [at] columbia [dot] edu.

Acknowledgements

SIM-PANEL is developed and maintained by Bingchen Wang as an independent research-engineering project.

The project benefited from early discussions with Bruno Abrahao and Teutly Correia on agent-based product evaluation workflows. These discussions helped motivate the beer-demo example and informed the Amazon Reviews'23 ingestion and benchmarking direction. Bruno Abrahao also contributed initial commits to an early prototype.

Any errors, design choices, or limitations remain the responsibility of the maintainer.

License

Apache-2.0. See LICENSE and NOTICE.

Name		Name	Last commit message	Last commit date
Latest commit History 49 Commits
.github/workflows		.github/workflows
assets		assets
docs		docs
examples		examples
sim_panel		sim_panel
tests		tests
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SIM-PANEL — Synthetic Panel Datasets for Agent Evaluation

What SIM-PANEL does

Core concepts

Module structure

Outputs

Installation

Quickstart

Source import and benchmark subsets

CLI commands

Documentation

Development

Project status and scope

Contact

Acknowledgements

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

SIM-PANEL — Synthetic Panel Datasets for Agent Evaluation

What SIM-PANEL does

Core concepts

Module structure

Outputs

Installation

Quickstart

Source import and benchmark subsets

CLI commands

Documentation

Development

Project status and scope

Contact

Acknowledgements

License

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages