Code, configs, and data for an empirical audit of how GPT-5-chat and Claude Sonnet 4.6 respond to neurodivergence (ND) context supplied via the system prompt.
See spec.md for the full experimental design and research questions.
configs/ # profiles, prompts, queries — single source of truth for the experiment
ndbench/ # runner, metrics, judges, analysis code
data/responses/ # raw model outputs (JSONL), one per (model, condition, profile)
paper/ # LaTeX source
cp .env.example .envand paste real API keys into.env(never commit).python -m venv .venv && source .venv/bin/activate && pip install -r requirements.txtpython -m ndbench.runner— dispatches all model calls, caches todata/responses/.python -m ndbench.metrics.run— computes structural, surface, and harm metrics.python -m ndbench.analyze— fits models, emits figures and tables topaper/figures/.
gpt-5-chat-latest(OpenAI, non-reasoning chat variant)claude-sonnet-4-6(Anthropic)
Code: MIT. Data and paper artifacts: CC-BY-4.0.