This repo contains datasets, code, and other assets for each module of the Braintrust Eval Course.
- 03 — Build a Simple Eval in the Braintrust UI — Dataset CSV, prompts, and scorer
- 06 — Build a Simple Eval in Code — Python eval script
- 07 — Nondeterminism — Eval with trial runs
- 10 — Building a Multi-Turn Chat App — Chat app with Braintrust logging
- 11 — Analyzing Multi-Turn Traces — Batch script to score full conversations at the trace level
- 12 — Online Scoring — Conversation generator script + online scoring configuration in the Braintrust UI
- 13 — Analyzing Production Logs — Script to generate production logs + Topics setup walkthrough
- 14 — The Improvement Loop — Baseline and fixed eval scripts to verify a prompt change resolves a regression
- Sign up for a free account at braintrust.dev
- Set your API keys:
export BRAINTRUST_API_KEY="your-api-key"
export OPENAI_API_KEY="your-openai-api-key"- Navigate to the module directory you're working on and follow the README there.