Pair-programming skills for machine learning engineering at Good Outcomes. Each skill is a self-contained, guided exercise designed to be used with an AI coding agent.
| Skill | What it tests | Needs API key? |
|---|---|---|
| calibration-audit | Model confidence calibration, threshold optimization, post-hoc calibration | No |
| prompt-sensitivity | Prompt robustness, experimental design, LLM classification evaluation | Yes |
- Clone this repo
- Pick a skill and
cdinto its directory - Open it in your editor with your preferred AI coding agent (Claude Code, Codex, Cursor, etc.)
- Run the starting script and let
SKILL.mdguide you
Each skill has a README.md with setup instructions and a SKILL.md that your agent will pick up automatically.
skills/
├── calibration-audit/ # Model calibration & threshold optimization
│ ├── SKILL.md
│ ├── README.md
│ ├── scripts/
│ └── references/
└── prompt-sensitivity/ # LLM prompt robustness testing
├── SKILL.md
├── README.md
├── scripts/
└── references/