Releases: intertwine/dspy-agent-skills
v0.2.3
DSPy 3.2.1 refresh
- Retargeted install and maintainer validation guidance from exact DSPy
3.2.0to current3.2.1, while keeping committed example artifacts labeled by the DSPy version that produced them. - Added
scripts/check_dspy_surface.pyto validate the live DSPy API surface taught by the skills (GEPA,BetterTogether,Evaluate,LM,SIMBA,Embedder,configure_cache, and current primitives). - Updated GEPA guidance for current upstream best practices: train-heavy GEPA splits, GPT-5-class reflection model shape, literal-dict metric mismatch, supported
component_selectorstrings, and when to trydspy.SIMBA. - Tightened the evaluation-harness reference around DSPy 3.2.1 semantics: GEPA-compatible five-argument metric signatures and aggregation-safe metric return shapes.
- Added production cache guidance for
dspy.configure_cache(restrict_pickle=True), project-localDSPY_CACHEDIR, and provider-side prompt caching.
Validation
uv run --with pytest python -m pytest tests/ -v-> 114 passedenv -u UV_EXCLUDE_NEWER uv run --with dspy==3.2.1 python scripts/check_dspy_surface.py-> passedfor f in skills/*/example_*.py; do env -u UV_EXCLUDE_NEWER uv run --with dspy==3.2.1 python "$f" --dry-run; done-> passedgit diff --check HEAD-> passed
Review
PR: #8
The release branch passed a read-only adversarial subagent review. Initial blockers were fixed and the targeted rereview returned: No blocking findings. Approved for release.
v0.2.2
Test suite hardening
- Extended regression guards (
.overall_score, dict metrics, stale RLM defaults, stale BetterTogether API) to coverarticles/**/*.mdin addition toskills/anddocs/. Uses recursive glob for parity with theskills/**pattern. - Added version consistency test: asserts
plugin.json,marketplace.json, andREADME.mdall carry the same version string. Regex is anchored to the## Versionheading to avoid false matches on changelog or prose mentions. - Added
reference.mdpresence test: every skill directory must ship areference.mdfor progressive disclosure. - Moved
_ANTIPATTERN_MARKERSand_is_antipattern_context()above Rule 1 so both Rule 1 (.overall_score) and Rule 2 (dict metrics) share the same anti-pattern context check. Added"enforces"marker to allow meta-references that describe prohibitions. Dropped the overly broad"no "marker —"enforces"alone covers the article line that triggered it. - Test count: 87 → 105.
Example artifacts
- Re-ran
examples/01-rag-qaas a clean DSPy 3.1.3 vs 3.2.0 comparison on the same model pair; the current clean DSPy 3.2.0 result is80.47 -> 100.00. - Kept
examples/03-invoice-extractionon its historical DSPy 3.1.3 artifact after a clean probe: the 3.1.3 GEPA run was stopped before completion after finding a0.944candidate, and the 3.2.0 baseline on the same model pair already reached0.944. - Updated README, examples index, and per-example
version_comparison.{md,json}files so the published docs describe the clean comparison path and no longer depend on.venv-dspy313/.venv-dspy320state.
New content
- Created
skills/dspy-advanced-workflow/reference.md— the only skill that was missing one. Covers step-by-step failure modes,autolevel selection, plateau debugging, export format tradeoffs,BetterTogetherchaining, and sub-skill cross-references. - GEPA constructor snippet marked as a subset with pointer to the full surface in
dspy-gepa-optimizer/reference.md. reflection_minibatch_sizeguidance annotated with symptom context (plateau vs. oscillation) to avoid contradicting the advice in the GEPA optimizer reference.
Installer
- Added
--verifyflag toscripts/install.sh: validates each expected skill exists at the destination, checks symlink targets or directory presence, and reports pass/fail per skill. - Updated
docs/installation.mdverification section to reference--verify.
Validation
uv run --with pytest python -m pytest tests/ -v-> 108 passed- Live reruns/probes:
examples/01-rag-qa->80.47 -> 100.00withopenrouter/mistralai/ministral-3b-2512examples/03-invoice-extraction-> clean probe recorded0.944baseline under DSPy 3.2.0; historical artifact retained
v0.2.1
Patch release for Vercel skills CLI compatibility.
- Documented the supported install path:
npx skills add intertwine/dspy-agent-skills. - Clarified that bare
npx skills add dspy-agent-skillscurrently requires an upstream CLI alias and is not repo-resolvable by itself. - Fixed
dspy-evaluation-harnessfrontmatter so strict YAML parsers discover all five skills. - Added a regression guard for inline YAML frontmatter values containing
:.
Validation:
uv run --with pytest python -m pytest tests/ -v-> 87 passed.npx --yes skills add . --list-> Found 5 skills.- Temp Codex install wrote all five
SKILL.mdfiles under.agents/skills.
v0.2.0
DSPy 3.2.x refresh for the skill pack. This release candidate moves the skills, references, manifests, and regression guards from DSPy 3.1.x assumptions to the real DSPy 3.2.0 surface, while adding a concrete example for the biggest new optimizer-facing capability.
Highlights
- Retargeted the repo from DSPy 3.1.x / 3.1.3 to DSPy 3.2.x / 3.2.0 across README, skill docs, manifests, and maintainer guidance.
- Added
skills/dspy-gepa-optimizer/example_bettertogether.py, a dry-run-capable example of DSPy 3.2.0's generalizeddspy.BetterTogether(metric=..., bootstrap=..., gepa=...)API. - Updated
dspy-fundamentalsto document 3.2.x type-mismatch warnings,warn_on_type_mismatch=False, and the newdspy.BaseLMcapability/ContextWindowExceededErrorguidance for custom backends. - Updated
dspy-rlm-modulefor DSPy 3.2.0'smax_output_chars=10_000default and kwargs-only tool dispatch. - Updated
dspy-gepa-optimizerto explain the new BetterTogether chaining model while keeping plain GEPA as the default recommendation. - Added a regression guard against stale BetterTogether constructor guidance and flipped the RLM default guard to the 3.2.0 value.
- Refreshed
examples/01-rag-qaandexamples/02-math-reasoningwith clean DSPy 3.2.0 live reruns, and added per-exampleversion_comparison.{md,json}files to make the old-vs-new story explicit. - Kept
examples/03-invoice-extractionon its historical DSPy 3.1.3 artifact, with the 3.2.0 probe sweep documented instead of forcing a misleading saturated or unstable rerun. - Validated the install path end to end, including
scripts/install.sh --dry-run, a temp-HOMEinstall, and new guidance forUV_EXCLUDE_NEWERwhenuvhides DSPy 3.2.0.
Validation
uv run --with pytest python -m pytest tests/ -v→ full suite passed- All 6 skill examples executed via
--dry-rununder DSPy 3.2.0 - All 3 end-to-end examples executed via
--dry-rununder DSPy 3.2.0 - Live reruns under DSPy 3.2.0:
examples/01-rag-qa→75.77 -> 100.00withopenrouter/mistralai/ministral-3b-2512examples/02-math-reasoning→85.00 -> 93.33withopenrouter/mistralai/ministral-3b-2512examples/03-invoice-extraction→ probe sweep recorded saturation or instability; historical artifact retained
scripts/install.sh --dry-runand a temp-HOMEinstall both matched the documented dual-target install flow- During release prep, local
uv run --with dspystill resolved DSPy3.1.3on this machine, so the 3.2.0 smoke tests were run in an isolated environment installed from the official 3.2.0 wheel.
0.1.0
First public release of dspy-agent-skills — a spec-compliant pack of 5 agent skills + 3 validated end-to-end examples that teach Claude Code and Codex CLI to build, optimize, and ship DSPy 3.1.x programs.
Skills
| Skill | Purpose |
|---|---|
dspy-fundamentals |
Signatures, Modules, Predict/ChainOfThought/ReAct, save/load |
dspy-evaluation-harness |
Rich-feedback metrics + dspy.Evaluate |
dspy-gepa-optimizer |
dspy.GEPA reflective optimization |
dspy-rlm-module |
dspy.RLM long-context / REPL reasoning |
dspy-advanced-workflow |
End-to-end pipeline orchestration |
Validated end-to-end examples
All three run on free OpenRouter models — $0 reproduction:
| Example | Task LM | Baseline | Optimized | Δ |
|---|---|---|---|---|
| 01-rag-qa | GLM 4.5 Air (32B) | 81.15 | 100.00 | +18.85 |
| 02-math-reasoning | Liquid LFM 2.5 (1.2B) | 45.00 | 70.00 | +25.00 |
| 03-invoice-extraction | Liquid LFM 2.5 (1.2B) | 0.833 | 0.931 | +0.098 |
Install
/plugin marketplace add intertwine/dspy-agent-skills
/plugin install dspy-agent-skills@dspy-agent-skills
Or for Claude Code + Codex CLI together:
git clone https://github.com/intertwine/dspy-agent-skills
cd dspy-agent-skills
./scripts/install.shWhat's covered by tests
60 validators across: SKILL.md frontmatter spec, JSON manifest schemas, Python AST on every example, and regression guards that prevent subtle teaching-material drift (e.g. dict-returning metrics, stale DSPy attribute names).
Compatibility
- DSPy 3.1.x (tested against 3.1.3)
- Claude Code skill spec as of 2026-04-17
- Codex CLI Agent Skills format
- Python 3.10+