Releases · intertwine/dspy-agent-skills

25 May 05:29

intertwine

v0.2.3

f2f7055

v0.2.3 Latest

Latest

DSPy 3.2.1 refresh

Retargeted install and maintainer validation guidance from exact DSPy 3.2.0 to current 3.2.1, while keeping committed example artifacts labeled by the DSPy version that produced them.
Added scripts/check_dspy_surface.py to validate the live DSPy API surface taught by the skills (GEPA, BetterTogether, Evaluate, LM, SIMBA, Embedder, configure_cache, and current primitives).
Updated GEPA guidance for current upstream best practices: train-heavy GEPA splits, GPT-5-class reflection model shape, literal-dict metric mismatch, supported component_selector strings, and when to try dspy.SIMBA.
Tightened the evaluation-harness reference around DSPy 3.2.1 semantics: GEPA-compatible five-argument metric signatures and aggregation-safe metric return shapes.
Added production cache guidance for dspy.configure_cache(restrict_pickle=True), project-local DSPY_CACHEDIR, and provider-side prompt caching.

Validation

uv run --with pytest python -m pytest tests/ -v -> 114 passed
env -u UV_EXCLUDE_NEWER uv run --with dspy==3.2.1 python scripts/check_dspy_surface.py -> passed
for f in skills/*/example_*.py; do env -u UV_EXCLUDE_NEWER uv run --with dspy==3.2.1 python "$f" --dry-run; done -> passed
git diff --check HEAD -> passed

Review

PR: #8

The release branch passed a read-only adversarial subagent review. Initial blockers were fixed and the targeted rereview returned: No blocking findings. Approved for release.

Assets 2

25 May 05:06

intertwine

v0.2.2

fb70151

v0.2.2

Test suite hardening

Extended regression guards (.overall_score, dict metrics, stale RLM defaults, stale BetterTogether API) to cover articles/**/*.md in addition to skills/ and docs/. Uses recursive glob for parity with the skills/** pattern.
Added version consistency test: asserts plugin.json, marketplace.json, and README.md all carry the same version string. Regex is anchored to the ## Version heading to avoid false matches on changelog or prose mentions.
Added reference.md presence test: every skill directory must ship a reference.md for progressive disclosure.
Moved _ANTIPATTERN_MARKERS and _is_antipattern_context() above Rule 1 so both Rule 1 (.overall_score) and Rule 2 (dict metrics) share the same anti-pattern context check. Added "enforces" marker to allow meta-references that describe prohibitions. Dropped the overly broad "no " marker — "enforces" alone covers the article line that triggered it.
Test count: 87 → 105.

Example artifacts

Re-ran examples/01-rag-qa as a clean DSPy 3.1.3 vs 3.2.0 comparison on the same model pair; the current clean DSPy 3.2.0 result is 80.47 -> 100.00.
Kept examples/03-invoice-extraction on its historical DSPy 3.1.3 artifact after a clean probe: the 3.1.3 GEPA run was stopped before completion after finding a 0.944 candidate, and the 3.2.0 baseline on the same model pair already reached 0.944.
Updated README, examples index, and per-example version_comparison.{md,json} files so the published docs describe the clean comparison path and no longer depend on .venv-dspy313 / .venv-dspy320 state.

New content

Created skills/dspy-advanced-workflow/reference.md — the only skill that was missing one. Covers step-by-step failure modes, auto level selection, plateau debugging, export format tradeoffs, BetterTogether chaining, and sub-skill cross-references.
GEPA constructor snippet marked as a subset with pointer to the full surface in dspy-gepa-optimizer/reference.md.
reflection_minibatch_size guidance annotated with symptom context (plateau vs. oscillation) to avoid contradicting the advice in the GEPA optimizer reference.

Installer

Added --verify flag to scripts/install.sh: validates each expected skill exists at the destination, checks symlink targets or directory presence, and reports pass/fail per skill.
Updated docs/installation.md verification section to reference --verify.

Validation

uv run --with pytest python -m pytest tests/ -v -> 108 passed
Live reruns/probes:
- examples/01-rag-qa -> 80.47 -> 100.00 with openrouter/mistralai/ministral-3b-2512
- examples/03-invoice-extraction -> clean probe recorded 0.944 baseline under DSPy 3.2.0; historical artifact retained

Assets 2

28 Apr 12:11

intertwine

v0.2.1

dbf0613

v0.2.1

Patch release for Vercel skills CLI compatibility.

Documented the supported install path: npx skills add intertwine/dspy-agent-skills.
Clarified that bare npx skills add dspy-agent-skills currently requires an upstream CLI alias and is not repo-resolvable by itself.
Fixed dspy-evaluation-harness frontmatter so strict YAML parsers discover all five skills.
Added a regression guard for inline YAML frontmatter values containing : .

Validation:

uv run --with pytest python -m pytest tests/ -v -> 87 passed.
npx --yes skills add . --list -> Found 5 skills.
Temp Codex install wrote all five SKILL.md files under .agents/skills.

Assets 2

21 Apr 21:02

intertwine

v0.2.0

7261609

v0.2.0

DSPy 3.2.x refresh for the skill pack. This release candidate moves the skills, references, manifests, and regression guards from DSPy 3.1.x assumptions to the real DSPy 3.2.0 surface, while adding a concrete example for the biggest new optimizer-facing capability.

Highlights

Retargeted the repo from DSPy 3.1.x / 3.1.3 to DSPy 3.2.x / 3.2.0 across README, skill docs, manifests, and maintainer guidance.
Added skills/dspy-gepa-optimizer/example_bettertogether.py, a dry-run-capable example of DSPy 3.2.0's generalized dspy.BetterTogether(metric=..., bootstrap=..., gepa=...) API.
Updated dspy-fundamentals to document 3.2.x type-mismatch warnings, warn_on_type_mismatch=False, and the new dspy.BaseLM capability/ContextWindowExceededError guidance for custom backends.
Updated dspy-rlm-module for DSPy 3.2.0's max_output_chars=10_000 default and kwargs-only tool dispatch.
Updated dspy-gepa-optimizer to explain the new BetterTogether chaining model while keeping plain GEPA as the default recommendation.
Added a regression guard against stale BetterTogether constructor guidance and flipped the RLM default guard to the 3.2.0 value.
Refreshed examples/01-rag-qa and examples/02-math-reasoning with clean DSPy 3.2.0 live reruns, and added per-example version_comparison.{md,json} files to make the old-vs-new story explicit.
Kept examples/03-invoice-extraction on its historical DSPy 3.1.3 artifact, with the 3.2.0 probe sweep documented instead of forcing a misleading saturated or unstable rerun.
Validated the install path end to end, including scripts/install.sh --dry-run, a temp-HOME install, and new guidance for UV_EXCLUDE_NEWER when uv hides DSPy 3.2.0.

Validation

uv run --with pytest python -m pytest tests/ -v → full suite passed
All 6 skill examples executed via --dry-run under DSPy 3.2.0
All 3 end-to-end examples executed via --dry-run under DSPy 3.2.0
Live reruns under DSPy 3.2.0:
- examples/01-rag-qa → 75.77 -> 100.00 with openrouter/mistralai/ministral-3b-2512
- examples/02-math-reasoning → 85.00 -> 93.33 with openrouter/mistralai/ministral-3b-2512
- examples/03-invoice-extraction → probe sweep recorded saturation or instability; historical artifact retained
scripts/install.sh --dry-run and a temp-HOME install both matched the documented dual-target install flow
During release prep, local uv run --with dspy still resolved DSPy 3.1.3 on this machine, so the 3.2.0 smoke tests were run in an isolated environment installed from the official 3.2.0 wheel.

Assets 2

20 Apr 01:09

intertwine

0.1.0

2c83049

0.1.0

First public release of dspy-agent-skills — a spec-compliant pack of 5 agent skills + 3 validated end-to-end examples that teach Claude Code and Codex CLI to build, optimize, and ship DSPy 3.1.x programs.

Skills

Skill	Purpose
`dspy-fundamentals`	Signatures, Modules, Predict/ChainOfThought/ReAct, save/load
`dspy-evaluation-harness`	Rich-feedback metrics + `dspy.Evaluate`
`dspy-gepa-optimizer`	`dspy.GEPA` reflective optimization
`dspy-rlm-module`	`dspy.RLM` long-context / REPL reasoning
`dspy-advanced-workflow`	End-to-end pipeline orchestration

Validated end-to-end examples

All three run on free OpenRouter models — $0 reproduction:

Example	Task LM	Baseline	Optimized	Δ
01-rag-qa	GLM 4.5 Air (32B)	81.15	100.00	+18.85
02-math-reasoning	Liquid LFM 2.5 (1.2B)	45.00	70.00	+25.00
03-invoice-extraction	Liquid LFM 2.5 (1.2B)	0.833	0.931	+0.098

Install

/plugin marketplace add intertwine/dspy-agent-skills
/plugin install dspy-agent-skills@dspy-agent-skills

Or for Claude Code + Codex CLI together:

git clone https://github.com/intertwine/dspy-agent-skills
cd dspy-agent-skills
./scripts/install.sh

What's covered by tests

60 validators across: SKILL.md frontmatter spec, JSON manifest schemas, Python AST on every example, and regression guards that prevent subtle teaching-material drift (e.g. dict-returning metrics, stale DSPy attribute names).

Compatibility

DSPy 3.1.x (tested against 3.1.3)
Claude Code skill spec as of 2026-04-17
Codex CLI Agent Skills format
Python 3.10+

Assets 2

Releases: intertwine/dspy-agent-skills

v0.2.3

DSPy 3.2.1 refresh

Validation

Review

Uh oh!

v0.2.2

Test suite hardening

Example artifacts

New content

Installer

Validation

Uh oh!

v0.2.1

Uh oh!

v0.2.0

Highlights

Validation

Uh oh!

0.1.0

Skills

Validated end-to-end examples

Install

What's covered by tests

Compatibility

Uh oh!