release: v0.18.0b1 — Multi-Agent Orchestration + Composable Pipelines (Beta) by johnnichev · Pull Request #33 · johnnichev/selectools

johnnichev · 2026-03-26T16:12:09Z

Summary

The biggest release since launch. Two headline features, both in plain Python:

Multi-Agent Orchestration

AgentGraph — directed graph of agent nodes with plain Python routing
SupervisorAgent — 4 strategies (plan_and_execute, round_robin, dynamic, magentic)
HITL via generator nodes — resumes at exact yield point (LangGraph restarts the node)
Parallel execution with 3 merge policies, checkpointing with 3 backends
Subgraph composition, loop/stall detection, budget propagation

Composable Pipelines

@step decorator on plain functions — zero learning curve
| operator for composition — thin sugar, not a DSL
parallel() and branch() primitives
Auto-tracing, retry with skip, kwargs filtering
Bridge to AgentGraph (pipeline as graph node)

Also in this release

License switched from LGPL-3.0 to Apache-2.0
35 bug fixes across 7 bug hunt rounds
Anthropic SYSTEM role fix (400 error with prompt compression/entity memory)
31 real-API e2e tests across OpenAI, Anthropic, Gemini
5 config.model → _effective_model fixes (coherence, summarization, entity/KG extraction)

Stats

Tests: 2275 → 2435 (+160)
Examples: 54 → 61 (+7)
StepTypes: 17 → 27 (+10)
Observer events: 32/29 → 45/42 sync/async (+13 each)
Bug fixes: 35
New source files: 7

Beta

This is a pre-release (0.18.0b1). Install with pip install selectools==0.18.0b1.
Regular pip install selectools stays on 0.17.7 until stable.

Checklist

All tests pass (2435 passed, 85 skipped)
Lint clean (black, isort, flake8, mypy, bandit)
Docs updated (README, ROADMAP, CHANGELOG, ARCHITECTURE, QUICKSTART, module docs, index)
Cross-reference audit passed (all counts consistent)
mkdocs build clean
31 real-API evals pass across all 3 providers
7 bug hunt rounds — codebase clean

AgentGraph engine with plain Python routing, SupervisorAgent with 4 strategies (plan_and_execute, round_robin, dynamic, magentic), HITL via generator nodes that resume at exact yield point, parallel execution with 3 merge policies, checkpointing with 3 backends, subgraph composition, loop/stall detection, AgentGraph.chain() one-liner. License switched from LGPL-3.0 to Apache-2.0. 10 new StepTypes (27 total), 13 new observer events (45/42 sync/async), 7 examples, 2 module docs, full roadmap detail through v0.21.0. Tests: 2397 passing (+122), Examples: 61 (+7)

Prompt compression, entity memory, knowledge graph, and knowledge memory all inject Message(role=Role.SYSTEM) into conversation history. Anthropic requires system instructions in the top-level `system` parameter only — passing role="system" in the messages array returns 400. Fix: AnthropicProvider._format_messages() now converts SYSTEM messages to user role. GeminiProvider._format_contents() gets the same explicit handling (was accidentally working via the else fallback). 2 regression tests added.

…Gemini) Validates AgentGraph with actual API calls against all 3 providers: - Linear graphs, conditional routing, callable nodes, cross-provider graphs - SYSTEM message injection (the exact bug reported by user) - Mixed OpenAI+Anthropic pipeline in one graph All 9 pass. Run with: pytest tests/test_orchestration_e2e.py --run-e2e

…cks) Replaced all mock-based evals with actual API calls against OpenAI, Anthropic, and Gemini. Every test parametrized across all 3 providers. Validates real scenarios: - Tool calling accuracy (correct tool for capital/math questions) - Multi-step pipelines (2-agent chains produce coherent output) - SYSTEM message survival (prompt compression, entity memory injection) - Cross-provider pipelines (OpenAI -> Anthropic in one graph) - HITL interrupt/resume with real LLM agent after gate - Parallel execution with real agents 22 tests, 0 mocks, all pass. Run: pytest tests/test_orchestration_evals.py --run-e2e

astream()/arun() parity (5 fixes): - astream() now fires all observer events (on_graph_start/end, on_node_start/end, etc.) - astream() enforces max_visits per node - astream() increments stall_count and records GRAPH_STALL trace steps - astream() records GRAPH_LOOP_DETECTED trace steps - astream() wraps _resolve_next_node in try-except Routing fixes (2): - _Update return type from routers now handled (applies patch, follows edge) - Router returning list of non-Scatter objects raises clear error Security (1): - FileCheckpointStore sanitizes graph_id to prevent path traversal Observer/trace (2): - Sync generator nodes now fire on_graph_interrupt event - on_graph_resume event fires when loading from checkpoint Supervisor (3): - _looks_complete() uses end-of-string matching (no more false positives) - _call_planner() logs exceptions instead of swallowing silently - Error messages updated for _Update support

Tests that would have caught the 15 bugs from the bug hunt: - astream() observer events fire (caught bugs 1-4) - astream() enforces max_visits (caught bug 2) - astream() tracks stall count (caught bug 3) - astream() routing error yields ERROR event (caught bug 10) - astream() result matches arun() output (catches parity drift) - update() routing applies patch and follows edge (caught bug 5) - Router returning list of strings raises (caught bug 6) - on_graph_resume fires on checkpoint load (caught bug 15) - FileCheckpointStore rejects path traversal (caught bug 7) These are framework tests (no API calls) that verify plumbing, not LLM behavior. They complement the real-API evals.

Critical: - SimpleStepObserver: swapped arg order in 13 graph event callbacks (was run_id, event_name — should be event_name, run_id) High: - Supervisor: "DONE" check now case-insensitive (handles "done", "Done") Medium: - Magentic strategy: tries "task" field before "reason" from LLM - Removed unused ThreadPoolExecutor import in graph.py - _scatter_patches cleaned up on error (try-finally) - Checkpoint: deep copy _interrupt_responses (was shallow)

Agent core: - _acall_provider() fired BOTH sync and async on_llm_start/end — now only fires async observers in async path (was double-notifying) - _astreaming_call() sync fallback used bare `if chunk:` instead of `isinstance(chunk, str)` — ToolCall objects could be stringified - Entity/KG extraction used config.model instead of _effective_model (wrong model when model_selector active) Providers: - OpenAI/Ollama _format_content() now guards None content with `or ""` - Gemini tool_result guards None content with `or ""` - Anthropic tool_result guards None content with `or ""`

Reverted the _acall_provider observer change — sync and async observers are DIFFERENT types that both need to fire in async paths. The "duplicate observer" report was a false positive. Remaining 5 real fixes from full-system bug hunt: - _astreaming_call() sync fallback: isinstance(chunk, str) guard - Entity/KG extraction: _effective_model instead of config.model - OpenAI/Ollama _format_content(): None guard with `or ""` - Gemini tool_result: None guard with `or ""` - Anthropic tool_result: None guard with `or ""`

Plain Python composability layer (anti-LCEL). No Runnable protocol, no base class, no paid debugger required. - @step decorator wraps plain functions (callable as normal Python) - | operator creates Pipeline (thin list of callables) - parallel() fans out to multiple steps, returns dict of results - branch() routes to named steps via classifier function - retry and on_error="skip" per step - Pipeline.__call__ bridges to AgentGraph (usable as graph node) - Async support: pipeline.arun() awaits async steps - Auto-tracing: every step records name, duration, status 36 tests, all pass.

- on_error="skip" no longer increments steps_run (was counting skipped steps) - Removed unreachable else blocks in retry for-loop (dead code) - branch() no longer calls asyncio.run() (was crashing in async contexts) - Removed dead code in async arun() retry path

- _filter_kwargs() inspects function signature before passing kwargs. Steps without **kwargs no longer crash when pipeline has extra kwargs. Applied to _execute_step, _aexecute_step, parallel(), and branch(). - Replaced global test counter with make_flaky() factory for isolation. - Added 5 kwargs tests. Tests: 2435 passed (+5)

Same pattern as the entity/KG extraction fix: coherence_model and summarize_model fallbacks used config.model (static) instead of _effective_model (respects model_selector). 3 locations fixed.

- CLAUDE.md: added pipeline.py to tree, test count 2435, roadmap updated - README.md: test count 2435, composable pipelines in What's New + features - CHANGELOG.md: added composable pipelines section, 35 bug fixes, stats - docs/index.md: test count 2435, pipeline in feature table + learning path - ROADMAP.md: v0.18.0 includes pipelines, v0.18.x now "Advanced Composition" - landing/index.html: test 2435, composable pipelines card, pills updated - mkdocs build: clean (no warnings)

Version 0.18.0b1 (beta). Regular pip install stays on 0.17.7. Install with: pip install selectools==0.18.0b1 Stable 0.18.0 will ship after real-world validation.

- README.md: added beta install note at top of What's New - CHANGELOG.md: header changed to [0.18.0b1] with beta note - landing/index.html: footer version updated to v0.18.0b1 - docs/CHANGELOG.md: synced

@tool

Three small fixes from the latest review pass. 1. **pip install terminal full-width on desktop** (Image #32): Removed `max-width: 440px` from `.terminal-install`. The terminal now spans 100% of the hero-content column, matching the width of the "Try the Builder" + "Read the Docs" button row directly below it. The hero grid (1fr 1fr split until 1024px) still constrains the column width itself, so this doesn't span the entire viewport on wide screens — it spans the column where the buttons live, which is what aligns them. 2. **SVG flow lines lingering after nodes fade out** (Image #33): Root cause: in the hero flow scene transitions, nodes fade via CSS `transition: opacity 0.3s` when `el.style.opacity = '0'`, but the SVG `<line>` elements (and `<circle>` pulses inside the same `<svg>`) had no fade animation — they stayed fully opaque until the next scene's `buildScene()` cleared the SVG via `svg.textContent = ''`. The visible result: nodes vanish at t=300ms, lines hang in the air alone for ~100ms, then snap to the next scene at t=400ms. Fix: added `transition: opacity 0.3s var(--ease)` to `.hf-svg`, and in playScene's "all done" handler set `svg.style.opacity = '0'` at the same time as `el.style.opacity = '0'` on the nodes. In buildScene, reset `svg.style.opacity = '1'` before drawing the next scene's lines so the new content fades in cleanly. Lines and nodes now vanish on the same curve. 3. **CONTRIBUTING.md stale facts** (user pointed out the version): The user noticed v0.19.2 in the header. While I was there I caught several other stale items and fixed them in the same pass: - Header version: v0.19.2 → v0.20.1 (current) - Header Python: "3.9+" → "3.9 – 3.13" (matches actual classifiers) - Header test status: "100%" → "95% coverage" (matches reality) - Project structure: "24 pre-built tools" → "33 pre-built tools" (verified via `grep -c '^@tool' src/selectools/toolbox/*.py`) - Project structure: "61 numbered examples (01–61)" → "88 numbered examples" (verified via `find examples -maxdepth 1 -name '*.py' | wc -l`) - Release script examples: 0.5.1 → 0.20.2 (current minor + 1) - Test command: `python tests/test_framework.py` → `pytest tests/` (the legacy single-file runner doesn't exist anymore) - Provider test path: `tests/test_framework.py` → `tests/providers/ test_your_provider.py` (current convention) - Section header: "Adding RAG Features (New in v0.8.0!)" → "Adding RAG Features" (v0.8.0 was many releases ago) The biggest substantive fix: rewrote the "Adding a New Tool" example. The old example used `Tool(name=..., parameters=[ ToolParameter(...)])`, which is the legacy class-based API. Selectools has used the `@tool()` decorator pattern for many versions now, where the function signature and docstring are introspected automatically. The old example was actively misleading new contributors into using a deprecated API. New example shows the modern decorator pattern with proper docstring conventions. What's NOT in this PR: - Full rewrite of the project structure block — only the genuinely stale numeric facts (24, 61) were fixed; the listed file names are still mostly accurate and a full architectural audit is out of scope for "the version is outdated" - No CHANGELOG entry — these are doc fixes, not user-facing code - No version bump

@tool

Three small fixes from the latest review pass. 1. **pip install terminal full-width on desktop** (Image #32): Removed `max-width: 440px` from `.terminal-install`. The terminal now spans 100% of the hero-content column, matching the width of the "Try the Builder" + "Read the Docs" button row directly below it. The hero grid (1fr 1fr split until 1024px) still constrains the column width itself, so this doesn't span the entire viewport on wide screens — it spans the column where the buttons live, which is what aligns them. 2. **SVG flow lines lingering after nodes fade out** (Image #33): Root cause: in the hero flow scene transitions, nodes fade via CSS `transition: opacity 0.3s` when `el.style.opacity = '0'`, but the SVG `<line>` elements (and `<circle>` pulses inside the same `<svg>`) had no fade animation — they stayed fully opaque until the next scene's `buildScene()` cleared the SVG via `svg.textContent = ''`. The visible result: nodes vanish at t=300ms, lines hang in the air alone for ~100ms, then snap to the next scene at t=400ms. Fix: added `transition: opacity 0.3s var(--ease)` to `.hf-svg`, and in playScene's "all done" handler set `svg.style.opacity = '0'` at the same time as `el.style.opacity = '0'` on the nodes. In buildScene, reset `svg.style.opacity = '1'` before drawing the next scene's lines so the new content fades in cleanly. Lines and nodes now vanish on the same curve. 3. **CONTRIBUTING.md stale facts** (user pointed out the version): The user noticed v0.19.2 in the header. While I was there I caught several other stale items and fixed them in the same pass: - Header version: v0.19.2 → v0.20.1 (current) - Header Python: "3.9+" → "3.9 – 3.13" (matches actual classifiers) - Header test status: "100%" → "95% coverage" (matches reality) - Project structure: "24 pre-built tools" → "33 pre-built tools" (verified via `grep -c '^@tool' src/selectools/toolbox/*.py`) - Project structure: "61 numbered examples (01–61)" → "88 numbered examples" (verified via `find examples -maxdepth 1 -name '*.py' | wc -l`) - Release script examples: 0.5.1 → 0.20.2 (current minor + 1) - Test command: `python tests/test_framework.py` → `pytest tests/` (the legacy single-file runner doesn't exist anymore) - Provider test path: `tests/test_framework.py` → `tests/providers/ test_your_provider.py` (current convention) - Section header: "Adding RAG Features (New in v0.8.0!)" → "Adding RAG Features" (v0.8.0 was many releases ago) The biggest substantive fix: rewrote the "Adding a New Tool" example. The old example used `Tool(name=..., parameters=[ ToolParameter(...)])`, which is the legacy class-based API. Selectools has used the `@tool()` decorator pattern for many versions now, where the function signature and docstring are introspected automatically. The old example was actively misleading new contributors into using a deprecated API. New example shows the modern decorator pattern with proper docstring conventions. What's NOT in this PR: - Full rewrite of the project structure block — only the genuinely stale numeric facts (24, 61) were fixed; the listed file names are still mostly accurate and a full architectural audit is out of scope for "the version is outdated" - No CHANGELOG entry — these are doc fixes, not user-facing code - No version bump

johnnichev added 16 commits March 26, 2026 00:10

fix: coherence check + memory summarization use _effective_model

4f2f206

Same pattern as the entity/KG extraction fix: coherence_model and summarize_model fallbacks used config.model (static) instead of _effective_model (respects model_selector). 3 locations fixed.

release: v0.18.0b1 — beta pre-release

59f4e50

Version 0.18.0b1 (beta). Regular pip install stays on 0.17.7. Install with: pip install selectools==0.18.0b1 Stable 0.18.0 will ship after real-world validation.

docs: reflect 0.18.0b1 beta in user-facing docs

3b4cc02

- README.md: added beta install note at top of What's New - CHANGELOG.md: header changed to [0.18.0b1] with beta note - landing/index.html: footer version updated to v0.18.0b1 - docs/CHANGELOG.md: synced

johnnichev merged commit ad60a6b into main Mar 26, 2026
3 of 8 checks passed

johnnichev deleted the feat/v0.18.0-multi-agent branch March 26, 2026 16:16

johnnichev mentioned this pull request Apr 7, 2026

fix: terminal full-width, flow line fade sync, CONTRIBUTING staleness #45

Merged

13 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release: v0.18.0b1 — Multi-Agent Orchestration + Composable Pipelines (Beta)#33

release: v0.18.0b1 — Multi-Agent Orchestration + Composable Pipelines (Beta)#33
johnnichev merged 16 commits intomainfrom
feat/v0.18.0-multi-agent

johnnichev commented Mar 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

johnnichev commented Mar 26, 2026

Summary

Stats

Beta

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant