release: v0.18.0b1 — Multi-Agent Orchestration + Composable Pipelines (Beta)#33
Merged
johnnichev merged 16 commits intomainfrom Mar 26, 2026
Merged
release: v0.18.0b1 — Multi-Agent Orchestration + Composable Pipelines (Beta)#33johnnichev merged 16 commits intomainfrom
johnnichev merged 16 commits intomainfrom
Conversation
AgentGraph engine with plain Python routing, SupervisorAgent with 4 strategies (plan_and_execute, round_robin, dynamic, magentic), HITL via generator nodes that resume at exact yield point, parallel execution with 3 merge policies, checkpointing with 3 backends, subgraph composition, loop/stall detection, AgentGraph.chain() one-liner. License switched from LGPL-3.0 to Apache-2.0. 10 new StepTypes (27 total), 13 new observer events (45/42 sync/async), 7 examples, 2 module docs, full roadmap detail through v0.21.0. Tests: 2397 passing (+122), Examples: 61 (+7)
Prompt compression, entity memory, knowledge graph, and knowledge memory all inject Message(role=Role.SYSTEM) into conversation history. Anthropic requires system instructions in the top-level `system` parameter only — passing role="system" in the messages array returns 400. Fix: AnthropicProvider._format_messages() now converts SYSTEM messages to user role. GeminiProvider._format_contents() gets the same explicit handling (was accidentally working via the else fallback). 2 regression tests added.
…Gemini) Validates AgentGraph with actual API calls against all 3 providers: - Linear graphs, conditional routing, callable nodes, cross-provider graphs - SYSTEM message injection (the exact bug reported by user) - Mixed OpenAI+Anthropic pipeline in one graph All 9 pass. Run with: pytest tests/test_orchestration_e2e.py --run-e2e
…cks) Replaced all mock-based evals with actual API calls against OpenAI, Anthropic, and Gemini. Every test parametrized across all 3 providers. Validates real scenarios: - Tool calling accuracy (correct tool for capital/math questions) - Multi-step pipelines (2-agent chains produce coherent output) - SYSTEM message survival (prompt compression, entity memory injection) - Cross-provider pipelines (OpenAI -> Anthropic in one graph) - HITL interrupt/resume with real LLM agent after gate - Parallel execution with real agents 22 tests, 0 mocks, all pass. Run: pytest tests/test_orchestration_evals.py --run-e2e
astream()/arun() parity (5 fixes): - astream() now fires all observer events (on_graph_start/end, on_node_start/end, etc.) - astream() enforces max_visits per node - astream() increments stall_count and records GRAPH_STALL trace steps - astream() records GRAPH_LOOP_DETECTED trace steps - astream() wraps _resolve_next_node in try-except Routing fixes (2): - _Update return type from routers now handled (applies patch, follows edge) - Router returning list of non-Scatter objects raises clear error Security (1): - FileCheckpointStore sanitizes graph_id to prevent path traversal Observer/trace (2): - Sync generator nodes now fire on_graph_interrupt event - on_graph_resume event fires when loading from checkpoint Supervisor (3): - _looks_complete() uses end-of-string matching (no more false positives) - _call_planner() logs exceptions instead of swallowing silently - Error messages updated for _Update support
Tests that would have caught the 15 bugs from the bug hunt: - astream() observer events fire (caught bugs 1-4) - astream() enforces max_visits (caught bug 2) - astream() tracks stall count (caught bug 3) - astream() routing error yields ERROR event (caught bug 10) - astream() result matches arun() output (catches parity drift) - update() routing applies patch and follows edge (caught bug 5) - Router returning list of strings raises (caught bug 6) - on_graph_resume fires on checkpoint load (caught bug 15) - FileCheckpointStore rejects path traversal (caught bug 7) These are framework tests (no API calls) that verify plumbing, not LLM behavior. They complement the real-API evals.
Critical: - SimpleStepObserver: swapped arg order in 13 graph event callbacks (was run_id, event_name — should be event_name, run_id) High: - Supervisor: "DONE" check now case-insensitive (handles "done", "Done") Medium: - Magentic strategy: tries "task" field before "reason" from LLM - Removed unused ThreadPoolExecutor import in graph.py - _scatter_patches cleaned up on error (try-finally) - Checkpoint: deep copy _interrupt_responses (was shallow)
Agent core: - _acall_provider() fired BOTH sync and async on_llm_start/end — now only fires async observers in async path (was double-notifying) - _astreaming_call() sync fallback used bare `if chunk:` instead of `isinstance(chunk, str)` — ToolCall objects could be stringified - Entity/KG extraction used config.model instead of _effective_model (wrong model when model_selector active) Providers: - OpenAI/Ollama _format_content() now guards None content with `or ""` - Gemini tool_result guards None content with `or ""` - Anthropic tool_result guards None content with `or ""`
Reverted the _acall_provider observer change — sync and async observers are DIFFERENT types that both need to fire in async paths. The "duplicate observer" report was a false positive. Remaining 5 real fixes from full-system bug hunt: - _astreaming_call() sync fallback: isinstance(chunk, str) guard - Entity/KG extraction: _effective_model instead of config.model - OpenAI/Ollama _format_content(): None guard with `or ""` - Gemini tool_result: None guard with `or ""` - Anthropic tool_result: None guard with `or ""`
Plain Python composability layer (anti-LCEL). No Runnable protocol, no base class, no paid debugger required. - @step decorator wraps plain functions (callable as normal Python) - | operator creates Pipeline (thin list of callables) - parallel() fans out to multiple steps, returns dict of results - branch() routes to named steps via classifier function - retry and on_error="skip" per step - Pipeline.__call__ bridges to AgentGraph (usable as graph node) - Async support: pipeline.arun() awaits async steps - Auto-tracing: every step records name, duration, status 36 tests, all pass.
- on_error="skip" no longer increments steps_run (was counting skipped steps) - Removed unreachable else blocks in retry for-loop (dead code) - branch() no longer calls asyncio.run() (was crashing in async contexts) - Removed dead code in async arun() retry path
- _filter_kwargs() inspects function signature before passing kwargs. Steps without **kwargs no longer crash when pipeline has extra kwargs. Applied to _execute_step, _aexecute_step, parallel(), and branch(). - Replaced global test counter with make_flaky() factory for isolation. - Added 5 kwargs tests. Tests: 2435 passed (+5)
Same pattern as the entity/KG extraction fix: coherence_model and summarize_model fallbacks used config.model (static) instead of _effective_model (respects model_selector). 3 locations fixed.
- CLAUDE.md: added pipeline.py to tree, test count 2435, roadmap updated - README.md: test count 2435, composable pipelines in What's New + features - CHANGELOG.md: added composable pipelines section, 35 bug fixes, stats - docs/index.md: test count 2435, pipeline in feature table + learning path - ROADMAP.md: v0.18.0 includes pipelines, v0.18.x now "Advanced Composition" - landing/index.html: test 2435, composable pipelines card, pills updated - mkdocs build: clean (no warnings)
Version 0.18.0b1 (beta). Regular pip install stays on 0.17.7. Install with: pip install selectools==0.18.0b1 Stable 0.18.0 will ship after real-world validation.
- README.md: added beta install note at top of What's New - CHANGELOG.md: header changed to [0.18.0b1] with beta note - landing/index.html: footer version updated to v0.18.0b1 - docs/CHANGELOG.md: synced
johnnichev
added a commit
that referenced
this pull request
Apr 7, 2026
Three small fixes from the latest review pass. 1. **pip install terminal full-width on desktop** (Image #32): Removed `max-width: 440px` from `.terminal-install`. The terminal now spans 100% of the hero-content column, matching the width of the "Try the Builder" + "Read the Docs" button row directly below it. The hero grid (1fr 1fr split until 1024px) still constrains the column width itself, so this doesn't span the entire viewport on wide screens — it spans the column where the buttons live, which is what aligns them. 2. **SVG flow lines lingering after nodes fade out** (Image #33): Root cause: in the hero flow scene transitions, nodes fade via CSS `transition: opacity 0.3s` when `el.style.opacity = '0'`, but the SVG `<line>` elements (and `<circle>` pulses inside the same `<svg>`) had no fade animation — they stayed fully opaque until the next scene's `buildScene()` cleared the SVG via `svg.textContent = ''`. The visible result: nodes vanish at t=300ms, lines hang in the air alone for ~100ms, then snap to the next scene at t=400ms. Fix: added `transition: opacity 0.3s var(--ease)` to `.hf-svg`, and in playScene's "all done" handler set `svg.style.opacity = '0'` at the same time as `el.style.opacity = '0'` on the nodes. In buildScene, reset `svg.style.opacity = '1'` before drawing the next scene's lines so the new content fades in cleanly. Lines and nodes now vanish on the same curve. 3. **CONTRIBUTING.md stale facts** (user pointed out the version): The user noticed v0.19.2 in the header. While I was there I caught several other stale items and fixed them in the same pass: - Header version: v0.19.2 → v0.20.1 (current) - Header Python: "3.9+" → "3.9 – 3.13" (matches actual classifiers) - Header test status: "100%" → "95% coverage" (matches reality) - Project structure: "24 pre-built tools" → "33 pre-built tools" (verified via `grep -c '^@tool' src/selectools/toolbox/*.py`) - Project structure: "61 numbered examples (01–61)" → "88 numbered examples" (verified via `find examples -maxdepth 1 -name '*.py' | wc -l`) - Release script examples: 0.5.1 → 0.20.2 (current minor + 1) - Test command: `python tests/test_framework.py` → `pytest tests/` (the legacy single-file runner doesn't exist anymore) - Provider test path: `tests/test_framework.py` → `tests/providers/ test_your_provider.py` (current convention) - Section header: "Adding RAG Features (New in v0.8.0!)" → "Adding RAG Features" (v0.8.0 was many releases ago) The biggest substantive fix: rewrote the "Adding a New Tool" example. The old example used `Tool(name=..., parameters=[ ToolParameter(...)])`, which is the legacy class-based API. Selectools has used the `@tool()` decorator pattern for many versions now, where the function signature and docstring are introspected automatically. The old example was actively misleading new contributors into using a deprecated API. New example shows the modern decorator pattern with proper docstring conventions. What's NOT in this PR: - Full rewrite of the project structure block — only the genuinely stale numeric facts (24, 61) were fixed; the listed file names are still mostly accurate and a full architectural audit is out of scope for "the version is outdated" - No CHANGELOG entry — these are doc fixes, not user-facing code - No version bump
13 tasks
johnnichev
added a commit
that referenced
this pull request
Apr 7, 2026
Three small fixes from the latest review pass. 1. **pip install terminal full-width on desktop** (Image #32): Removed `max-width: 440px` from `.terminal-install`. The terminal now spans 100% of the hero-content column, matching the width of the "Try the Builder" + "Read the Docs" button row directly below it. The hero grid (1fr 1fr split until 1024px) still constrains the column width itself, so this doesn't span the entire viewport on wide screens — it spans the column where the buttons live, which is what aligns them. 2. **SVG flow lines lingering after nodes fade out** (Image #33): Root cause: in the hero flow scene transitions, nodes fade via CSS `transition: opacity 0.3s` when `el.style.opacity = '0'`, but the SVG `<line>` elements (and `<circle>` pulses inside the same `<svg>`) had no fade animation — they stayed fully opaque until the next scene's `buildScene()` cleared the SVG via `svg.textContent = ''`. The visible result: nodes vanish at t=300ms, lines hang in the air alone for ~100ms, then snap to the next scene at t=400ms. Fix: added `transition: opacity 0.3s var(--ease)` to `.hf-svg`, and in playScene's "all done" handler set `svg.style.opacity = '0'` at the same time as `el.style.opacity = '0'` on the nodes. In buildScene, reset `svg.style.opacity = '1'` before drawing the next scene's lines so the new content fades in cleanly. Lines and nodes now vanish on the same curve. 3. **CONTRIBUTING.md stale facts** (user pointed out the version): The user noticed v0.19.2 in the header. While I was there I caught several other stale items and fixed them in the same pass: - Header version: v0.19.2 → v0.20.1 (current) - Header Python: "3.9+" → "3.9 – 3.13" (matches actual classifiers) - Header test status: "100%" → "95% coverage" (matches reality) - Project structure: "24 pre-built tools" → "33 pre-built tools" (verified via `grep -c '^@tool' src/selectools/toolbox/*.py`) - Project structure: "61 numbered examples (01–61)" → "88 numbered examples" (verified via `find examples -maxdepth 1 -name '*.py' | wc -l`) - Release script examples: 0.5.1 → 0.20.2 (current minor + 1) - Test command: `python tests/test_framework.py` → `pytest tests/` (the legacy single-file runner doesn't exist anymore) - Provider test path: `tests/test_framework.py` → `tests/providers/ test_your_provider.py` (current convention) - Section header: "Adding RAG Features (New in v0.8.0!)" → "Adding RAG Features" (v0.8.0 was many releases ago) The biggest substantive fix: rewrote the "Adding a New Tool" example. The old example used `Tool(name=..., parameters=[ ToolParameter(...)])`, which is the legacy class-based API. Selectools has used the `@tool()` decorator pattern for many versions now, where the function signature and docstring are introspected automatically. The old example was actively misleading new contributors into using a deprecated API. New example shows the modern decorator pattern with proper docstring conventions. What's NOT in this PR: - Full rewrite of the project structure block — only the genuinely stale numeric facts (24, 61) were fixed; the listed file names are still mostly accurate and a full architectural audit is out of scope for "the version is outdated" - No CHANGELOG entry — these are doc fixes, not user-facing code - No version bump
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The biggest release since launch. Two headline features, both in plain Python:
Multi-Agent Orchestration
AgentGraph— directed graph of agent nodes with plain Python routingSupervisorAgent— 4 strategies (plan_and_execute, round_robin, dynamic, magentic)Composable Pipelines
@stepdecorator on plain functions — zero learning curve|operator for composition — thin sugar, not a DSLparallel()andbranch()primitivesAlso in this release
config.model→_effective_modelfixes (coherence, summarization, entity/KG extraction)Stats
Beta
This is a pre-release (
0.18.0b1). Install withpip install selectools==0.18.0b1.Regular
pip install selectoolsstays on 0.17.7 until stable.Checklist