Release v0.29.0#925
Conversation
…concept (#902) A PipeLLM whose output concept refines the native JSON concept resolves to a structured-output model carrying a dict[str, Any] field inherited from JSONContent. SchemaToModelFactory generates that model with `from __future__ import annotations`, turning the field annotation into the string "dict[str, Any]", then rebuilds each class to resolve the strings. The rebuild namespace was filtered to `type` instances plus a hand-listed Literal, dropping typing.Any (a special form, not a type) — model_rebuild then raised PydanticUndefinedAnnotation. The rebuild namespace is now the exec namespace itself (minus __builtins__), so it carries exactly the names the generated source was written against and cannot drift as codegen emits other typing constructs. Covers the sender path (make_from_json_schema) and the cross-process receiver path (make_types_from_source). Adds unit tests for both paths plus an e2e .mthds bundle regression that exercises the LIVE structured-output build. Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: replace TODOS.md with offline-mode implementation plan Lays out a TDD plan for offline-safe Pipelex setup: cache remote config on first init, fall back to cache when network is unavailable, and fail clearly when a referenced gateway model is missing from both fresh and cached specs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: tighten offline-mode plan after eng review Apply 9 review-driven edits to TODOS.md: move source provenance off GatewayConfig, cache raw JSON, keep RemoteConfigFetchError, require fresh data for doc generators, add retry-exhaustion and regression tests, replace test env-var backdoor with PIPELEX_REMOTE_CONFIG_URL. * feat: implement RemoteConfigCache for offline mode support and add integration tests * feat: Implement new remote config fetching logic with cache fallback and provenance tracking - Refactored `RemoteConfigFetcher.fetch_remote_config()` to return a `RemoteConfigResult` containing the fetched config, source of the config (FRESH or CACHED), and cache timestamp. - Introduced `RemoteConfigUnavailableError` for scenarios where both network fetch and cache fallback fail, providing user-facing error messages with remediation steps. - Added `RemoteConfigStaleWarning` to indicate when a cached config is used due to network issues. - Updated all existing callers of `fetch_remote_config()` to accommodate the new return type and error handling. - Enhanced tests to cover new behaviors, including success cases, network failures, and validation errors. - Ensured that the internal retry logic raises `RemoteConfigFetchError` while the outer layer handles user-facing errors appropriately. * feat: Introduce GatewayUnknownModelError for missing models in gateway specs - Added GatewayUnknownModelError to handle cases where a model referenced in the deck is not found in the active gateway specs. - Enhanced model manager to enforce gateway model membership, raising the new error when discrepancies are detected. - Updated remote config fetcher to include source provenance (FRESH vs CACHED) for better error messaging and telemetry control. - Refactored related tests to ensure proper coverage for the new error handling and gateway configuration scenarios. - Introduced RemoteConfigSource enum to streamline source tracking for remote configurations. * feat: Implement remote config cache priming for offline mode in agent CLI * feat: Enhance offline mode support with new remote config handling and E2E tests * feat: Add offline mode support and error handling for remote config issues * fix: Improve error message clarity for RemoteConfigUnavailableError when cache is refused * feat: Enhance offline mode support with improved cache priming and error handling * test: Add unit tests for cycle detection in alias and waterfall handling * refactor: Remove RemoteConfigFetchError references and improve error handling for remote config issues * docs: document offline-mode envelope warnings and init cache priming Adds the `warnings` field to the agent CLI JSON success contract in agent-cli.md (was missing the field this branch introduces), and notes remote-config cache priming in init.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add Offline Behavior section to gateway.md Documents how Pipelex stays usable when the Gateway remote config service is unreachable: BYOK skips the fetch entirely, Gateway mode falls back to the primed on-disk cache, and only live inference still needs the network. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: recognize .pipelex/ config dir as a project root marker A directory containing a .pipelex/ config dir is now recognized as a project root. Previously such a directory fell through to the global ~/.pipelex/ config, silently ignoring the project's own overrides. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plan archive * fix: refuse stale/malformed gateway config in offline paths Address two P1 review findings on the offline-mode work: - remote_config_fetcher: a cache with a valid wrapper but a malformed raw_config let a raw Pydantic ValidationError escape the offline fallback. Catch it and raise RemoteConfigUnavailableError with the normal remediation. Reword the message to "no usable local cache" so it is accurate for both missing and unusable caches. - preprocess_test_models_cmd: _fetch_gateway_models swallowed require_fresh refusals into empty model lists, letting offline fixture generation proceed without any pipelex_gateway entries. Let the error propagate and surface a clear offline-mode panel. Adds regression tests for both paths. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: verify cache exists after priming to avoid misreporting success A successful remote-config fetch does not guarantee the on-disk cache was written: RemoteConfigFetcher treats the cache write as opportunistic and swallows OSErrors (read-only / full cache dir) with only a stderr warning. attempt_prime_remote_config_cache trusted the fetch result alone, so it could return primed=True while no usable cache existed, making `pipelex-agent init` emit `cache_primed: true` and leaving later offline runs to fail with RemoteConfigUnavailableError. Verify a usable cache exists via RemoteConfigCache.load() after the fetch; report priming failure with a clear remediation message otherwise. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: re-validate cached payload when priming remote config cache The priming read-back check treated RemoteConfigCache.load() as a usability check, but load() only validates the cache wrapper, not the inner raw_config payload. A malformed payload could still report primed=True. Now call to_remote_config() and treat a ValidationError as a non-primed result, matching the existing check in RemoteConfigFetcher.fetch_remote_config. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eration collapse (#891) * Plans * Drop text-then-object structuring path from PipeLLM stack Remove the entire "text then object" mechanism from PipeLLM down through cogt and Temporal layers. The StructuringMethod.PRELIMINARY_TEXT enum value stays so a future implementation can opt in; selecting it at runtime now raises NotImplementedError. Rename make_object_direct -> make_object and make_object_list_direct -> make_object_list since the "_direct" suffix only existed to contrast with the deleted text-then-object variants. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docds and plan * fix image gen test * answer plan questions * TODO: Collapse `tprl_content_generation/` Workflow Layer * TODOS: complete Phase 0 audit, lock in activity-id strategy (i) Audit confirms today's per-workflow uniqueness invariant holds for the collapse refactor: every operator-side ContentGeneratorProtocol method is invoked at most once per WfPipeRouter execution (mutually-exclusive branches in PipeLLM/PipeImgGen, single unconditional call in PipeExtract, no calls from PipeCompose/StructuredContentComposer). Strategy (i) is adopted with two mitigations for Phase 1: split the duplicate "craft-image" default between make_single_image/make_image_list, and construct distinct activity_ids inside make_extract_pages for its two inner activity dispatches. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plan * plan reviewed * plan review * Phases 1-4: build ContentGeneratorInWorkflow behind feature flag Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Reintroduce preliminary_text via bundle elaboration + PipeStructure Bring back structuring_method = "preliminary_text" through a build-time elaboration pass instead of a runtime branch, and ship a reusable PipeStructure (Text -> StructuredConcept) operator. - New PipeStructure operator: blueprint, factory, runtime, spec, registered in PipeBlueprintUnion, PipeType, CoreRegistryModels, PipeSpecUnion / pipe_spec_map, and the MTHDS schema generator. Ships a structuring_prompt template under [cogt.llm_config.generic_templates]. - New BundleElaborator (pipelex/core/interpreter/bundle_elaborator.py) that rewrites a PipeLLMBlueprint with structuring_method = preliminary_text into a PipeSequence[PipeLLM(text), PipeStructure]. Synthetic pipes are recorded on a new excluded side-table PipelexBundleBlueprint.elaboration_metadata so the language surface stays clean. Wired into PipelexInterpreter. - Runtime PipeLLM no longer knows about structuring_method; the field stays on PipeLLMBlueprint as a build-time directive. A blueprint-level model_validator surfaces "preliminary_text + Text output" errors at authoring time; the elaborator's check stays as defense-in-depth. - Tests: unit + integration coverage for PipeStructure, the elaboration pass (including image-input flow, multiplicity preservation, synthetic-name collision, main_pipe regression, defense-in-depth via model_construct), the spec round-trip, and an updated mthds-schema test that derives expected blueprint names from PipeType so future additions don't break it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 5: flip default to ContentGeneratorInWorkflow Rename env flag PIPELEX_USE_IN_WORKFLOW_CONTENT_GENERATOR -> PIPELEX_USE_LEGACY_CONTENT_GENERATOR and invert polarity so the new direct-activity content generator is the default under temporal.is_enabled. The old name's polarity (set = new) was awkward once new became default; the new name reads as a clear opt-out for the legacy ContentGeneratorChild. Mirrored in both Temporal integration conftests so explicit set_content_generator(...) tracks the production gate. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add Phase 8 round-out tests for PipeStructure + preliminary_text Cover kajson round-trip for PipeStructureBlueprint and elaborated bundles, end-to-end interpreter parsing of structuring_method = preliminary_text, and integration tests exercising PipeStructure inside hand-authored PipeSequence and PipeBatch as well as the full elaborated path. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 6: delete legacy tprl_content_generation/ surface; add real-PDF page-views test Collapses the tprl_content_generation/ workflow layer per TODOS.md Phase 6: - Removes 11 source files: 6 WfMake* / WfRenderPageViews workflow wrappers, ContentGeneratorTop + factory, ContentGeneratorChild + factory, and content_generator_models.py. - Removes 4 superseded test files (2 obsolete crafter tests + 2 already-commented-out historical tests). - Drops the PIPELEX_USE_LEGACY_CONTENT_GENERATOR env-flag branch from pipelex.py and both Temporal conftests; ContentGeneratorInWorkflowFactory is now wired unconditionally when temporal.is_enabled. - Drops the WfMake* entries from the crafting TaskPack (workflow_list=[]); the activity_list is unchanged. - Removes the top_crafter / child_crafter fixtures from the content_generation conftest. Also fills the previously-deferred test gap from Phase 4 by adding real-PDF end-to-end coverage for make_extract_pages with document_uri + should_include_page_views=True: - New WfTestContentGeneratorPdfPageViews workflow (registered in TEMPORAL_TEST_WORKFLOWS) exercises act_extract_gen_extract_pages plus act_render_page_views and asserts each PageContent.page_view is set. - New TestTprlContentGeneratorPdfPageViews integration test runs against a 2-page local PDF to catch attachment-loop ordering bugs. - Marked @extract @inference @dry_runnable @temporal. TODOS.md Phase 6 marked complete; deploy/ship sections of Phase 8 removed (handled separately when the branch lands). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Strengthen PipeStructure e2e tests: multiplicity, LLM-call count, inline-structure example Cover all three output multiplicities (single, dynamic list, fixed list) on the elaborated preliminary_text path, assert exactly two LLM calls per run via the reporting registry, and add a parallel e2e fixture whose concept is declared entirely inline in the .mthds (HikingTripReport, 12 fields). Replace the toy SimpleResult with a richer RestaurantReview Python class for the non-inline tests, and switch all prompts to everyday non-AI topics (restaurants, hiking) for clearer signal in live runs. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Document PipeStructure operator and preliminary_text elaboration Adds the user-facing PipeStructure page and an Under-the-Hood page explaining the BundleElaborator mechanism, rewires the PipeLLM page around the new build-time elaboration model, and records the change in CHANGELOG. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix PipeStructure error_type for non-Text input The single input is present but its concept is incompatible — classify as INPUT_STUFF_SPEC_MISMATCH instead of MISSING_INPUT_VARIABLE so tooling and logging classify the failure correctly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plans * Phase 7: cross-process e2e coverage + inference dispatch stopgap Extracts _inference_dispatch_kwargs as the single deletion point for the provisional inference_task_queue model, adds Tier 9/11 cross-process regression tests (object-gen JSON round-trip + extract two-activity contract), promotes split workers to required for image-gen tiers in the temporal-e2e-validate skill, and refreshes the per-activity routing v1 design with explicit upgrade targets for the new tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Address pre-landing review findings on PipeStructure / preliminary_text - Load synthetic helpers (`__draft_text`, `__structure`) alongside their exported parent in `LibraryManager._load_single_dependency`; without this, exported `preliminary_text` pipes ship a wrapping PipeSequence whose helpers are filtered out by the manifest, breaking consumers at runtime. - Emit `PipeLLMSpec.structuring_method` and `PipeStructureSpec` in the spec-to-TOML serializers (`builder/operations/pipe_ops.py` and `cli/agent_cli/commands/pipe_cmd.py`); both fields were silently dropped. - Reject multiplicity inputs (`Text[]`, `Text[N]`) on `PipeStructureBlueprint` to fail fast at parse time instead of crashing inside `working_memory.get_stuff_as_str` at runtime. - Document the process-local lifetime of `elaboration_metadata` (survives `model_copy`, dropped on `model_dump`/`model_validate`) on the field, in `under-the-hood/build-time-elaboration.md`, and via a regression test; capture future cross-boundary persistence as TODO #10. - Note the `StructuringMethod` import-path move in `[Unreleased]/Changed` and document the third (library-time / concept) layer of the output-Text guard. - Add unit coverage for the new contracts: spec-to-TOML round-trip for `structuring_method` and `PipeStructureSpec.model`; multiplicity-input rejection; `validate_output_with_library` rejecting Text-refining concepts; `validate_inputs_with_library` accepting Text-refining concepts; `PipeLLMBlueprint.validate_preliminary_text_output` direct test; `_load_single_dependency` synthetic-helper loading. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 7 follow-up: fix Tier 11 SKILL.md marker filter The Tier 11 pytest command in temporal-e2e-validate skill used `-m "extract and temporal"` but the test is only marked `@pytest.mark.temporal` (the `@extract` marker was deliberately dropped because the substitute activities mean no real Azure Document Intelligence or pypdfium2 dependency is exercised — see TODOS.md Phase 7). Running the documented command would deselect every test (pytest exit 5) and the operator would conclude "Tier 11 has no tests." Switch the filter to `-m temporal` and rewrite the surrounding paragraph to make the substitute-fixture rationale explicit so a reader does not assume real OCR credentials are required. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 7 follow-up: code-review nits Three small cleanups from the post-Phase-7 code review: 1. Tighten test_split_worker_extract_pages assertion to filter scheduled activity events by activity_type.name rather than activity_id suffix. The previous endswith(("-pages", "-render-page-views")) filter would false-positive against any future test passing a wfid like "my-pages" to a generator method on the same fixture workflow. Pinning to the activity name (act_extract_gen_extract_pages, act_render_page_views) is strictly more robust. Failure message also now includes (type, id) pairs so a regression is easier to triage. 2. Drop ConfigDict(arbitrary_types_allowed=True) from the new FixtureLineItem / FixtureCustomer / FixtureInvoice models — every field is a primitive or another BaseModel, so the config is dead weight. Person keeps its config; that line predates this branch. 3. Document why # noqa: TC001 is intentional on ContentGeneratorProtocol in wf_test_structured_output_cross_process.py — the import sits inside workflow.unsafe.imports_passed_through() and must stay runtime so Temporal can replay history. Adds a one-line comment so a future reader does not "fix" it under TYPE_CHECKING and break replay. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add PR brief HTML for preliminary_text + PipeStructure work Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 7 follow-up: address PR #878 review comments - Scope `_seen_activity_ids` cache by `(workflow_id, run_id)` so retries, `continue_as_new`, and id-reuse policies don't inherit prior-run entries. - Add `@update_job_metadata` to `make_templated_text` for consistent `content_generation_job_id` tracking. - Clarify `_inference_dispatch_kwargs` docstring scope (LLM text only). - Update SKILL.md hang-debug references to split-worker sessions. - Add regression tests for run_id scoping and templated_text job metadata. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix @variable regex to not match emails; add Tier 9b bug analysis Add negative lookbehind to @variable patterns in template preprocessor so emails like alice@example.com stay literal. Document cross-process ListContent decode bug in wip/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix cross-process Temporal decode of ListContent for dynamic concepts Rename per-item type markers in WorkingMemory.dump_for_temporal() from kajson's reserved `__class__` / `__module__` to pipelex-private `__pipelex_class__` / `__pipelex_module__`. Kajson's universal decoder gates strictly on `__class__`, so nested dicts now pass through the Temporal data converter untouched and class binding stays inside pipelex's hydrator where the per-workflow ClassRegistry lives. Extends CLEAN_JSON_FIELDS_TO_SKIP to strip both marker families. Adds unit tests pinning the wire-format contract and the kajson-isolation invariant in both directions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * lint comment * Phase 8: docs + docstrings reflect direct-activity dispatch Removes all `WfMake*` / `wf_make_*` / `WfRenderPageViews` / `ActLLMGenText` references from `docs/under-the-hood/` and the split-worker test docstring, and adds the `[Unreleased] Changed` entry documenting the `tprl_content_generation/` workflow-layer collapse — including the surfaced page-views-augmentation fix in Temporal mode that was previously a silent no-op via `WfMakeExtract`. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Move TODOS.md into wip/ as executed v2-plan; add co-dev HTML brief TODOS.md was the live execution log for the collapse-content-generation workflow-layer refactor — now in its final state with all 9 phases checked and decisions/follow-ups recorded. Move it next to its v2 analysis as collapse-content-generation-workflow-layer-v2-plan.md (overwriting the stale pre-execution plan). Add an HTML brief summarizing the refactor for co-developer onboarding. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Fix docs links to deleted llm-structured-generation-config page Remove dangling references to the doc page deleted alongside the text-then-object structuring path. Reframe the llm-integration Structured Output section around direct provider-native structured outputs, which is the only supported approach. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * move plan * Refine per-activity queue routing plan: cold-start checklist + TTO orthogonality note Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Per-activity, per-handle Temporal queue routing (v1) Replace the provisional `inference_task_queue` (LLM-text-only override) with a general `activity_queues` table on `WorkerConfig`. Each entry declares a per-activity `default` queue and an optional `by_handle` map keyed by runtime handle (LLM model handle, `img_gen_handle`, `extract_handle`). `WorkerConfig.resolve_queue(activity_name, routing_key)` walks three layers: per-handle override, activity default, worker-wide `task_queue`. Every `ContentGeneratorInWorkflow` dispatch site now passes `task_queue=resolve_queue(...)` uniformly; the asymmetric LLM-text kwarg and the `_inference_dispatch_kwargs` stopgap are gone, and `inference_task_queue` is deleted (no backward-compat shim). Tests: rewrote unit pins to assert uniform task_queue contract; added `test_worker_config_resolve_queue.py` covering all three resolution layers; migrated the split-worker LLM-text integration fixture to the new config; added `route_activities_to(...)` helper so object/extract substitute-activity tests route back to their UUID queue. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Routing validation: Step 8 battery + TOML examples + parsing test Expand the temporal-e2e-validate skill with a self-contained Step 8 that proves per-activity, per-handle routing works end-to-end against a live Temporal server. Replaces the previous Tier 10 (which was awkwardly embedded inside Tier 4/5 and only covered image-gen) with three sub-tiers: - 10a: multi-activity isolation — act_llm_gen_text (activity default) + act_img_gen_images dispatched to dedicated workers, runner sees 0 hits. - 10b: per-handle routing — same activity, two distinct model handles (claude-4.6-sonnet vs gemini-flash-latest) land on by_handle workers, proving the per-handle layer wins over the activity default. - 10c: two activities sharing one route — act_extract_gen_extract_pages + act_render_page_views (routing_key=None) both land on q_extract, exercising the activity-default fallback for handle-less activities. Add commented activity_queues examples to pipelex.toml, .pipelex/pipelex.toml, and pipelex/kit/configs/pipelex.toml so operators have a copy-paste invitation to override routing per deployment. Add a unit test that parses a representative activity_queues TOML fragment into WorkerConfig and exercises resolve_queue on it — regression guard for the commented examples. New fixtures: per_handle_routing.mthds (Tier 10b) and pdf_extract_page_views.mthds + pdf_extract_inputs.json (Tier 10c). All three sub-tiers validated live against PR #879's resolver. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * temporal-e2e-validate: document Pipelex Gateway path for Tier 10c Tier 10c was marked conditional pending direct Azure Document Intelligence credentials, but PIPELEX_GATEWAY_API_KEY + PIPELEX_INFERENCE_API_KEY proxy that backend. Document the gateway path with the existing pdf_extract_page_views.mthds bundle, an inputs JSON template, and a warning not to substitute mistral-ocr/deepseek-ocr. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * change-doc and improve config * Move activity_queues default to main TOML, drop Python default Per the project rule that defaults live in pipelex/pipelex.toml (not in the class definition), set activity_queues = {} explicitly in the main config and remove Field(default_factory=dict) from both WorkerConfig.activity_queues and ActivityRouteConfig.by_handle. Commented routing examples now live only in the kit config copy that pipelex init config surfaces to users. Test fixtures and helper call sites updated to pass by_handle={} explicitly. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plans * Temporal queue options + worker-runtime profiles (v2 config) Phase 0-6 of the queue-options-and-worker-profiles plan. Adds per-queue submitter options (timeouts, retry, server-side rate limit) and per-worker runtime profiles (concurrency slots, pollers, worker-local rate limit) on top of v1 per-activity routing. - Schema: QueueOptions, HandleOptions, WorkerRuntimeProfile, WorkerRuntimeProfilesConfig, WorkerTuningMode, DispatchOptions. Renamed worker_config.task_queue -> default_task_queue. - Resolver: WorkerConfig.resolve_dispatch() composes baseline -> queue_options -> handle_options last-wins for scalars, additively for non_retryable_error_types. Every workflow.execute_activity in ContentGeneratorInWorkflow now goes through the resolver, fixing the workflow_execution_timeout-as-activity-timeout bug. - Worker tuning: TemporalTaskManager.make_worker reads a WorkerRuntimeProfile (--profile CLI flag). Queue-level max_task_queue_activities_per_second flows from queue_options into Worker(...). - Validation: lenient warn at config load on routing entries naming queues with no queue_options entry; strict fail at worker CLI startup on unknown --task-queue (Levenshtein 'did you mean?' suggestion). Overlay layers reject non-empty non_retryable_error_types (must use _extra) to prevent silent drops. - Tracing: is_dispatch_resolution_traced flag emits per-call resolver trace lines with source-layer attribution. - Specialized scopes: runner-llm / runner-img-gen / runner-extract / runner-jinja2 for deployment manifests with one worker pool per backend class. - E2E skill: SKILL.md Step 9 documents v2 scenarios A-F. - Cleanup: deleted dead pipelex/temporal/wrapper/. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Use tomli. cleanenv not erase lock * Address PR #880 review comments (lazy temporalio + queue fallback + retry layer split) Fix four actionable review-bot threads: 1. Restore lazy temporalio import (chatgpt-codex P1). RetryPolicy moves back under TYPE_CHECKING; DispatchOptions becomes a @dataclass so the module loads without the optional temporal extra installed. 2. Hybrid workflow-local queue fallback (chatgpt-codex P1). When activity_queues is empty (default config), resolve_queue returns None and to_execute_kwargs omits task_queue — Temporal then uses the workflow's own queue, restoring the with_conditional_worker test isolation pattern. With any routing configured, the prior explicit-routing semantic is preserved. 3. Derive content-generation activity names from CRAFTING task pack (cubic-dev-ai P2). Test no longer hardcodes the set; new content-gen activities are automatically tracked. 4. Split RetryPolicyConfig into baseline + overlay classes (greptile P2). ConfigModel's extra="forbid" now enforces the layer asymmetry at the type level; baseline non_retryable_error_types_extra entries that used to be silently dropped now fail at config load. Removes two now-redundant overlay-side validators; drops the misleading baseline _extra TOML stub. Adds an AST regression test that fails if temporalio appears at module level in config_temporal.py, plus per-fix regression assertions in the dispatch / resolve_queue / TOML parsing test suites. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix Makefile * Address PR #880 follow-up review comments (allowed-tools + default_task_queue rename docs) - Add `temporal` and `jq` to allowed-tools in the temporal-e2e-validate skill so Step 9's workflow-history checks (which shell out to those commands to read per-queue `start_to_close_timeout` from the live server) are actually executable under the permission policy (cubic-dev-ai P1). - Update stale `worker_config.task_queue` references to the renamed `default_task_queue` field in `.claude/skills/temporal-e2e-validate/SKILL.md` and `CHANGELOG.md` so the documentation matches the v2 config surface (cubic-dev-ai P2 × 2). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Improve AST regression scanner to catch guarded top-level temporalio imports Address cubic-dev-ai follow-up on the AST regression test for PR #880 #1: the original scanner only walked `tree.body`, so a `try: from temporalio... ` or `if SOME_FLAG: from temporalio...` block at the module top level (which still executes at import time) would slip through unflagged. The scanner now recurses into `ast.If` (except `if TYPE_CHECKING:` bodies, which never run at runtime) and `ast.Try` (body, handlers, orelse, finalbody). Function and class bodies stay skipped — lazy imports inside them only fire when the function is called, which is the intended pattern. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Lock in extra="forbid" on RetryPolicyConfig split classes Type-design review of PR #880's RetryPolicyConfig split flagged that the layer asymmetry invariant (baseline owns the main list, overlays own _extra) is load-bearing on ConfigModel's ambient extra="forbid" setting. If a future contributor ever flipped that to "allow" on either subclass, the silent-drop bug for baseline non_retryable_error_types_extra would silently come back. Add a one-line assertion per class so the invariant fails loudly at unit-test time instead of regressing at runtime. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Polishing * plans * Temporal IDs + observability redesign (workflow_id, activity_id, search attributes) Workflow IDs derive directly from pipeline_run_id ({env_prefix}{pipeline_run_id}); session/random/class-name components are removed. Child workflow IDs use `/` as the path separator. Activity IDs are no longer customized — the Temporal SDK assigns deterministic integers per workflow run, removing the duplicate-id failure mode and the LRU + replay-short-circuit machinery that defended against it. The wfid parameter is dropped from PipeRunProtocol, PipeRouterProtocol, ContentGeneratorProtocol, and every implementation. Every workflow start sets five Keyword search attributes (PipeCode, PipelineRunId, SessionId, UserId, DomainCode), a static_summary, and static_details; every execute_activity call carries a per-call summary= built from the new observability helpers. A soft-fail bootstrap check at worker boot warns when the namespace is missing the required search attributes, including the registration command. Tests cover the new helpers, the executor passthrough, the workflow-id construction, the search-attribute dict, and the bootstrap check. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Migrate Temporal search attributes to TypedSearchAttributes + unify child-spawn paths Phase 5 of the Temporal IDs/Naming redesign: replace deprecated dict-based search attributes with TypedSearchAttributes throughout the workflow layer. Adds five module-level SearchAttributeKey constants in observability.py and flips the type annotations on the WorkflowExecutor surface. As a follow-up, unifies the last raw workflow.execute_child_workflow call in wf_pipe_run.py to route through WorkflowExecutor.execute_child_workflow, matching the pattern already used by TemporalPipeRouter's child branch. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Pre-Phase-6 cleanup: tighten exception handling + WfPipeRun failure-path test - Replace catch-all except Exception in workflow_caller.py with named SDK exceptions (WorkflowAlreadyStartedError, RPCError, WorkflowFailureError on the client path; ChildWorkflowError-only on the child path). - Add failure-path integration test for WfPipeRun pinning the exception-type shift from ChildWorkflowError to WorkflowExecutionError introduced by the Phase 5 child-spawn-path unification. - Fix latent production hang: register WorkflowExecutionError via workflow_failure_exception_types on the production Worker (and the test Worker). Without this, any workflow re-raising WorkflowExecutionError triggers indefinite workflow-task retry instead of failing terminally, because WorkflowExecutionError is not a temporalio.exceptions.FailureError subclass. - Fix TODOS.md doc path: docs/under-the-hood/temporal-deployment.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Add wip note for Temporal error-handling revamp Captures the deferred design work to make WorkflowExecutionError inherit from temporalio.exceptions.ApplicationError, which would remove the workflow_failure_exception_types Worker-side registration added in 117bbe01. Documents scope, open questions, and trigger conditions for the eventual cleanup. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Phase 6: hard-fail worker boot + configurable search attributes + setup CLI Flip the Temporal search-attribute boot check from warn-and-continue to hard-fail on reachable namespaces — the previous framing was dishonest because real clusters reject every workflow start that references an unregistered attribute. Add a [temporal.search_attributes] config block (master enabled toggle + opt-in subset of the five built-ins), and a new `pipelex setup-temporal-namespace` CLI that wraps the registration via the same connection config the worker uses, with a permission-denied fallback runbook for Temporal Cloud namespaces. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fixes * Sync docs with reverted child-spawn unification TODOS.md, the WfPipeRun failure-path test docstring, the worker workflow_failure_exception_types comment, and the WorkflowExecutor child-spawn wrapper docstrings still described the briefly-unified state from the Phase 5 follow-up. The Phase 6 follow-up (commit ac8e2335) reverted that unification for replay-determinism reasons — WorkflowExecutorFactory.create_executor seeds config-derived options that would be baked into the recorded StartChildWorkflowExecution command — but the surrounding prose was not updated to match. Updated narrative to reflect the current state: both child-spawn sites call workflow.execute_child_workflow(...) directly and wrap ChildWorkflowError as WorkflowExecutionError in-place; the unused WorkflowExecutor.execute_child_workflow / start_child_workflow wrapper methods now carry warnings explaining the in-workflow replay-determinism trap. No code-behavior changes. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Reorganize wip/temporal-primitives/: archive shipped plans, renumber the rest Move plans for shipped work (id-and-naming-plan pre-checkpoints, collapse-content-generation v2 plan+analysis+HTML, per-activity-queue-routing-v1, queue-options-and-worker-profiles plan+design, text-then-object brief, operators-as-activities analysis, workflow-and-activity-ids problem statement) into wip/archive/. Renumber the four surviving files with sortable prefixes: 00-temporal-id-primitives (evergreen reference), 01-id-and-naming-design (refreshed status: "Implemented"), 02-id-and-naming-plan (formerly top-level TODOS.md; refreshed status: "Phases 1-6 shipped"), 03-temporal-error-handling-revamp (the only deferred open item). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs * docs and plans * Expand deferred-items: capture why snapshot-on-workflow-input was wrong, point at Worker Versioning Document the rejected `TemporalDispatchSnapshot` approach (architectural inversion, payload bloat, defeats central-config purpose, not Temporal-idiomatic) and the three-option roadmap (docs+replay test / Worker Versioning / thin search-attrs-only snapshot) so future readers don't re-attempt the same wrong-shaped fix. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * small fixes * recap * polish * docs * fix skill for tests and fix bugs * add CV batch screening pipeline and tests for deeply-nested controller validation * fix PR #891 review comments: task-routing TOML path, outdated routing claim, contradictory escape-hatch example, skill allowed-tools gaps + unbounded worker-start waits - docs/distributed-execution/task-routing.md: correct activity_queues path to [temporal.worker_config.activity_queues.*] throughout (per-activity routing, per-handle overlays, resolution-order list) - docs/under-the-hood/pipe-routing-and-execution.md: replace stale act_llm_gen_text-only claim with current resolve_dispatch behavior across all content-generation activities - wip/temporal-primitives/id-and-naming.html: flip escape-hatch example to enabled=false so it matches the "turn off custom attributes entirely" header - .claude/skills/temporal-e2e-validate/SKILL.md: add timeout/pkill/sleep/echo/tail/seq to allowed-tools; replace unbounded `until ... grep ...; do sleep 1; done` with bounded 30s waits that dump last 50 lines and exit 1 on timeout (both two-scoped-workers and single-worker blocks) - wip/temporal-next/01-deferred-items.md: cross-cite cubic-dev-ai on the replay-determinism deferral so both reviewers' flags are visible * fix cv_batch_screening_job missing live-mode reporting registry; dedup with shared helper The _cv_job_iter helper was copy-pasted from pipe_job_from_library but dropped the open_registry/close_registry bracket and let build_pipe_job mint its own random pipeline_run_id, so --pipe-run-mode live runs would fail in reporting. Extended pipe_job_from_library with an optional working_memory_builder hook, deleted the duplicate, and routed cv_batch_screening_job through the single source of truth. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * hide elaboration_metadata from JSON schema and document missing PipeExtract / ReasoningEffort fields Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * refactor: move kit-config sync into pipelex-dev CLI (#915) * fix: update configuration guidelines and defaults in pipelex.toml and documentation * refactor: move kit-config sync into pipelex-dev CLI Replace the Makefile `up-kit-configs` rsync recipe with a native `pipelex-dev sync-kit-configs` command. Add a reusable `mirror_dir` utility (recursive copy + delete) that derives its exclude list from the single source of truth in pipelex/kit/paths.py — the same sets `check-config-sync` enforces, so a sync is always followed by a passing check. Drops the rsync dependency (not portable to Windows) and the `$(shell python -c ...)` indirection. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #915 review-agent comments on kit-config sync - mirror_dir: validate source_dir is an existing directory before the delete pass, so an invalid source can no longer wipe the target tree - mirror_dir: unlink target-only directory symlinks instead of passing them to shutil.rmtree, which raised and aborted the sync - mirror_dir: record created directories in MirrorDirResult.created_dirs so an added empty directory counts as a change and dry-run reports it - sync-kit-configs: CLI pre-check uses is_dir(); display created dirs - update/check-gateway-models: write and verify the gateway model docs in both .pipelex/ and the packaged pipelex/kit/configs/ locations so the shipped kit copy can no longer silently go stale - pipelex.toml: replace activity_queues = {} inline table with the [temporal.worker_config.activity_queues] empty table header so users can add routing sub-tables without a TOML parse error Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: replace target symlinks in mirror_dir instead of writing through them When a target path was a symlink, is_file() followed the link and copy_file() wrote through it, corrupting the external file the symlink pointed to. Pass 2 now unlinks target symlinks before copying, so the mirror tree always holds real files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Feature/error handling 2 (#913) * docs: reorganize error-handling wip docs into track-based structure Replaces phase-numbered error-handling docs with a track-based layout under wip/error-handling/. Each track is a self-contained concern (metadata model, worker classification, retry, CLI delivery, Temporal integration, testing) with current state, open gaps, and followups — no implicit ordering between tracks. Refreshes stale file paths and worker-tier framings to match the current codebase. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plans * feat: implement UNKNOWN error category and enhance error handling - Added `UNKNOWN` category to `InferenceErrorCategory` to prevent misclassification of unrecognized SDK exceptions. - Introduced `extract_underlying_sdk_exception` function to recover SDK exceptions from `InstructorRetryException`. - Updated `AnthropicLLMWorker` to utilize the new extraction method and categorize errors correctly. - Created shared test helpers for instructor-related tests and added unit tests for new functionality. * feat: ProviderErrorMetadata on inference errors (Phase 3) Adds structured SDK metadata (status_code, request_id, retry_after, provider_error_code, body) to every CogtError via a new ProviderErrorMetadata Pydantic model, plus extract_anthropic_metadata helper. Anthropic worker now populates provider_metadata on every categorized raise. ErrorReport serializes it through to_error_report(). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Refactor user action handling in error reporting - Introduced structured `UserAction` and `UserActionKind` to replace free-form strings in error handling across various workers. - Updated error handling in Google, Hugging Face, Linkup, Mistral, and OpenAI plugins to utilize the new `UserAction` model for user guidance. - Enhanced tests to validate the new user action structure and ensure consistent error reporting. - Adjusted existing tests to check for `UserAction` details instead of string comparisons for user actions. * feat(openai): enhance error handling and metadata extraction for OpenAI SDK exceptions - Added `extract_openai_metadata` function to distill OpenAI SDK exceptions into `ProviderErrorMetadata`, accommodating both `APIStatusError` subclasses and connection-related errors. - Implemented `_raise_categorized_openai_sdk_error` method in `OpenAICompletionsLLMWorker` to categorize and raise appropriate `LLMCompletionError` based on the type of OpenAI SDK exception encountered. - Updated error handling in `_gen_object` method to utilize the new categorization method, improving clarity and maintainability. - Introduced comprehensive unit tests for `extract_openai_metadata` to ensure correct extraction of metadata from various OpenAI SDK exceptions. - Added tests for structured-generation error handling in `OpenAICompletionsLLMWorker`, verifying that wrapped exceptions are correctly unwrapped and categorized. * feat(openai): Phase 6 — Responses LLM unwrap + metadata + semantic user_action Brings the OpenAI Responses worker up to the beyond-reference standard set by Anthropic (reference) and Phase 5 (Completions). The Responses-specific specialization is preserved: NotFoundError raises LLMModelNotFoundError (carrying model_handle so callers can swap models), while every other recognized SDK exception raises LLMCompletionError. - Added _raise_categorized_openai_sdk_error helper on the Responses worker (mirrors the Completions helper but raises LLMModelNotFoundError for NotFoundError). Both _gen_text and _gen_object now dispatch through it. - _gen_object's InstructorRetryException catch unwraps the underlying SDK exception via extract_underlying_sdk_exception and routes through the same helper, so transient/capacity/auth/not-found wrapped errors are no longer flattened to CONTENT. - Every raise carries provider_metadata=extract_openai_metadata(sdk_exc) and a semantic UserActionKind (WAIT_AND_RETRY / CHECK_BILLING / CHECK_CREDENTIALS / CHANGE_INPUT / CHANGE_MODEL). - SDK coverage now uniform with Phase 5: added InternalServerError and PermissionDeniedError to the tuple-catch. - Migrated `from instructor.exceptions` to `from instructor.core`. - Extended ModelNotFoundError.__init__ to accept and forward error_category, user_action, and provider_metadata kwargs to CogtError.__init__ so LLMModelNotFoundError can carry them end-to-end. ModelWaterfallError continues to work via class-level error_category. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * test(cogt): lock in ModelWaterfallError + LLMModelNotFoundError contract Adds two unit tests for the ModelNotFoundError.__init__ widening done in Phase 6: 1. ModelWaterfallError: when constructed without the new optional kwargs, the class-level error_category = CONFIGURATION must survive — i.e., the None defaults forwarded up to CogtError.__init__ must not clobber the class attribute (guarded by `if error_category is not None` in CogtError.__init__). 2. LLMModelNotFoundError: when worker-side categorization passes user_action and provider_metadata kwargs, they reach the instance. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(mistral): enhance error handling with structured metadata extraction - Implemented `extract_mistral_metadata` to distill Mistral SDK exceptions into `ProviderErrorMetadata`, accommodating both flat and nested error body structures. - Refactored `_classify_mistral_error` to utilize the new metadata extraction, ensuring uniform handling of error categories and user actions across different error types. - Added comprehensive unit tests for `extract_mistral_metadata` to validate behavior against various Mistral error shapes, including handling of non-JSON bodies and `NoResponseError`. - Developed tests for `MistralLLMWorker` to ensure correct categorization of wrapped `MistralError` instances, verifying that transient, capacity, and configuration errors are handled appropriately with attached metadata. * Implement Google LLM error handling and metadata extraction - Added `extract_google_metadata` function to distill Google GenAI SDK exceptions into `ProviderErrorMetadata`, accommodating Google's unique error structure. - Refactored `_classify_google_client_error` to include structured `provider_metadata` and semantic `UserActionKind` values for better error categorization. - Introduced `_raise_categorized_google_sdk_error` helper to streamline error handling for `ServerError` and `ClientError`. - Updated `GoogleLLMWorker` to utilize the new error handling methods, ensuring proper categorization of errors wrapped in `InstructorRetryException`. - Created unit tests for `extract_google_metadata` covering various error scenarios and responses. - Developed comprehensive tests for `GoogleLLMWorker` to validate structured generation error handling, ensuring correct behavior for different error types. * feat: enhance error handling across LLM workers with structured metadata and user actions * Add unit tests for error handling in image generation workers - Implement tests for Google ImgGen worker to validate provider_metadata and UserActionKind for various error scenarios. - Create tests for Hugging Face image generation worker to ensure proper extraction of metadata and handling of errors. - Add tests for OpenAI Completions ImgGen worker to verify SDK exception handling and error categorization. - Introduce tests for OpenAI ImgGen worker to check error handling, including rate limits, quota issues, and authentication errors. * feat: Phase 11 extract worker audits + Bedrock LLM upgrade Brings Bedrock LLM and every extract worker (Mistral, Docling, Linkup, Gateway, pypdfium2) up to the beyond-reference standard: each raise carries ProviderErrorMetadata and a semantic UserActionKind so downstream consumers (retry, CLI, telemetry) get a uniform shape across providers. New helpers in error_classification.py: - extract_bedrock_metadata for botocore ClientError shape (Error.Code, ResponseMetadata.HTTPStatusCode/RequestId, retry-after header) - extract_linkup_metadata (Linkup SDK exposes only exception class — no HTTP response metadata) - extract_local_extract_metadata for non-HTTP local extractors (Docling, pypdfium2) — surfaces only sdk_exception_type / provider_error_code Per-worker changes: - Bedrock LLM: collapsed inline branches into _classify_bedrock_client_error helper; added ResourceNotFoundException 404 branch - Mistral extract: refactored _classify_mistral_error to mirror the LLM worker shape; every branch now carries a semantic user_action - Docling / pypdfium2: each exception branch carries provider_metadata + semantic UserActionKind via the shared local-extract helper - Linkup extract: introduced _classify_linkup_error method (mirrors search worker) with a single tuple-catch in _extract_pages - Gateway extract: adopted extract_gateway_metadata + GatewayFactory.make_user_action_from_portkey_error on both _extract_web_fetch and _extract_base64_url paths New tests: per-worker semantic test files asserting category + user_action.kind + provider_metadata across all SDK exception types; test_extract_bedrock_metadata.py covers the metadata helper directly. make agent-check clean (0 errors); 1122 plugins/cogt unit tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: address Phase 11 review nits Two minor follow-ups from independent code review of de61d4b9: 1. extract_bedrock_metadata: one-line comment noting that botocore lowercases HTTPHeaders keys, so retry-after is the canonical lookup. 2. Capture the FileNotFoundError category-vs-user-action mismatch (CONFIGURATION + CHANGE_INPUT) in Docling and pypdfium2 workers as a deferred item. The existing pre-Phase-11 tests lock in CONFIGURATION, so flipping the category was out of scope; documented options and trade-offs for a future revisit. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(openai): implement shared error classification for OpenAI SDK exceptions - Added `openai_error_classification.py` to centralize error handling for OpenAI SDK exceptions across different workers. - Updated `OpenAIImgGenWorker` and `OpenAIResponsesLLMWorker` to utilize the new error classification method. - Refactored error handling logic to categorize unhandled 4xx APIStatusErrors as CONFIGURATION instead of TRANSIENT. - Enhanced tests to cover new error handling scenarios, including unhandled 4xx errors and retry-after logic for RateLimitError. - Updated dependencies in `pyproject.toml` for compatibility with the latest instructor version. * uv lock * feat: Phase 12 search worker audits Bring both search workers up to the beyond-reference error-handling standard: every raised error carries structured provider_metadata and a semantic UserActionKind. - Linkup search: _classify_linkup_error now attaches extract_linkup_metadata and a semantic UserActionKind on every branch (timeout, invalid-request, and fallback previously had no user_action). - Gateway search: _call_relay passes user_action + provider_metadata to GatewaySearchResponseError, reusing the Portkey helpers from Phase 10. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * deferred * feat: enhance error handling across image generation and extraction workers with structured user actions and metadata * feat: update error handling in image generation workers to classify errors as UNKNOWN and suggest changing models * landing the plan * feat: improve error handling across various workers to categorize APIStatusErrors and enhance user actions * refactor: enable ruff BLE001 and sweep broad except Exception catches Remove BLE001 from the ruff ignore list so broad `except Exception` catches are now permanently lint-guarded. Exempt tests/ via per-file ignores (Phase 1 is scoped to non-test code). Narrow three genuinely-narrowable catches to specific exceptions: - pipe_func.py: get_stuff catch -> WorkingMemoryStuffNotFoundError - func_registry.py: get_type_hints catch -> (NameError, TypeError) - model_deck.py: backend-TOML load catch -> (TomlError, OSError) Annotate the remaining legitimate broad catches (CLI/dev/agent roots, Temporal and async-task roots, telemetry exporters, best-effort teardown cleanup, defensive utility fallbacks) with # noqa: BLE001. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * plans * feat: enhance error handling across various modules by narrowing exception catches - Updated exception handling in `customize_backends_config` to catch specific exceptions (EOFError, OSError, TOMLKitError) instead of a broad Exception. - Modified `get_currently_enabled_backends` to handle TomlError and OSError, providing a silent failure for file read issues. - Refined error handling in `do_show_backends` to catch MarkupError specifically, improving clarity in error reporting. - Adjusted `WorkingMemoryFactory` to log warnings for mock creation failures while maintaining a best-effort approach. - Enhanced `output_renderer` to provide context-specific comments for exception handling, focusing on dynamic concept rendering. - Narrowed exception handling in `LibraryManager` to specific exceptions (NameError, PydanticUserError) during model rebuilding. - Improved error handling in `DeliveryExecutor` methods to ensure failures are logged without disrupting the delivery process. - Updated `dry_run` to catch ValidationError specifically, ensuring fallback to TextContent is clear. - Enhanced teardown error handling in `GatewayExtractWorker` and `GoogleImgGenWorker` to ensure cleanup failures are logged but do not halt execution. - Refined exception handling in `MistralFactory` to catch ValueError specifically for base64 cleaning. - Updated `act_assemble_graph` to clarify the best-effort observability approach for graph assembly failures. - Narrowed exception handling in `json_utils` to specific exceptions (TypeError, UnijsonEncoderError) during JSON purification. - Adjusted `can_inject_text` to clarify the safety of f-string formatting on uncertain input types. - Improved error handling in `are_classes_equivalent` to catch Pydantic-specific exceptions during schema comparison. * Enhance error handling across CLI commands - Added structured error handling in various CLI commands to ensure unexpected failures are captured and reported consistently. - Introduced comments to clarify the purpose of error handling at command boundaries, emphasizing that failures are converted into structured error payloads. - Updated error handling in commands related to agent operations, model checks, input processing, and validation to improve robustness and user feedback. - Ensured that telemetry and logging mechanisms do not disrupt application flow during error scenarios. * feat: add error_domain to the error model and class-level exception metadata Phases 2-4 of the error-handling plan (Checkpoint B): - Phase 2: add the ErrorDomain StrEnum and an error_domain field on PipelexError / ErrorReport, forwarded through to_error_report(). - Phase 3: set class-level error_domain (and user_action where the track doc gives concrete text) on the key non-CogtError exceptions; set error_category on the uncategorized prompt-* CogtError families. - Phase 4: agent_error() reads error_domain report-first with the lookup dict as fallback; remove the now-redundant agent-CLI dict entries; add a drift test guarding the dicts against stale keys and double sources. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * Improve Phase 5 plan * feat: update cold-start status section and add detailed phase execution instructions * feat: PipeRouter transient-retry loop (resilience without Temporal) PipeRouter now retries InferenceErrorCategory.TRANSIENT failures with exponential backoff, driven by four new pipeline_execution_config settings (max_transient_retries defaults to 3; 0 disables retry). Retry moves out of the two gateway workers into the dispatch layer: the tenacity AsyncRetrying wrappers, _make_retryer/_is_retryable/_log_retry helpers, and the TenacityConfig model + [cogt.tenacity_config] block are removed. The tenacity dependency stays (FAL polling, remote-config fetch). PipeRouterProtocol carries the retry policy as a TransientRetrySettings instance attribute (new dependency-free pipe_run/transient_retry.py), populated from config by each concrete router at construction — reading config inside the protocol directly would form a config->hub->protocol import cycle. A CogtError out of pipe execution is now reported to the observer on the failing path and re-raised as-is (cause chain preserved, not wrapped). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: preserve transient retry of gateway deployment-propagation 404s Code review of the retry-loop commit found a regression: the deleted gateway-worker retry predicate had treated a NotFoundError reading "specified deployment could not be found" as a transient Portkey deployment-propagation race. classify_error_category() mapped every NotFoundError to CONFIGURATION (non-retryable), so that case would no longer retry. Add GatewayFactory._is_deployment_propagation_race() and special-case that 404 to TRANSIENT (and WAIT_AND_RETRY for the user action). Add a CLASSIFY_CASES test case for it. Also restore the field bounds the removed TenacityConfig provided: PipelineExecutionConfig retry fields now carry Field(ge=0) so a malformed pipelex.toml fails at config load. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: bounded fan-out concurrency for PipeBatch (Phase 5.5) PipeBatch over many items no longer spawns every branch — every coroutine, every deep-copied working memory, every inference call — at once. Branches now run in bounded chunks via the new gather_bounded helper, driven by the max_concurrency config (default 8). This is the second resilience-without-Temporal pillar beside transient retry (Phase 5): it keeps a large workload from overwhelming asyncio, memory, and provider rate limits. - gather_bounded (pipelex/tools/misc/async_utils.py): generic chunked fan-out over factories, not coroutines, so each deep copy is materialized only when its chunk runs — bounding memory, not just execution. Results preserve input order; first error by input index wins; the failing chunk is drained and no later chunk starts. max_concurrency is int | None, None meaning unbounded. - max_concurrency on PipelineExecutionConfig, beside the retry fields, typed Annotated[int, Field(ge=1)] | Literal["unbounded"] — the bound is disabled by the explicit "unbounded" literal, not a magic 0. - PipeBatch builds one factory per branch and maps the "unbounded" config literal to the helper's None; large batches log an advisory pointing at the Temporal track as the durable, rate-limited path. - PipeParallel is left unbounded — it fans over a fixed branch set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: category-aware Temporal retry + ErrorReport details payload (Phase 6) The Temporal error bridge now derives its retry decision from InferenceErrorCategory.is_retryable — the same signal the in-process PipeRouter retry loop consults — instead of a static class-name list. - TemporalError.__init__ gains non_retryable + error_report passthrough; error_report is packed as the ApplicationError.details payload. - from_message_exception: category-aware retryability for CogtError carrying a category; class-name-list fallback for category-less exceptions (via the new _is_non_retryable helper). - from_app_error: recovers the ErrorReport dict from details and preserves the round-tripped non_retryable flag, so error_category / user_action / model / provider survive the activity -> workflow boundary. - Logging extracted into _log_critical / _log_error classmethods for unit-testability (workflow_log needs a live workflow event loop). - config_temporal docstrings: non_retryable_error_types is documented as a fallback for category-less exceptions and an override mechanism. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: agent-CLI markdown delivery + error_domain HTTP-status mapping (Phase 7) Phase 7 of the error-handling plan — errors and results surface cleanly on every delivery surface. - ErrorReport gains a documented, authoritative error_domain -> HTTP-status mapping: error_domain_to_http_status() (pure domain table) and the ErrorReport.http_status property (provider-429 passthrough on top). The library stays HTTP-agnostic; downstream APIs call the helper. - The agent CLI emits markdown by default for run / validate / init, with --format json for the structured payload. agent_error() dispatches JSON or markdown via a per-invocation ContextVar, so every existing call site follows --format. agent_error_markdown() renders heading / hint callout / details / source block. - validate bundle's graph-format option renamed --format -> --graph-format so --format is uniformly the output-format flag (breaking CLI change). - error_handlers.py: extracted display_error_panel() for the field-based Rich handlers. make agent-check clean; make agent-test passed (full suite). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: full-chain error-delivery coverage + cause-chain ErrorReport enrichment (Phase 8) Phase 8 of the error-handling plan — full-chain integration coverage. The full-chain test exposed the wiring gap the plan anticipated: a worker LLMCompletionError (CogtError, carries error_category/retryable/model/ provider) is wrapped by PipeLLM -> PipeRouter -> PipelexRunner into plain PipelexError subclasses, and PipelexError.to_error_report() did not consult the __cause__ chain — so the agent CLI lost the worker's classification once the error was wrapped. GREEN fix, at the source: enrichment is a protected method PipelexError._enrich_error_report_from_cause(report) that fills every None classification field from the underlying exception. The base to_error_report() calls it, and CogtError's @override calls it too, so the "enrich from the __cause__ chain" contract is uniform across the hierarchy. The recursion propagates the deepest CogtError's metadata up through every wrapper; a wrapper keeps its own error_type/message and non-None fields. Tests: - tests/integration/pipelex/cli/agent_cli/test_run_error_chain.py — runs the real run_pipe_cmd with the LLM worker mocked to fail; asserts JSON and markdown error output carry error_category/retryable/model/provider and the ordered error_source chain. - tests/unit/pipelex/cli/test_error_handlers_snapshot.py — exact-match snapshots of two display_error_panel handlers, guarding the Phase 7 refactor. Flagged as a follow-up (out of Phase 8 scope, by decision): PipeLLM wraps LLMCompletionError into a plain PipeRunError before the router sees it, so the Phase 5 router retry loop (except CogtError) is bypassed for the LLM path. make agent-check clean; make agent-test passed; Temporal integration suite passed (94 passed, 4 xpassed, 0 failures). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: ErrorReport.http_status tolerates an unrecognized error_domain `http_status` converted `error_domain` with `ErrorDomain(self.error_domain)`, which raises `ValueError` on any string the running version doesn't know. A downstream HTTP API receiving an ErrorReport serialized by a newer Pipelex would crash while rendering the error response instead of falling back to the documented 500 for unclassified domains. Catch the ValueError and treat the unknown domain as unclassified. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: add minimal TODO for wiring from_message_exception into activities Stub follow-up for Temporal integration Followup 5: Phase 6 built TemporalError.from_message_exception() but no activity calls it, so the category-aware retry decision and ErrorReport details packing are dead in production. The file carries the gap statement and pointers only — the RED/GREEN/REFACTOR plan is to be designed in a fresh session. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: correct two worker error-classification miscategorizations (#907) * docs: cold-start TODOS for two error-classification fixes Plan branch for two deferred classification bugs from the error-handling sweep: LinkupNoResultError mis-classified TRANSIENT (should be CONTENT — now wastes real router retry budget) and FileNotFoundError mis-classified CONFIGURATION (should be CONTENT). TDD plan, RED to GREEN per bug, checkpoint between. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: classify Linkup no-result as CONTENT, not TRANSIENT LinkupNoResultError had no explicit branch in _classify_linkup_error and fell through to the TRANSIENT fallback. A no-result search is not transient — retrying the identical query yields no result again — so the PipeRouter retry loop burned its budget and backoff sleeps on a query that cannot succeed. Add an explicit LinkupNoResultError branch to both Linkup workers (search and extract), classifying it CONTENT + CHANGE_INPUT. Behavior change: a no-result search is now non-retryable (CONTENT.is_retryable is False). Also marks item 1 of the search-worker-review-followups deferred doc as landed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: classify missing input file as CONTENT, not CONFIGURATION The `except FileNotFoundError` branch in the Docling and pypdfium2 extract workers classified a missing input file as CONFIGURATION + CHANGE_INPUT. CONFIGURATION is reserved for setup problems elsewhere (paired with CHECK_CREDENTIALS / CHANGE_MODEL / CONTACT_SUPPORT); the CONFIGURATION + CHANGE_INPUT pairing was internally inconsistent. A missing input file is a content problem — flip the branch to CONTENT, matching its sibling branches. No behavior change: CONFIGURATION and CONTENT are both non-retryable. Also adds the combined CHANGELOG entry for both error-classification fixes and deletes the resolved file-not-found-category-mismatch deferred doc. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: retry transient LLM failures that bypassed the PipeRouter retry loop (#903) * fix: retry transient LLM failures that bypassed the PipeRouter retry loop PipeRouter's application-level transient-retry loop was dead for the most common case — an LLM call. PipeLLM and PipeStructure catch the worker's LLMCompletionError (a CogtError) and re-raise it as a plain PipeRunError; the router's `except CogtError` retry branch never saw it, and the `except PipeRunError` branch wrapped into PipeRouterError without retrying. A transient LLM failure (rate limit, timeout, brief outage) was never retried in-process despite max_transient_retries defaulting to 3. The router now unifies its retry decision in a single `except (CogtError, PipeRunError)` branch: the retry classification is derived from the exception itself when it is a CogtError, or from its __cause__ when it is a PipeRunError (the operator wrap). On exhaustion a PipeRunError still wraps into PipeRouterError, preserving the pipe location context and the full-chain error shape. Coverage: new integration test runs a real PipeLLM and PipeStructure through the router with the worker mocked to raise a TRANSIENT failure; a cause-chain case added to the router retry unit tests. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: address PR #903 review comments on retry loop Two confirmed issues from PR #903 review agents, plus a deferred TODO: - gather_bounded: a factory raising synchronously bypassed chunk draining and aborted the gather before it ran, orphaning the chunk's other coroutines. Each factory is now invoked inside an _invoke() coroutine, so a synchronous raise is captured per-task by gather(return_exceptions=True) and the chunk still drains. - format_run_markdown: when main_stuff carries an empty markdown (the API runner cannot render it), the result was dropped entirely as an excluded envelope key, leaving the "## Result" section showing only metadata. It now falls back to main_stuff's structured json payload. - Deferred: the retry loop reruns run_pipe() after on_pipe_end_error has recorded the failed attempt, so a successful retried run carries a stale error node. Tracked in todos-retry-graph-trace.md. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: inference-failure ErrorReports carry model/provider in production (#904) * docs: cold-start TODOS for the ErrorReport model/provider fix Inference-failure errors (LLMCompletionError, ImgGenGenerationError, ExtractJobFailureError, SearchJobFailureError) reach to_error_report() with model/provider = None in production — CogtError.to_error_report() duck-types model_handle/backend_name via getattr and nothing sets them. The Phase 8 full-chain test only passes because it setattrs them by hand. This TODOS is the cold-start plan to fix it: verified facts checked against the code, the A-vs-B design decision (recommendation: centralize enrichment at the worker base class), and the RED/GREEN/REFACTOR steps. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * fix: inference-failure ErrorReports carry model/provider in production A real LLM / image-gen / extract / search failure used to surface an ErrorReport with model = None and provider = None: the leaf errors (LLMCompletionError, ImgGenGenerationError, ExtractJobFailureError, SearchJobFailureError, ExtractOutputError) carry no model or provider of their own, and CogtError.to_error_report() only duck-typed whatever attributes happened to be set on the exception. Option B (centralized enrichment at the worker base class): - Declare model_handle / backend_name on CogtError; to_error_report() reads them directly instead of getattr. New fill_model_and_provider() fills them only when unset, never overwrites an inner error's value, and skips the "unknown" placeholde…
* feat: add support for Gemini 3.5 flash model in Google backend configurations * changelog: note gemini-3.5-flash addition on google backend Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: add gemini-3.5-flash model to Pipelex Gateway documentation --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#923) * feat: lean CLAUDE.md by default, standalone variant for contributors - Drop tdd.md from all generated rule sets - Add [agent_rules.targets.claude.sets] with `all` (lean) and `standalone` (full); CLAUDE.md no longer duplicates Python and pytest standards that workspace users already get from .claude/rules/ - Add --targets CLI filter to `pipelex-dev kit rules` to constrain which preferred targets get regenerated - Add `make rules-claude-standalone` for contributors without the Pipelex workspace - Document the lean/standalone split in CONTRIBUTING.md - Cover lean/standalone behavior with a new integration test Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat: enhance cleanup behavior for cursor rules and agent rules synchronization * feat: enhance remove_cursor_rules to support deprecated rule stems for cleanup --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 37e7c4fdbe
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
3 issues found across 567 files
Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.
Re-trigger cubic
…ols with xhigh level
There was a problem hiding this comment.
1 issue found across 3 files (changes from recent commits).
Reply with feedback, questions, or to request a fix.
Re-trigger cubic
|
@greptileai review this |
Release v0.29.0
Bumps version from
0.28.0to0.29.0.Changelog
[v0.29.0] - 2026-05-20
Added
gemini-3.5-flashmodel added to thegooglebackend. Adaptive thinking mode, supports text / images / pdf in and structured outputs,max_prompt_images = 3000, costs{ input = 1.5, output = 9.0 }per million tokens.New
PipeStructureoperator that turns text into a structured concept via a single LLM call. Takes oneText-compatible input (or a domain concept thatrefines = "Text"), produces any structured output with the usual multiplicity options (Foo,Foo[],Foo[N]). Useful whenever the source text comes from a PDF extraction, a search result, an upstream pipe, or any non-LLM origin. Documented at building-methods/pipes/pipe-operators/PipeStructure.md.Custom search attributes on the Temporal namespace, with a strict prerequisite at worker boot. Pipelex sets five custom Keyword search attributes on every workflow start —
PipeCode,PipelineRunId,SessionId,UserId,DomainCode— so the Temporal dashboard can filter, group, and audit workflows by their actual semantic identity rather than by opaque workflow ids. A real Temporal cluster rejects everyStartWorkflowExecutionRPC that references an unregistered attribute, so the check at worker boot is a hard fail (raisesSearchAttributeRegistrationError) on a reachable namespace — the previous "warn and continue" framing was dishonest because workflows would then fail at every dispatch with a much less actionable error. The exception message embeds both thepipelex setup-temporal-namespaceinvocation and the equivalent rawtemporal operator search-attribute createcommand.RPCError(control-plane unreachable) stays a soft fail; the in-process test server is registered up front by the integration conftest. Documented at distributed-execution/cluster-setup.md.[temporal.search_attributes]config block — masterenabledtoggle and anattributessubset selector for the five built-ins.enabled = falseskips both the worker-boot registration check and the per-workflow attribute attachment (use this when the namespace policy is owned by a third party who will not register attributes). The validator rejects unknown attribute names so typos like"PipelineRunID"surface at config load instead of silently producing no attribute. The Pydantic model lives inpipelex/temporal/config_temporal.pyso the validator can referenceBUILTIN_SEARCH_ATTRIBUTESwithout pullingtemporaliointo the config-load path.pipelex setup-temporal-namespaceCLI command — wrapsOperatorService.AddSearchAttributesagainst the configured server profile so operators don't need a separatetemporal/tcldinstall for the common case. Reads the same[temporal.search_attributes].attributeslist and[temporal.temporal_config]block the worker uses, so the names registered here can never drift from the names the worker will dispatch with.--dry-runprints the equivalent rawtemporal operator search-attribute createcommand without executing.--server <profile>targets a non-default server profile. On Temporal Cloud namespaces where the worker API key lacksOperatorService.AddSearchAttributespermission, the command catchesRPCError(PERMISSION_DENIED)and prints the fallback runbook (rawtemporalCLI command,tcldfor Cloud, Cloud UI link).Per-workflow static summary and details in the Temporal dashboard. Every top-level workflow start now sets
static_summary(200-byte Markdown liketranslate_doc — Translate a document from English to French) andstatic_details(Markdown table of pipe code, domain, pipeline run id, user, session, optional library crate, optional inputs). Everyworkflow.execute_activity(...)call sets a per-callsummary=carrying the call-specific meaning (LLM text · pipe=translate_doc · model=gpt-4o,Img gen N× · pipe=cover · model=fal-flux-dev · n=3, etc.). The formatting policy lives in the newpipelex/temporal/tprl/observability.pymodule.Documented
render_jsandinclude_raw_htmlonPipeExtract. Both fields were already shipping onPipeExtractBlueprint; only docs were missing.render_js = trueasks the extraction backend to render JavaScript before fetching web-page content;include_raw_html = truepopulates each extractedPage'sraw_htmlfield with the fetched HTML. Added to building-methods/pipes/pipe-operators/PipeExtract.md.Documented
XHIGHvalue onReasoningEffort. The enum already shipped seven levels; theunder-the-hood/reasoning-controls.mdtable listed only six.XHIGHsits betweenHIGHandMAXand maps to provider-specific xhigh values where supported.Bounded fan-out concurrency for
PipeBatch. APipeBatchover many items no longer spawns every branch — every coroutine, every deep-copied working memory, every inference call — at once. Branches now run in bounded chunks driven by the new[pipelex.pipeline_execution_config]settingmax_concurrency(default8; set to the literal"unbounded"for unbounded fan-out, the previous behavior). This is the resilience-without-Temporal pillar: it keeps a large workload (one pipe over thousands of documents) from overwhelming asyncio, memory, and provider rate limits. Results still preserve input order, and a branch failure still propagates (first error by input index wins). When a batch fans out over a large number of items, an advisory log line points at the Temporal track as the durable, rate-limited path.PipeParallel, which fans over a fixed pipe-defined branch set, is unchanged.Authoritative
error_domain→ HTTP-status mapping.pipelex/base_exceptions.pynow owns the mapping that downstream HTTP APIs (pipelex-relay,pipelex-back-office) need to render anErrorReportas an HTTP response:error_domain_to_http_status()(the pure domain table —INPUT→ 422,CONFIG/RUNTIME/unknown → 500) and theErrorReport.http_statusproperty, which adds a provider-429 passthrough so the API can emit aRetry-Afterheader fromprovider_metadata.retry_after_seconds. The library stays HTTP-agnostic — no web-framework dependency, just the mapping table; downstream FastAPI handlers call the helper instead of reinventing the contract.Category-aware error boundary on Temporal activities. Every in-scope Temporal activity is now decorated with
@convert_pipelex_errors(new modulepipelex/temporal/tprl/activity_error_boundary.py), which converts aPipelexErrorraised inside the activity into aTemporalErrorat the activity boundary. This packs the structuredErrorReportintoApplicationError.detailsand derivesnon_retryablefrom the error'sInferenceErrorCategory— so the workflow-sideTemporalError.from_app_errorkeepserror_category,user_action,modelandproviderinstead of landing in itserror_report is Nonefallback, and a non-retryableCogtError(CONFIGURATION,CONTENT,CAPACITY) is no longer retried. Resilience for inference failures lives entirely on the Temporal track — direct (non-Temporal) execution makes a single pipeline-level attempt.act_assemble_graphis deliberately left unwired — it is best-effort observability that swallows every failure and degrades toNone. As part of this,TemporalError._log_critical/_log_errornow selectactivity_logvsworkflow_logbased onactivity.in_activity(), sinceworkflow.loggerraises_NotInWorkflowEventLoopErroroutside a workflow event loop.AMBIGUOUSinference-error category for outcome-uncertain failures. NewInferenceErrorCategory.AMBIGUOUSfor failures where the error type is known but the outcome is not — the operation may or may not have committed (e.g. a connection dropped mid-request). It is non-retryable, likeCONFIGURATION/CONTENT/CAPACITY, but semantically distinct fromUNKNOWN, which means the error could not be classified at all. Theazure_restimage-generation worker now raisesAMBIGUOUSfor mid-request transport failures —ReadError/WriteError/RemoteProtocolErrorandReadTimeout/WriteTimeout— so Temporal does not auto-retry a non-idempotent, billable image submit whose outcome is unknown; pre-request failures (ConnectError/ConnectTimeout/PoolTimeout) stayTRANSIENTand retryable.Explicit, uniform Tier 1 transport retry across every inference worker. A new
[cogt]settingtransport_max_retries(default2) is now wired explicitly into every inference SDK client factory — Anthropic, OpenAI / Azure OpenAI, the Portkey-backed gateway clients, Mistral, and Google — instead of each factory silently inheriting whatever retry posture its SDK happens to default to. The two SDK families that default to no transport retry are brought up to the same floor: the Mistral client gets a bounded-backoffRetryConfig(retry_connection_errors=True) and the Google GenAI client getsHttpOptions(retry_options=...). The genuinely SDK-less path — theazure_restimage-generation worker, which talks to Azure over rawhttpx— gets atenacity-based transport-retry wrapper (new modulepipelex/cogt/inference/transport_retry.py) that retries connection failures and transient HTTP statuses (408/409/429/5xx) and honorsRetry-After. When noRetry-Afterheader is present, the fallback backoff uses full jitter (wait_random_exponential) so a burst of failures retried together does not re-fire in lockstep. On a non-idempotent submit-style POST it narrows the retry to failures that prove the server did no billable work — a request that was never delivered, or a 408/429 rejection — withholding an ambiguous 5xx, a 409 conflict, and a post-delivery timeout such as aReadTimeout. Transport retry is now a deliberate, configured, uniform policy rather than a per-provider accident. This is "Tier 1" of the retry model — direct (non-Temporal) execution still makes a single pipeline-level attempt on top of this transport floor; durable resilience remains the Temporal track. (Theportkey-aiSDK does not expose a retry knob — it carries its own internal retry — so the gateway'sAsyncPortkeyclient is left as-is; only its underlying OpenAI clients are wired.)Offline mode for Pipelex Gateway setup and dry-run. When the gateway is enabled but the remote config service is temporarily unreachable, Pipelex now falls back to a previously primed on-disk cache (
~/.pipelex/cache/remote_config.json, schema-versioned) instead of failing setup outright. Dry-run, validation, andpipelex-agent run bundle --dry-runcomplete normally; only the actual inference call still needs the network at runtime. The cache is primed on every successful fetch and onpipelex initwhile online. When the gateway is disabled (BYOK), no remote fetch is attempted at all — setup is fully offline. A newRemoteConfigStaleWarning(UserWarning) is emitted whenever stale cache is in use; the agent CLI surfaces it on the JSON envelope aswarnings: [{"type": "RemoteConfigStale", ...}]. Telemetry is suppressed (no-op) when running on a cached config so stale model identities don't pollute metrics. The doc/fixture generators (pipelex-dev update-gateway-models,preprocess_test_models_cmd) refuse the cache fallback via a newrequire_fresh=Trueflag, so committed reference docs and test fixtures never bake in stale data.GatewayUnknownModelError(pipelex.cogt.exceptions). Raised at setup time when the active model deck references a gateway model handle that isn't present in the (fresh or cached) gateway specs. Carries the model name and the config source (RemoteConfigSource.FRESH|CACHED); the message branches on source so a cached-source failure suggestspipelex initwhile online and a fresh-source failure points at deck/typo fixes. Wired through both the Rich CLI (handle_gateway_unknown_model_errorinerror_handlers.py) and the agent CLI (AGENT_ERROR_HINTS/AGENT_ERROR_DOMAINS).RemoteConfigUnavailableError(pipelex.system.pipelex_service.exceptions). User-facing offline-mode error: raised only when the network fetch fails AND no usable cached fallback exists. The message names the cache file path and the two remediation paths (runpipelex initwhile online to prime the cache; or disablepipelex_gatewayinbackends.tomlfor permanent BYOK operation). Distinct from the internalRemoteConfigFetchError, which is kept as the retry-layer exception.PIPELEX_REMOTE_CONFIG_URLenvironment variable. Overrides the default remote-config URL. Useful for staging/testing environments; defaults to the production URL when unset.Changed
Inference error handling refactored to Extract / Classify / Render (internal). Every inference worker's SDK-exception block now collapses to
metadata = extract_*_metadata(exc); classification = classify_inference_error(metadata); raise render_*_error(...) from exc. New modules:pipelex/cogt/inference/error_classify.py(single sharedclassify_inference_error()returningClassificationResult(category, user_action_kind, is_model_not_found)),pipelex/cogt/inference/error_render.py(single sharedrender_llm_error/render_img_gen_error/render_extract_error/render_search_error, picking theCogtErrorsubclass from anInferenceErrorFamilytag plus theis_model_not_foundflag), andpipelex/cogt/inference/provider_name.py(theProviderNameenum keying the extract-fn registry).ProviderErrorMetadatagains amessagefield plusis_quota_exhaustion/is_content_policy_violation/is_network_error@propertyaccessors; the per-providerextract_*_metadatafunctions are now the only plugin-local piece — Classify and Render live once. Mistral and the gateway-search worker specialize HTTP 404 toExtractModelNotFoundError/SearchModelNotFoundError; Azure img-gen keeps two worker-specificAMBIGUOUSbranches for mid-request transport failures. Removed: the per-provider*_error_classification.pymodules and their tests,AnthropicCredentialsError,GatewayFactory.classify_error_category/make_user_action_from_portkey_error/make_error_summary_from_portkey_error, and every inline_classify_*_error/_raise_categorized_*worker method. New unit-testtests/unit/pipelex/cogt/inference/test_provider_classification_parity.pywalks everyProviderNameagainst the extract-fn registry + worker-family map, so an unwired new provider fails fast. No user-facing API change — the structuredErrorReportcontract is unchanged. Documented at under-the-hood/error-model.md.instructor's structured-output retry no longer re-runs completions on transport errors. Passed a bareint,instructor'smax_retriesbuilds a retry loop whose predicate retries any exception — so the structured-output path (PipeLLM/PipeStructure) was re-running the whole completion on transport / API errors, a second retry loop nested on top of the SDK client's own transport retry. Eachinstructorcall site now passes atenacity.AsyncRetrying(built by the newpipelex/cogt/llm/instructor_retry.pyhelper) whose retry predicate matches only validation failures (pydantic.ValidationError,json.JSONDecodeError, andinstructor's own validation-error types). A transport error now propagates immediately as the raw SDK exception for the worker'sexceptclause to classify — transport retry is the SDK client floor (Tier 1) alone, andinstructor's retry is confined to genuine schema re-ask. As part of this the Google and Mistral structured-generation workers gained anhttpx.TransportErrorexceptclause: those SDKs let raw connection / timeout errors propagate outside their own exception hierarchies, so withinstructorno longer wrapping them they must be caught and classified directly. The Mistral structured path also now gets schema re-ask at all — it previously passed nomax_retries.Inference schema-retry setting moved and renamed:
[cogt.llm_config.llm_job_config] max_retries→[cogt.llm_config] schema_reask_max_attempts(breaking). The setting isinstructor's schema re-ask budget for structured-output validation failures. The old namemax_retriesgave no hint of that scope and collided conceptually with the new top-levelcogt.transport_max_retries— which counts retries beyond the initial attempt, whereas this one is a total attempt count (stop_after_attempt). The new name makes both the scope (schema re-ask) and the unit (attempts, not retries) explicit. The single-key[cogt.llm_config.llm_job_config]sub-table is also dropped — the setting now sits directly under[cogt.llm_config]. Projects that override this key in their ownpipelex.tomlmust move and rename it.Agent CLI
run/validate/initdefault to markdown output, with independent success/error format options (breaking). These commands previously always emitted JSON; they now accept--format markdown|jsonand default to markdown, matchingmodels/check-model/doctor. A second flag--error-format markdown|jsoncontrols error reporting on stderr independently from success output — it defaults to the value of--format, so--format jsonstill flips both, but the two can now be set separately (e.g.--format markdown --error-format jsonfor human-readable success with machine-parseable errors). Internally, only the error format is carried in aContextVar; the success format is threaded explicitly toagent_success_formatted(). Theinputs/concept/pipe/fmt/lint/accept-gateway-termscommands are unaffected (always JSON / raw passthrough).pipelex-agent validate bundlegraph-format option renamed--format→--graph-format(breaking). The--format/-fflag that selected the graph renderer (mermaidflow/reactflow/both) is now--graph-format/-f, freeing--formatto be the uniform markdown/json output-format flag across every agent-CLI command.Top-level Workflow ID shape change (breaking). Workflow IDs go from
{env}{session5}-{rand5}-{ClassName}(e.g.EdgdJ-HR5fd-TemporalPipeRun-pipe-router) to{env_prefix}{pipeline_run_id}(e.g.ut-3f9c8b2a-1e4d-4f5b-9c7a-2d8e1f0a6b3c). The session-id and random-id components and the calling class name are gone; identity flows entirely from Pipelex's existingJobMetadata.pipeline_run_id. Operational tooling that grepped for the old shape must update.Pipeline run chain semantics shift (breaking, behavioral). Because the Workflow ID is now derived from
pipeline_run_id, callers that pass a stablepipeline_run_idtoPipelineFactory.make_pipeline(pipeline_run_id=...)and re-execute now land on the same Temporal Workflow Execution Chain (with a freshrun_idper execution under the SDK-defaultALLOW_DUPLICATEreuse policy). The previous behavior produced a fresh workflow_id per execution by accident, via the truncated session id and random shortuuid components — not by design. This is now documented behavior, not a bug.Child Workflow ID separator change (breaking). Child workflow ids use
/instead of-as the separator. Examples: the fixed-rolewf_pipe_routerchild ofwf_pipe_runis{parent}/pipe-router; a dynamic sub-pipe spawned by a router is{parent}/{pipe_code}-{8-hex-chars}(the 8 hex chars come fromworkflow.uuid4()for replay-safety). Operational tooling parsing the nested-id format must update.Activity ID change (breaking). Pipelex no longer customizes
activity_id. The Temporal Python SDK auto-assigns deterministic sequential integers ("1","2", …) per workflow run, which guarantees per-(workflow_id, run_id)uniqueness by construction and is replay-safe (assigned by history position). Per-call meaning that previously lived inactivity_idnow lives in the per-activitysummary=. Anything that filtered or grouped Event History by the old semantic strings ("craft-text","craft-object-direct","jinja2-text","extract-pages", etc.) must read the per-activitysummaryor the Activity Type instead.wfidparameter removed (breaking). Thewfid: str | None = Noneparameter is dropped fromPipeRunProtocol,PipeRouterProtocol,ContentGeneratorProtocol, and every implementation (PipeRun,PipeRouter,DryPipeRouter,TemporalPipeRun,TemporalPipeRouter,ContentGenerator,ContentGeneratorDry,ContentGeneratorInWorkflow). The parameter was a legacy artifact —wfidoriginally seeded child workflow ids, was later co-opted as a defaultactivity_id, and had no production callers. Identity now flows entirely viaJobMetadata.pipeline_run_id; observability is auto-derived frompipe_code/domain_code. No deprecation cycle; per Pipelex's no-backward-compat policy, callers update at once.ContentGeneratorInWorkflowLRU + replay short-circuit deleted. The worker-singleton_seen_activity_ids: OrderedDict[(workflow_id, run_id), set[str]]cache, the_MAX_SEEN_RUNSbound, and_record_activity_id(with itsworkflow.unsafe.is_replaying()short-circuit) are gone. They existed solely to defend against the duplicate-activity-id failure mode that no longer exists now that Pipelex does not customizeactivity_id. Future contributors: do not reintroduce worker-singleton state on the activity-dispatch path — the determinism guarantee is the SDK assigning ids by history position.structuring_method = "preliminary_text"onPipeLLMworks again, via build-time elaboration. WhenPipelexInterpreterparses a.mthdsfile with aPipeLLMcarryingstructuring_method = "preliminary_text", the newBundleElaboratorrewrites it before any pipe runs into aPipeSequenceof two synthetic pipes: aPipeLLMproducingText(step 1, inheriting the original prompt + inputs + model) and aPipeStructureproducing the original output (step 2, optionally usingmodel_to_structure). The synthetic codes are tracked in aPipelexBundleBlueprint.elaboration_metadataside-table (excluded from serialization). Output multiplicity is preserved: step 1 always emits a singleText; step 2 emitsFoo,Foo[], orFoo[N]according to the original output. Two LLM calls are issued per invocation. The user-facing pipe code is unchanged, so callers,main_pipe, and the run API keep working as before. Mechanism documented at under-the-hood/build-time-elaboration.md.PipeLLMruntime no longer knows aboutstructuring_method. The field is now a pure build-time directive. The runtimePipeLLMclass no longer carriesstructuring_method, the validator that rejected mismatched output concepts has moved toPipeLLMBlueprint(where it fires at parse time), and theNotImplementedErrorpreviously raised at runtime whenpreliminary_textwas selected is gone — the elaborator handles it.PipeLLMSpecexposesstructuring_methodso AI agents authoring via specs can still opt in.StructuringMethodenum import path moved. The enum is now defined inpipelex.pipe_operators.llm.pipe_llm_blueprint(it lives next to the blueprint that consumes it) instead ofpipelex.pipe_operators.llm.pipe_llm. Direct importers must update theirfrom pipelex.pipe_operators.llm.pipe_llm import StructuringMethodline tofrom pipelex.pipe_operators.llm.pipe_llm_blueprint import StructuringMethod.Collapsed the
tprl_content_generation/workflow layer. Each content-generation call (LLM text/object/object-list, image generation, Jinja2 rendering, document extraction, PDF page-view rendering) previously went through a dedicated child workflow (WfMakeLLMText,WfMakeObject,WfMakeImages,WfMakeJinja2Text,WfMakeExtract,WfRenderPageViews) that wrapped a singleact_*activity. The wrappers added one workflow-scheduled / started / completed history-event triplet per call without changing durability guarantees — every retry still happens at the activity level.ContentGeneratorInWorkflownow callsworkflow.execute_activity(act_*, ...)directly from insideWfPipeRouter, deleting allWfMake*/WfRenderPageViewsworkflows, bothContentGeneratorTopandContentGeneratorChildgenerators along with their factories, and thecontent_generator_models.pyassignment-types module. Per-step meaning in the Temporal Web UI now comes from per-activitysummary=(and Activity Type), replacing the old semanticactivity_idstrings. Page-views augmentation (should_include_page_views) now works in Temporal mode:make_extract_pagesdispatchesact_render_page_viewsfordocument_uriinputs and builds an inline single-element[ImageContent(url=...)]list forimage_uriinputs, then attaches each page-view to itsPageContent. Previously the direct-mode generator handled both branches but the child-workflow path (WfMakeExtract) only ran the extract activity and silently skipped the page-views step, soshould_include_page_views=Truewas a no-op under Temporal.Per-activity, per-handle Temporal task-queue routing. Replaced the provisional
WorkerConfig.inference_task_queue(one-off LLM-text override) with a generalactivity_queues: dict[str, ActivityRouteConfig]table. Each entry declares a per-activitydefaultqueue and an optionalby_handlemap keyed by runtime handle (LLM model handle for text/object/object-list,img_gen_handlefor image generation,extract_handlefor document extraction). At dispatch time,WorkerConfig.resolve_queue(activity_name, routing_key)walks three layers: (1)activity_queues[activity_name].by_handle[routing_key], (2)activity_queues[activity_name].default, (3) the worker-widedefault_task_queuedefault. EveryContentGeneratorInWorkflowcall site now passestask_queue=worker_config.resolve_queue(...)uniformly — the asymmetric "LLM-text-only kwarg" is gone, the_inference_dispatch_kwargsstopgap is deleted, andWorkerConfig.inference_task_queueis removed (no backward-compat shim). Deployments can now scale OpenAI vs Anthropic worker pools independently, isolate image generation onto dedicated runners, or route specific OCR backends to dedicated queues — all via TOML config, with no code change required. Defaultpipelex.tomlships with an emptyactivity_queuestable, so existing single-queue deployments are unaffected.Per-queue submitter options and named worker-runtime profiles. Layers on top of the per-activity routing above. (1)
[temporal.queue_options.<queue>]declares per-queuestart_to_close_timeout,schedule_to_close_timeout,schedule_to_start_timeout,heartbeat_timeout,retry_policy_config, and cluster-widemax_task_queue_activities_per_second. (2)activity_queues.<activity>.handle_options.<handle>declares rare per-handle overrides for a single model/backend variant. (3)[temporal.worker_runtime_profiles.profiles.<name>]declares concurrency slots, pollers, heartbeat throttles, worker-local rate cap, and graceful shutdown timeout; one worker process selects one profile viapipelex worker --profile <name>. The newWorkerConfig.resolve_dispatch(activity_name, routing_key)composes baseline →queue_options[resolved_queue]→handle_options[routing_key]last-wins for scalars, and unionsnon_retryable_error_typesadditively across all three layers (baseline main list + queue_extra+ handle_extra). TheRetryPolicyConfigschema split into a baseline class (ownsnon_retryable_error_types) and an overlay class (ownsnon_retryable_error_types_extra);extra="forbid"rejects the wrong field on each layer so the additive composition rule is enforced at config load. Everyworkflow.execute_activity(...)site inContentGeneratorInWorkflownow splatsworker_config.resolve_dispatch(...).to_execute_kwargs()— fixing a long-standing bug where the workflow-levelworkflow_execution_timeoutwas used as every activity'sstart_to_close_timeout(a 1h budget on a jinja2 render, same 1h on a slow PDF extract).Strict
--task-queuevalidation at worker CLI startup.pipelex worker --task-queue Xnow fast-fails withWorkerTaskQueueUnknownError(and a Levenshtein "Did you mean?" suggestion) whenXisn't declared indefault_task_queue, anyactivity_queuesentry, or anyqueue_optionskey. Strict counterpart to the lenient config-load WARN on routing entries that name unknown queues.Dispatch resolution tracing. Set
temporal.temporal_config.temporal_log_config.is_dispatch_resolution_traced = trueto emit one INFO log line perworkflow.execute_activitycall, including the resolved queue, timeout, retry attempts, and the layer (baseline/queue_options/handle_options) each scalar came from. Off by default.Specialized worker scopes. New
runner-llm,runner-img-gen,runner-extract,runner-jinja2scopes under[temporal.worker_scopes.scopes]for deployment manifests that want one worker pool per backend class. Each scope registers only its activities (e.g.runner-llmregistersact_llm_gen_text,act_llm_gen_object,act_llm_gen_object_list).act_render_page_viewsis registered under bothrunner-img-genandrunner-extract(belt-and-suspenders for the two paths that need it).temporal.worker_config.task_queuerenamed todefault_task_queue. Migration map entry under[migration.migration_maps.config]maps the old key. New required fielddefault_activity_start_to_close_timeout(baseline activity timeout, default"0:10:00") replaces the previous accidental reuse ofworkflow_execution_timeoutfor activities.Cluster-wide queue rate cap moved from Python constant to TOML overlay.
TemporalTaskManager.make_workerpreviously hardcodedmax_task_queue_activities_per_second=1000on the TemporalWorker(...)constructor for every queue. It now reads this knob fromqueue_options[task_queue].max_task_queue_activities_per_second; the shipping default[temporal.queue_options.temporal_task_queue]sets it to1000, so out-of-the-box behavior on the default queue is unchanged. Deployments using non-default queue names should add their own[temporal.queue_options.<queue>]entries with the cap appropriate for that backend pool. The per-workermax_activities_per_second = 1000lives on the runtime profile and is unchanged.Removed dead code:
pipelex/temporal/wrapper/. Thestart_tprl_activitywrapper had zero callers.Retry removed from the gateway workers;
[cogt.tenacity_config]removed (breaking). Thetenacity-based retry insideGatewayExtractWorkerandGatewaySearchWorkeris gone, along with the[cogt.tenacity_config]config block and itsTenacityConfigmodel. Because config models forbid unknown keys, an existing~/.pipelex/pipelex.toml(or any layered override) that still carries[cogt.tenacity_config]will fail to load — remove that block from your config. Transient transport blips are left to the provider SDK clients' own retry; thetenacitylibrary itself is still used elsewhere (FAL job polling, remote-config fetch) and remains a dependency.PipeRouter.run()now reportsCogtErrorfailures to the observer. ACogtErrorraised out of pipe execution previously propagated pastrun()without anobserve_after_failing_runnotification (onlyPipeRunErrorwas caught). It is now observed on the failing path, then re-raised as-is — the cause chain is preserved and it is not wrapped intoPipeRouterError(only thePipeRunErrorpath still wraps).Provider HTTP 404s now raise dedicated
*ModelNotFoundErroracross LLM, image-gen, extract, and search. A model-or-deployment-not-found 404 from any LLM or image-gen provider, or from the Pipelex Gateway extract / search workers, now raises a dedicated*ModelNotFoundError(LLMModelNotFoundError,ImgGenModelNotFoundError,ExtractModelNotFoundError,SearchModelNotFoundError— allModelNotFoundErrorsubclasses,CONFIGURATIONcategory) instead of a genericLLMCompletionError/ImgGenGenerationError/ExtractJobFailureError/GatewaySearchResponseError. Because these are siblings of the generic errors — not subclasses — a 404 propagates past the operator's genericexcepttoexcept ModelNotFoundErrorinPipeOperator._live_run_pipe, which re-raisesPipeOperatorModelAvailabilityErrorcarrying the unavailablemodel_handle.RemoteConfigFetcher.fetch_remote_config()now returns aRemoteConfigResultcarryingconfig,source(fresh|cached), andcached_at, instead of a bareRemoteConfig. Callers unwrap.configfor the payload and may branch on.sourceto know whether the config is fresh or restored from cache. The fetcher accepts a new keyword-onlyrequire_fresh: bool = False— whenTrue, a cached fallback raisesRemoteConfigUnavailableErrorinstead.RemoteConfigValidationErroris never satisfied by the cache (server-side schema breaks must surface loudly).ModelManager.setup()andBackendLibrary._load_gateway_model_specs()accept a newgateway_config_source: RemoteConfigSource | Noneparameter. Passed through fromPipelex.setup()so the deck-level gateway membership check can branch its error message onFRESHvsCACHED.GatewayConfigitself staysextra="forbid"and source-free — provenance is plumbed alongside, not baked in.RemoteConfigUnavailableErrormessage branches on whether the cache was refused vs absent. Whenrequire_fresh=True(dev-CLI generators) refuses to fall back to an existing cache, the message now reads "the local cache at<path>was refused because a fresh fetch is required" instead of the previously misleading "no local cache is available at<path>". The cache-truly-missing path keeps its original wording.Fixed
Inference-failure
ErrorReports now carrymodelandproviderin production. A real LLM / image-gen / extract / search failure used to surface anErrorReportwithmodel = Noneandprovider = None: the leaf errors (LLMCompletionError,ImgGenGenerationError,ExtractJobFailureError,SearchJobFailureError,ExtractOutputError) carry no model or provider of their own, andCogtError.to_error_report()only duck-typed whatever attributes happened to be set on the exception. Each inference worker family now fillsmodel_handle/backend_namefrom the worker — where both are unambiguously known — at its public-method chokepoint (gen_text/gen_object,gen_image/gen_image_list,extract_pages,search_sourced_answer/search_structured), via the newCogtError.fill_model_and_provider(). The fill never overwrites a value an inner error already set and skips the"unknown"placeholder external plugins report.model_handle/backend_nameare now declared onCogtError(soto_error_report()reads them directly instead ofgetattr), making them uniformlystr | Noneacross the exception hierarchy.Two worker error-classification miscategorizations corrected.
LinkupNoResultError(a search or fetch that returned nothing) had no explicit branch in either Linkup worker's_classify_linkup_errorand fell through to theTRANSIENTcatch-all — marking a query that cannot succeed by retrying as retryable. It is now classifiedCONTENT+CHANGE_INPUTin both the search and the extract worker. Behavior change: a no-result search is now non-retryable (CONTENT.is_retryableisFalse), so Temporal's activity retry policy no longer retries a query that returns nothing on every attempt. Separately, theFileNotFoundErrorbranch in the Docling and pypdfium2 extract workers was classifiedCONFIGURATION— inconsistent with its sibling branches (CONTENT+CHANGE_INPUT) and with the rest of the codebase, whereCONFIGURATIONis reserved for setup problems. A missing input file is a content problem; it is nowCONTENT. This second change does not alter retry behavior —CONFIGURATIONandCONTENTare both non-retryable.Wrapped exceptions now surface the underlying inference error's classification.
PipelexError.to_error_report()enriches its report from the__cause__chain, so aPipelineExecutionError— or any wrapper around a transientCogtError— now reportserror_category,retryable,model, andproviderinstead of dropping them. Previously the agent-CLI JSON / markdown error output for a failed pipeline run lost the worker's category and retryability once the error had been wrapped by the pipe operator → router → runner layers, leaving an agent unable to tell a transient failure from a fatal one.MTHDS JSON schema no longer leaks the
elaboration_metadataside-table.PipelexBundleBlueprint.elaboration_metadatacarriedField(exclude=True)so it stayed out ofmodel_dump()/model_validate()round-trips, but Pydantic v2'sexclude=Truedoes not affectmodel_json_schema()— soderived/mthds_schema.jsonended up declaring a top-levelelaboration_metadataproperty plus$defs/ElaborationMetadataand$defs/StepRoleentries, even though the side-table is process-local in the reference runtime. Wrapped the field type withSkipJsonSchema[...]so it is hidden from the schema. After regenerating, the three entries are gone from the public schema, matching the spec's silence on them.Cross-process Temporal decode of
ListContentpayloads for dynamic concepts.WorkingMemory.dump_for_temporal()was tagging each list item with__class__/__module__markers so the receiving worker could rebuild the right subclass forAnything[]outputs. Those keys are also kajson's universal-decoder protocol, so the Temporal data converter tried to eagerly bind dynamic-concept classes (e.g.structured_output_test__Invoice) at the payload boundary — before the child workflow's per-workflowClassRegistrywas loaded. In a true 3-process topology the class is not in the global registry (the child workflow tore down its scoped registry infinally), sokajson.loadsraisedKajsonDecoderError→RuntimeError: Failed decoding argumentsand any--temporalrun that produced a list of dynamic-structured-concept items hung. Renamed the markers to the pipelex-private namespace__pipelex_class__/__pipelex_module__inWorkingMemory.dump_for_temporal()and_hydrate_list_item(); kajson's decoder gates strictly on__class__(kajson/json_decoder.py:132) so pipelex's nested dicts now pass through untouched and class binding stays inside pipelex's hydrator where the per-workflow registry lives. Also extendedCLEAN_JSON_FIELDS_TO_SKIPinpipelex/tools/misc/json_utils.pyto strip both marker families from user-facing JSON output.A directory containing a
.pipelex/config dir is now recognized as a project root..pipelexwas added toPROJECT_ROOT_MARKERS, so project-level config (e.g..pipelex/inference/backends.toml) is honored even when the directory has no.git,pyproject.toml, or other source-project marker. Previously such a directory fell through to the global~/.pipelex/config, silently ignoring the project's own overrides — so a backend disabled in the project'sbackends.tomlcould still demand credentials because the global config (where it was enabled) was loaded instead. The home directory remains excluded from project-root detection, so the global~/.pipelex/is unaffected.PipeLLMoutputting a concept that refines the nativeJSONconcept no longer crashes withNameError: name 'Any' is not defined. On a LIVE run, such a concept resolves to a structured-output model carrying adict[str, Any]field inherited fromJSONContent.SchemaToModelFactorygenerates that model as source withfrom __future__ import annotations(every annotation becomes a string) and then rebuilds each class to resolve the string annotations. The rebuild namespace was assembled from the exec'd user types plus a hand-listedLiteral, buttyping.Anyis a special form, not atype, so it was filtered out —model_rebuildthen raisedPydanticUndefinedAnnotationevaluating"dict[str, Any]". The rebuild namespace is now the exec namespace itself (minus__builtins__), so it carries exactly the names the generated source was written against and cannot drift as codegen emits other typing constructs. Fixed inpipelex/cogt/content_generation/schema_to_model_factory.py; covers both the sender path (make_from_json_schema) and the cross-process receiver path (make_types_from_source).Summary by cubic
Release v0.29.0 focuses on reliability, observability, and control for both direct and Temporal runs. It adds
PipeStructure, offline Gateway setup, per-activity task routing, uniform transport retry, and a safer error model — with a few breaking CLI and Temporal changes.New Features
PipeStructureoperator + build-time elaboration:PipeLLMwithstructuring_method = "preliminary_text"is rewritten intoPipeLLM(text)→PipeStructure(object); no runtime changes needed.pipelex setup-temporal-namespace), human-readable workflow/activity summaries, per-activity/per-handle routing ([temporal.worker_config.activity_queues]), per-queue submitter options and named worker runtime profiles, and collapsed content generation to direct activities.GatewayUnknownModelErrorandRemoteConfigUnavailableError;PIPELEX_REMOTE_CONFIG_URLoverride.[cogt].transport_max_retriesapplied across providers and rawhttpxpaths; honorsRetry-After.PipeBatchfan‑out via[pipelex.pipeline_execution_config].max_concurrency(default8), plus in-process transient retry inPipeRouter.error_domain → HTTPmapping,ErrorReportnow carriesmodel/provider, newAMBIGUOUScategory, category‑aware Temporal retry and details bridge, consistent provider metadata and user actions across workers.--formatand--error-format;validate ... --graph-formatreplaces--formatfor graph selection.gemini-3.5-flash(text/images/pdf; structured output; adaptive thinking).ReasoningEffortaddsXHIGHbetweenHIGHandMAX(mapped where supported).Migration
pipelex setup-temporal-namespace(or use the printedtemporal operator/tcldcommand).pipeline_run_id; child IDs use/; activity IDs are SDK-assigned integers — update any dashboard tooling that parsed the old shapes or relied on semanticactivity_id.worker_config.task_queue→default_task_queue; add/adjust[temporal.worker_config.activity_queues],[temporal.queue_options.<queue>], and[temporal.worker_runtime_profiles.profiles.<name>];pipelex workernow supports--profile;--task-queueis strictly validated.--format jsonfor machine reads, or set--error-formatindependently.validate ... --graph-formatreplaces--formatfor graph renderer selection.[cogt.llm_config.llm_job_config].max_retries→[cogt.llm_config].schema_reask_max_attempts(total attempts). Remove[cogt.tenacity_config](deleted). Use[cogt].transport_max_retriesfor transport and[pipelex.pipeline_execution_config]for transient retries.StructuringMethodimport path moved topipelex.pipe_operators.llm.pipe_llm_blueprint.pipelex initwhile online to prime the cache; setPIPELEX_REMOTE_CONFIG_URLfor non-prod endpoints..pipelex/pipelex.toml.Written for commit bfcb011. Summary will update on new commits. Review in cubic