Skip to content

Release v0.29.0#925

Merged
lchoquel merged 10 commits into
mainfrom
release/v0.29.0
May 20, 2026
Merged

Release v0.29.0#925
lchoquel merged 10 commits into
mainfrom
release/v0.29.0

Conversation

@lchoquel
Copy link
Copy Markdown
Member

@lchoquel lchoquel commented May 20, 2026

Release v0.29.0

Bumps version from 0.28.0 to 0.29.0.

Changelog

[v0.29.0] - 2026-05-20

Added

  • gemini-3.5-flash model added to the google backend. Adaptive thinking mode, supports text / images / pdf in and structured outputs, max_prompt_images = 3000, costs { input = 1.5, output = 9.0 } per million tokens.

  • New PipeStructure operator that turns text into a structured concept via a single LLM call. Takes one Text-compatible input (or a domain concept that refines = "Text"), produces any structured output with the usual multiplicity options (Foo, Foo[], Foo[N]). Useful whenever the source text comes from a PDF extraction, a search result, an upstream pipe, or any non-LLM origin. Documented at building-methods/pipes/pipe-operators/PipeStructure.md.

  • Custom search attributes on the Temporal namespace, with a strict prerequisite at worker boot. Pipelex sets five custom Keyword search attributes on every workflow start — PipeCode, PipelineRunId, SessionId, UserId, DomainCode — so the Temporal dashboard can filter, group, and audit workflows by their actual semantic identity rather than by opaque workflow ids. A real Temporal cluster rejects every StartWorkflowExecution RPC that references an unregistered attribute, so the check at worker boot is a hard fail (raises SearchAttributeRegistrationError) on a reachable namespace — the previous "warn and continue" framing was dishonest because workflows would then fail at every dispatch with a much less actionable error. The exception message embeds both the pipelex setup-temporal-namespace invocation and the equivalent raw temporal operator search-attribute create command. RPCError (control-plane unreachable) stays a soft fail; the in-process test server is registered up front by the integration conftest. Documented at distributed-execution/cluster-setup.md.

  • [temporal.search_attributes] config block — master enabled toggle and an attributes subset selector for the five built-ins. enabled = false skips both the worker-boot registration check and the per-workflow attribute attachment (use this when the namespace policy is owned by a third party who will not register attributes). The validator rejects unknown attribute names so typos like "PipelineRunID" surface at config load instead of silently producing no attribute. The Pydantic model lives in pipelex/temporal/config_temporal.py so the validator can reference BUILTIN_SEARCH_ATTRIBUTES without pulling temporalio into the config-load path.

  • pipelex setup-temporal-namespace CLI command — wraps OperatorService.AddSearchAttributes against the configured server profile so operators don't need a separate temporal / tcld install for the common case. Reads the same [temporal.search_attributes].attributes list and [temporal.temporal_config] block the worker uses, so the names registered here can never drift from the names the worker will dispatch with. --dry-run prints the equivalent raw temporal operator search-attribute create command without executing. --server <profile> targets a non-default server profile. On Temporal Cloud namespaces where the worker API key lacks OperatorService.AddSearchAttributes permission, the command catches RPCError(PERMISSION_DENIED) and prints the fallback runbook (raw temporal CLI command, tcld for Cloud, Cloud UI link).

  • Per-workflow static summary and details in the Temporal dashboard. Every top-level workflow start now sets static_summary (200-byte Markdown like translate_doc — Translate a document from English to French) and static_details (Markdown table of pipe code, domain, pipeline run id, user, session, optional library crate, optional inputs). Every workflow.execute_activity(...) call sets a per-call summary= carrying the call-specific meaning (LLM text · pipe=translate_doc · model=gpt-4o, Img gen N× · pipe=cover · model=fal-flux-dev · n=3, etc.). The formatting policy lives in the new pipelex/temporal/tprl/observability.py module.

  • Documented render_js and include_raw_html on PipeExtract. Both fields were already shipping on PipeExtractBlueprint; only docs were missing. render_js = true asks the extraction backend to render JavaScript before fetching web-page content; include_raw_html = true populates each extracted Page's raw_html field with the fetched HTML. Added to building-methods/pipes/pipe-operators/PipeExtract.md.

  • Documented XHIGH value on ReasoningEffort. The enum already shipped seven levels; the under-the-hood/reasoning-controls.md table listed only six. XHIGH sits between HIGH and MAX and maps to provider-specific xhigh values where supported.

  • Bounded fan-out concurrency for PipeBatch. A PipeBatch over many items no longer spawns every branch — every coroutine, every deep-copied working memory, every inference call — at once. Branches now run in bounded chunks driven by the new [pipelex.pipeline_execution_config] setting max_concurrency (default 8; set to the literal "unbounded" for unbounded fan-out, the previous behavior). This is the resilience-without-Temporal pillar: it keeps a large workload (one pipe over thousands of documents) from overwhelming asyncio, memory, and provider rate limits. Results still preserve input order, and a branch failure still propagates (first error by input index wins). When a batch fans out over a large number of items, an advisory log line points at the Temporal track as the durable, rate-limited path. PipeParallel, which fans over a fixed pipe-defined branch set, is unchanged.

  • Authoritative error_domain → HTTP-status mapping. pipelex/base_exceptions.py now owns the mapping that downstream HTTP APIs (pipelex-relay, pipelex-back-office) need to render an ErrorReport as an HTTP response: error_domain_to_http_status() (the pure domain table — INPUT → 422, CONFIG/RUNTIME/unknown → 500) and the ErrorReport.http_status property, which adds a provider-429 passthrough so the API can emit a Retry-After header from provider_metadata.retry_after_seconds. The library stays HTTP-agnostic — no web-framework dependency, just the mapping table; downstream FastAPI handlers call the helper instead of reinventing the contract.

  • Category-aware error boundary on Temporal activities. Every in-scope Temporal activity is now decorated with @convert_pipelex_errors (new module pipelex/temporal/tprl/activity_error_boundary.py), which converts a PipelexError raised inside the activity into a TemporalError at the activity boundary. This packs the structured ErrorReport into ApplicationError.details and derives non_retryable from the error's InferenceErrorCategory — so the workflow-side TemporalError.from_app_error keeps error_category, user_action, model and provider instead of landing in its error_report is None fallback, and a non-retryable CogtError (CONFIGURATION, CONTENT, CAPACITY) is no longer retried. Resilience for inference failures lives entirely on the Temporal track — direct (non-Temporal) execution makes a single pipeline-level attempt. act_assemble_graph is deliberately left unwired — it is best-effort observability that swallows every failure and degrades to None. As part of this, TemporalError._log_critical / _log_error now select activity_log vs workflow_log based on activity.in_activity(), since workflow.logger raises _NotInWorkflowEventLoopError outside a workflow event loop.

  • AMBIGUOUS inference-error category for outcome-uncertain failures. New InferenceErrorCategory.AMBIGUOUS for failures where the error type is known but the outcome is not — the operation may or may not have committed (e.g. a connection dropped mid-request). It is non-retryable, like CONFIGURATION / CONTENT / CAPACITY, but semantically distinct from UNKNOWN, which means the error could not be classified at all. The azure_rest image-generation worker now raises AMBIGUOUS for mid-request transport failures — ReadError / WriteError / RemoteProtocolError and ReadTimeout / WriteTimeout — so Temporal does not auto-retry a non-idempotent, billable image submit whose outcome is unknown; pre-request failures (ConnectError / ConnectTimeout / PoolTimeout) stay TRANSIENT and retryable.

  • Explicit, uniform Tier 1 transport retry across every inference worker. A new [cogt] setting transport_max_retries (default 2) is now wired explicitly into every inference SDK client factory — Anthropic, OpenAI / Azure OpenAI, the Portkey-backed gateway clients, Mistral, and Google — instead of each factory silently inheriting whatever retry posture its SDK happens to default to. The two SDK families that default to no transport retry are brought up to the same floor: the Mistral client gets a bounded-backoff RetryConfig (retry_connection_errors=True) and the Google GenAI client gets HttpOptions(retry_options=...). The genuinely SDK-less path — the azure_rest image-generation worker, which talks to Azure over raw httpx — gets a tenacity-based transport-retry wrapper (new module pipelex/cogt/inference/transport_retry.py) that retries connection failures and transient HTTP statuses (408/409/429/5xx) and honors Retry-After. When no Retry-After header is present, the fallback backoff uses full jitter (wait_random_exponential) so a burst of failures retried together does not re-fire in lockstep. On a non-idempotent submit-style POST it narrows the retry to failures that prove the server did no billable work — a request that was never delivered, or a 408/429 rejection — withholding an ambiguous 5xx, a 409 conflict, and a post-delivery timeout such as a ReadTimeout. Transport retry is now a deliberate, configured, uniform policy rather than a per-provider accident. This is "Tier 1" of the retry model — direct (non-Temporal) execution still makes a single pipeline-level attempt on top of this transport floor; durable resilience remains the Temporal track. (The portkey-ai SDK does not expose a retry knob — it carries its own internal retry — so the gateway's AsyncPortkey client is left as-is; only its underlying OpenAI clients are wired.)

  • Offline mode for Pipelex Gateway setup and dry-run. When the gateway is enabled but the remote config service is temporarily unreachable, Pipelex now falls back to a previously primed on-disk cache (~/.pipelex/cache/remote_config.json, schema-versioned) instead of failing setup outright. Dry-run, validation, and pipelex-agent run bundle --dry-run complete normally; only the actual inference call still needs the network at runtime. The cache is primed on every successful fetch and on pipelex init while online. When the gateway is disabled (BYOK), no remote fetch is attempted at all — setup is fully offline. A new RemoteConfigStaleWarning (UserWarning) is emitted whenever stale cache is in use; the agent CLI surfaces it on the JSON envelope as warnings: [{"type": "RemoteConfigStale", ...}]. Telemetry is suppressed (no-op) when running on a cached config so stale model identities don't pollute metrics. The doc/fixture generators (pipelex-dev update-gateway-models, preprocess_test_models_cmd) refuse the cache fallback via a new require_fresh=True flag, so committed reference docs and test fixtures never bake in stale data.

  • GatewayUnknownModelError (pipelex.cogt.exceptions). Raised at setup time when the active model deck references a gateway model handle that isn't present in the (fresh or cached) gateway specs. Carries the model name and the config source (RemoteConfigSource.FRESH | CACHED); the message branches on source so a cached-source failure suggests pipelex init while online and a fresh-source failure points at deck/typo fixes. Wired through both the Rich CLI (handle_gateway_unknown_model_error in error_handlers.py) and the agent CLI (AGENT_ERROR_HINTS / AGENT_ERROR_DOMAINS).

  • RemoteConfigUnavailableError (pipelex.system.pipelex_service.exceptions). User-facing offline-mode error: raised only when the network fetch fails AND no usable cached fallback exists. The message names the cache file path and the two remediation paths (run pipelex init while online to prime the cache; or disable pipelex_gateway in backends.toml for permanent BYOK operation). Distinct from the internal RemoteConfigFetchError, which is kept as the retry-layer exception.

  • PIPELEX_REMOTE_CONFIG_URL environment variable. Overrides the default remote-config URL. Useful for staging/testing environments; defaults to the production URL when unset.

Changed

  • Inference error handling refactored to Extract / Classify / Render (internal). Every inference worker's SDK-exception block now collapses to metadata = extract_*_metadata(exc); classification = classify_inference_error(metadata); raise render_*_error(...) from exc. New modules: pipelex/cogt/inference/error_classify.py (single shared classify_inference_error() returning ClassificationResult(category, user_action_kind, is_model_not_found)), pipelex/cogt/inference/error_render.py (single shared render_llm_error / render_img_gen_error / render_extract_error / render_search_error, picking the CogtError subclass from an InferenceErrorFamily tag plus the is_model_not_found flag), and pipelex/cogt/inference/provider_name.py (the ProviderName enum keying the extract-fn registry). ProviderErrorMetadata gains a message field plus is_quota_exhaustion / is_content_policy_violation / is_network_error @property accessors; the per-provider extract_*_metadata functions are now the only plugin-local piece — Classify and Render live once. Mistral and the gateway-search worker specialize HTTP 404 to ExtractModelNotFoundError / SearchModelNotFoundError; Azure img-gen keeps two worker-specific AMBIGUOUS branches for mid-request transport failures. Removed: the per-provider *_error_classification.py modules and their tests, AnthropicCredentialsError, GatewayFactory.classify_error_category / make_user_action_from_portkey_error / make_error_summary_from_portkey_error, and every inline _classify_*_error / _raise_categorized_* worker method. New unit-test tests/unit/pipelex/cogt/inference/test_provider_classification_parity.py walks every ProviderName against the extract-fn registry + worker-family map, so an unwired new provider fails fast. No user-facing API change — the structured ErrorReport contract is unchanged. Documented at under-the-hood/error-model.md.

  • instructor's structured-output retry no longer re-runs completions on transport errors. Passed a bare int, instructor's max_retries builds a retry loop whose predicate retries any exception — so the structured-output path (PipeLLM / PipeStructure) was re-running the whole completion on transport / API errors, a second retry loop nested on top of the SDK client's own transport retry. Each instructor call site now passes a tenacity.AsyncRetrying (built by the new pipelex/cogt/llm/instructor_retry.py helper) whose retry predicate matches only validation failures (pydantic.ValidationError, json.JSONDecodeError, and instructor's own validation-error types). A transport error now propagates immediately as the raw SDK exception for the worker's except clause to classify — transport retry is the SDK client floor (Tier 1) alone, and instructor's retry is confined to genuine schema re-ask. As part of this the Google and Mistral structured-generation workers gained an httpx.TransportError except clause: those SDKs let raw connection / timeout errors propagate outside their own exception hierarchies, so with instructor no longer wrapping them they must be caught and classified directly. The Mistral structured path also now gets schema re-ask at all — it previously passed no max_retries.

  • Inference schema-retry setting moved and renamed: [cogt.llm_config.llm_job_config] max_retries[cogt.llm_config] schema_reask_max_attempts (breaking). The setting is instructor's schema re-ask budget for structured-output validation failures. The old name max_retries gave no hint of that scope and collided conceptually with the new top-level cogt.transport_max_retries — which counts retries beyond the initial attempt, whereas this one is a total attempt count (stop_after_attempt). The new name makes both the scope (schema re-ask) and the unit (attempts, not retries) explicit. The single-key [cogt.llm_config.llm_job_config] sub-table is also dropped — the setting now sits directly under [cogt.llm_config]. Projects that override this key in their own pipelex.toml must move and rename it.

  • Agent CLI run / validate / init default to markdown output, with independent success/error format options (breaking). These commands previously always emitted JSON; they now accept --format markdown|json and default to markdown, matching models / check-model / doctor. A second flag --error-format markdown|json controls error reporting on stderr independently from success output — it defaults to the value of --format, so --format json still flips both, but the two can now be set separately (e.g. --format markdown --error-format json for human-readable success with machine-parseable errors). Internally, only the error format is carried in a ContextVar; the success format is threaded explicitly to agent_success_formatted(). The inputs / concept / pipe / fmt / lint / accept-gateway-terms commands are unaffected (always JSON / raw passthrough).

  • pipelex-agent validate bundle graph-format option renamed --format--graph-format (breaking). The --format/-f flag that selected the graph renderer (mermaidflow / reactflow / both) is now --graph-format/-f, freeing --format to be the uniform markdown/json output-format flag across every agent-CLI command.

  • Top-level Workflow ID shape change (breaking). Workflow IDs go from {env}{session5}-{rand5}-{ClassName} (e.g. EdgdJ-HR5fd-TemporalPipeRun-pipe-router) to {env_prefix}{pipeline_run_id} (e.g. ut-3f9c8b2a-1e4d-4f5b-9c7a-2d8e1f0a6b3c). The session-id and random-id components and the calling class name are gone; identity flows entirely from Pipelex's existing JobMetadata.pipeline_run_id. Operational tooling that grepped for the old shape must update.

  • Pipeline run chain semantics shift (breaking, behavioral). Because the Workflow ID is now derived from pipeline_run_id, callers that pass a stable pipeline_run_id to PipelineFactory.make_pipeline(pipeline_run_id=...) and re-execute now land on the same Temporal Workflow Execution Chain (with a fresh run_id per execution under the SDK-default ALLOW_DUPLICATE reuse policy). The previous behavior produced a fresh workflow_id per execution by accident, via the truncated session id and random shortuuid components — not by design. This is now documented behavior, not a bug.

  • Child Workflow ID separator change (breaking). Child workflow ids use / instead of - as the separator. Examples: the fixed-role wf_pipe_router child of wf_pipe_run is {parent}/pipe-router; a dynamic sub-pipe spawned by a router is {parent}/{pipe_code}-{8-hex-chars} (the 8 hex chars come from workflow.uuid4() for replay-safety). Operational tooling parsing the nested-id format must update.

  • Activity ID change (breaking). Pipelex no longer customizes activity_id. The Temporal Python SDK auto-assigns deterministic sequential integers ("1", "2", …) per workflow run, which guarantees per-(workflow_id, run_id) uniqueness by construction and is replay-safe (assigned by history position). Per-call meaning that previously lived in activity_id now lives in the per-activity summary=. Anything that filtered or grouped Event History by the old semantic strings ("craft-text", "craft-object-direct", "jinja2-text", "extract-pages", etc.) must read the per-activity summary or the Activity Type instead.

  • wfid parameter removed (breaking). The wfid: str | None = None parameter is dropped from PipeRunProtocol, PipeRouterProtocol, ContentGeneratorProtocol, and every implementation (PipeRun, PipeRouter, DryPipeRouter, TemporalPipeRun, TemporalPipeRouter, ContentGenerator, ContentGeneratorDry, ContentGeneratorInWorkflow). The parameter was a legacy artifact — wfid originally seeded child workflow ids, was later co-opted as a default activity_id, and had no production callers. Identity now flows entirely via JobMetadata.pipeline_run_id; observability is auto-derived from pipe_code / domain_code. No deprecation cycle; per Pipelex's no-backward-compat policy, callers update at once.

  • ContentGeneratorInWorkflow LRU + replay short-circuit deleted. The worker-singleton _seen_activity_ids: OrderedDict[(workflow_id, run_id), set[str]] cache, the _MAX_SEEN_RUNS bound, and _record_activity_id (with its workflow.unsafe.is_replaying() short-circuit) are gone. They existed solely to defend against the duplicate-activity-id failure mode that no longer exists now that Pipelex does not customize activity_id. Future contributors: do not reintroduce worker-singleton state on the activity-dispatch path — the determinism guarantee is the SDK assigning ids by history position.

  • structuring_method = "preliminary_text" on PipeLLM works again, via build-time elaboration. When PipelexInterpreter parses a .mthds file with a PipeLLM carrying structuring_method = "preliminary_text", the new BundleElaborator rewrites it before any pipe runs into a PipeSequence of two synthetic pipes: a PipeLLM producing Text (step 1, inheriting the original prompt + inputs + model) and a PipeStructure producing the original output (step 2, optionally using model_to_structure). The synthetic codes are tracked in a PipelexBundleBlueprint.elaboration_metadata side-table (excluded from serialization). Output multiplicity is preserved: step 1 always emits a single Text; step 2 emits Foo, Foo[], or Foo[N] according to the original output. Two LLM calls are issued per invocation. The user-facing pipe code is unchanged, so callers, main_pipe, and the run API keep working as before. Mechanism documented at under-the-hood/build-time-elaboration.md.

  • PipeLLM runtime no longer knows about structuring_method. The field is now a pure build-time directive. The runtime PipeLLM class no longer carries structuring_method, the validator that rejected mismatched output concepts has moved to PipeLLMBlueprint (where it fires at parse time), and the NotImplementedError previously raised at runtime when preliminary_text was selected is gone — the elaborator handles it. PipeLLMSpec exposes structuring_method so AI agents authoring via specs can still opt in.

  • StructuringMethod enum import path moved. The enum is now defined in pipelex.pipe_operators.llm.pipe_llm_blueprint (it lives next to the blueprint that consumes it) instead of pipelex.pipe_operators.llm.pipe_llm. Direct importers must update their from pipelex.pipe_operators.llm.pipe_llm import StructuringMethod line to from pipelex.pipe_operators.llm.pipe_llm_blueprint import StructuringMethod.

  • Collapsed the tprl_content_generation/ workflow layer. Each content-generation call (LLM text/object/object-list, image generation, Jinja2 rendering, document extraction, PDF page-view rendering) previously went through a dedicated child workflow (WfMakeLLMText, WfMakeObject, WfMakeImages, WfMakeJinja2Text, WfMakeExtract, WfRenderPageViews) that wrapped a single act_* activity. The wrappers added one workflow-scheduled / started / completed history-event triplet per call without changing durability guarantees — every retry still happens at the activity level. ContentGeneratorInWorkflow now calls workflow.execute_activity(act_*, ...) directly from inside WfPipeRouter, deleting all WfMake* / WfRenderPageViews workflows, both ContentGeneratorTop and ContentGeneratorChild generators along with their factories, and the content_generator_models.py assignment-types module. Per-step meaning in the Temporal Web UI now comes from per-activity summary= (and Activity Type), replacing the old semantic activity_id strings. Page-views augmentation (should_include_page_views) now works in Temporal mode: make_extract_pages dispatches act_render_page_views for document_uri inputs and builds an inline single-element [ImageContent(url=...)] list for image_uri inputs, then attaches each page-view to its PageContent. Previously the direct-mode generator handled both branches but the child-workflow path (WfMakeExtract) only ran the extract activity and silently skipped the page-views step, so should_include_page_views=True was a no-op under Temporal.

  • Per-activity, per-handle Temporal task-queue routing. Replaced the provisional WorkerConfig.inference_task_queue (one-off LLM-text override) with a general activity_queues: dict[str, ActivityRouteConfig] table. Each entry declares a per-activity default queue and an optional by_handle map keyed by runtime handle (LLM model handle for text/object/object-list, img_gen_handle for image generation, extract_handle for document extraction). At dispatch time, WorkerConfig.resolve_queue(activity_name, routing_key) walks three layers: (1) activity_queues[activity_name].by_handle[routing_key], (2) activity_queues[activity_name].default, (3) the worker-wide default_task_queue default. Every ContentGeneratorInWorkflow call site now passes task_queue=worker_config.resolve_queue(...) uniformly — the asymmetric "LLM-text-only kwarg" is gone, the _inference_dispatch_kwargs stopgap is deleted, and WorkerConfig.inference_task_queue is removed (no backward-compat shim). Deployments can now scale OpenAI vs Anthropic worker pools independently, isolate image generation onto dedicated runners, or route specific OCR backends to dedicated queues — all via TOML config, with no code change required. Default pipelex.toml ships with an empty activity_queues table, so existing single-queue deployments are unaffected.

  • Per-queue submitter options and named worker-runtime profiles. Layers on top of the per-activity routing above. (1) [temporal.queue_options.<queue>] declares per-queue start_to_close_timeout, schedule_to_close_timeout, schedule_to_start_timeout, heartbeat_timeout, retry_policy_config, and cluster-wide max_task_queue_activities_per_second. (2) activity_queues.<activity>.handle_options.<handle> declares rare per-handle overrides for a single model/backend variant. (3) [temporal.worker_runtime_profiles.profiles.<name>] declares concurrency slots, pollers, heartbeat throttles, worker-local rate cap, and graceful shutdown timeout; one worker process selects one profile via pipelex worker --profile <name>. The new WorkerConfig.resolve_dispatch(activity_name, routing_key) composes baseline → queue_options[resolved_queue]handle_options[routing_key] last-wins for scalars, and unions non_retryable_error_types additively across all three layers (baseline main list + queue _extra + handle _extra). The RetryPolicyConfig schema split into a baseline class (owns non_retryable_error_types) and an overlay class (owns non_retryable_error_types_extra); extra="forbid" rejects the wrong field on each layer so the additive composition rule is enforced at config load. Every workflow.execute_activity(...) site in ContentGeneratorInWorkflow now splats worker_config.resolve_dispatch(...).to_execute_kwargs() — fixing a long-standing bug where the workflow-level workflow_execution_timeout was used as every activity's start_to_close_timeout (a 1h budget on a jinja2 render, same 1h on a slow PDF extract).

  • Strict --task-queue validation at worker CLI startup. pipelex worker --task-queue X now fast-fails with WorkerTaskQueueUnknownError (and a Levenshtein "Did you mean?" suggestion) when X isn't declared in default_task_queue, any activity_queues entry, or any queue_options key. Strict counterpart to the lenient config-load WARN on routing entries that name unknown queues.

  • Dispatch resolution tracing. Set temporal.temporal_config.temporal_log_config.is_dispatch_resolution_traced = true to emit one INFO log line per workflow.execute_activity call, including the resolved queue, timeout, retry attempts, and the layer (baseline / queue_options / handle_options) each scalar came from. Off by default.

  • Specialized worker scopes. New runner-llm, runner-img-gen, runner-extract, runner-jinja2 scopes under [temporal.worker_scopes.scopes] for deployment manifests that want one worker pool per backend class. Each scope registers only its activities (e.g. runner-llm registers act_llm_gen_text, act_llm_gen_object, act_llm_gen_object_list). act_render_page_views is registered under both runner-img-gen and runner-extract (belt-and-suspenders for the two paths that need it).

  • temporal.worker_config.task_queue renamed to default_task_queue. Migration map entry under [migration.migration_maps.config] maps the old key. New required field default_activity_start_to_close_timeout (baseline activity timeout, default "0:10:00") replaces the previous accidental reuse of workflow_execution_timeout for activities.

  • Cluster-wide queue rate cap moved from Python constant to TOML overlay. TemporalTaskManager.make_worker previously hardcoded max_task_queue_activities_per_second=1000 on the Temporal Worker(...) constructor for every queue. It now reads this knob from queue_options[task_queue].max_task_queue_activities_per_second; the shipping default [temporal.queue_options.temporal_task_queue] sets it to 1000, so out-of-the-box behavior on the default queue is unchanged. Deployments using non-default queue names should add their own [temporal.queue_options.<queue>] entries with the cap appropriate for that backend pool. The per-worker max_activities_per_second = 1000 lives on the runtime profile and is unchanged.

  • Removed dead code: pipelex/temporal/wrapper/. The start_tprl_activity wrapper had zero callers.

  • Retry removed from the gateway workers; [cogt.tenacity_config] removed (breaking). The tenacity-based retry inside GatewayExtractWorker and GatewaySearchWorker is gone, along with the [cogt.tenacity_config] config block and its TenacityConfig model. Because config models forbid unknown keys, an existing ~/.pipelex/pipelex.toml (or any layered override) that still carries [cogt.tenacity_config] will fail to load — remove that block from your config. Transient transport blips are left to the provider SDK clients' own retry; the tenacity library itself is still used elsewhere (FAL job polling, remote-config fetch) and remains a dependency.

  • PipeRouter.run() now reports CogtError failures to the observer. A CogtError raised out of pipe execution previously propagated past run() without an observe_after_failing_run notification (only PipeRunError was caught). It is now observed on the failing path, then re-raised as-is — the cause chain is preserved and it is not wrapped into PipeRouterError (only the PipeRunError path still wraps).

  • Provider HTTP 404s now raise dedicated *ModelNotFoundError across LLM, image-gen, extract, and search. A model-or-deployment-not-found 404 from any LLM or image-gen provider, or from the Pipelex Gateway extract / search workers, now raises a dedicated *ModelNotFoundError (LLMModelNotFoundError, ImgGenModelNotFoundError, ExtractModelNotFoundError, SearchModelNotFoundError — all ModelNotFoundError subclasses, CONFIGURATION category) instead of a generic LLMCompletionError / ImgGenGenerationError / ExtractJobFailureError / GatewaySearchResponseError. Because these are siblings of the generic errors — not subclasses — a 404 propagates past the operator's generic except to except ModelNotFoundError in PipeOperator._live_run_pipe, which re-raises PipeOperatorModelAvailabilityError carrying the unavailable model_handle.

  • RemoteConfigFetcher.fetch_remote_config() now returns a RemoteConfigResult carrying config, source (fresh | cached), and cached_at, instead of a bare RemoteConfig. Callers unwrap .config for the payload and may branch on .source to know whether the config is fresh or restored from cache. The fetcher accepts a new keyword-only require_fresh: bool = False — when True, a cached fallback raises RemoteConfigUnavailableError instead. RemoteConfigValidationError is never satisfied by the cache (server-side schema breaks must surface loudly).

  • ModelManager.setup() and BackendLibrary._load_gateway_model_specs() accept a new gateway_config_source: RemoteConfigSource | None parameter. Passed through from Pipelex.setup() so the deck-level gateway membership check can branch its error message on FRESH vs CACHED. GatewayConfig itself stays extra="forbid" and source-free — provenance is plumbed alongside, not baked in.

  • RemoteConfigUnavailableError message branches on whether the cache was refused vs absent. When require_fresh=True (dev-CLI generators) refuses to fall back to an existing cache, the message now reads "the local cache at <path> was refused because a fresh fetch is required" instead of the previously misleading "no local cache is available at <path>". The cache-truly-missing path keeps its original wording.

Fixed

  • Inference-failure ErrorReports now carry model and provider in production. A real LLM / image-gen / extract / search failure used to surface an ErrorReport with model = None and provider = None: the leaf errors (LLMCompletionError, ImgGenGenerationError, ExtractJobFailureError, SearchJobFailureError, ExtractOutputError) carry no model or provider of their own, and CogtError.to_error_report() only duck-typed whatever attributes happened to be set on the exception. Each inference worker family now fills model_handle / backend_name from the worker — where both are unambiguously known — at its public-method chokepoint (gen_text / gen_object, gen_image / gen_image_list, extract_pages, search_sourced_answer / search_structured), via the new CogtError.fill_model_and_provider(). The fill never overwrites a value an inner error already set and skips the "unknown" placeholder external plugins report. model_handle / backend_name are now declared on CogtError (so to_error_report() reads them directly instead of getattr), making them uniformly str | None across the exception hierarchy.

  • Two worker error-classification miscategorizations corrected. LinkupNoResultError (a search or fetch that returned nothing) had no explicit branch in either Linkup worker's _classify_linkup_error and fell through to the TRANSIENT catch-all — marking a query that cannot succeed by retrying as retryable. It is now classified CONTENT + CHANGE_INPUT in both the search and the extract worker. Behavior change: a no-result search is now non-retryable (CONTENT.is_retryable is False), so Temporal's activity retry policy no longer retries a query that returns nothing on every attempt. Separately, the FileNotFoundError branch in the Docling and pypdfium2 extract workers was classified CONFIGURATION — inconsistent with its sibling branches (CONTENT + CHANGE_INPUT) and with the rest of the codebase, where CONFIGURATION is reserved for setup problems. A missing input file is a content problem; it is now CONTENT. This second change does not alter retry behavior — CONFIGURATION and CONTENT are both non-retryable.

  • Wrapped exceptions now surface the underlying inference error's classification. PipelexError.to_error_report() enriches its report from the __cause__ chain, so a PipelineExecutionError — or any wrapper around a transient CogtError — now reports error_category, retryable, model, and provider instead of dropping them. Previously the agent-CLI JSON / markdown error output for a failed pipeline run lost the worker's category and retryability once the error had been wrapped by the pipe operator → router → runner layers, leaving an agent unable to tell a transient failure from a fatal one.

  • MTHDS JSON schema no longer leaks the elaboration_metadata side-table. PipelexBundleBlueprint.elaboration_metadata carried Field(exclude=True) so it stayed out of model_dump() / model_validate() round-trips, but Pydantic v2's exclude=True does not affect model_json_schema() — so derived/mthds_schema.json ended up declaring a top-level elaboration_metadata property plus $defs/ElaborationMetadata and $defs/StepRole entries, even though the side-table is process-local in the reference runtime. Wrapped the field type with SkipJsonSchema[...] so it is hidden from the schema. After regenerating, the three entries are gone from the public schema, matching the spec's silence on them.

  • Cross-process Temporal decode of ListContent payloads for dynamic concepts. WorkingMemory.dump_for_temporal() was tagging each list item with __class__ / __module__ markers so the receiving worker could rebuild the right subclass for Anything[] outputs. Those keys are also kajson's universal-decoder protocol, so the Temporal data converter tried to eagerly bind dynamic-concept classes (e.g. structured_output_test__Invoice) at the payload boundary — before the child workflow's per-workflow ClassRegistry was loaded. In a true 3-process topology the class is not in the global registry (the child workflow tore down its scoped registry in finally), so kajson.loads raised KajsonDecoderErrorRuntimeError: Failed decoding arguments and any --temporal run that produced a list of dynamic-structured-concept items hung. Renamed the markers to the pipelex-private namespace __pipelex_class__ / __pipelex_module__ in WorkingMemory.dump_for_temporal() and _hydrate_list_item(); kajson's decoder gates strictly on __class__ (kajson/json_decoder.py:132) so pipelex's nested dicts now pass through untouched and class binding stays inside pipelex's hydrator where the per-workflow registry lives. Also extended CLEAN_JSON_FIELDS_TO_SKIP in pipelex/tools/misc/json_utils.py to strip both marker families from user-facing JSON output.

  • A directory containing a .pipelex/ config dir is now recognized as a project root. .pipelex was added to PROJECT_ROOT_MARKERS, so project-level config (e.g. .pipelex/inference/backends.toml) is honored even when the directory has no .git, pyproject.toml, or other source-project marker. Previously such a directory fell through to the global ~/.pipelex/ config, silently ignoring the project's own overrides — so a backend disabled in the project's backends.toml could still demand credentials because the global config (where it was enabled) was loaded instead. The home directory remains excluded from project-root detection, so the global ~/.pipelex/ is unaffected.

  • PipeLLM outputting a concept that refines the native JSON concept no longer crashes with NameError: name 'Any' is not defined. On a LIVE run, such a concept resolves to a structured-output model carrying a dict[str, Any] field inherited from JSONContent. SchemaToModelFactory generates that model as source with from __future__ import annotations (every annotation becomes a string) and then rebuilds each class to resolve the string annotations. The rebuild namespace was assembled from the exec'd user types plus a hand-listed Literal, but typing.Any is a special form, not a type, so it was filtered out — model_rebuild then raised PydanticUndefinedAnnotation evaluating "dict[str, Any]". The rebuild namespace is now the exec namespace itself (minus __builtins__), so it carries exactly the names the generated source was written against and cannot drift as codegen emits other typing constructs. Fixed in pipelex/cogt/content_generation/schema_to_model_factory.py; covers both the sender path (make_from_json_schema) and the cross-process receiver path (make_types_from_source).


Summary by cubic

Release v0.29.0 focuses on reliability, observability, and control for both direct and Temporal runs. It adds PipeStructure, offline Gateway setup, per-activity task routing, uniform transport retry, and a safer error model — with a few breaking CLI and Temporal changes.

  • New Features

    • PipeStructure operator + build-time elaboration: PipeLLM with structuring_method = "preliminary_text" is rewritten into PipeLLM(text)PipeStructure(object); no runtime changes needed.
    • Temporal upgrades: strict custom search attributes (with pipelex setup-temporal-namespace), human-readable workflow/activity summaries, per-activity/per-handle routing ([temporal.worker_config.activity_queues]), per-queue submitter options and named worker runtime profiles, and collapsed content generation to direct activities.
    • Gateway offline mode: uses a primed on-disk cache for setup/validate/dry-run when remote config is unreachable; new GatewayUnknownModelError and RemoteConfigUnavailableError; PIPELEX_REMOTE_CONFIG_URL override.
    • Uniform Tier‑1 transport retry: [cogt].transport_max_retries applied across providers and raw httpx paths; honors Retry-After.
    • Resilience without Temporal: bounded PipeBatch fan‑out via [pipelex.pipeline_execution_config].max_concurrency (default 8), plus in-process transient retry in PipeRouter.
    • Error model and delivery: authoritative error_domain → HTTP mapping, ErrorReport now carries model/provider, new AMBIGUOUS category, category‑aware Temporal retry and details bridge, consistent provider metadata and user actions across workers.
    • Agent CLI UX: markdown is the default; independent --format and --error-format; validate ... --graph-format replaces --format for graph selection.
    • Model updates: add Google gemini-3.5-flash (text/images/pdf; structured output; adaptive thinking).
    • Reasoning controls: ReasoningEffort adds XHIGH between HIGH and MAX (mapped where supported).
  • Migration

    • Temporal
      • Register search attributes once per namespace: run pipelex setup-temporal-namespace (or use the printed temporal operator/tcld command).
      • IDs/observability: workflow IDs now derive from pipeline_run_id; child IDs use /; activity IDs are SDK-assigned integers — update any dashboard tooling that parsed the old shapes or relied on semantic activity_id.
      • Config: worker_config.task_queuedefault_task_queue; add/adjust [temporal.worker_config.activity_queues], [temporal.queue_options.<queue>], and [temporal.worker_runtime_profiles.profiles.<name>]; pipelex worker now supports --profile; --task-queue is strictly validated.
    • CLI
      • Agent: default output is markdown; use --format json for machine reads, or set --error-format independently. validate ... --graph-format replaces --format for graph renderer selection.
    • LLM structuring and retry
      • Rename/move: [cogt.llm_config.llm_job_config].max_retries[cogt.llm_config].schema_reask_max_attempts (total attempts). Remove [cogt.tenacity_config] (deleted). Use [cogt].transport_max_retries for transport and [pipelex.pipeline_execution_config] for transient retries.
      • StructuringMethod import path moved to pipelex.pipe_operators.llm.pipe_llm_blueprint.
    • Gateway
      • To enable offline setup/validate/dry-run, run pipelex init while online to prime the cache; set PIPELEX_REMOTE_CONFIG_URL for non-prod endpoints.
    • Docs
      • Distributed execution: corrected config path references to .pipelex/pipelex.toml.

Written for commit bfcb011. Summary will update on new commits. Review in cubic

lchoquel and others added 8 commits May 15, 2026 20:47
…concept (#902)

A PipeLLM whose output concept refines the native JSON concept resolves to a
structured-output model carrying a dict[str, Any] field inherited from
JSONContent. SchemaToModelFactory generates that model with
`from __future__ import annotations`, turning the field annotation into the
string "dict[str, Any]", then rebuilds each class to resolve the strings. The
rebuild namespace was filtered to `type` instances plus a hand-listed Literal,
dropping typing.Any (a special form, not a type) — model_rebuild then raised
PydanticUndefinedAnnotation.

The rebuild namespace is now the exec namespace itself (minus __builtins__), so
it carries exactly the names the generated source was written against and
cannot drift as codegen emits other typing constructs. Covers the sender path
(make_from_json_schema) and the cross-process receiver path
(make_types_from_source).

Adds unit tests for both paths plus an e2e .mthds bundle regression that
exercises the LIVE structured-output build.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* docs: replace TODOS.md with offline-mode implementation plan

Lays out a TDD plan for offline-safe Pipelex setup: cache remote config
on first init, fall back to cache when network is unavailable, and fail
clearly when a referenced gateway model is missing from both fresh and
cached specs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: tighten offline-mode plan after eng review

Apply 9 review-driven edits to TODOS.md: move source provenance off
GatewayConfig, cache raw JSON, keep RemoteConfigFetchError, require
fresh data for doc generators, add retry-exhaustion and regression
tests, replace test env-var backdoor with PIPELEX_REMOTE_CONFIG_URL.

* feat: implement RemoteConfigCache for offline mode support and add integration tests

* feat: Implement new remote config fetching logic with cache fallback and provenance tracking

- Refactored `RemoteConfigFetcher.fetch_remote_config()` to return a `RemoteConfigResult` containing the fetched config, source of the config (FRESH or CACHED), and cache timestamp.
- Introduced `RemoteConfigUnavailableError` for scenarios where both network fetch and cache fallback fail, providing user-facing error messages with remediation steps.
- Added `RemoteConfigStaleWarning` to indicate when a cached config is used due to network issues.
- Updated all existing callers of `fetch_remote_config()` to accommodate the new return type and error handling.
- Enhanced tests to cover new behaviors, including success cases, network failures, and validation errors.
- Ensured that the internal retry logic raises `RemoteConfigFetchError` while the outer layer handles user-facing errors appropriately.

* feat: Introduce GatewayUnknownModelError for missing models in gateway specs

- Added GatewayUnknownModelError to handle cases where a model referenced in the deck is not found in the active gateway specs.
- Enhanced model manager to enforce gateway model membership, raising the new error when discrepancies are detected.
- Updated remote config fetcher to include source provenance (FRESH vs CACHED) for better error messaging and telemetry control.
- Refactored related tests to ensure proper coverage for the new error handling and gateway configuration scenarios.
- Introduced RemoteConfigSource enum to streamline source tracking for remote configurations.

* feat: Implement remote config cache priming for offline mode in agent CLI

* feat: Enhance offline mode support with new remote config handling and E2E tests

* feat: Add offline mode support and error handling for remote config issues

* fix: Improve error message clarity for RemoteConfigUnavailableError when cache is refused

* feat: Enhance offline mode support with improved cache priming and error handling

* test: Add unit tests for cycle detection in alias and waterfall handling

* refactor: Remove RemoteConfigFetchError references and improve error handling for remote config issues

* docs: document offline-mode envelope warnings and init cache priming

Adds the `warnings` field to the agent CLI JSON success contract in
agent-cli.md (was missing the field this branch introduces), and notes
remote-config cache priming in init.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add Offline Behavior section to gateway.md

Documents how Pipelex stays usable when the Gateway remote config
service is unreachable: BYOK skips the fetch entirely, Gateway mode
falls back to the primed on-disk cache, and only live inference still
needs the network.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: recognize .pipelex/ config dir as a project root marker

A directory containing a .pipelex/ config dir is now recognized as a
project root. Previously such a directory fell through to the global
~/.pipelex/ config, silently ignoring the project's own overrides.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plan archive

* fix: refuse stale/malformed gateway config in offline paths

Address two P1 review findings on the offline-mode work:

- remote_config_fetcher: a cache with a valid wrapper but a malformed
  raw_config let a raw Pydantic ValidationError escape the offline
  fallback. Catch it and raise RemoteConfigUnavailableError with the
  normal remediation. Reword the message to "no usable local cache"
  so it is accurate for both missing and unusable caches.
- preprocess_test_models_cmd: _fetch_gateway_models swallowed
  require_fresh refusals into empty model lists, letting offline
  fixture generation proceed without any pipelex_gateway entries.
  Let the error propagate and surface a clear offline-mode panel.

Adds regression tests for both paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: verify cache exists after priming to avoid misreporting success

A successful remote-config fetch does not guarantee the on-disk cache was
written: RemoteConfigFetcher treats the cache write as opportunistic and
swallows OSErrors (read-only / full cache dir) with only a stderr warning.
attempt_prime_remote_config_cache trusted the fetch result alone, so it
could return primed=True while no usable cache existed, making
`pipelex-agent init` emit `cache_primed: true` and leaving later offline
runs to fail with RemoteConfigUnavailableError.

Verify a usable cache exists via RemoteConfigCache.load() after the fetch;
report priming failure with a clear remediation message otherwise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: re-validate cached payload when priming remote config cache

The priming read-back check treated RemoteConfigCache.load() as a
usability check, but load() only validates the cache wrapper, not the
inner raw_config payload. A malformed payload could still report
primed=True. Now call to_remote_config() and treat a ValidationError as
a non-primed result, matching the existing check in
RemoteConfigFetcher.fetch_remote_config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…eration collapse (#891)

* Plans

* Drop text-then-object structuring path from PipeLLM stack

Remove the entire "text then object" mechanism from PipeLLM down through cogt
and Temporal layers. The StructuringMethod.PRELIMINARY_TEXT enum value stays
so a future implementation can opt in; selecting it at runtime now raises
NotImplementedError. Rename make_object_direct -> make_object and
make_object_list_direct -> make_object_list since the "_direct" suffix only
existed to contrast with the deleted text-then-object variants.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docds and plan

* fix image gen test

* answer plan questions

* TODO: Collapse `tprl_content_generation/` Workflow Layer

* TODOS: complete Phase 0 audit, lock in activity-id strategy (i)

Audit confirms today's per-workflow uniqueness invariant holds for the
collapse refactor: every operator-side ContentGeneratorProtocol method
is invoked at most once per WfPipeRouter execution (mutually-exclusive
branches in PipeLLM/PipeImgGen, single unconditional call in PipeExtract,
no calls from PipeCompose/StructuredContentComposer). Strategy (i) is
adopted with two mitigations for Phase 1: split the duplicate
"craft-image" default between make_single_image/make_image_list, and
construct distinct activity_ids inside make_extract_pages for its two
inner activity dispatches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plan

* plan reviewed

* plan review

* Phases 1-4: build ContentGeneratorInWorkflow behind feature flag

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reintroduce preliminary_text via bundle elaboration + PipeStructure

Bring back structuring_method = "preliminary_text" through a build-time
elaboration pass instead of a runtime branch, and ship a reusable
PipeStructure (Text -> StructuredConcept) operator.

- New PipeStructure operator: blueprint, factory, runtime, spec,
  registered in PipeBlueprintUnion, PipeType, CoreRegistryModels,
  PipeSpecUnion / pipe_spec_map, and the MTHDS schema generator.
  Ships a structuring_prompt template under [cogt.llm_config.generic_templates].

- New BundleElaborator (pipelex/core/interpreter/bundle_elaborator.py)
  that rewrites a PipeLLMBlueprint with structuring_method =
  preliminary_text into a PipeSequence[PipeLLM(text), PipeStructure].
  Synthetic pipes are recorded on a new excluded side-table
  PipelexBundleBlueprint.elaboration_metadata so the language surface
  stays clean. Wired into PipelexInterpreter.

- Runtime PipeLLM no longer knows about structuring_method; the field
  stays on PipeLLMBlueprint as a build-time directive. A blueprint-level
  model_validator surfaces "preliminary_text + Text output" errors at
  authoring time; the elaborator's check stays as defense-in-depth.

- Tests: unit + integration coverage for PipeStructure, the
  elaboration pass (including image-input flow, multiplicity
  preservation, synthetic-name collision, main_pipe regression,
  defense-in-depth via model_construct), the spec round-trip, and an
  updated mthds-schema test that derives expected blueprint names from
  PipeType so future additions don't break it.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 5: flip default to ContentGeneratorInWorkflow

Rename env flag PIPELEX_USE_IN_WORKFLOW_CONTENT_GENERATOR ->
PIPELEX_USE_LEGACY_CONTENT_GENERATOR and invert polarity so the new
direct-activity content generator is the default under temporal.is_enabled.
The old name's polarity (set = new) was awkward once new became default;
the new name reads as a clear opt-out for the legacy ContentGeneratorChild.

Mirrored in both Temporal integration conftests so explicit
set_content_generator(...) tracks the production gate.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add Phase 8 round-out tests for PipeStructure + preliminary_text

Cover kajson round-trip for PipeStructureBlueprint and elaborated bundles,
end-to-end interpreter parsing of structuring_method = preliminary_text,
and integration tests exercising PipeStructure inside hand-authored
PipeSequence and PipeBatch as well as the full elaborated path.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 6: delete legacy tprl_content_generation/ surface; add real-PDF page-views test

Collapses the tprl_content_generation/ workflow layer per TODOS.md Phase 6:
- Removes 11 source files: 6 WfMake* / WfRenderPageViews workflow wrappers,
  ContentGeneratorTop + factory, ContentGeneratorChild + factory, and
  content_generator_models.py.
- Removes 4 superseded test files (2 obsolete crafter tests + 2
  already-commented-out historical tests).
- Drops the PIPELEX_USE_LEGACY_CONTENT_GENERATOR env-flag branch from
  pipelex.py and both Temporal conftests; ContentGeneratorInWorkflowFactory
  is now wired unconditionally when temporal.is_enabled.
- Drops the WfMake* entries from the crafting TaskPack (workflow_list=[]);
  the activity_list is unchanged.
- Removes the top_crafter / child_crafter fixtures from the
  content_generation conftest.

Also fills the previously-deferred test gap from Phase 4 by adding
real-PDF end-to-end coverage for make_extract_pages with document_uri +
should_include_page_views=True:
- New WfTestContentGeneratorPdfPageViews workflow (registered in
  TEMPORAL_TEST_WORKFLOWS) exercises act_extract_gen_extract_pages plus
  act_render_page_views and asserts each PageContent.page_view is set.
- New TestTprlContentGeneratorPdfPageViews integration test runs against
  a 2-page local PDF to catch attachment-loop ordering bugs.
- Marked @extract @inference @dry_runnable @temporal.

TODOS.md Phase 6 marked complete; deploy/ship sections of Phase 8
removed (handled separately when the branch lands).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Strengthen PipeStructure e2e tests: multiplicity, LLM-call count, inline-structure example

Cover all three output multiplicities (single, dynamic list, fixed list) on
the elaborated preliminary_text path, assert exactly two LLM calls per run via
the reporting registry, and add a parallel e2e fixture whose concept is
declared entirely inline in the .mthds (HikingTripReport, 12 fields). Replace
the toy SimpleResult with a richer RestaurantReview Python class for the
non-inline tests, and switch all prompts to everyday non-AI topics
(restaurants, hiking) for clearer signal in live runs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Document PipeStructure operator and preliminary_text elaboration

Adds the user-facing PipeStructure page and an Under-the-Hood page
explaining the BundleElaborator mechanism, rewires the PipeLLM page
around the new build-time elaboration model, and records the change
in CHANGELOG.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix PipeStructure error_type for non-Text input

The single input is present but its concept is incompatible — classify as
INPUT_STUFF_SPEC_MISMATCH instead of MISSING_INPUT_VARIABLE so tooling and
logging classify the failure correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plans

* Phase 7: cross-process e2e coverage + inference dispatch stopgap

Extracts _inference_dispatch_kwargs as the single deletion point for the
provisional inference_task_queue model, adds Tier 9/11 cross-process
regression tests (object-gen JSON round-trip + extract two-activity
contract), promotes split workers to required for image-gen tiers in the
temporal-e2e-validate skill, and refreshes the per-activity routing v1
design with explicit upgrade targets for the new tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Address pre-landing review findings on PipeStructure / preliminary_text

- Load synthetic helpers (`__draft_text`, `__structure`) alongside their
  exported parent in `LibraryManager._load_single_dependency`; without this,
  exported `preliminary_text` pipes ship a wrapping PipeSequence whose
  helpers are filtered out by the manifest, breaking consumers at runtime.
- Emit `PipeLLMSpec.structuring_method` and `PipeStructureSpec` in the
  spec-to-TOML serializers (`builder/operations/pipe_ops.py` and
  `cli/agent_cli/commands/pipe_cmd.py`); both fields were silently dropped.
- Reject multiplicity inputs (`Text[]`, `Text[N]`) on `PipeStructureBlueprint`
  to fail fast at parse time instead of crashing inside
  `working_memory.get_stuff_as_str` at runtime.
- Document the process-local lifetime of `elaboration_metadata` (survives
  `model_copy`, dropped on `model_dump`/`model_validate`) on the field, in
  `under-the-hood/build-time-elaboration.md`, and via a regression test;
  capture future cross-boundary persistence as TODO #10.
- Note the `StructuringMethod` import-path move in `[Unreleased]/Changed`
  and document the third (library-time / concept) layer of the
  output-Text guard.
- Add unit coverage for the new contracts: spec-to-TOML round-trip for
  `structuring_method` and `PipeStructureSpec.model`; multiplicity-input
  rejection; `validate_output_with_library` rejecting Text-refining
  concepts; `validate_inputs_with_library` accepting Text-refining
  concepts; `PipeLLMBlueprint.validate_preliminary_text_output` direct
  test; `_load_single_dependency` synthetic-helper loading.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 7 follow-up: fix Tier 11 SKILL.md marker filter

The Tier 11 pytest command in temporal-e2e-validate skill used
`-m "extract and temporal"` but the test is only marked `@pytest.mark.temporal`
(the `@extract` marker was deliberately dropped because the substitute
activities mean no real Azure Document Intelligence or pypdfium2 dependency
is exercised — see TODOS.md Phase 7). Running the documented command would
deselect every test (pytest exit 5) and the operator would conclude "Tier 11
has no tests."

Switch the filter to `-m temporal` and rewrite the surrounding paragraph to
make the substitute-fixture rationale explicit so a reader does not assume
real OCR credentials are required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 7 follow-up: code-review nits

Three small cleanups from the post-Phase-7 code review:

1. Tighten test_split_worker_extract_pages assertion to filter scheduled
   activity events by activity_type.name rather than activity_id suffix.
   The previous endswith(("-pages", "-render-page-views")) filter would
   false-positive against any future test passing a wfid like "my-pages"
   to a generator method on the same fixture workflow. Pinning to the
   activity name (act_extract_gen_extract_pages, act_render_page_views)
   is strictly more robust. Failure message also now includes (type, id)
   pairs so a regression is easier to triage.

2. Drop ConfigDict(arbitrary_types_allowed=True) from the new
   FixtureLineItem / FixtureCustomer / FixtureInvoice models — every
   field is a primitive or another BaseModel, so the config is dead
   weight. Person keeps its config; that line predates this branch.

3. Document why # noqa: TC001 is intentional on ContentGeneratorProtocol
   in wf_test_structured_output_cross_process.py — the import sits inside
   workflow.unsafe.imports_passed_through() and must stay runtime so
   Temporal can replay history. Adds a one-line comment so a future
   reader does not "fix" it under TYPE_CHECKING and break replay.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add PR brief HTML for preliminary_text + PipeStructure work

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 7 follow-up: address PR #878 review comments

- Scope `_seen_activity_ids` cache by `(workflow_id, run_id)` so retries,
  `continue_as_new`, and id-reuse policies don't inherit prior-run entries.
- Add `@update_job_metadata` to `make_templated_text` for consistent
  `content_generation_job_id` tracking.
- Clarify `_inference_dispatch_kwargs` docstring scope (LLM text only).
- Update SKILL.md hang-debug references to split-worker sessions.
- Add regression tests for run_id scoping and templated_text job metadata.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix @variable regex to not match emails; add Tier 9b bug analysis

Add negative lookbehind to @variable patterns in template preprocessor
so emails like alice@example.com stay literal. Document cross-process
ListContent decode bug in wip/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix cross-process Temporal decode of ListContent for dynamic concepts

Rename per-item type markers in WorkingMemory.dump_for_temporal() from
kajson's reserved `__class__` / `__module__` to pipelex-private
`__pipelex_class__` / `__pipelex_module__`. Kajson's universal decoder
gates strictly on `__class__`, so nested dicts now pass through the
Temporal data converter untouched and class binding stays inside
pipelex's hydrator where the per-workflow ClassRegistry lives. Extends
CLEAN_JSON_FIELDS_TO_SKIP to strip both marker families. Adds unit
tests pinning the wire-format contract and the kajson-isolation
invariant in both directions.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* lint comment

* Phase 8: docs + docstrings reflect direct-activity dispatch

Removes all `WfMake*` / `wf_make_*` / `WfRenderPageViews` / `ActLLMGenText`
references from `docs/under-the-hood/` and the split-worker test docstring,
and adds the `[Unreleased] Changed` entry documenting the
`tprl_content_generation/` workflow-layer collapse — including the
surfaced page-views-augmentation fix in Temporal mode that was previously
a silent no-op via `WfMakeExtract`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Move TODOS.md into wip/ as executed v2-plan; add co-dev HTML brief

TODOS.md was the live execution log for the collapse-content-generation
workflow-layer refactor — now in its final state with all 9 phases checked
and decisions/follow-ups recorded. Move it next to its v2 analysis as
collapse-content-generation-workflow-layer-v2-plan.md (overwriting the
stale pre-execution plan). Add an HTML brief summarizing the refactor
for co-developer onboarding.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Fix docs links to deleted llm-structured-generation-config page

Remove dangling references to the doc page deleted alongside the
text-then-object structuring path. Reframe the llm-integration
Structured Output section around direct provider-native structured
outputs, which is the only supported approach.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* move plan

* Refine per-activity queue routing plan: cold-start checklist + TTO orthogonality note

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Per-activity, per-handle Temporal queue routing (v1)

Replace the provisional `inference_task_queue` (LLM-text-only override) with
a general `activity_queues` table on `WorkerConfig`. Each entry declares a
per-activity `default` queue and an optional `by_handle` map keyed by
runtime handle (LLM model handle, `img_gen_handle`, `extract_handle`).
`WorkerConfig.resolve_queue(activity_name, routing_key)` walks three layers:
per-handle override, activity default, worker-wide `task_queue`. Every
`ContentGeneratorInWorkflow` dispatch site now passes
`task_queue=resolve_queue(...)` uniformly; the asymmetric LLM-text kwarg
and the `_inference_dispatch_kwargs` stopgap are gone, and
`inference_task_queue` is deleted (no backward-compat shim).

Tests: rewrote unit pins to assert uniform task_queue contract; added
`test_worker_config_resolve_queue.py` covering all three resolution
layers; migrated the split-worker LLM-text integration fixture to the new
config; added `route_activities_to(...)` helper so object/extract
substitute-activity tests route back to their UUID queue.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Routing validation: Step 8 battery + TOML examples + parsing test

Expand the temporal-e2e-validate skill with a self-contained Step 8 that
proves per-activity, per-handle routing works end-to-end against a live
Temporal server. Replaces the previous Tier 10 (which was awkwardly
embedded inside Tier 4/5 and only covered image-gen) with three sub-tiers:

- 10a: multi-activity isolation — act_llm_gen_text (activity default) +
  act_img_gen_images dispatched to dedicated workers, runner sees 0 hits.
- 10b: per-handle routing — same activity, two distinct model handles
  (claude-4.6-sonnet vs gemini-flash-latest) land on by_handle workers,
  proving the per-handle layer wins over the activity default.
- 10c: two activities sharing one route — act_extract_gen_extract_pages +
  act_render_page_views (routing_key=None) both land on q_extract,
  exercising the activity-default fallback for handle-less activities.

Add commented activity_queues examples to pipelex.toml,
.pipelex/pipelex.toml, and pipelex/kit/configs/pipelex.toml so operators
have a copy-paste invitation to override routing per deployment. Add a
unit test that parses a representative activity_queues TOML fragment
into WorkerConfig and exercises resolve_queue on it — regression guard
for the commented examples.

New fixtures: per_handle_routing.mthds (Tier 10b) and
pdf_extract_page_views.mthds + pdf_extract_inputs.json (Tier 10c).

All three sub-tiers validated live against PR #879's resolver.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* temporal-e2e-validate: document Pipelex Gateway path for Tier 10c

Tier 10c was marked conditional pending direct Azure Document Intelligence
credentials, but PIPELEX_GATEWAY_API_KEY + PIPELEX_INFERENCE_API_KEY proxy
that backend. Document the gateway path with the existing
pdf_extract_page_views.mthds bundle, an inputs JSON template, and a warning
not to substitute mistral-ocr/deepseek-ocr.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* change-doc and improve config

* Move activity_queues default to main TOML, drop Python default

Per the project rule that defaults live in pipelex/pipelex.toml (not in
the class definition), set activity_queues = {} explicitly in the main
config and remove Field(default_factory=dict) from both
WorkerConfig.activity_queues and ActivityRouteConfig.by_handle. Commented
routing examples now live only in the kit config copy that pipelex init
config surfaces to users. Test fixtures and helper call sites updated to
pass by_handle={} explicitly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plans

* Temporal queue options + worker-runtime profiles (v2 config)

Phase 0-6 of the queue-options-and-worker-profiles plan. Adds per-queue
submitter options (timeouts, retry, server-side rate limit) and per-worker
runtime profiles (concurrency slots, pollers, worker-local rate limit) on
top of v1 per-activity routing.

- Schema: QueueOptions, HandleOptions, WorkerRuntimeProfile,
  WorkerRuntimeProfilesConfig, WorkerTuningMode, DispatchOptions.
  Renamed worker_config.task_queue -> default_task_queue.
- Resolver: WorkerConfig.resolve_dispatch() composes baseline ->
  queue_options -> handle_options last-wins for scalars, additively for
  non_retryable_error_types. Every workflow.execute_activity in
  ContentGeneratorInWorkflow now goes through the resolver, fixing the
  workflow_execution_timeout-as-activity-timeout bug.
- Worker tuning: TemporalTaskManager.make_worker reads a
  WorkerRuntimeProfile (--profile CLI flag). Queue-level
  max_task_queue_activities_per_second flows from queue_options into
  Worker(...).
- Validation: lenient warn at config load on routing entries naming
  queues with no queue_options entry; strict fail at worker CLI startup
  on unknown --task-queue (Levenshtein 'did you mean?' suggestion).
  Overlay layers reject non-empty non_retryable_error_types (must use
  _extra) to prevent silent drops.
- Tracing: is_dispatch_resolution_traced flag emits per-call resolver
  trace lines with source-layer attribution.
- Specialized scopes: runner-llm / runner-img-gen / runner-extract /
  runner-jinja2 for deployment manifests with one worker pool per
  backend class.
- E2E skill: SKILL.md Step 9 documents v2 scenarios A-F.
- Cleanup: deleted dead pipelex/temporal/wrapper/.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Use tomli. cleanenv not erase lock

* Address PR #880 review comments (lazy temporalio + queue fallback + retry layer split)

Fix four actionable review-bot threads:

1. Restore lazy temporalio import (chatgpt-codex P1). RetryPolicy moves back
   under TYPE_CHECKING; DispatchOptions becomes a @dataclass so the module
   loads without the optional temporal extra installed.

2. Hybrid workflow-local queue fallback (chatgpt-codex P1). When activity_queues
   is empty (default config), resolve_queue returns None and to_execute_kwargs
   omits task_queue — Temporal then uses the workflow's own queue, restoring
   the with_conditional_worker test isolation pattern. With any routing
   configured, the prior explicit-routing semantic is preserved.

3. Derive content-generation activity names from CRAFTING task pack
   (cubic-dev-ai P2). Test no longer hardcodes the set; new content-gen
   activities are automatically tracked.

4. Split RetryPolicyConfig into baseline + overlay classes (greptile P2).
   ConfigModel's extra="forbid" now enforces the layer asymmetry at the
   type level; baseline non_retryable_error_types_extra entries that used
   to be silently dropped now fail at config load. Removes two now-redundant
   overlay-side validators; drops the misleading baseline _extra TOML stub.

Adds an AST regression test that fails if temporalio appears at module level
in config_temporal.py, plus per-fix regression assertions in the dispatch /
resolve_queue / TOML parsing test suites.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix Makefile

* Address PR #880 follow-up review comments (allowed-tools + default_task_queue rename docs)

- Add `temporal` and `jq` to allowed-tools in the temporal-e2e-validate skill
  so Step 9's workflow-history checks (which shell out to those commands to
  read per-queue `start_to_close_timeout` from the live server) are actually
  executable under the permission policy (cubic-dev-ai P1).

- Update stale `worker_config.task_queue` references to the renamed
  `default_task_queue` field in `.claude/skills/temporal-e2e-validate/SKILL.md`
  and `CHANGELOG.md` so the documentation matches the v2 config surface
  (cubic-dev-ai P2 × 2).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Improve AST regression scanner to catch guarded top-level temporalio imports

Address cubic-dev-ai follow-up on the AST regression test for PR #880 #1:
the original scanner only walked `tree.body`, so a `try: from temporalio... `
or `if SOME_FLAG: from temporalio...` block at the module top level (which
still executes at import time) would slip through unflagged.

The scanner now recurses into `ast.If` (except `if TYPE_CHECKING:` bodies,
which never run at runtime) and `ast.Try` (body, handlers, orelse, finalbody).
Function and class bodies stay skipped — lazy imports inside them only fire
when the function is called, which is the intended pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Lock in extra="forbid" on RetryPolicyConfig split classes

Type-design review of PR #880's RetryPolicyConfig split flagged that the layer
asymmetry invariant (baseline owns the main list, overlays own _extra) is
load-bearing on ConfigModel's ambient extra="forbid" setting. If a future
contributor ever flipped that to "allow" on either subclass, the silent-drop
bug for baseline non_retryable_error_types_extra would silently come back.

Add a one-line assertion per class so the invariant fails loudly at unit-test
time instead of regressing at runtime.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Polishing

* plans

* Temporal IDs + observability redesign (workflow_id, activity_id, search attributes)

Workflow IDs derive directly from pipeline_run_id ({env_prefix}{pipeline_run_id});
session/random/class-name components are removed. Child workflow IDs use `/` as
the path separator. Activity IDs are no longer customized — the Temporal SDK
assigns deterministic integers per workflow run, removing the duplicate-id
failure mode and the LRU + replay-short-circuit machinery that defended against
it. The wfid parameter is dropped from PipeRunProtocol, PipeRouterProtocol,
ContentGeneratorProtocol, and every implementation. Every workflow start sets
five Keyword search attributes (PipeCode, PipelineRunId, SessionId, UserId,
DomainCode), a static_summary, and static_details; every execute_activity call
carries a per-call summary= built from the new observability helpers. A
soft-fail bootstrap check at worker boot warns when the namespace is missing
the required search attributes, including the registration command. Tests
cover the new helpers, the executor passthrough, the workflow-id construction,
the search-attribute dict, and the bootstrap check.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Migrate Temporal search attributes to TypedSearchAttributes + unify child-spawn paths

Phase 5 of the Temporal IDs/Naming redesign: replace deprecated dict-based
search attributes with TypedSearchAttributes throughout the workflow layer.
Adds five module-level SearchAttributeKey constants in observability.py and
flips the type annotations on the WorkflowExecutor surface. As a follow-up,
unifies the last raw workflow.execute_child_workflow call in wf_pipe_run.py
to route through WorkflowExecutor.execute_child_workflow, matching the
pattern already used by TemporalPipeRouter's child branch.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Pre-Phase-6 cleanup: tighten exception handling + WfPipeRun failure-path test

- Replace catch-all except Exception in workflow_caller.py with named SDK
  exceptions (WorkflowAlreadyStartedError, RPCError, WorkflowFailureError
  on the client path; ChildWorkflowError-only on the child path).
- Add failure-path integration test for WfPipeRun pinning the
  exception-type shift from ChildWorkflowError to WorkflowExecutionError
  introduced by the Phase 5 child-spawn-path unification.
- Fix latent production hang: register WorkflowExecutionError via
  workflow_failure_exception_types on the production Worker (and the
  test Worker). Without this, any workflow re-raising
  WorkflowExecutionError triggers indefinite workflow-task retry
  instead of failing terminally, because WorkflowExecutionError is not
  a temporalio.exceptions.FailureError subclass.
- Fix TODOS.md doc path: docs/under-the-hood/temporal-deployment.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Add wip note for Temporal error-handling revamp

Captures the deferred design work to make WorkflowExecutionError inherit
from temporalio.exceptions.ApplicationError, which would remove the
workflow_failure_exception_types Worker-side registration added in
117bbe01. Documents scope, open questions, and trigger conditions for
the eventual cleanup.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Phase 6: hard-fail worker boot + configurable search attributes + setup CLI

Flip the Temporal search-attribute boot check from warn-and-continue to
hard-fail on reachable namespaces — the previous framing was dishonest
because real clusters reject every workflow start that references an
unregistered attribute. Add a [temporal.search_attributes] config block
(master enabled toggle + opt-in subset of the five built-ins), and a new
`pipelex setup-temporal-namespace` CLI that wraps the registration via
the same connection config the worker uses, with a permission-denied
fallback runbook for Temporal Cloud namespaces.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fixes

* Sync docs with reverted child-spawn unification

TODOS.md, the WfPipeRun failure-path test docstring, the worker
workflow_failure_exception_types comment, and the WorkflowExecutor
child-spawn wrapper docstrings still described the briefly-unified
state from the Phase 5 follow-up. The Phase 6 follow-up (commit
ac8e2335) reverted that unification for replay-determinism reasons —
WorkflowExecutorFactory.create_executor seeds config-derived options
that would be baked into the recorded StartChildWorkflowExecution
command — but the surrounding prose was not updated to match.

Updated narrative to reflect the current state: both child-spawn
sites call workflow.execute_child_workflow(...) directly and wrap
ChildWorkflowError as WorkflowExecutionError in-place; the unused
WorkflowExecutor.execute_child_workflow / start_child_workflow
wrapper methods now carry warnings explaining the in-workflow
replay-determinism trap. No code-behavior changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Reorganize wip/temporal-primitives/: archive shipped plans, renumber the rest

Move plans for shipped work (id-and-naming-plan pre-checkpoints, collapse-content-generation v2 plan+analysis+HTML, per-activity-queue-routing-v1, queue-options-and-worker-profiles plan+design, text-then-object brief, operators-as-activities analysis, workflow-and-activity-ids problem statement) into wip/archive/. Renumber the four surviving files with sortable prefixes: 00-temporal-id-primitives (evergreen reference), 01-id-and-naming-design (refreshed status: "Implemented"), 02-id-and-naming-plan (formerly top-level TODOS.md; refreshed status: "Phases 1-6 shipped"), 03-temporal-error-handling-revamp (the only deferred open item).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs

* docs and plans

* Expand deferred-items: capture why snapshot-on-workflow-input was wrong, point at Worker Versioning

Document the rejected `TemporalDispatchSnapshot` approach (architectural
inversion, payload bloat, defeats central-config purpose, not
Temporal-idiomatic) and the three-option roadmap (docs+replay test /
Worker Versioning / thin search-attrs-only snapshot) so future readers
don't re-attempt the same wrong-shaped fix.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* small fixes

* recap

* polish

* docs

* fix skill for tests and fix bugs

* add CV batch screening pipeline and tests for deeply-nested controller validation

* fix PR #891 review comments: task-routing TOML path, outdated routing claim, contradictory escape-hatch example, skill allowed-tools gaps + unbounded worker-start waits

- docs/distributed-execution/task-routing.md: correct activity_queues path to [temporal.worker_config.activity_queues.*] throughout (per-activity routing, per-handle overlays, resolution-order list)
- docs/under-the-hood/pipe-routing-and-execution.md: replace stale act_llm_gen_text-only claim with current resolve_dispatch behavior across all content-generation activities
- wip/temporal-primitives/id-and-naming.html: flip escape-hatch example to enabled=false so it matches the "turn off custom attributes entirely" header
- .claude/skills/temporal-e2e-validate/SKILL.md: add timeout/pkill/sleep/echo/tail/seq to allowed-tools; replace unbounded `until ... grep ...; do sleep 1; done` with bounded 30s waits that dump last 50 lines and exit 1 on timeout (both two-scoped-workers and single-worker blocks)
- wip/temporal-next/01-deferred-items.md: cross-cite cubic-dev-ai on the replay-determinism deferral so both reviewers' flags are visible

* fix cv_batch_screening_job missing live-mode reporting registry; dedup with shared helper

The _cv_job_iter helper was copy-pasted from pipe_job_from_library but dropped the
open_registry/close_registry bracket and let build_pipe_job mint its own random
pipeline_run_id, so --pipe-run-mode live runs would fail in reporting.

Extended pipe_job_from_library with an optional working_memory_builder hook, deleted
the duplicate, and routed cv_batch_screening_job through the single source of truth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* hide elaboration_metadata from JSON schema and document missing PipeExtract / ReasoningEffort fields

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* refactor: move kit-config sync into pipelex-dev CLI (#915)

* fix: update configuration guidelines and defaults in pipelex.toml and documentation

* refactor: move kit-config sync into pipelex-dev CLI

Replace the Makefile `up-kit-configs` rsync recipe with a native
`pipelex-dev sync-kit-configs` command. Add a reusable `mirror_dir`
utility (recursive copy + delete) that derives its exclude list from
the single source of truth in pipelex/kit/paths.py — the same sets
`check-config-sync` enforces, so a sync is always followed by a
passing check. Drops the rsync dependency (not portable to Windows)
and the `$(shell python -c ...)` indirection.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #915 review-agent comments on kit-config sync

- mirror_dir: validate source_dir is an existing directory before the
  delete pass, so an invalid source can no longer wipe the target tree
- mirror_dir: unlink target-only directory symlinks instead of passing
  them to shutil.rmtree, which raised and aborted the sync
- mirror_dir: record created directories in MirrorDirResult.created_dirs
  so an added empty directory counts as a change and dry-run reports it
- sync-kit-configs: CLI pre-check uses is_dir(); display created dirs
- update/check-gateway-models: write and verify the gateway model docs
  in both .pipelex/ and the packaged pipelex/kit/configs/ locations so
  the shipped kit copy can no longer silently go stale
- pipelex.toml: replace activity_queues = {} inline table with the
  [temporal.worker_config.activity_queues] empty table header so users
  can add routing sub-tables without a TOML parse error

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: replace target symlinks in mirror_dir instead of writing through them

When a target path was a symlink, is_file() followed the link and
copy_file() wrote through it, corrupting the external file the symlink
pointed to. Pass 2 now unlinks target symlinks before copying, so the
mirror tree always holds real files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Feature/error handling 2 (#913)

* docs: reorganize error-handling wip docs into track-based structure

Replaces phase-numbered error-handling docs with a track-based layout
under wip/error-handling/. Each track is a self-contained concern
(metadata model, worker classification, retry, CLI delivery, Temporal
integration, testing) with current state, open gaps, and followups —
no implicit ordering between tracks. Refreshes stale file paths and
worker-tier framings to match the current codebase.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plans

* feat: implement UNKNOWN error category and enhance error handling

- Added `UNKNOWN` category to `InferenceErrorCategory` to prevent misclassification of unrecognized SDK exceptions.
- Introduced `extract_underlying_sdk_exception` function to recover SDK exceptions from `InstructorRetryException`.
- Updated `AnthropicLLMWorker` to utilize the new extraction method and categorize errors correctly.
- Created shared test helpers for instructor-related tests and added unit tests for new functionality.

* feat: ProviderErrorMetadata on inference errors (Phase 3)

Adds structured SDK metadata (status_code, request_id, retry_after,
provider_error_code, body) to every CogtError via a new
ProviderErrorMetadata Pydantic model, plus extract_anthropic_metadata
helper. Anthropic worker now populates provider_metadata on every
categorized raise. ErrorReport serializes it through to_error_report().

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Refactor user action handling in error reporting

- Introduced structured `UserAction` and `UserActionKind` to replace free-form strings in error handling across various workers.
- Updated error handling in Google, Hugging Face, Linkup, Mistral, and OpenAI plugins to utilize the new `UserAction` model for user guidance.
- Enhanced tests to validate the new user action structure and ensure consistent error reporting.
- Adjusted existing tests to check for `UserAction` details instead of string comparisons for user actions.

* feat(openai): enhance error handling and metadata extraction for OpenAI SDK exceptions

- Added `extract_openai_metadata` function to distill OpenAI SDK exceptions into `ProviderErrorMetadata`, accommodating both `APIStatusError` subclasses and connection-related errors.
- Implemented `_raise_categorized_openai_sdk_error` method in `OpenAICompletionsLLMWorker` to categorize and raise appropriate `LLMCompletionError` based on the type of OpenAI SDK exception encountered.
- Updated error handling in `_gen_object` method to utilize the new categorization method, improving clarity and maintainability.
- Introduced comprehensive unit tests for `extract_openai_metadata` to ensure correct extraction of metadata from various OpenAI SDK exceptions.
- Added tests for structured-generation error handling in `OpenAICompletionsLLMWorker`, verifying that wrapped exceptions are correctly unwrapped and categorized.

* feat(openai): Phase 6 — Responses LLM unwrap + metadata + semantic user_action

Brings the OpenAI Responses worker up to the beyond-reference standard set
by Anthropic (reference) and Phase 5 (Completions). The Responses-specific
specialization is preserved: NotFoundError raises LLMModelNotFoundError
(carrying model_handle so callers can swap models), while every other
recognized SDK exception raises LLMCompletionError.

- Added _raise_categorized_openai_sdk_error helper on the Responses worker
  (mirrors the Completions helper but raises LLMModelNotFoundError for
  NotFoundError). Both _gen_text and _gen_object now dispatch through it.
- _gen_object's InstructorRetryException catch unwraps the underlying SDK
  exception via extract_underlying_sdk_exception and routes through the
  same helper, so transient/capacity/auth/not-found wrapped errors are no
  longer flattened to CONTENT.
- Every raise carries provider_metadata=extract_openai_metadata(sdk_exc)
  and a semantic UserActionKind (WAIT_AND_RETRY / CHECK_BILLING /
  CHECK_CREDENTIALS / CHANGE_INPUT / CHANGE_MODEL).
- SDK coverage now uniform with Phase 5: added InternalServerError and
  PermissionDeniedError to the tuple-catch.
- Migrated `from instructor.exceptions` to `from instructor.core`.
- Extended ModelNotFoundError.__init__ to accept and forward
  error_category, user_action, and provider_metadata kwargs to
  CogtError.__init__ so LLMModelNotFoundError can carry them end-to-end.
  ModelWaterfallError continues to work via class-level error_category.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* test(cogt): lock in ModelWaterfallError + LLMModelNotFoundError contract

Adds two unit tests for the ModelNotFoundError.__init__ widening done in
Phase 6:

1. ModelWaterfallError: when constructed without the new optional kwargs,
   the class-level error_category = CONFIGURATION must survive — i.e., the
   None defaults forwarded up to CogtError.__init__ must not clobber the
   class attribute (guarded by `if error_category is not None` in
   CogtError.__init__).
2. LLMModelNotFoundError: when worker-side categorization passes
   user_action and provider_metadata kwargs, they reach the instance.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(mistral): enhance error handling with structured metadata extraction

- Implemented `extract_mistral_metadata` to distill Mistral SDK exceptions into `ProviderErrorMetadata`, accommodating both flat and nested error body structures.
- Refactored `_classify_mistral_error` to utilize the new metadata extraction, ensuring uniform handling of error categories and user actions across different error types.
- Added comprehensive unit tests for `extract_mistral_metadata` to validate behavior against various Mistral error shapes, including handling of non-JSON bodies and `NoResponseError`.
- Developed tests for `MistralLLMWorker` to ensure correct categorization of wrapped `MistralError` instances, verifying that transient, capacity, and configuration errors are handled appropriately with attached metadata.

* Implement Google LLM error handling and metadata extraction

- Added `extract_google_metadata` function to distill Google GenAI SDK exceptions into `ProviderErrorMetadata`, accommodating Google's unique error structure.
- Refactored `_classify_google_client_error` to include structured `provider_metadata` and semantic `UserActionKind` values for better error categorization.
- Introduced `_raise_categorized_google_sdk_error` helper to streamline error handling for `ServerError` and `ClientError`.
- Updated `GoogleLLMWorker` to utilize the new error handling methods, ensuring proper categorization of errors wrapped in `InstructorRetryException`.
- Created unit tests for `extract_google_metadata` covering various error scenarios and responses.
- Developed comprehensive tests for `GoogleLLMWorker` to validate structured generation error handling, ensuring correct behavior for different error types.

* feat: enhance error handling across LLM workers with structured metadata and user actions

* Add unit tests for error handling in image generation workers

- Implement tests for Google ImgGen worker to validate provider_metadata and UserActionKind for various error scenarios.
- Create tests for Hugging Face image generation worker to ensure proper extraction of metadata and handling of errors.
- Add tests for OpenAI Completions ImgGen worker to verify SDK exception handling and error categorization.
- Introduce tests for OpenAI ImgGen worker to check error handling, including rate limits, quota issues, and authentication errors.

* feat: Phase 11 extract worker audits + Bedrock LLM upgrade

Brings Bedrock LLM and every extract worker (Mistral, Docling, Linkup,
Gateway, pypdfium2) up to the beyond-reference standard: each raise carries
ProviderErrorMetadata and a semantic UserActionKind so downstream consumers
(retry, CLI, telemetry) get a uniform shape across providers.

New helpers in error_classification.py:
- extract_bedrock_metadata for botocore ClientError shape (Error.Code,
  ResponseMetadata.HTTPStatusCode/RequestId, retry-after header)
- extract_linkup_metadata (Linkup SDK exposes only exception class — no
  HTTP response metadata)
- extract_local_extract_metadata for non-HTTP local extractors (Docling,
  pypdfium2) — surfaces only sdk_exception_type / provider_error_code

Per-worker changes:
- Bedrock LLM: collapsed inline branches into _classify_bedrock_client_error
  helper; added ResourceNotFoundException 404 branch
- Mistral extract: refactored _classify_mistral_error to mirror the LLM
  worker shape; every branch now carries a semantic user_action
- Docling / pypdfium2: each exception branch carries provider_metadata +
  semantic UserActionKind via the shared local-extract helper
- Linkup extract: introduced _classify_linkup_error method (mirrors search
  worker) with a single tuple-catch in _extract_pages
- Gateway extract: adopted extract_gateway_metadata +
  GatewayFactory.make_user_action_from_portkey_error on both _extract_web_fetch
  and _extract_base64_url paths

New tests: per-worker semantic test files asserting category +
user_action.kind + provider_metadata across all SDK exception types;
test_extract_bedrock_metadata.py covers the metadata helper directly.

make agent-check clean (0 errors); 1122 plugins/cogt unit tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: address Phase 11 review nits

Two minor follow-ups from independent code review of de61d4b9:

1. extract_bedrock_metadata: one-line comment noting that botocore
   lowercases HTTPHeaders keys, so retry-after is the canonical lookup.

2. Capture the FileNotFoundError category-vs-user-action mismatch
   (CONFIGURATION + CHANGE_INPUT) in Docling and pypdfium2 workers as a
   deferred item. The existing pre-Phase-11 tests lock in CONFIGURATION,
   so flipping the category was out of scope; documented options and
   trade-offs for a future revisit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat(openai): implement shared error classification for OpenAI SDK exceptions

- Added `openai_error_classification.py` to centralize error handling for OpenAI SDK exceptions across different workers.
- Updated `OpenAIImgGenWorker` and `OpenAIResponsesLLMWorker` to utilize the new error classification method.
- Refactored error handling logic to categorize unhandled 4xx APIStatusErrors as CONFIGURATION instead of TRANSIENT.
- Enhanced tests to cover new error handling scenarios, including unhandled 4xx errors and retry-after logic for RateLimitError.
- Updated dependencies in `pyproject.toml` for compatibility with the latest instructor version.

* uv lock

* feat: Phase 12 search worker audits

Bring both search workers up to the beyond-reference error-handling
standard: every raised error carries structured provider_metadata and a
semantic UserActionKind.

- Linkup search: _classify_linkup_error now attaches extract_linkup_metadata
  and a semantic UserActionKind on every branch (timeout, invalid-request,
  and fallback previously had no user_action).
- Gateway search: _call_relay passes user_action + provider_metadata to
  GatewaySearchResponseError, reusing the Portkey helpers from Phase 10.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* deferred

* feat: enhance error handling across image generation and extraction workers with structured user actions and metadata

* feat: update error handling in image generation workers to classify errors as UNKNOWN and suggest changing models

* landing the plan

* feat: improve error handling across various workers to categorize APIStatusErrors and enhance user actions

* refactor: enable ruff BLE001 and sweep broad except Exception catches

Remove BLE001 from the ruff ignore list so broad `except Exception`
catches are now permanently lint-guarded. Exempt tests/ via per-file
ignores (Phase 1 is scoped to non-test code).

Narrow three genuinely-narrowable catches to specific exceptions:
- pipe_func.py: get_stuff catch -> WorkingMemoryStuffNotFoundError
- func_registry.py: get_type_hints catch -> (NameError, TypeError)
- model_deck.py: backend-TOML load catch -> (TomlError, OSError)

Annotate the remaining legitimate broad catches (CLI/dev/agent roots,
Temporal and async-task roots, telemetry exporters, best-effort
teardown cleanup, defensive utility fallbacks) with # noqa: BLE001.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* plans

* feat: enhance error handling across various modules by narrowing exception catches

- Updated exception handling in `customize_backends_config` to catch specific exceptions (EOFError, OSError, TOMLKitError) instead of a broad Exception.
- Modified `get_currently_enabled_backends` to handle TomlError and OSError, providing a silent failure for file read issues.
- Refined error handling in `do_show_backends` to catch MarkupError specifically, improving clarity in error reporting.
- Adjusted `WorkingMemoryFactory` to log warnings for mock creation failures while maintaining a best-effort approach.
- Enhanced `output_renderer` to provide context-specific comments for exception handling, focusing on dynamic concept rendering.
- Narrowed exception handling in `LibraryManager` to specific exceptions (NameError, PydanticUserError) during model rebuilding.
- Improved error handling in `DeliveryExecutor` methods to ensure failures are logged without disrupting the delivery process.
- Updated `dry_run` to catch ValidationError specifically, ensuring fallback to TextContent is clear.
- Enhanced teardown error handling in `GatewayExtractWorker` and `GoogleImgGenWorker` to ensure cleanup failures are logged but do not halt execution.
- Refined exception handling in `MistralFactory` to catch ValueError specifically for base64 cleaning.
- Updated `act_assemble_graph` to clarify the best-effort observability approach for graph assembly failures.
- Narrowed exception handling in `json_utils` to specific exceptions (TypeError, UnijsonEncoderError) during JSON purification.
- Adjusted `can_inject_text` to clarify the safety of f-string formatting on uncertain input types.
- Improved error handling in `are_classes_equivalent` to catch Pydantic-specific exceptions during schema comparison.

* Enhance error handling across CLI commands

- Added structured error handling in various CLI commands to ensure unexpected failures are captured and reported consistently.
- Introduced comments to clarify the purpose of error handling at command boundaries, emphasizing that failures are converted into structured error payloads.
- Updated error handling in commands related to agent operations, model checks, input processing, and validation to improve robustness and user feedback.
- Ensured that telemetry and logging mechanisms do not disrupt application flow during error scenarios.

* feat: add error_domain to the error model and class-level exception metadata

Phases 2-4 of the error-handling plan (Checkpoint B):

- Phase 2: add the ErrorDomain StrEnum and an error_domain field on
  PipelexError / ErrorReport, forwarded through to_error_report().
- Phase 3: set class-level error_domain (and user_action where the track
  doc gives concrete text) on the key non-CogtError exceptions; set
  error_category on the uncategorized prompt-* CogtError families.
- Phase 4: agent_error() reads error_domain report-first with the lookup
  dict as fallback; remove the now-redundant agent-CLI dict entries; add
  a drift test guarding the dicts against stale keys and double sources.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* Improve Phase 5 plan

* feat: update cold-start status section and add detailed phase execution instructions

* feat: PipeRouter transient-retry loop (resilience without Temporal)

PipeRouter now retries InferenceErrorCategory.TRANSIENT failures with
exponential backoff, driven by four new pipeline_execution_config
settings (max_transient_retries defaults to 3; 0 disables retry).

Retry moves out of the two gateway workers into the dispatch layer:
the tenacity AsyncRetrying wrappers, _make_retryer/_is_retryable/_log_retry
helpers, and the TenacityConfig model + [cogt.tenacity_config] block are
removed. The tenacity dependency stays (FAL polling, remote-config fetch).

PipeRouterProtocol carries the retry policy as a TransientRetrySettings
instance attribute (new dependency-free pipe_run/transient_retry.py),
populated from config by each concrete router at construction — reading
config inside the protocol directly would form a config->hub->protocol
import cycle.

A CogtError out of pipe execution is now reported to the observer on the
failing path and re-raised as-is (cause chain preserved, not wrapped).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: preserve transient retry of gateway deployment-propagation 404s

Code review of the retry-loop commit found a regression: the deleted
gateway-worker retry predicate had treated a NotFoundError reading
"specified deployment could not be found" as a transient Portkey
deployment-propagation race. classify_error_category() mapped every
NotFoundError to CONFIGURATION (non-retryable), so that case would no
longer retry.

Add GatewayFactory._is_deployment_propagation_race() and special-case
that 404 to TRANSIENT (and WAIT_AND_RETRY for the user action). Add a
CLASSIFY_CASES test case for it.

Also restore the field bounds the removed TenacityConfig provided:
PipelineExecutionConfig retry fields now carry Field(ge=0) so a
malformed pipelex.toml fails at config load.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: bounded fan-out concurrency for PipeBatch (Phase 5.5)

PipeBatch over many items no longer spawns every branch — every
coroutine, every deep-copied working memory, every inference call — at
once. Branches now run in bounded chunks via the new gather_bounded
helper, driven by the max_concurrency config (default 8).

This is the second resilience-without-Temporal pillar beside transient
retry (Phase 5): it keeps a large workload from overwhelming asyncio,
memory, and provider rate limits.

- gather_bounded (pipelex/tools/misc/async_utils.py): generic chunked
  fan-out over factories, not coroutines, so each deep copy is
  materialized only when its chunk runs — bounding memory, not just
  execution. Results preserve input order; first error by input index
  wins; the failing chunk is drained and no later chunk starts.
  max_concurrency is int | None, None meaning unbounded.
- max_concurrency on PipelineExecutionConfig, beside the retry fields,
  typed Annotated[int, Field(ge=1)] | Literal["unbounded"] — the bound
  is disabled by the explicit "unbounded" literal, not a magic 0.
- PipeBatch builds one factory per branch and maps the "unbounded"
  config literal to the helper's None; large batches log an advisory
  pointing at the Temporal track as the durable, rate-limited path.
- PipeParallel is left unbounded — it fans over a fixed branch set.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: category-aware Temporal retry + ErrorReport details payload (Phase 6)

The Temporal error bridge now derives its retry decision from
InferenceErrorCategory.is_retryable — the same signal the in-process
PipeRouter retry loop consults — instead of a static class-name list.

- TemporalError.__init__ gains non_retryable + error_report passthrough;
  error_report is packed as the ApplicationError.details payload.
- from_message_exception: category-aware retryability for CogtError
  carrying a category; class-name-list fallback for category-less
  exceptions (via the new _is_non_retryable helper).
- from_app_error: recovers the ErrorReport dict from details and
  preserves the round-tripped non_retryable flag, so error_category /
  user_action / model / provider survive the activity -> workflow boundary.
- Logging extracted into _log_critical / _log_error classmethods for
  unit-testability (workflow_log needs a live workflow event loop).
- config_temporal docstrings: non_retryable_error_types is documented as
  a fallback for category-less exceptions and an override mechanism.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: agent-CLI markdown delivery + error_domain HTTP-status mapping (Phase 7)

Phase 7 of the error-handling plan — errors and results surface cleanly on
every delivery surface.

- ErrorReport gains a documented, authoritative error_domain -> HTTP-status
  mapping: error_domain_to_http_status() (pure domain table) and the
  ErrorReport.http_status property (provider-429 passthrough on top). The
  library stays HTTP-agnostic; downstream APIs call the helper.
- The agent CLI emits markdown by default for run / validate / init, with
  --format json for the structured payload. agent_error() dispatches JSON or
  markdown via a per-invocation ContextVar, so every existing call site
  follows --format. agent_error_markdown() renders heading / hint callout /
  details / source block.
- validate bundle's graph-format option renamed --format -> --graph-format so
  --format is uniformly the output-format flag (breaking CLI change).
- error_handlers.py: extracted display_error_panel() for the field-based Rich
  handlers.

make agent-check clean; make agent-test passed (full suite).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: full-chain error-delivery coverage + cause-chain ErrorReport enrichment (Phase 8)

Phase 8 of the error-handling plan — full-chain integration coverage.

The full-chain test exposed the wiring gap the plan anticipated: a worker
LLMCompletionError (CogtError, carries error_category/retryable/model/
provider) is wrapped by PipeLLM -> PipeRouter -> PipelexRunner into plain
PipelexError subclasses, and PipelexError.to_error_report() did not consult
the __cause__ chain — so the agent CLI lost the worker's classification once
the error was wrapped.

GREEN fix, at the source: enrichment is a protected method
PipelexError._enrich_error_report_from_cause(report) that fills every None
classification field from the underlying exception. The base to_error_report()
calls it, and CogtError's @override calls it too, so the "enrich from the
__cause__ chain" contract is uniform across the hierarchy. The recursion
propagates the deepest CogtError's metadata up through every wrapper; a wrapper
keeps its own error_type/message and non-None fields.

Tests:
- tests/integration/pipelex/cli/agent_cli/test_run_error_chain.py — runs the
  real run_pipe_cmd with the LLM worker mocked to fail; asserts JSON and
  markdown error output carry error_category/retryable/model/provider and the
  ordered error_source chain.
- tests/unit/pipelex/cli/test_error_handlers_snapshot.py — exact-match
  snapshots of two display_error_panel handlers, guarding the Phase 7 refactor.

Flagged as a follow-up (out of Phase 8 scope, by decision): PipeLLM wraps
LLMCompletionError into a plain PipeRunError before the router sees it, so the
Phase 5 router retry loop (except CogtError) is bypassed for the LLM path.

make agent-check clean; make agent-test passed; Temporal integration suite
passed (94 passed, 4 xpassed, 0 failures).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: ErrorReport.http_status tolerates an unrecognized error_domain

`http_status` converted `error_domain` with `ErrorDomain(self.error_domain)`,
which raises `ValueError` on any string the running version doesn't know.
A downstream HTTP API receiving an ErrorReport serialized by a newer
Pipelex would crash while rendering the error response instead of falling
back to the documented 500 for unclassified domains. Catch the ValueError
and treat the unknown domain as unclassified.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* docs: add minimal TODO for wiring from_message_exception into activities

Stub follow-up for Temporal integration Followup 5: Phase 6 built
TemporalError.from_message_exception() but no activity calls it, so the
category-aware retry decision and ErrorReport details packing are dead in
production. The file carries the gap statement and pointers only — the
RED/GREEN/REFACTOR plan is to be designed in a fresh session.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: correct two worker error-classification miscategorizations (#907)

* docs: cold-start TODOS for two error-classification fixes

Plan branch for two deferred classification bugs from the error-handling sweep: LinkupNoResultError mis-classified TRANSIENT (should be CONTENT — now wastes real router retry budget) and FileNotFoundError mis-classified CONFIGURATION (should be CONTENT). TDD plan, RED to GREEN per bug, checkpoint between.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: classify Linkup no-result as CONTENT, not TRANSIENT

LinkupNoResultError had no explicit branch in _classify_linkup_error
and fell through to the TRANSIENT fallback. A no-result search is not
transient — retrying the identical query yields no result again — so
the PipeRouter retry loop burned its budget and backoff sleeps on a
query that cannot succeed.

Add an explicit LinkupNoResultError branch to both Linkup workers
(search and extract), classifying it CONTENT + CHANGE_INPUT. Behavior
change: a no-result search is now non-retryable (CONTENT.is_retryable
is False).

Also marks item 1 of the search-worker-review-followups deferred doc
as landed.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: classify missing input file as CONTENT, not CONFIGURATION

The `except FileNotFoundError` branch in the Docling and pypdfium2
extract workers classified a missing input file as CONFIGURATION +
CHANGE_INPUT. CONFIGURATION is reserved for setup problems elsewhere
(paired with CHECK_CREDENTIALS / CHANGE_MODEL / CONTACT_SUPPORT); the
CONFIGURATION + CHANGE_INPUT pairing was internally inconsistent. A
missing input file is a content problem — flip the branch to CONTENT,
matching its sibling branches.

No behavior change: CONFIGURATION and CONTENT are both non-retryable.

Also adds the combined CHANGELOG entry for both error-classification
fixes and deletes the resolved file-not-found-category-mismatch
deferred doc.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: retry transient LLM failures that bypassed the PipeRouter retry loop (#903)

* fix: retry transient LLM failures that bypassed the PipeRouter retry loop

PipeRouter's application-level transient-retry loop was dead for the most
common case — an LLM call. PipeLLM and PipeStructure catch the worker's
LLMCompletionError (a CogtError) and re-raise it as a plain PipeRunError;
the router's `except CogtError` retry branch never saw it, and the
`except PipeRunError` branch wrapped into PipeRouterError without
retrying. A transient LLM failure (rate limit, timeout, brief outage) was
never retried in-process despite max_transient_retries defaulting to 3.

The router now unifies its retry decision in a single
`except (CogtError, PipeRunError)` branch: the retry classification is
derived from the exception itself when it is a CogtError, or from its
__cause__ when it is a PipeRunError (the operator wrap). On exhaustion a
PipeRunError still wraps into PipeRouterError, preserving the pipe
location context and the full-chain error shape.

Coverage: new integration test runs a real PipeLLM and PipeStructure
through the router with the worker mocked to raise a TRANSIENT failure;
a cause-chain case added to the router retry unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: address PR #903 review comments on retry loop

Two confirmed issues from PR #903 review agents, plus a deferred TODO:

- gather_bounded: a factory raising synchronously bypassed chunk
  draining and aborted the gather before it ran, orphaning the chunk's
  other coroutines. Each factory is now invoked inside an _invoke()
  coroutine, so a synchronous raise is captured per-task by
  gather(return_exceptions=True) and the chunk still drains.

- format_run_markdown: when main_stuff carries an empty markdown (the
  API runner cannot render it), the result was dropped entirely as an
  excluded envelope key, leaving the "## Result" section showing only
  metadata. It now falls back to main_stuff's structured json payload.

- Deferred: the retry loop reruns run_pipe() after on_pipe_end_error
  has recorded the failed attempt, so a successful retried run carries
  a stale error node. Tracked in todos-retry-graph-trace.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: inference-failure ErrorReports carry model/provider in production (#904)

* docs: cold-start TODOS for the ErrorReport model/provider fix

Inference-failure errors (LLMCompletionError, ImgGenGenerationError,
ExtractJobFailureError, SearchJobFailureError) reach to_error_report()
with model/provider = None in production — CogtError.to_error_report()
duck-types model_handle/backend_name via getattr and nothing sets them.
The Phase 8 full-chain test only passes because it setattrs them by hand.

This TODOS is the cold-start plan to fix it: verified facts checked
against the code, the A-vs-B design decision (recommendation: centralize
enrichment at the worker base class), and the RED/GREEN/REFACTOR steps.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: inference-failure ErrorReports carry model/provider in production

A real LLM / image-gen / extract / search failure used to surface an
ErrorReport with model = None and provider = None: the leaf errors
(LLMCompletionError, ImgGenGenerationError, ExtractJobFailureError,
SearchJobFailureError, ExtractOutputError) carry no model or provider of
their own, and CogtError.to_error_report() only duck-typed whatever
attributes happened to be set on the exception.

Option B (centralized enrichment at the worker base class):

- Declare model_handle / backend_name on CogtError; to_error_report()
  reads them directly instead of getattr. New fill_model_and_provider()
  fills them only when unset, never overwrites an inner error's value,
  and skips the "unknown" placeholde…
* feat: add support for Gemini 3.5 flash model in Google backend configurations

* changelog: note gemini-3.5-flash addition on google backend

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: add gemini-3.5-flash model to Pipelex Gateway documentation

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…#923)

* feat: lean CLAUDE.md by default, standalone variant for contributors

- Drop tdd.md from all generated rule sets
- Add [agent_rules.targets.claude.sets] with `all` (lean) and `standalone` (full); CLAUDE.md no longer duplicates Python and pytest standards that workspace users already get from .claude/rules/
- Add --targets CLI filter to `pipelex-dev kit rules` to constrain which preferred targets get regenerated
- Add `make rules-claude-standalone` for contributors without the Pipelex workspace
- Document the lean/standalone split in CONTRIBUTING.md
- Cover lean/standalone behavior with a new integration test

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* feat: enhance cleanup behavior for cursor rules and agent rules synchronization

* feat: enhance remove_cursor_rules to support deprecated rule stems for cleanup

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 37e7c4fdbe

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pipelex/temporal/tprl_pipe/hydration.py
@lchoquel
Copy link
Copy Markdown
Member Author

@greptileai

Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 issues found across 567 files

Note: This PR contains a large number of files. cubic only reviews up to 100 files per PR, so some files may not have been reviewed. cubic prioritizes the most important files to review.
On a pro plan you can use ultrareview for larger PRs.

Re-trigger cubic

Comment thread Makefile
Comment thread docs/features/distributed-execution.md Outdated
Comment thread docs/under-the-hood/reasoning-controls.md
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 3 files (changes from recent commits).

Reply with feedback, questions, or to request a fix.

Re-trigger cubic

Comment thread docs/features/distributed-execution.md
@lchoquel
Copy link
Copy Markdown
Member Author

@greptileai

@lchoquel
Copy link
Copy Markdown
Member Author

@greptileai review this

Copy link
Copy Markdown
Member

@thomashebrard thomashebrard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

congrats

@lchoquel lchoquel merged commit 1a216ef into main May 20, 2026
30 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators May 20, 2026
@lchoquel lchoquel deleted the release/v0.29.0 branch May 20, 2026 15:59
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants