Skip to content

feat: Offline mode for Pipelex Gateway setup and dry-run#900

Merged
lchoquel merged 20 commits into
devfrom
fix/Offline-mode
May 17, 2026
Merged

feat: Offline mode for Pipelex Gateway setup and dry-run#900
lchoquel merged 20 commits into
devfrom
fix/Offline-mode

Conversation

@lchoquel
Copy link
Copy Markdown
Member

@lchoquel lchoquel commented May 14, 2026

Summary

  • Adds a schema-versioned on-disk cache at ~/.pipelex/cache/remote_config.json that Pipelex Gateway falls back to when the remote config service is unreachable. Setup, validation, and pipelex-agent run bundle --dry-run complete normally offline; only the actual inference call still needs the network at runtime. The cache is primed on every successful fetch and at pipelex init while online. When the gateway is disabled (BYOK), no remote fetch is attempted at all.
  • Adds RemoteConfigStaleWarning (UserWarning) surfaced on the agent-CLI JSON envelope as warnings: [{"type": "RemoteConfigStale", ...}]; suppresses telemetry (no-op) on stale cache; refuses cache for doc/fixture generators via a new require_fresh=True flag.
  • Adds two user-facing exceptions: RemoteConfigUnavailableError (offline + cold cache, with two-path remediation) and GatewayUnknownModelError (source-aware messaging when a deck references a gateway handle absent from the fresh-or-cached specs). Both wired through the Rich CLI (error_handlers.py) and the agent CLI (AGENT_ERROR_HINTS / AGENT_ERROR_DOMAINS).
  • RemoteConfigFetcher.fetch_remote_config() now returns a RemoteConfigResult(config, source, cached_at); ModelManager.setup() and BackendLibrary._load_gateway_model_specs() accept a gateway_config_source: RemoteConfigSource | None parameter so the membership check can branch its error message on FRESH vs CACHED. GatewayConfig stays extra="forbid" and source-free — provenance is plumbed alongside.
  • Adds PIPELEX_REMOTE_CONFIG_URL env var to override the default URL (useful for staging/testing).

See TODOS.md for the full phased implementation plan (Phases 0–7) and the per-checkpoint status blocks with rationale, decisions, and verification notes.

Related

  • Unblocks the codex-sandbox handoff scenario described in mthds-plugins/wip/codex-sandbox-escalation.md--dry-run no longer requires escalation once Pipelex has been initialised online once.

Deferred follow-ups

Tracked at the bottom of TODOS.md:

  1. pipelex doctor cache reporting (presence, age, missing-cache hint).
  2. Codex Cloud cache-first short-circuit (today returns a dummy unconditionally).
  3. Cross-repo update to mthds-plugins/wip/codex-sandbox-escalation.md.
  4. Cache TTL revisit (currently uncapped).
  5. Schema-version migration runbook.
  6. pipelex-agent run bundle --output-dir <path> flag for read-only mounted bundles.
  7. Pin GatewayUnknownModelError end-to-end via a deck override that points a preset at a missing handle.

Test plan

  • make agent-check — clean (ruff fix-imports, ruff format, plxt fmt, ruff lint, plxt lint, pyright, mypy).
  • make agent-test — full suite green.
  • Manual reproduction of the behaviour matrix:
    • BYOK offline → success (E2E test_byok_offline_succeeds).
    • Gateway online, no cache → success, cache written (integration + cache-priming tests).
    • Gateway offline, cache present → success with stale warning (E2E test_gateway_known_with_cache_succeeds_offline).
    • Gateway offline, no cache → clear RemoteConfigUnavailableError (E2E test_gateway_no_cache_no_network_fails_with_unavailable).
    • Bundle references unknown gateway model → clear GatewayUnknownModelError (integration test_gateway_unknown_model.py + E2E).
    • pipelex-dev update-gateway-models offline → clear refusal, no stale docs written (manual via PIPELEX_REMOTE_CONFIG_URL=http://127.0.0.1:1/..., pinned in test_require_fresh_refuses_cache).

Documentation

Docs updated in this branch:

  • docs/tools/cli/agent-cli.md — added the warnings array to the agent CLI JSON success contract. The branch introduces this top-level envelope field (warnings: [{"type": "RemoteConfigStale", ...}]) but the machine-facing output contract didn't document it. Added a JSON example and noted that RemoteConfigStale is emitted on offline cache fallback.
  • docs/tools/cli/init.md — added an "Offline cache priming" note: pipelex init now primes ~/.pipelex/cache/remote_config.json when the gateway is enabled, and warns (without failing) when run offline.
  • docs/features/gateway.md — added an "Offline Behavior" section explaining BYOK vs Gateway offline modes, the cache fallback, the stale warning, and the RemoteConfigUnavailableError cold-cache case.

CHANGELOG [Unreleased] already documents the feature accurately and follows the repo's Keep-a-Changelog style — no voice changes needed.

Documentation Debt

Remaining gap — shipped surface with no coverage in the docs site:

  • ⚠️ PIPELEX_REMOTE_CONFIG_URL — the new env var is not documented under docs/configuration/. Reference gap; niche (staging/testing override), low priority.

No architecture diagrams reference the changed modules — no diagram drift.

🤖 Generated with Claude Code


Summary by cubic

Adds offline setup, validation, and dry-run for the Pipelex Gateway by caching the remote config to disk and falling back when the service is unreachable; only live inference still needs the network. Also surfaces stale-cache warnings in the agent CLI JSON and prefers a local .pipelex/ project config over the global one.

  • New Features

    • Cache the gateway config to ~/.pipelex/cache/remote_config.json, primed on successful fetches and during pipelex init; RemoteConfigFetcher.fetch_remote_config(require_fresh=False) returns RemoteConfigResult { config, source (FRESH|CACHED), cached_at }, and doc/fixture tools use require_fresh=True to refuse cached data.
    • Emit RemoteConfigStaleWarning when falling back to cache and attach it to the agent CLI success envelope as warnings: [...]; disable telemetry on cached configs; BYOK skips remote fetch and runs fully offline; support PIPELEX_REMOTE_CONFIG_URL override.
    • Raise GatewayUnknownModelError when a deck references a missing gateway model, with source-aware hints based on fresh vs cached specs.
  • Bug Fixes

    • Recognize .pipelex/ as a project root marker so local config isn’t ignored; harden alias/waterfall resolution with cycle detection and full fallback expansion.
    • Consolidate remote-config failures under RemoteConfigUnavailableError; cached JSON validation issues no longer leak raw errors; preprocess-test-models now propagates require_fresh=True refusals instead of silently generating empty entries.
    • pipelex init cache priming now verifies the cache exists and re-validates the cached payload; if the cache write or validation fails, it reports clear remediation instead of misreporting success.

Written for commit 2170259. Summary will update on new commits. Review in cubic

lchoquel and others added 9 commits May 14, 2026 13:28
Lays out a TDD plan for offline-safe Pipelex setup: cache remote config
on first init, fall back to cache when network is unavailable, and fail
clearly when a referenced gateway model is missing from both fresh and
cached specs.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Apply 9 review-driven edits to TODOS.md: move source provenance off
GatewayConfig, cache raw JSON, keep RemoteConfigFetchError, require
fresh data for doc generators, add retry-exhaustion and regression
tests, replace test env-var backdoor with PIPELEX_REMOTE_CONFIG_URL.
…and provenance tracking

- Refactored `RemoteConfigFetcher.fetch_remote_config()` to return a `RemoteConfigResult` containing the fetched config, source of the config (FRESH or CACHED), and cache timestamp.
- Introduced `RemoteConfigUnavailableError` for scenarios where both network fetch and cache fallback fail, providing user-facing error messages with remediation steps.
- Added `RemoteConfigStaleWarning` to indicate when a cached config is used due to network issues.
- Updated all existing callers of `fetch_remote_config()` to accommodate the new return type and error handling.
- Enhanced tests to cover new behaviors, including success cases, network failures, and validation errors.
- Ensured that the internal retry logic raises `RemoteConfigFetchError` while the outer layer handles user-facing errors appropriately.
…y specs

- Added GatewayUnknownModelError to handle cases where a model referenced in the deck is not found in the active gateway specs.
- Enhanced model manager to enforce gateway model membership, raising the new error when discrepancies are detected.
- Updated remote config fetcher to include source provenance (FRESH vs CACHED) for better error messaging and telemetry control.
- Refactored related tests to ensure proper coverage for the new error handling and gateway configuration scenarios.
- Introduced RemoteConfigSource enum to streamline source tracking for remote configurations.
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: dbaa02743d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread pipelex/cogt/models/model_manager.py
Comment thread pipelex/system/pipelex_service/remote_config_fetcher.py Outdated
@greptile-apps
Copy link
Copy Markdown

greptile-apps Bot commented May 14, 2026

Greptile Summary

This PR adds offline support for Pipelex Gateway setup, validation, and dry-run flows. The main changes are:

  • Schema-versioned remote-config cache under ~/.pipelex/cache/remote_config.json.
  • Fresh-or-cached provenance from RemoteConfigFetcher, with fresh-only mode for docs and fixture generation.
  • Stale-cache warnings surfaced in agent CLI JSON output.
  • Source-aware gateway model validation and user-facing offline error messages.
  • Init-time cache priming and .pipelex/ project-root detection.

Confidence Score: 5/5

This looks safe to merge.

  • No blocking issues found in the changed code.

Important Files Changed

Filename Overview
pipelex/system/pipelex_service/remote_config_fetcher.py Adds remote-config provenance, cache fallback, fresh-only refusal, and opportunistic cache writes.
pipelex/system/pipelex_service/remote_config_cache.py Adds schema-versioned on-disk cache loading and atomic cache writes.
pipelex/pipelex.py Plumbs gateway config source through setup, emits stale-cache warnings, and disables telemetry for cached specs.
pipelex/cogt/models/model_manager.py Validates gateway-referenced model handles and resolves aliases/waterfalls with cycle protection.
pipelex/cli/commands/init/command.py Primes the remote-config cache during init and verifies that the persisted cache is usable.

Reviews (8): Last reviewed commit: "fix: re-validate cached payload when pri..." | Re-trigger Greptile

Comment thread pipelex/system/pipelex_service/remote_config_fetcher.py Outdated
Comment thread pipelex/cogt/models/model_manager.py Outdated
Comment thread pipelex/system/pipelex_service/remote_config_fetcher.py Outdated
Comment thread tests/e2e/agent_cli/test_offline_run_dry.py
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

8 issues found across 36 files

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pipelex/cogt/models/model_manager.py">

<violation number="1" location="pipelex/cogt/models/model_manager.py:143">
P2: Gateway membership collection misses extract/img_gen/search choice defaults, so invalid default handles for those model types are not validated.</violation>

<violation number="2" location="pipelex/cogt/models/model_manager.py:183">
P1: Bare HANDLE references are not resolved through alias/waterfall mappings, which can raise `GatewayUnknownModelError` for valid deck references.</violation>

<violation number="3" location="pipelex/cogt/models/model_manager.py:195">
P1: Add cycle detection in the WATERFALL resolution path; without a visited guard, a self-referential or cyclic waterfall can loop indefinitely and hang setup.</violation>

<violation number="4" location="pipelex/cogt/models/model_manager.py:200">
P1: Waterfall membership validation only inspects the first fallback, causing false unknown-model errors when later fallbacks are valid.</violation>
</file>

<file name="pipelex/cli/commands/init/command.py">

<violation number="1" location="pipelex/cli/commands/init/command.py:82">
P2: Cache priming checks gateway enablement from layered/project-preferred config instead of the init target directory, so global vs local init can prime (or skip) based on the wrong backends.toml.</violation>
</file>

<file name="pipelex/pipelex.py">

<violation number="1" location="pipelex/pipelex.py:219">
P2: Do not mark the dummy no-model-specs path as `FRESH`; setting a source here triggers gateway membership validation against empty placeholder specs and can break commands that intentionally skip spec loading.</violation>
</file>

<file name="pipelex/system/pipelex_service/remote_config_fetcher.py">

<violation number="1" location="pipelex/system/pipelex_service/remote_config_fetcher.py:237">
P1: Handle `OSError` around cache persistence so an unwritable `~/.pipelex/cache` does not fail an otherwise successful fresh remote-config fetch.</violation>
</file>

<file name="tests/e2e/agent_cli/test_offline_run_dry.py">

<violation number="1" location="tests/e2e/agent_cli/test_offline_run_dry.py:83">
P3: Parse and return the last JSON object in the CLI output, not the first decodable one, so preamble JSON fragments don't get mistaken for the final agent envelope.</violation>
</file>

Tip: cubic can generate docs of your entire codebase and keep them up to date. Try it here.

Comment thread pipelex/cogt/models/model_manager.py Outdated
Comment thread pipelex/cogt/models/model_manager.py Outdated
Comment thread pipelex/system/pipelex_service/remote_config_fetcher.py Outdated
Comment thread pipelex/cogt/models/model_manager.py Outdated
Comment thread pipelex/cogt/models/model_manager.py
Comment thread pipelex/cli/commands/init/command.py Outdated
Comment thread pipelex/pipelex.py Outdated
Comment thread tests/e2e/agent_cli/test_offline_run_dry.py Outdated
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 11 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pipelex/cogt/models/model_manager.py">

<violation number="1" location="pipelex/cogt/models/model_manager.py:243">
P2: Cycle detection in `_collect_candidates` uses only the reference name, so alias/waterfall entries with the same identifier are falsely treated as cycles. This can produce an empty candidate list and incorrectly skip gateway membership validation.</violation>
</file>

Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.

Comment thread pipelex/cogt/models/model_manager.py
lchoquel and others added 6 commits May 15, 2026 10:20
Adds the `warnings` field to the agent CLI JSON success contract in
agent-cli.md (was missing the field this branch introduces), and notes
remote-config cache priming in init.md.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Documents how Pipelex stays usable when the Gateway remote config
service is unreachable: BYOK skips the fetch entirely, Gateway mode
falls back to the primed on-disk cache, and only live inference still
needs the network.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
A directory containing a .pipelex/ config dir is now recognized as a
project root. Previously such a directory fell through to the global
~/.pipelex/ config, silently ignoring the project's own overrides.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread pipelex/system/pipelex_service/remote_config_fetcher.py
Comment thread pipelex/cli/dev_cli/commands/preprocess_test_models_cmd.py Outdated
lchoquel and others added 2 commits May 17, 2026 22:28
Address two P1 review findings on the offline-mode work:

- remote_config_fetcher: a cache with a valid wrapper but a malformed
  raw_config let a raw Pydantic ValidationError escape the offline
  fallback. Catch it and raise RemoteConfigUnavailableError with the
  normal remediation. Reword the message to "no usable local cache"
  so it is accurate for both missing and unusable caches.
- preprocess_test_models_cmd: _fetch_gateway_models swallowed
  require_fresh refusals into empty model lists, letting offline
  fixture generation proceed without any pipelex_gateway entries.
  Let the error propagate and surface a clear offline-mode panel.

Adds regression tests for both paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread pipelex/cli/commands/init/command.py
A successful remote-config fetch does not guarantee the on-disk cache was
written: RemoteConfigFetcher treats the cache write as opportunistic and
swallows OSErrors (read-only / full cache dir) with only a stderr warning.
attempt_prime_remote_config_cache trusted the fetch result alone, so it
could return primed=True while no usable cache existed, making
`pipelex-agent init` emit `cache_primed: true` and leaving later offline
runs to fail with RemoteConfigUnavailableError.

Verify a usable cache exists via RemoteConfigCache.load() after the fetch;
report priming failure with a clear remediation message otherwise.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

@cubic-dev-ai cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 issue found across 2 files (changes from recent commits).

Prompt for AI agents (unresolved issues)

Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="pipelex/cli/commands/init/command.py">

<violation number="1" location="pipelex/cli/commands/init/command.py:112">
P2: The new priming read-back check treats `RemoteConfigCache.load()` as a usability check, but it only validates the cache wrapper. This can still report `primed=True` with an unusable cached payload.</violation>
</file>

Tip: Review your code locally with the cubic CLI to iterate faster.
Re-trigger cubic

Comment thread pipelex/cli/commands/init/command.py Outdated
The priming read-back check treated RemoteConfigCache.load() as a
usability check, but load() only validates the cache wrapper, not the
inner raw_config payload. A malformed payload could still report
primed=True. Now call to_remote_config() and treat a ValidationError as
a non-primed result, matching the existing check in
RemoteConfigFetcher.fetch_remote_config.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@lchoquel lchoquel merged commit d58fade into dev May 17, 2026
27 checks passed
@github-actions github-actions Bot locked and limited conversation to collaborators May 17, 2026
@lchoquel lchoquel deleted the fix/Offline-mode branch May 17, 2026 22:17
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant