Skip to content

feat(openapi): cache spec fetches (URL index + conditional GET)#1247

Open
RhysSullivan wants to merge 3 commits into
executor-cache-primitivefrom
spec-fetch-caching
Open

feat(openapi): cache spec fetches (URL index + conditional GET)#1247
RhysSullivan wants to merge 3 commits into
executor-cache-primitivefrom
spec-fetch-caching

Conversation

@RhysSullivan

@RhysSullivan RhysSullivan commented Jul 2, 2026

Copy link
Copy Markdown
Owner

Summary

Stacked on #1248 (executor.cache primitive).

OpenAPI spec documents were re-downloaded on every step of every flow: the add form's debounced previewSpec re-fetched per keystroke-burst, addSpec re-fetched the URL the preview had just downloaded, detect fetched it again, and updateSpec always pulled the full document even when nothing changed upstream.

Spec-fetch cache (spec_source index over the blob store)

The content-addressed blob store already dedupes spec storage, but its key is the SHA-256 of the body, so it could never skip a download. The new org-owned spec_source plugin-storage collection remembers what each URL last resolved to (url -> { specHash, etag, lastModified, fetchedAt }):

  • Fresh entry (5-minute TTL) → serve spec/<hash> from the store, zero network. This collapses detect → preview → preview → addSpec into one download.
  • Stale entry with validators → conditional GET (If-None-Match / If-Modified-Since); a 304 revalidates the stored blob without a body transfer.
  • updateSpec always revalidates (never trusts the TTL) — an explicit refresh must see a changed upstream immediately; an unchanged one now costs a bodyless 304 instead of a multi-MB download.

Inline blob specs bypass the cache entirely. The index guards against key-hash collisions by storing the full URL in the row, and concurrent first-fetch races degrade to a harmless duplicate-insert-ignored.

Auth invariant: spec fetches were already structurally unauthenticated (the credentials param on fetchSpecText was dead code — never passed by any call site; integration headers only apply at invoke time). That dead param is removed and the invariant is now documented at the fetch site: the fetched text is cached tenant-wide, so a future authed-spec feature must bypass this cache, not extend the fetch.

Why not executor.cache? The URL index must be durable and tenant-scoped — plugin storage gives both for free, while KV on per-request cloud executors would add an eventual-consistency question for no win. The primitive (#1248) stays for derived-data consumers where best-effort is fine.

Tests

e2e/scenarios/openapi-spec-fetch-cache.test.ts pins the contract black-box on both targets (cloud + selfhost): a real ETag-serving spec host counts downloads vs 304s while the scenario drives only the public API — preview → preview → addSpec is one download (separate HTTP requests, so on cloud this also proves the URL index is durable, not per-executor memory), an unchanged-upstream refresh revalidates to a bodyless 304, and a changed upstream busts the cache.

spec-cache.test.ts pins the same contract at the plugin's HTTP boundary against a real local ETag-serving server: request/304 counts for preview→add single-download, refresh-revalidates-on-304, changed-upstream cache-bust, and inline-blob no-network. Full plugin-openapi suite green (165), google + microsoft plugins green, repo typecheck and lint clean.

Follow-ups (deliberately out of scope)

Stack

  1. feat(sdk): add executor.cache, a host-pluggable key-value cache primitive #1248
  2. feat(openapi): cache spec fetches (URL index + conditional GET) #1247 👈 current

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Preview URL Updated (UTC)
✅ Deployment successful!
View logs
executor-marketing f81d13f Commit Preview URL

Branch Preview URL
Jul 02 2026, 02:58 AM

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jul 2, 2026

Copy link
Copy Markdown

Deploying with  Cloudflare Workers  Cloudflare Workers

The latest updates on your project. Learn more about integrating Git with Workers.

Status Name Latest Commit Updated (UTC)
✅ Deployment successful!
View logs
executor-cloud f81d13f Jul 02 2026, 02:59 AM

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Cloudflare preview

Console https://executor-preview-pr-1247.executor-e2e.workers.dev
MCP https://executor-preview-pr-1247.executor-e2e.workers.dev/mcp
Deployed commit f81d13f

Sign-in is Cloudflare Access (one-time PIN to an allowed email). The preview has its own database and encryption key; it is destroyed when this PR closes.

@pkg-pr-new

pkg-pr-new Bot commented Jul 2, 2026

Copy link
Copy Markdown

Open in StackBlitz

@executor-js/cli

npm i https://pkg.pr.new/@executor-js/cli@1247

@executor-js/config

npm i https://pkg.pr.new/@executor-js/config@1247

@executor-js/execution

npm i https://pkg.pr.new/@executor-js/execution@1247

@executor-js/sdk

npm i https://pkg.pr.new/@executor-js/sdk@1247

@executor-js/codemode-core

npm i https://pkg.pr.new/@executor-js/codemode-core@1247

@executor-js/runtime-quickjs

npm i https://pkg.pr.new/@executor-js/runtime-quickjs@1247

@executor-js/plugin-file-secrets

npm i https://pkg.pr.new/@executor-js/plugin-file-secrets@1247

@executor-js/plugin-graphql

npm i https://pkg.pr.new/@executor-js/plugin-graphql@1247

@executor-js/plugin-keychain

npm i https://pkg.pr.new/@executor-js/plugin-keychain@1247

@executor-js/plugin-mcp

npm i https://pkg.pr.new/@executor-js/plugin-mcp@1247

@executor-js/plugin-onepassword

npm i https://pkg.pr.new/@executor-js/plugin-onepassword@1247

@executor-js/plugin-openapi

npm i https://pkg.pr.new/@executor-js/plugin-openapi@1247

executor

npm i https://pkg.pr.new/executor@1247

commit: f81d13f

@greptile-apps

greptile-apps Bot commented Jul 2, 2026

Copy link
Copy Markdown

Greptile Summary

This PR introduces a tenant-scoped spec_source URL index over the existing content-addressed blob store, so OpenAPI spec URLs are downloaded at most once per 5-minute TTL window and refresh calls use conditional GET (ETag / Last-Modified) to avoid re-downloading an unchanged spec. The dead credentials parameter on fetchSpecText is removed and the function is refactored into fetchSpecDocument, which surfaces a typed NotModified result for 304 responses.

  • spec-cache.ts: new fetchSpecTextCached implements the three-way cache path (TTL hit → zero network; stale with validators → conditional GET; miss → full download) with a broken-index recovery path for pruned blobs.
  • plugin.ts: previewSpec, addSpec, updateSpec, and detect all route URL inputs through the cache with the correct freshness mode (prefer-cache for the add flow, revalidate for explicit refresh); inline blobs bypass the cache entirely.
  • Tests: spec-cache.test.ts pins the HTTP-boundary contract (single download, 304 revalidation, changed-upstream bust, inline no-network); two e2e scenarios (API-level and browser UI) pin the same contract end-to-end on both cloud and selfhost targets.

Confidence Score: 5/5

Safe to merge; the caching logic is well-reasoned, all call sites are correctly wired, and the test suite covers the key paths at both unit and e2e level.

The spec-cache implementation handles the TTL, conditional GET, and broken-index recovery paths correctly with appropriate error propagation. The persisted flag correctly avoids redundant blob re-puts. No correctness or data-integrity issues were found in the changed files.

e2e/scenarios/openapi-spec-fetch-cache-ui.test.ts should be moved to e2e/cloud/ per the project convention for browser-capability tests.

Important Files Changed

Filename Overview
packages/plugins/openapi/src/sdk/spec-cache.ts New URL-keyed index over the content-addressed blob store. Logic for prefer-cache / revalidate / broken-index recovery is sound; the broken-index fallback (304 with missing blob) refetches without validators (addressed in a previous thread).
packages/plugins/openapi/src/sdk/store.ts Adds getSpecSource / putSpecSource to OpenapiStore with schema decode + URL-collision guard; no issues found.
packages/plugins/openapi/src/sdk/parse.ts fetchSpecText refactored into fetchSpecDocument with conditional-GET support; removes dead credentials param; 304-without-validators falls through to an error correctly.
packages/plugins/openapi/src/sdk/plugin.ts All three call sites (previewSpec, addSpec, updateSpec) wired to fetchSpecTextCached with correct freshness modes; persisted flag skips redundant blob re-puts.
packages/plugins/openapi/src/sdk/spec-cache.test.ts Unit tests pin the HTTP-boundary contract for all four paths (single-download add flow, 304 revalidation, changed-upstream bust, inline blob no-network); no issues found.
e2e/scenarios/openapi-spec-fetch-cache.test.ts End-to-end scenario correctly placed in scenarios/ (no browser dependency), drives preview → add → refresh through the public API, uses Effect.ensuring for cleanup.
e2e/scenarios/openapi-spec-fetch-cache-ui.test.ts Browser-surface UI journey that enforces download counts; misplaced in scenarios/ instead of cloud/ per the e2e convention for browser-capability tests.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A["URL input\n(previewSpec / addSpec / updateSpec)"] --> B{freshness?}
    B -->|prefer-cache| C{entry in spec_source\nAND within TTL?}
    B -->|revalidate| E
    C -->|yes| D{blob in store?}
    D -->|yes| HIT["Return cached blob\n(zero network)"]
    D -->|no - pruned| E["Conditional GET\n(If-None-Match / If-Modified-Since)"]
    C -->|no / stale| E
    E --> F{response}
    F -->|304 Not Modified| G{blob still in store?}
    G -->|yes| REVAL["Return cached blob\nUpdate fetchedAt"]
    G -->|no - broken index| RECOVERY["Unconditional GET\n(fetchSpecText)\nWrite blob + index"]
    F -->|200 OK| FULL["Write blob + Write spec_source entry\nReturn new text"]
    A2["Inline blob input"] --> BYPASS["Bypass cache entirely\n(persisted=false)"]
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A["URL input\n(previewSpec / addSpec / updateSpec)"] --> B{freshness?}
    B -->|prefer-cache| C{entry in spec_source\nAND within TTL?}
    B -->|revalidate| E
    C -->|yes| D{blob in store?}
    D -->|yes| HIT["Return cached blob\n(zero network)"]
    D -->|no - pruned| E["Conditional GET\n(If-None-Match / If-Modified-Since)"]
    C -->|no / stale| E
    E --> F{response}
    F -->|304 Not Modified| G{blob still in store?}
    G -->|yes| REVAL["Return cached blob\nUpdate fetchedAt"]
    G -->|no - broken index| RECOVERY["Unconditional GET\n(fetchSpecText)\nWrite blob + index"]
    F -->|200 OK| FULL["Write blob + Write spec_source entry\nReturn new text"]
    A2["Inline blob input"] --> BYPASS["Bypass cache entirely\n(persisted=false)"]
Loading

Reviews (4): Last reviewed commit: "test(e2e): browser journey for the spec-..." | Re-trigger Greptile

Comment thread packages/plugins/openapi/src/sdk/spec-cache.ts
@RhysSullivan RhysSullivan changed the title feat(openapi): cache spec fetches and add an executor.cache primitive feat(openapi): cache spec fetches (URL index + conditional GET) Jul 2, 2026
@RhysSullivan RhysSullivan changed the base branch from main to executor-cache-primitive July 2, 2026 01:39
@RhysSullivan RhysSullivan force-pushed the executor-cache-primitive branch from 4049627 to 651f44a Compare July 2, 2026 01:39
Spec URLs resolve through a tenant-shared spec_source index over the existing
content-addressed blob store (url -> {specHash, etag, lastModified,
fetchedAt}): within-TTL repeats — the add flow's detect -> preview -> addSpec
sequence, debounced previews — are served from the store without touching the
network, and updateSpec revalidates with the stored ETag/Last-Modified so an
unchanged upstream costs a bodyless 304 instead of a multi-MB download.

Spec fetches remain structurally unauthenticated (fetchSpecText's dead
credentials param is removed); any future authed-spec feature must bypass
this cache rather than extend it.
A real 127.0.0.1 spec host with a strong ETag counts downloads vs 304s;
the scenario drives only the public API and asserts the ledger:
preview -> preview -> addSpec is one download (across separate HTTP
requests, so on cloud this also proves the URL index is durable, not
per-executor memory), updateSpec on an unchanged upstream revalidates to
a bodyless 304, and a changed upstream busts the cache through the
validators. Non-spec paths (OAuth discovery probes) 404 and stay out of
the count.
The API-surface twin pins the contract headlessly; this one makes it
watchable: the session video shows the add form analyzing a pasted spec
URL, the integration landing, and Edit -> re-fetch-on-save, while the
counting ETag spec server proves the network did what the UI implies -
one download across the whole add journey, and a bodyless 304 for the
refresh.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant