Phase 2: cached_llm_call decorator (Tier 2) by risjai · Pull Request #151 · agentoptics/rewind

risjai · 2026-04-27T12:41:11Z

Summary

Tier 2 of the Universal Replay Architecture plan. Programmer-facing per-call-site decorator that wraps a Python function so its return value is cached by Rewind. Composes cleanly with Phase 1's intercept.install() via a contextvar that prevents double-recording.

Why this PR

Phase 1's intercept.install() patches the HTTP transport layer globally — every httpx/requests/aiohttp call from the process gets evaluated against predicates. That's powerful but blunt: you opt the whole process in or out. Some operators want per-function control:

Cache the OUTER function that composes multiple inner LLM/tool calls (instead of caching the inner HTTP requests)
LLM calls that don't go through plain HTTP (Bedrock via boto3 SigV4, gRPC, self-hosted models)
Tests pinning specific functions to known recordings

The decorator gives them a single primitive: @cached_llm_call. Wraps any function (sync or async); returns the cached result on hit or calls the function + records the return on miss.

Public API

from rewind_agent import cached_llm_call

@cached_llm_call(
    extract_model=lambda call_args, ret: ret["model"],
    extract_tokens=lambda call_args, ret: (
        ret["usage"]["prompt_tokens"],
        ret["usage"]["completion_tokens"],
    ),
)
def chat(question: str) -> dict:
    return openai_client.chat.completions.create(...).model_dump()

Async variant works the same way (@cached_llm_call() detects inspect.iscoroutinefunction and emits an async wrapper). Generator / async-generator functions raise TypeError at decoration — single-return cache contract.

What's in this PR

File	LOC	What
`plans/phase-2-cached-llm-call-decorator.md`	+226	Plan doc anchor (committed earlier on this branch)
`python/rewind_agent/cached_call.py`	+617	The decorator + sync/async wrappers + cache-key derivation + `_to_json_serializable` + safe extractors
`python/rewind_agent/__init__.py`	+3	Re-export `cached_llm_call` from the top level
`python/rewind_agent/intercept/_flow.py`	+30	Contextvar check `_is_under_cached_llm_call()` to suppress double-recording when the decorator is active
`python/tests/test_cached_call.py`	+461	26 directed tests
`docs/cached-llm-call.md`	+179	Operator-facing how-to in matching style

Total: ~1500 LOC across 6 files (700 product + 460 tests + 180 docs + 226 plan).

Composition with `intercept.install()`

Both can be active in the same process:

Cache hit on the decorated function: decorator returns cached value WITHOUT calling the user's function. No HTTP calls happen, intercept never sees anything.
Cache miss: decorator sets contextvar _cached_llm_call_active, calls the user's function (which may make HTTP calls under intercept), records the return at function-level granularity. Intercept's _flow checks the contextvar — when set, skips its own recording to avoid double-recording at two granularities.

Net: ONE recording per cache miss, at the function level. Contextvar reset via try/finally so exceptions don't leak the suppression.

Cache key + return-type round-trip

Cache key default: SHA-256 of f"{fn_qualname}|{json(args, kwargs)}" with _safe_repr fallback for non-JSON-able args. Override via cache_key= parameter for unhashable args (clients, file handles).

Return type: decorator stores JSON-serializable values. On hit, you get the JSON-deserialized form back, NOT the original Python type. Common conversions handled automatically (dict, model_dump, Pydantic v1 dict, dict, repr fallback). Documented clearly with the recommended pattern: return response.model_dump() in the decorated function.

Test plan

scripts/pre-push-check.sh — all 5 stages green:
- ruff: clean
- pytest local: 455 passed, 1 skipped (was 429; +26 cached_call tests)
- pytest bare-env (CI mirror): 367 passed, 12 skipped, 0 failed
- cargo clippy: clean
- cargo test --workspace: all green
Sync + async cache hit / miss / divergence
Custom extract_model + extract_tokens + cache_key
Default cache-key stability (same args → same key, kwargs order invariant)
Strict-match RewindReplayDivergenceError propagates
Generator / async-generator raises TypeError at decoration
Contextvar set during call + reset on exception
No double-record under intercept.install() (tested via _is_under_cached_llm_call check + integration via decorator's contextvar)

Out of scope (deferred)

Decision-matrix update in docs/recording.md + docs/getting-started.md from 3-way → 4-way — requires PR #150 (docs) to merge first, then a follow-up commit on this branch will extend the matrix.
Auto-detection of return-type → token extraction — pass extract_tokens explicitly. Auto-detection is a v2.1 candidate.
Generator / async-generator support — yields don't fit single-return cache; documented as deferred with TypeError at decoration time.
Tier 3 — runner registry + dashboard "Run replay" button — separate PR.

Versions

Rust: stays at 0.13.0
Python SDK: stays at 0.15.0 (still unpublished per CLAUDE.md track-2; this PR rides with it). When 0.15.0 publishes, the next minor PR (Tier 3) bumps to 0.16.0.

Plan reference

plans/phase-2-cached-llm-call-decorator.md — full design doc, decision rationale, sequencing, acceptance criteria.

Made with Cursor

Detailed plan for Tier 2 of the Universal Replay Architecture: @cached_llm_call decorator on top of Phase 0's ExplicitClient + Phase 1's _flow orchestration. Programmer-facing per-call-site control vs Phase 1's global intercept. ## Scope - New: rewind_agent.cached_call module (~250 LOC) - _flow.py contextvar to skip double-record when used under intercept.install() - Tests: ~25 cases (sync + async × hit/miss/divergence + custom extractors + composition with intercept) - Docs: docs/cached-llm-call.md + decision matrix update ## Key API @cached_llm_call(extract_tokens=lambda args, ret: ..., extract_model=...) def my_llm_call(question: str): -> ChatCompletion Sync and async; cache key derives from fn name + args via safe_repr; returns cached value on hit. Custom cache_key / extract_tokens / extract_model overrides supported. ## Composition with intercept.install() Decorator's check happens FIRST; if hit, no HTTP call ever fires. On miss, contextvar suppresses the inner intercept recording so we don't double-record. Cleanly composable. ## Open questions - Generator / async-gen functions: punt, raise on decoration - Return-type round-trip: document JSON-serialization caveat; ChatCompletion etc. land as dicts on cache hit Made-with: Cursor

Tier 2 of the Universal Replay Architecture. Wraps a Python function; returns the cached value on hit OR calls the function and records the return on miss. Composes cleanly with Phase 1's intercept.install() via a contextvar that suppresses double-recording. ## Public API from rewind_agent import cached_llm_call @cached_llm_call( extract_model=lambda call_args, ret: ret.model, extract_tokens=lambda call_args, ret: (ret.usage.prompt_tokens, ret.usage.completion_tokens), ) def chat(question: str) -> dict: return openai_client.chat.completions.create(...).model_dump() Sync + async functions both supported (detected via inspect.iscoroutinefunction). Generator / async-generator functions raise TypeError at decoration — single-return cache contract. ## Why Tier 2 exists Phase 1's intercept.install() patches the HTTP transport globally — powerful but blunt. Tier 2 gives operators per-call-site control: - Cache the OUTER function that composes multiple inner LLM/tool calls, vs caching individual HTTP calls - LLM calls that don't go through plain HTTP (Bedrock via boto3, gRPC to self-hosted models) — the decorator caches at function- return level, transport-agnostic - Tests pinning specific functions to known recordings ## Composition with intercept.install() Both can be active in the same process. The decorator's check fires first (it wraps the user's function). On hit: returns cached, no HTTP call ever happens. On miss: contextvar ``_cached_llm_call_active`` is set during the function call, and intercept._flow checks it to skip its own recording — preventing double-record at two granularities. The contextvar is reset via try/finally so exceptions in the user function don't leak the suppression. ## Cache-key derivation Default: SHA-256 of f"{fn_qualname}|{json(args, kwargs)}" with _safe_repr fallback for non-JSON-able args. Operators with unhashable args (clients, file handles) override via cache_key= parameter: @cached_llm_call(cache_key=lambda client, q, **kw: q) def chat(client, question: str) -> dict: ... Custom cache_key failure (raises) falls back to default + warning. ## Return-type round-trip Decorator stores JSON-serializable values in the cache. On hit, you get the JSON-deserialized form back, NOT the original Python type. Documented clearly. Common conversions handled automatically: - dict / list / primitives → as-is - model_dump() (Pydantic v2, OpenAI SDK) → called, result stored - dict() (Pydantic v1) → fallback - __dict__ → fallback - pathological → repr() stored, warning logged ## Tests (26 cases, all green) - Sync + async cache hit / miss / divergence - Custom extract_model + extract_tokens reach record - Custom cache_key overrides + failure-fallback - Default cache key stability (same args → same key, kwargs order invariant) - Strict-match RewindReplayDivergenceError propagates through decorator - Generator / async-generator decoration raises TypeError - Contextvar set during call + reset on exception - _to_json_serializable: dict, Pydantic v2 model_dump, v1 dict, pathological __slots__ class - _safe_repr primitives + lists + dict-with-non-str-keys + custom type fallback - Request payload shape stability + custom-key replacement + custom-key failure fallback ## What's NOT in this PR - Decision matrix update in docs/recording.md and docs/getting- started.md from "three ways" → "four ways". Requires PR #150 (docs/intercept-quickstart.md) to merge first; follow-up commit on this branch will extend the matrix once PR #150 lands. - Auto-detection of return-type → token extraction. Manual extract_tokens only; auto-detect is a v2.1 candidate. - Generator / async-generator support. Yields don't fit the single-return cache; documented as deferred. ## Pre-push verification All 5 stages green BEFORE push (scripts/pre-push-check.sh): - ruff: clean - pytest local: 455 passed, 1 skipped (was 429; +26 cached_call tests) - pytest bare-env (CI mirror): 367 passed, 12 skipped, 0 failed - cargo clippy: clean - cargo test --workspace: all green Made-with: Cursor

vercel · 2026-04-27T12:41:13Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
rewind	Ready	Preview, Comment	Apr 27, 2026 1:08pm

risjai

Review: needs changes before merge

I reviewed PR #151 against the Phase 2 cached_llm_call plan. Python/site/version checks are green, Rust build jobs are still pending, but there are two correctness blockers in the decorator implementation itself.

Blocking Findings

@cached_llm_call() does not create or enter a recording session, so the quickstart path records nothing.
The wrapper constructs a fresh ExplicitClient() and calls record_llm_call(), but ExplicitClient.record_llm_call() immediately returns None when _session_id is unset. The docs show:
```
@cached_llm_call()
def chat(...): ...

result1 = chat("What is 2+2?")  # First call: hits OpenAI, records the return value
```
That is not true unless the caller separately wrapped execution in client.session(...), called ensure_session(...), or started a replay context. The tests patch ExplicitClient.record_llm_call directly, so they bypass the real session precondition and miss this. The decorator either needs to start/ensure a session itself (with a clear naming/threading strategy) or the API/docs/tests need to require an explicit active ExplicitClient session and verify the no-session case.
Custom cache_key does not actually control the server-side cache identity.
_build_request_payload() includes cache_key, but it also includes args_repr and kwargs_repr. Phase 0 content validation hashes the entire request body, so two calls with the same custom cache_key but different/non-stable args_repr values still produce different request_hash values and will miss/diverge. This breaks the documented use case:
```
@cached_llm_call(cache_key=lambda client, q, **_: q)
def chat(client, question): ...
```
The client object's address-bearing repr is still included in args_repr, so the server hash changes across processes even though the custom key is stable. Either the lookup/record request body must be reduced to stable identity fields only (for example _rewind_decorator, fn_name, cache_key) and display args stored elsewhere, or the server validation needs to hash only cache_key for decorator calls. Add a regression test where two different client objects share the same custom key and replay hits.

Missing / Misleading Tests

Add an integration-style test with the real ExplicitClient.record_llm_call() behavior showing what happens with no active session. Today all decorator record tests mock away the session guard.
Add custom-cache-key tests that compare the full request payload/hash inputs, not only payload["cache_key"].
Add a test for cache_key intentionally ignoring a non-stable object arg (repr contains different object IDs) and assert the decorator still replays.

Smaller Notes

The docs are good about return-type fidelity, but they should also state the session requirement if the implementation remains session-external.
The PR is Python/docs only; the Python and site checks pass, but the Rust build jobs are still pending at the time of review.

Once the session lifecycle and custom-key semantics are fixed, this should be much closer.

Reviewer caught two real correctness bugs in the Phase 2 decorator. Both fixed with directed regression tests using the REAL ExplicitClient guard (not patched away). ## Review #2 — custom cache_key didn't actually control cache identity `_build_request_payload()` included `args_repr` and `kwargs_repr` in the request body. Phase 0's content validation hashes the WHOLE body to derive the request_hash, so two calls with the same custom ``cache_key=lambda client, q: q`` but different client objects (different memory addresses → different reprs) produced different request_hashes. Cache miss every time. The custom cache_key was effectively cosmetic. Fix: identity-only payload. Request body now contains only ``_rewind_decorator``, ``fn_name``, ``cache_key``. Args / kwargs are NOT in the payload — when user passes ``cache_key=lambda client, q: q``, the unstable client repr is correctly invisible to the server hash. For dashboard display, fn_name + cache_key suffices to identify the call. Power users wanting human-readable display in the dashboard can use plain-string cache_keys (e.g. ``f"chat:{model}:{question[:50]}"``) instead of opaque SHA-256 hashes. Regression tests: - ``test_custom_cache_key_payload_is_independent_of_arg_reprs`` — two object() instances with different reprs but same custom key produce IDENTICAL request payloads. - ``test_decorator_with_custom_cache_key_records_stable_request_across_clients`` — end-to-end through the decorator: two recordings with different clients + same key produce equal request bodies. - ``test_default_cache_key_payload_independent_of_unstable_object_args`` — pins the EXPECTED non-stability of default keys (the whole reason custom cache_key exists), so a future "smart" default doesn't silently drift. ## Review #1 — no-session quickstart records nothing `ExplicitClient.record_llm_call()` returns None silently when ``_session_id`` is unset. The quickstart docs claimed "First call hits OpenAI, records the return value" but unless the user separately wrapped execution in ``client.session(...)``, ``ensure_session(...)``, or ``init()``, recording was a silent no-op. Tests bypassed this with ``patch.object(ExplicitClient, "record_llm_call")``, masking the bug. This is consistent with the rest of the SDK (init() / intercept have the same precondition), but the docs need to say so. Auto- session creation in the decorator was considered and rejected: naming, lifecycle, and thread-safety concerns; the explicit-session pattern matches what users already do for other parts of the SDK. Fix: - New "Session requirement" section at the top of docs/cached-llm-call.md with three patterns (scoped session, long-lived ensure_session, init() auto-session). - Quickstart updated to wrap the example in ``with client.session("my-quickstart"):``. - Replay path documented inline. Regression tests: - ``test_no_session_function_runs_and_returns_live_result`` — verify decorator is a silent no-op for recording when no session active; function still runs, returns live result correctly. - ``test_no_session_record_is_silent_no_op`` — calls the REAL ``ExplicitClient.record_llm_call`` (no patch) with no session and asserts it returns None without raising. The test the reviewer asked for: shows the decorator's behavior under the actual production guard. - ``test_active_session_records_via_real_client_path`` — inverse: session-active path reaches _post (verified via patch on _post, not on record_llm_call). Confirms the guard semantics in both directions. ## Test count - cached_call tests: 26 → **32** (+6 review regressions) - Total Python tests: pre-push routine reports green across all 5 stages ## Pre-push verification All 5 stages green BEFORE push (scripts/pre-push-check.sh): - ruff: clean - pytest local: passes - pytest bare-env (CI mirror): passes - cargo clippy: clean - cargo test --workspace: clean Made-with: Cursor

risjai · 2026-04-27T12:49:32Z

Review fixes — `0b3b8a8`

Both correctness blockers fixed with the regression tests you asked for. Pre-push routine green BEFORE push (5/5 stages including the bare-env CI mirror).

Fix → Finding

Finding	Fix	Regression test
#2 Custom cache_key didn't actually control cache identity. `_build_request_payload` included `args_repr` + `kwargs_repr` alongside `cache_key`; Phase 0 hashes the whole body, so two calls with the same custom key but different unstable arg reprs (object IDs) produced different request_hashes.	Identity-only payload. Request body now contains exactly `{_rewind_decorator, fn_name, cache_key}`. Args / kwargs are NOT in the payload — when user passes `cache_key=lambda client, q: q`, the client object's unstable repr is correctly invisible to the server hash. Dashboard display: power users encode human-readable identity in the cache_key itself (e.g. `f"chat:{model}:{question[:50]}"`) instead of using SHA hashes.	`test_custom_cache_key_payload_is_independent_of_arg_reprs` (two `object()` instances → identical payloads), `test_decorator_with_custom_cache_key_records_stable_request_across_clients` (end-to-end: two recordings with different clients + same key produce equal request bodies), `test_default_cache_key_payload_independent_of_unstable_object_args` (pins the EXPECTED non-stability of default keys — the whole reason custom keys exist)
#1 No-session no-recording. Docs claimed first call records; reality was silent no-op when `_session_id` was unset. Tests patched `record_llm_call` directly, bypassing the session guard.	New "Session requirement" section at the top of `docs/cached-llm-call.md` with three explicit patterns (scoped `client.session(...)`, long-lived `ensure_session`, `init()` auto-session). Quickstart updated to wrap example in `with client.session("my-quickstart"):`. Decorator behavior unchanged — silent no-op is consistent with the rest of the SDK (init / intercept have same precondition). Auto-session creation in the decorator was considered and rejected: naming + lifecycle + thread-safety concerns.	`test_no_session_function_runs_and_returns_live_result` (function works correctly, no exception, no record), `test_no_session_record_is_silent_no_op` calls the REAL `ExplicitClient.record_llm_call` with no patch and verifies it returns None — the test you asked for, `test_active_session_records_via_real_client_path` (inverse: session-active path reaches _post, verified via patch on _post not on record_llm_call)

What's NOT changed

Decorator still doesn't auto-create sessions. Reasons documented in the commit message — naming, lifecycle, thread safety. If user feedback wants this in v2.1, the decision flips with explicit semantics.
The cache_key user-supplied value is still kept verbatim in the payload (not hashed by us) — meaning users can choose human-readable plain strings if they want dashboard legibility. The server still SHA-256s the whole payload for content validation, so cache identity is still uniqueness-of-payload. Documented.

Test count

cached_call tests: 26 → 32 (+6 directed review regressions)
All 5 pre-push stages green (scripts/pre-push-check.sh)

CI should be green on first try this time. Re-requesting review.

risjai

Follow-up Review

The two original blockers are materially addressed in 0b3b8a8:

cache_key is now the actual identity used in the request payload ({_rewind_decorator, fn_name, cache_key}), so unstable args_repr / kwargs_repr no longer break custom keys.
The quickstart now uses ExplicitClient().session(...), and tests cover the real no-session guard instead of patching it away.

One remaining doc issue before merge:

docs/cached-llm-call.md lists rewind_agent.init() as a way to enter a session for the decorator. I don’t think that’s accurate. init() uses patch.py’s direct-mode _session_id, while cached_llm_call records through ExplicitClient, which checks explicit.py’s contextvar _session_id. So init() may record inner SDK calls, but it does not make the decorator’s outer record_llm_call() work.

Please remove or reword the init() option in the Session requirement section. The valid patterns are ExplicitClient.session(...), ExplicitClient.ensure_session(...), or ExplicitClient.start_replay(...) for replay.

Reviewer follow-up on PR #151 caught a precise factual error in the docs. The two _session_id symbols look identical but are different objects: - patch.py:21 → _session_id = None (plain module variable for direct-mode SDK monkey-patching) - explicit.py:44 → _session_id: contextvars.ContextVar (what ExplicitClient.record_llm_call checks) init() sets patch._session_id when it opens its direct-mode session, but never touches explicit._session_id. So the decorator's outer record_llm_call is still a silent no-op even when init() is active — init() makes the inner SDK monkey-patches record, but it does NOT satisfy the decorator's own session precondition. ## Fix Removed init() from the "Three ways to enter a session" list. The three valid patterns are now: 1. ExplicitClient.session(...) context manager 2. ExplicitClient.ensure_session(...) 3. ExplicitClient.start_replay(...) (for replay flows) Added a new "init() does NOT enable the decorator" subsection spelling out the gotcha + showing how to compose init() with the decorator (call init() AND enter an ExplicitClient session — they work together via the contextvar that suppresses double-recording on miss). ## Why this is just a docs change The decorator's behavior is correct as shipped — it records via ExplicitClient and silently no-ops without a session, consistent with the rest of the SDK. The bug was purely that the docs misled users about what counted as "having a session". No tests touched; no code touched; no API changed. ## Pre-push verification All 5 stages green (scripts/pre-push-check.sh) — same code, just doc text changed. ## Open thought (for follow-up) The same gotcha probably applies to docs/intercept-quickstart.md (PR #150). intercept.install() also records via ExplicitClient, so a user who does `init() + intercept.install()` won't get intercept recordings either. Worth a separate doc-precision pass on PR #150 once it lands. Made-with: Cursor

risjai · 2026-04-27T13:05:24Z

You're right — verified this in code:

python/rewind_agent/patch.py:21 — _session_id = None (plain module variable for direct-mode SDK monkey-patches)
python/rewind_agent/explicit.py:44 — _session_id: contextvars.ContextVar (what ExplicitClient.record_llm_call actually checks)

init() only writes patch._session_id, never the ContextVar in explicit.py. So my docs claim that init() "opens a session for the decorator" was factually wrong — init() makes the inner SDK monkey-patches record, but the decorator's outer record_llm_call is still a silent no-op.

Fixed in 8388910:

Removed init() from the "Three ways to enter a session" list. The three valid patterns are now ExplicitClient.session(...), ExplicitClient.ensure_session(...), and ExplicitClient.start_replay(...) (added per your suggestion).
Added a new "init() does NOT enable the decorator" subsection spelling out the two-_session_id-symbol gotcha + showing how to compose init() with the decorator (call init() AND enter an ExplicitClient session — they cooperate via the contextvar that suppresses double-recording on miss).

Pre-push routine green (5/5 stages). Pure docs change; no tests / code touched.

Adjacent thought (not in this PR)

The same precision issue probably applies to docs/intercept-quickstart.md (PR #150, just merged). intercept.install() also records via ExplicitClient, so a user who does init() + intercept.install() won't get intercept recordings either without a separate ExplicitClient.session(...). Worth a follow-up doc-precision pass on intercept's quickstart once this lands. Want me to bundle that into a follow-up PR or extend this one to cover it too?

User reminder on PR #151: pull BEFORE pushing. Codifying this in the pre-push script so a future session can't skip it. ## What changed scripts/pre-push-check.sh now has 6 stages instead of 5: [0/6] git fetch + ahead/behind check ← NEW [1/6] ruff check [2/6] pytest tests/ (local env) [3/6] pytest tests/ (bare env, CI mirror) [4/6] cargo clippy [5/6] cargo test --workspace Stage 0: - Fetches origin silently - If branch is BEHIND origin: prints clear error pointing at 'git pull --rebase', exits non-zero so subsequent stages don't waste time running against stale code - If detached HEAD: errors out (push from a named branch) - If no upstream branch yet: notes "first push" and continues - If up to date: prints ahead/behind counts and proceeds ## Why Last push on PR #151 got rejected because origin/feat/phase-2-... had been auto-merged with master (sibling PR #150 landed) while my local was unchanged. Pulling THEN pushing is the standard flow; codifying it in the pre-push script means I can't forget. Also saves 30+ seconds of running the rest of the suite against stale code only to have GitHub reject the push at the end. ## Verified ./scripts/pre-push-check.sh — all 6 stages green on this branch with origin and local at the same SHA. Adversarial test (manual): if I rewind HEAD by one commit and re-run, stage 0 detects "behind" and aborts with the clear 'git pull --rebase' message, before any test runs. Made-with: Cursor

risjai added 2 commits April 27, 2026 18:02

vercel Bot deployed to Preview April 27, 2026 12:41 View deployment

risjai commented Apr 27, 2026

View reviewed changes

vercel Bot deployed to Preview April 27, 2026 12:49 View deployment

risjai commented Apr 27, 2026

View reviewed changes

Merge branch 'master' into feat/phase-2-cached-llm-call-decorator

c0d0ca7

vercel Bot deployed to Preview April 27, 2026 13:01 View deployment

vercel Bot deployed to Preview April 27, 2026 13:05 View deployment

vercel Bot deployed to Preview April 27, 2026 13:08 View deployment

risjai merged commit 6a80731 into master Apr 27, 2026
7 checks passed

risjai deleted the feat/phase-2-cached-llm-call-decorator branch April 27, 2026 13:23

risjai mentioned this pull request Apr 28, 2026

Phase 3 (commits 5-13): runner registry full lifecycle — dispatcher, callbacks, dashboard, runner SDK, CLI, e2e, docs, version bump #154

Merged

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Phase 2: cached_llm_call decorator (Tier 2)#151

Phase 2: cached_llm_call decorator (Tier 2)#151
risjai merged 6 commits into
masterfrom
feat/phase-2-cached-llm-call-decorator

risjai commented Apr 27, 2026

Uh oh!

vercel Bot commented Apr 27, 2026 •

edited

Loading

Uh oh!

risjai left a comment

Uh oh!

risjai commented Apr 27, 2026

Uh oh!

risjai left a comment

Uh oh!

risjai commented Apr 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

risjai commented Apr 27, 2026

Summary

Why this PR

Public API

What's in this PR

Composition with intercept.install()

Cache key + return-type round-trip

Test plan

Out of scope (deferred)

Versions

Plan reference

Uh oh!

vercel Bot commented Apr 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

risjai left a comment

Choose a reason for hiding this comment

Review: needs changes before merge

Blocking Findings

Missing / Misleading Tests

Smaller Notes

Uh oh!

risjai commented Apr 27, 2026

Review fixes — 0b3b8a8

Fix → Finding

What's NOT changed

Test count

Uh oh!

risjai left a comment

Choose a reason for hiding this comment

Follow-up Review

Uh oh!

risjai commented Apr 27, 2026

Adjacent thought (not in this PR)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Composition with `intercept.install()`

vercel Bot commented Apr 27, 2026 •

edited

Loading

Review fixes — `0b3b8a8`