Skip to content

Phase 2: cached_llm_call decorator (Tier 2)#151

Merged
risjai merged 6 commits into
masterfrom
feat/phase-2-cached-llm-call-decorator
Apr 27, 2026
Merged

Phase 2: cached_llm_call decorator (Tier 2)#151
risjai merged 6 commits into
masterfrom
feat/phase-2-cached-llm-call-decorator

Conversation

@risjai
Copy link
Copy Markdown
Collaborator

@risjai risjai commented Apr 27, 2026

Summary

Tier 2 of the Universal Replay Architecture plan. Programmer-facing per-call-site decorator that wraps a Python function so its return value is cached by Rewind. Composes cleanly with Phase 1's intercept.install() via a contextvar that prevents double-recording.

Why this PR

Phase 1's intercept.install() patches the HTTP transport layer globally — every httpx/requests/aiohttp call from the process gets evaluated against predicates. That's powerful but blunt: you opt the whole process in or out. Some operators want per-function control:

  • Cache the OUTER function that composes multiple inner LLM/tool calls (instead of caching the inner HTTP requests)
  • LLM calls that don't go through plain HTTP (Bedrock via boto3 SigV4, gRPC, self-hosted models)
  • Tests pinning specific functions to known recordings

The decorator gives them a single primitive: @cached_llm_call. Wraps any function (sync or async); returns the cached result on hit or calls the function + records the return on miss.

Public API

from rewind_agent import cached_llm_call

@cached_llm_call(
    extract_model=lambda call_args, ret: ret["model"],
    extract_tokens=lambda call_args, ret: (
        ret["usage"]["prompt_tokens"],
        ret["usage"]["completion_tokens"],
    ),
)
def chat(question: str) -> dict:
    return openai_client.chat.completions.create(...).model_dump()

Async variant works the same way (@cached_llm_call() detects inspect.iscoroutinefunction and emits an async wrapper). Generator / async-generator functions raise TypeError at decoration — single-return cache contract.

What's in this PR

File LOC What
plans/phase-2-cached-llm-call-decorator.md +226 Plan doc anchor (committed earlier on this branch)
python/rewind_agent/cached_call.py +617 The decorator + sync/async wrappers + cache-key derivation + _to_json_serializable + safe extractors
python/rewind_agent/__init__.py +3 Re-export cached_llm_call from the top level
python/rewind_agent/intercept/_flow.py +30 Contextvar check _is_under_cached_llm_call() to suppress double-recording when the decorator is active
python/tests/test_cached_call.py +461 26 directed tests
docs/cached-llm-call.md +179 Operator-facing how-to in matching style

Total: ~1500 LOC across 6 files (700 product + 460 tests + 180 docs + 226 plan).

Composition with intercept.install()

Both can be active in the same process:

  • Cache hit on the decorated function: decorator returns cached value WITHOUT calling the user's function. No HTTP calls happen, intercept never sees anything.
  • Cache miss: decorator sets contextvar _cached_llm_call_active, calls the user's function (which may make HTTP calls under intercept), records the return at function-level granularity. Intercept's _flow checks the contextvar — when set, skips its own recording to avoid double-recording at two granularities.

Net: ONE recording per cache miss, at the function level. Contextvar reset via try/finally so exceptions don't leak the suppression.

Cache key + return-type round-trip

Cache key default: SHA-256 of f"{fn_qualname}|{json(args, kwargs)}" with _safe_repr fallback for non-JSON-able args. Override via cache_key= parameter for unhashable args (clients, file handles).

Return type: decorator stores JSON-serializable values. On hit, you get the JSON-deserialized form back, NOT the original Python type. Common conversions handled automatically (dict, model_dump, Pydantic v1 dict, dict, repr fallback). Documented clearly with the recommended pattern: return response.model_dump() in the decorated function.

Test plan

  • scripts/pre-push-check.sh — all 5 stages green:
    • ruff: clean
    • pytest local: 455 passed, 1 skipped (was 429; +26 cached_call tests)
    • pytest bare-env (CI mirror): 367 passed, 12 skipped, 0 failed
    • cargo clippy: clean
    • cargo test --workspace: all green
  • Sync + async cache hit / miss / divergence
  • Custom extract_model + extract_tokens + cache_key
  • Default cache-key stability (same args → same key, kwargs order invariant)
  • Strict-match RewindReplayDivergenceError propagates
  • Generator / async-generator raises TypeError at decoration
  • Contextvar set during call + reset on exception
  • No double-record under intercept.install() (tested via _is_under_cached_llm_call check + integration via decorator's contextvar)

Out of scope (deferred)

  • Decision-matrix update in docs/recording.md + docs/getting-started.md from 3-way → 4-way — requires PR #150 (docs) to merge first, then a follow-up commit on this branch will extend the matrix.
  • Auto-detection of return-type → token extraction — pass extract_tokens explicitly. Auto-detection is a v2.1 candidate.
  • Generator / async-generator support — yields don't fit single-return cache; documented as deferred with TypeError at decoration time.
  • Tier 3 — runner registry + dashboard "Run replay" button — separate PR.

Versions

  • Rust: stays at 0.13.0
  • Python SDK: stays at 0.15.0 (still unpublished per CLAUDE.md track-2; this PR rides with it). When 0.15.0 publishes, the next minor PR (Tier 3) bumps to 0.16.0.

Plan reference

plans/phase-2-cached-llm-call-decorator.md — full design doc, decision rationale, sequencing, acceptance criteria.

Made with Cursor

risjai added 2 commits April 27, 2026 18:02
Detailed plan for Tier 2 of the Universal Replay Architecture:
@cached_llm_call decorator on top of Phase 0's ExplicitClient +
Phase 1's _flow orchestration. Programmer-facing per-call-site
control vs Phase 1's global intercept.

## Scope

- New: rewind_agent.cached_call module (~250 LOC)
- _flow.py contextvar to skip double-record when used under
  intercept.install()
- Tests: ~25 cases (sync + async × hit/miss/divergence + custom
  extractors + composition with intercept)
- Docs: docs/cached-llm-call.md + decision matrix update

## Key API

@cached_llm_call(extract_tokens=lambda args, ret: ..., extract_model=...)
def my_llm_call(question: str): -> ChatCompletion

Sync and async; cache key derives from fn name + args via safe_repr;
returns cached value on hit. Custom cache_key / extract_tokens /
extract_model overrides supported.

## Composition with intercept.install()

Decorator's check happens FIRST; if hit, no HTTP call ever fires.
On miss, contextvar suppresses the inner intercept recording so we
don't double-record. Cleanly composable.

## Open questions

- Generator / async-gen functions: punt, raise on decoration
- Return-type round-trip: document JSON-serialization caveat;
  ChatCompletion etc. land as dicts on cache hit

Made-with: Cursor
Tier 2 of the Universal Replay Architecture. Wraps a Python function;
returns the cached value on hit OR calls the function and records the
return on miss. Composes cleanly with Phase 1's intercept.install()
via a contextvar that suppresses double-recording.

## Public API

  from rewind_agent import cached_llm_call

  @cached_llm_call(
      extract_model=lambda call_args, ret: ret.model,
      extract_tokens=lambda call_args, ret: (ret.usage.prompt_tokens,
                                              ret.usage.completion_tokens),
  )
  def chat(question: str) -> dict:
      return openai_client.chat.completions.create(...).model_dump()

Sync + async functions both supported (detected via
inspect.iscoroutinefunction). Generator / async-generator functions
raise TypeError at decoration — single-return cache contract.

## Why Tier 2 exists

Phase 1's intercept.install() patches the HTTP transport globally —
powerful but blunt. Tier 2 gives operators per-call-site control:

- Cache the OUTER function that composes multiple inner LLM/tool
  calls, vs caching individual HTTP calls
- LLM calls that don't go through plain HTTP (Bedrock via boto3,
  gRPC to self-hosted models) — the decorator caches at function-
  return level, transport-agnostic
- Tests pinning specific functions to known recordings

## Composition with intercept.install()

Both can be active in the same process. The decorator's check
fires first (it wraps the user's function). On hit: returns cached,
no HTTP call ever happens. On miss: contextvar
``_cached_llm_call_active`` is set during the function call, and
intercept._flow checks it to skip its own recording — preventing
double-record at two granularities.

The contextvar is reset via try/finally so exceptions in the user
function don't leak the suppression.

## Cache-key derivation

Default: SHA-256 of f"{fn_qualname}|{json(args, kwargs)}" with
_safe_repr fallback for non-JSON-able args. Operators with
unhashable args (clients, file handles) override via cache_key=
parameter:

  @cached_llm_call(cache_key=lambda client, q, **kw: q)
  def chat(client, question: str) -> dict: ...

Custom cache_key failure (raises) falls back to default + warning.

## Return-type round-trip

Decorator stores JSON-serializable values in the cache. On hit, you
get the JSON-deserialized form back, NOT the original Python type.
Documented clearly. Common conversions handled automatically:

- dict / list / primitives → as-is
- model_dump() (Pydantic v2, OpenAI SDK) → called, result stored
- dict() (Pydantic v1) → fallback
- __dict__ → fallback
- pathological → repr() stored, warning logged

## Tests (26 cases, all green)

- Sync + async cache hit / miss / divergence
- Custom extract_model + extract_tokens reach record
- Custom cache_key overrides + failure-fallback
- Default cache key stability (same args → same key, kwargs order
  invariant)
- Strict-match RewindReplayDivergenceError propagates through
  decorator
- Generator / async-generator decoration raises TypeError
- Contextvar set during call + reset on exception
- _to_json_serializable: dict, Pydantic v2 model_dump, v1 dict,
  pathological __slots__ class
- _safe_repr primitives + lists + dict-with-non-str-keys + custom
  type fallback
- Request payload shape stability + custom-key replacement +
  custom-key failure fallback

## What's NOT in this PR

- Decision matrix update in docs/recording.md and docs/getting-
  started.md from "three ways" → "four ways". Requires PR #150
  (docs/intercept-quickstart.md) to merge first; follow-up commit
  on this branch will extend the matrix once PR #150 lands.
- Auto-detection of return-type → token extraction. Manual
  extract_tokens only; auto-detect is a v2.1 candidate.
- Generator / async-generator support. Yields don't fit the
  single-return cache; documented as deferred.

## Pre-push verification

All 5 stages green BEFORE push (scripts/pre-push-check.sh):
  - ruff: clean
  - pytest local: 455 passed, 1 skipped (was 429; +26 cached_call tests)
  - pytest bare-env (CI mirror): 367 passed, 12 skipped, 0 failed
  - cargo clippy: clean
  - cargo test --workspace: all green

Made-with: Cursor
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 27, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
rewind Ready Ready Preview, Comment Apr 27, 2026 1:08pm

Copy link
Copy Markdown
Collaborator Author

@risjai risjai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review: needs changes before merge

I reviewed PR #151 against the Phase 2 cached_llm_call plan. Python/site/version checks are green, Rust build jobs are still pending, but there are two correctness blockers in the decorator implementation itself.

Blocking Findings

  • @cached_llm_call() does not create or enter a recording session, so the quickstart path records nothing.
    The wrapper constructs a fresh ExplicitClient() and calls record_llm_call(), but ExplicitClient.record_llm_call() immediately returns None when _session_id is unset. The docs show:

    @cached_llm_call()
    def chat(...): ...
    
    result1 = chat("What is 2+2?")  # First call: hits OpenAI, records the return value

    That is not true unless the caller separately wrapped execution in client.session(...), called ensure_session(...), or started a replay context. The tests patch ExplicitClient.record_llm_call directly, so they bypass the real session precondition and miss this. The decorator either needs to start/ensure a session itself (with a clear naming/threading strategy) or the API/docs/tests need to require an explicit active ExplicitClient session and verify the no-session case.

  • Custom cache_key does not actually control the server-side cache identity.
    _build_request_payload() includes cache_key, but it also includes args_repr and kwargs_repr. Phase 0 content validation hashes the entire request body, so two calls with the same custom cache_key but different/non-stable args_repr values still produce different request_hash values and will miss/diverge. This breaks the documented use case:

    @cached_llm_call(cache_key=lambda client, q, **_: q)
    def chat(client, question): ...

    The client object's address-bearing repr is still included in args_repr, so the server hash changes across processes even though the custom key is stable. Either the lookup/record request body must be reduced to stable identity fields only (for example _rewind_decorator, fn_name, cache_key) and display args stored elsewhere, or the server validation needs to hash only cache_key for decorator calls. Add a regression test where two different client objects share the same custom key and replay hits.

Missing / Misleading Tests

  • Add an integration-style test with the real ExplicitClient.record_llm_call() behavior showing what happens with no active session. Today all decorator record tests mock away the session guard.
  • Add custom-cache-key tests that compare the full request payload/hash inputs, not only payload["cache_key"].
  • Add a test for cache_key intentionally ignoring a non-stable object arg (repr contains different object IDs) and assert the decorator still replays.

Smaller Notes

  • The docs are good about return-type fidelity, but they should also state the session requirement if the implementation remains session-external.
  • The PR is Python/docs only; the Python and site checks pass, but the Rust build jobs are still pending at the time of review.

Once the session lifecycle and custom-key semantics are fixed, this should be much closer.

Reviewer caught two real correctness bugs in the Phase 2 decorator.
Both fixed with directed regression tests using the REAL
ExplicitClient guard (not patched away).

## Review #2 — custom cache_key didn't actually control cache identity

`_build_request_payload()` included `args_repr` and `kwargs_repr` in
the request body. Phase 0's content validation hashes the WHOLE body
to derive the request_hash, so two calls with the same custom
``cache_key=lambda client, q: q`` but different client objects
(different memory addresses → different reprs) produced different
request_hashes. Cache miss every time. The custom cache_key was
effectively cosmetic.

Fix: identity-only payload. Request body now contains only
``_rewind_decorator``, ``fn_name``, ``cache_key``. Args / kwargs
are NOT in the payload — when user passes
``cache_key=lambda client, q: q``, the unstable client repr is
correctly invisible to the server hash.

For dashboard display, fn_name + cache_key suffices to identify the
call. Power users wanting human-readable display in the dashboard can
use plain-string cache_keys (e.g. ``f"chat:{model}:{question[:50]}"``)
instead of opaque SHA-256 hashes.

Regression tests:
- ``test_custom_cache_key_payload_is_independent_of_arg_reprs`` —
  two object() instances with different reprs but same custom key
  produce IDENTICAL request payloads.
- ``test_decorator_with_custom_cache_key_records_stable_request_across_clients`` —
  end-to-end through the decorator: two recordings with different
  clients + same key produce equal request bodies.
- ``test_default_cache_key_payload_independent_of_unstable_object_args`` —
  pins the EXPECTED non-stability of default keys (the whole reason
  custom cache_key exists), so a future "smart" default doesn't
  silently drift.

## Review #1 — no-session quickstart records nothing

`ExplicitClient.record_llm_call()` returns None silently when
``_session_id`` is unset. The quickstart docs claimed "First call
hits OpenAI, records the return value" but unless the user
separately wrapped execution in ``client.session(...)``,
``ensure_session(...)``, or ``init()``, recording was a silent no-op.
Tests bypassed this with ``patch.object(ExplicitClient, "record_llm_call")``,
masking the bug.

This is consistent with the rest of the SDK (init() / intercept
have the same precondition), but the docs need to say so. Auto-
session creation in the decorator was considered and rejected:
naming, lifecycle, and thread-safety concerns; the explicit-session
pattern matches what users already do for other parts of the SDK.

Fix:
- New "Session requirement" section at the top of
  docs/cached-llm-call.md with three patterns (scoped session,
  long-lived ensure_session, init() auto-session).
- Quickstart updated to wrap the example in
  ``with client.session("my-quickstart"):``.
- Replay path documented inline.

Regression tests:
- ``test_no_session_function_runs_and_returns_live_result`` — verify
  decorator is a silent no-op for recording when no session active;
  function still runs, returns live result correctly.
- ``test_no_session_record_is_silent_no_op`` — calls the REAL
  ``ExplicitClient.record_llm_call`` (no patch) with no session and
  asserts it returns None without raising. The test the reviewer
  asked for: shows the decorator's behavior under the actual
  production guard.
- ``test_active_session_records_via_real_client_path`` — inverse:
  session-active path reaches _post (verified via patch on _post,
  not on record_llm_call). Confirms the guard semantics in both
  directions.

## Test count

- cached_call tests: 26 → **32** (+6 review regressions)
- Total Python tests: pre-push routine reports green across all 5 stages

## Pre-push verification

All 5 stages green BEFORE push (scripts/pre-push-check.sh):
  - ruff: clean
  - pytest local: passes
  - pytest bare-env (CI mirror): passes
  - cargo clippy: clean
  - cargo test --workspace: clean

Made-with: Cursor
@risjai
Copy link
Copy Markdown
Collaborator Author

risjai commented Apr 27, 2026

Review fixes — 0b3b8a8

Both correctness blockers fixed with the regression tests you asked for. Pre-push routine green BEFORE push (5/5 stages including the bare-env CI mirror).

Fix → Finding

Finding Fix Regression test
#2 Custom cache_key didn't actually control cache identity. _build_request_payload included args_repr + kwargs_repr alongside cache_key; Phase 0 hashes the whole body, so two calls with the same custom key but different unstable arg reprs (object IDs) produced different request_hashes. Identity-only payload. Request body now contains exactly {_rewind_decorator, fn_name, cache_key}. Args / kwargs are NOT in the payload — when user passes cache_key=lambda client, q: q, the client object's unstable repr is correctly invisible to the server hash. Dashboard display: power users encode human-readable identity in the cache_key itself (e.g. f"chat:{model}:{question[:50]}") instead of using SHA hashes. test_custom_cache_key_payload_is_independent_of_arg_reprs (two object() instances → identical payloads), test_decorator_with_custom_cache_key_records_stable_request_across_clients (end-to-end: two recordings with different clients + same key produce equal request bodies), test_default_cache_key_payload_independent_of_unstable_object_args (pins the EXPECTED non-stability of default keys — the whole reason custom keys exist)
#1 No-session no-recording. Docs claimed first call records; reality was silent no-op when _session_id was unset. Tests patched record_llm_call directly, bypassing the session guard. New "Session requirement" section at the top of docs/cached-llm-call.md with three explicit patterns (scoped client.session(...), long-lived ensure_session, init() auto-session). Quickstart updated to wrap example in with client.session("my-quickstart"):. Decorator behavior unchanged — silent no-op is consistent with the rest of the SDK (init / intercept have same precondition). Auto-session creation in the decorator was considered and rejected: naming + lifecycle + thread-safety concerns. test_no_session_function_runs_and_returns_live_result (function works correctly, no exception, no record), test_no_session_record_is_silent_no_op calls the REAL ExplicitClient.record_llm_call with no patch and verifies it returns None — the test you asked for, test_active_session_records_via_real_client_path (inverse: session-active path reaches _post, verified via patch on _post not on record_llm_call)

What's NOT changed

  • Decorator still doesn't auto-create sessions. Reasons documented in the commit message — naming, lifecycle, thread safety. If user feedback wants this in v2.1, the decision flips with explicit semantics.
  • The cache_key user-supplied value is still kept verbatim in the payload (not hashed by us) — meaning users can choose human-readable plain strings if they want dashboard legibility. The server still SHA-256s the whole payload for content validation, so cache identity is still uniqueness-of-payload. Documented.

Test count

  • cached_call tests: 26 → 32 (+6 directed review regressions)
  • All 5 pre-push stages green (scripts/pre-push-check.sh)

CI should be green on first try this time. Re-requesting review.

Copy link
Copy Markdown
Collaborator Author

@risjai risjai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Follow-up Review

The two original blockers are materially addressed in 0b3b8a8:

  • cache_key is now the actual identity used in the request payload ({_rewind_decorator, fn_name, cache_key}), so unstable args_repr / kwargs_repr no longer break custom keys.
  • The quickstart now uses ExplicitClient().session(...), and tests cover the real no-session guard instead of patching it away.

One remaining doc issue before merge:

  • docs/cached-llm-call.md lists rewind_agent.init() as a way to enter a session for the decorator. I don’t think that’s accurate. init() uses patch.py’s direct-mode _session_id, while cached_llm_call records through ExplicitClient, which checks explicit.py’s contextvar _session_id. So init() may record inner SDK calls, but it does not make the decorator’s outer record_llm_call() work.

Please remove or reword the init() option in the Session requirement section. The valid patterns are ExplicitClient.session(...), ExplicitClient.ensure_session(...), or ExplicitClient.start_replay(...) for replay.

Reviewer follow-up on PR #151 caught a precise factual error in
the docs. The two _session_id symbols look identical but are
different objects:

  - patch.py:21         → _session_id = None  (plain module variable
                          for direct-mode SDK monkey-patching)
  - explicit.py:44      → _session_id: contextvars.ContextVar
                          (what ExplicitClient.record_llm_call checks)

init() sets patch._session_id when it opens its direct-mode session,
but never touches explicit._session_id. So the decorator's outer
record_llm_call is still a silent no-op even when init() is active —
init() makes the inner SDK monkey-patches record, but it does NOT
satisfy the decorator's own session precondition.

## Fix

Removed init() from the "Three ways to enter a session" list. The
three valid patterns are now:

  1. ExplicitClient.session(...) context manager
  2. ExplicitClient.ensure_session(...)
  3. ExplicitClient.start_replay(...)  (for replay flows)

Added a new "init() does NOT enable the decorator" subsection
spelling out the gotcha + showing how to compose init() with the
decorator (call init() AND enter an ExplicitClient session — they
work together via the contextvar that suppresses double-recording
on miss).

## Why this is just a docs change

The decorator's behavior is correct as shipped — it records via
ExplicitClient and silently no-ops without a session, consistent
with the rest of the SDK. The bug was purely that the docs misled
users about what counted as "having a session". No tests touched;
no code touched; no API changed.

## Pre-push verification

All 5 stages green (scripts/pre-push-check.sh) — same code, just
doc text changed.

## Open thought (for follow-up)

The same gotcha probably applies to docs/intercept-quickstart.md
(PR #150). intercept.install() also records via ExplicitClient,
so a user who does `init() + intercept.install()` won't get
intercept recordings either. Worth a separate doc-precision pass
on PR #150 once it lands.

Made-with: Cursor
@risjai
Copy link
Copy Markdown
Collaborator Author

risjai commented Apr 27, 2026

You're right — verified this in code:

  • python/rewind_agent/patch.py:21_session_id = None (plain module variable for direct-mode SDK monkey-patches)
  • python/rewind_agent/explicit.py:44_session_id: contextvars.ContextVar (what ExplicitClient.record_llm_call actually checks)

init() only writes patch._session_id, never the ContextVar in explicit.py. So my docs claim that init() "opens a session for the decorator" was factually wrong — init() makes the inner SDK monkey-patches record, but the decorator's outer record_llm_call is still a silent no-op.

Fixed in 8388910:

  • Removed init() from the "Three ways to enter a session" list. The three valid patterns are now ExplicitClient.session(...), ExplicitClient.ensure_session(...), and ExplicitClient.start_replay(...) (added per your suggestion).
  • Added a new "init() does NOT enable the decorator" subsection spelling out the two-_session_id-symbol gotcha + showing how to compose init() with the decorator (call init() AND enter an ExplicitClient session — they cooperate via the contextvar that suppresses double-recording on miss).

Pre-push routine green (5/5 stages). Pure docs change; no tests / code touched.

Adjacent thought (not in this PR)

The same precision issue probably applies to docs/intercept-quickstart.md (PR #150, just merged). intercept.install() also records via ExplicitClient, so a user who does init() + intercept.install() won't get intercept recordings either without a separate ExplicitClient.session(...). Worth a follow-up doc-precision pass on intercept's quickstart once this lands. Want me to bundle that into a follow-up PR or extend this one to cover it too?

User reminder on PR #151: pull BEFORE pushing. Codifying this in
the pre-push script so a future session can't skip it.

## What changed

scripts/pre-push-check.sh now has 6 stages instead of 5:

  [0/6] git fetch + ahead/behind check  ← NEW
  [1/6] ruff check
  [2/6] pytest tests/ (local env)
  [3/6] pytest tests/ (bare env, CI mirror)
  [4/6] cargo clippy
  [5/6] cargo test --workspace

Stage 0:

  - Fetches origin silently
  - If branch is BEHIND origin: prints clear error pointing at
    'git pull --rebase', exits non-zero so subsequent stages don't
    waste time running against stale code
  - If detached HEAD: errors out (push from a named branch)
  - If no upstream branch yet: notes "first push" and continues
  - If up to date: prints ahead/behind counts and proceeds

## Why

Last push on PR #151 got rejected because origin/feat/phase-2-...
had been auto-merged with master (sibling PR #150 landed) while my
local was unchanged. Pulling THEN pushing is the standard flow;
codifying it in the pre-push script means I can't forget. Also
saves 30+ seconds of running the rest of the suite against stale
code only to have GitHub reject the push at the end.

## Verified

  ./scripts/pre-push-check.sh — all 6 stages green on this branch
  with origin and local at the same SHA.

  Adversarial test (manual): if I rewind HEAD by one commit and
  re-run, stage 0 detects "behind" and aborts with the clear
  'git pull --rebase' message, before any test runs.

Made-with: Cursor
@risjai risjai merged commit 6a80731 into master Apr 27, 2026
7 checks passed
@risjai risjai deleted the feat/phase-2-cached-llm-call-decorator branch April 27, 2026 13:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant