Skip to content

Remove read-path cache writes for query-backed models (closes #74)#79

Merged
ZmeiGorynych merged 3 commits into
mainfrom
egor/lost-update
May 3, 2026
Merged

Remove read-path cache writes for query-backed models (closes #74)#79
ZmeiGorynych merged 3 commits into
mainfrom
egor/lost-update

Conversation

@ZmeiGorynych
Copy link
Copy Markdown
Member

@ZmeiGorynych ZmeiGorynych commented May 3, 2026

Summary

  • Stops engine.execute, inspect_model, get_column_types, MCP query, and REST /query from writing to storage as a side effect — eliminates the lost-update race in Lost-update race in _refresh_cache_after_resolution (read-path side effect) #74 where a parallel edit_model was clobbered by the read path's full-model rewrite.
  • Cache fields (columns, backing_query_sql, data_source) on a query-backed SlayerModel are now populated only by engine.save_model / create_model_from_query(save=True) (and the REST/MCP/CLI write paths that route through them).
  • Aggressive cleanup: deleted _refresh_cache_after_resolution and the now-vestigial refresh_cache parameter + canonical-second-render branch on _expand_query_backed_model.

Trade-off (explicit): a backing-query rewrite that bypasses engine.save_model (e.g., raw storage.save_model) won't be reflected in cached fields until the next save through the engine. Documented in docs/concepts/models.md. Full CAS / patch_model is the layer-2 path from the issue and is deferred.

Test plan

  • poetry run pytest — 1382 passed, 0 failed
  • poetry run ruff check slayer/ tests/ — clean
  • Postgres / SQLite / DuckDB integration suites verified pre-merge (38 + 89 passing); my changes touch only dialect-agnostic resolution code.
  • Inverted TestCacheRefreshOnExecute to assert read paths never call storage.save_model, with cases for populated, empty, and stale cache plus outer-query variables.
  • Updated TestInspectModelQueryBacked._setup to route through engine.save_model so the cache is warmed at save time (was relying on the removed read-path side effect).

🤖 Generated with Claude Code

Summary by CodeRabbit

  • Bug Fixes
    • Query-backed models now enforce strict separation of read and write operations. Read operations (execution, inspection) no longer update cached metadata. Cache contents are populated only when saving through the engine. Note: direct storage access bypassing the engine may result in stale cache until the next engine save.

ZmeiGorynych and others added 2 commits May 3, 2026 11:37
Cache fields (columns, backing_query_sql, data_source) on a query-backed
SlayerModel are no longer refreshed as a side effect of engine.execute,
inspect_model, or get_column_types — only engine.save_model and
create_model_from_query(save=True) populate them. Eliminates the
lost-update race where a parallel edit_model write was clobbered by the
read path's full-model rewrite.

Trade-off: a backing-query rewrite that bypasses engine.save_model
won't be reflected in cached fields until the next save through the
engine. Documented in docs/concepts/models.md.

- Delete _refresh_cache_after_resolution and all three call sites
  (one in _execute_by_name, two in _expand_query_backed_model).
- Drop the now-vestigial refresh_cache parameter and the
  canonical-second-render branch.
- Invert TestCacheRefreshOnExecute to assert read paths never write,
  with cases for populated/empty/stale cache and outer-query variables.
- Update TestInspectModelQueryBacked._setup to route through
  engine.save_model so the cache is warmed at save time.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 3, 2026

Warning

Rate limit exceeded

@ZmeiGorynych has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 59 seconds before requesting another review.

To keep reviews running without waiting, you can enable usage-based add-on for your organization. This allows additional reviews beyond the hourly cap. Account admins can enable it under billing.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: a9441862-2771-4f5c-b91f-3fa9c5159c4f

📥 Commits

Reviewing files that changed from the base of the PR and between 86c1d2b and 7d9a648.

📒 Files selected for processing (1)
  • tests/test_mcp_server.py
📝 Walkthrough

Walkthrough

This PR eliminates cache-refresh logic from read operations in the query engine. Query-backed model caches (model.columns and backing_query_sql) are now populated only during engine.save_model, not during execute or model resolution. Direct storage bypass and read operations no longer update persisted cache.

Changes

Cache Refresh Removal from Read Paths

Layer / File(s) Summary
Core Engine Behavior
slayer/engine/query_engine.py
_execute_by_name() removes post-execution cache refresh; _expand_query_backed_model() becomes read-only (signature removes refresh_cache parameter and conditional persistence block); _resolve_model_inner() stops requesting cache refresh when expanding query-backed models.
Test Infrastructure
tests/test_mcp_server.py
Query-backed model test setup now saves through SlayerQueryEngine.save_model instead of direct storage write to populate cache at save time; test_json_show_sql_includes_backing_query_sql simplified to rely only on cache warming from _setup().
Test Assertions & Invariants
tests/test_query_backed_models.py
Adds _wrap_save_counter() helper to spy on storage writes; strengthens assertions that execute never calls storage.save_model; validates read operations do not populate cache for raw-saved models; verifies extension columns and ephemeral named-query stages do not persist; confirms runtime and outer-query variables do not leak into cached backing_query_sql.
Documentation
docs/concepts/models.md
Cache semantics now explicitly state cache is populated only via engine.save_model (including REST/MCP paths), read operations never modify persisted cache, and direct storage bypass can leave cache stale until the next engine save.

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly Related Issues

Possibly Related PRs

Poem

A rabbit hops through cache with care,
No reads shall write—let saves declare!
The engine knows when truth must stay,
Fresh data blooms at save's first day. 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 70.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically summarizes the main change: removing cache writes from read-path operations for query-backed models, and directly references the issue being closed (#74).
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch egor/lost-update

Review rate limit: 0/5 reviews remaining, refill in 4 minutes and 59 seconds.

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
docs/concepts/models.md (1)

327-334: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document data_source as part of the persisted cache.

This section still calls out only columns and backing_query_sql, but engine.save_model() also refreshes the stored data_source from the resolved virtual model. That field is part of the same engine-managed cache contract now, and downstream read paths like get_column_types() rely on it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/concepts/models.md` around lines 327 - 334, The docs omit that
engine.save_model also persists model.data_source into the engine-managed cache;
update the prose under "What gets cached" to list `model.data_source` alongside
`model.columns` and `model.backing_query_sql`, and state that
`engine.save_model` (and REST MCP create/edit) refreshes `model.data_source`
from the resolved virtual model so read paths like `get_column_types`,
`inspect_model`, and REST `GET /models/{name}` can rely on it; keep the note
that read operations (e.g., `engine.execute`, `inspect_model`,
`get_column_types`, MCP `query`, REST `/query`) do not write to storage and that
direct writes outside `engine.save_model` leave `model.columns`,
`model.backing_query_sql`, and now `model.data_source` stale until the next
engine save.
🧹 Nitpick comments (1)
tests/test_mcp_server.py (1)

327-331: ⚡ Quick win

Keep one MCP-level cache-warming regression test.

This helper now primes the cache by calling SlayerQueryEngine.save_model(...) directly, so it no longer exercises the MCP write path itself. I'd add a tool-level create_model(query=...) or edit_model(source_queries=...) assertion that the persisted model has backing_query_sql, so a regression in slayer/mcp/server.py can't slip past while this test still stays green.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/test_mcp_server.py` around lines 327 - 331, The test currently primes
the cache by calling SlayerQueryEngine.save_model directly
(SlayerQueryEngine.save_model and SlayerModel), so it no longer verifies the MCP
write path in slayer/mcp/server.py; update the test to also exercise the
MCP-level create/edit methods by invoking the MCP tool API (e.g., call
create_model(query=...) or edit_model(source_queries=...)) and then retrieve the
persisted model and assert that its backing_query_sql field is present and
populated; this ensures a regression in the MCP handlers
(create_model/edit_model in slayer/mcp/server.py) will be caught while keeping
the existing cache-warming behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Outside diff comments:
In `@docs/concepts/models.md`:
- Around line 327-334: The docs omit that engine.save_model also persists
model.data_source into the engine-managed cache; update the prose under "What
gets cached" to list `model.data_source` alongside `model.columns` and
`model.backing_query_sql`, and state that `engine.save_model` (and REST MCP
create/edit) refreshes `model.data_source` from the resolved virtual model so
read paths like `get_column_types`, `inspect_model`, and REST `GET
/models/{name}` can rely on it; keep the note that read operations (e.g.,
`engine.execute`, `inspect_model`, `get_column_types`, MCP `query`, REST
`/query`) do not write to storage and that direct writes outside
`engine.save_model` leave `model.columns`, `model.backing_query_sql`, and now
`model.data_source` stale until the next engine save.

---

Nitpick comments:
In `@tests/test_mcp_server.py`:
- Around line 327-331: The test currently primes the cache by calling
SlayerQueryEngine.save_model directly (SlayerQueryEngine.save_model and
SlayerModel), so it no longer verifies the MCP write path in
slayer/mcp/server.py; update the test to also exercise the MCP-level create/edit
methods by invoking the MCP tool API (e.g., call create_model(query=...) or
edit_model(source_queries=...)) and then retrieve the persisted model and assert
that its backing_query_sql field is present and populated; this ensures a
regression in the MCP handlers (create_model/edit_model in slayer/mcp/server.py)
will be caught while keeping the existing cache-warming behavior.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 079b8351-5170-4d12-9b03-c9d8108ca4a4

📥 Commits

Reviewing files that changed from the base of the PR and between 3545d1f and 86c1d2b.

📒 Files selected for processing (4)
  • docs/concepts/models.md
  • slayer/engine/query_engine.py
  • tests/test_mcp_server.py
  • tests/test_query_backed_models.py

Cover the MCP write paths (`create_model(query=...)` and
`edit_model(source_queries=...)`) end-to-end: assert the persisted
model has `backing_query_sql` populated. Read paths no longer warm
the cache after #74, so a regression in slayer/mcp/server.py that
bypassed `engine.save_model` would otherwise silently leave the
cache empty without any test catching it.

Addresses CodeRabbit nitpick on PR #79.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented May 3, 2026

@ZmeiGorynych ZmeiGorynych merged commit 84da94a into main May 3, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant