S1: schema versioning + read-time migration chain (#51)#57
Conversation
Foundational, additive prep for the upcoming S3 schema redesign. Every persisted entity (SlayerModel, SlayerQuery, DatasourceConfig) now carries a version: int field, and a per-entity converter registry runs older dicts through a chain of pure dict->dict transforms before Pydantic validates them. The chain is empty today (current schema is v1), so this is a no-op functionally — but it puts the rails in place so S3 can ship without breaking on-disk files. The migration hook lives on the Pydantic class itself (@model_validator(mode="before")), not inside each storage backend's get_* method. As a result, every call site that does Model.model_validate(dict) — YAMLStorage, SQLiteStorage, third-party backends registered via register_storage(), the HTTP API, the MCP server, and the dbt importer — picks up migrations transparently with zero backend-side changes. - New module slayer/storage/migrations.py with the registry, the migrate() walker, and CURRENT_VERSIONS per entity. - Forward tolerance: dicts with a version higher than this SLayer knows about pass through untouched (Pydantic's default extra="ignore" handles unknown fields). - tests/test_migrations.py covers v1->v1 no-op, synthetic v0->v1 chains, forward tolerance, end-to-end migration through both YAMLStorage and SQLiteStorage, and round-trip version preservation. - docs/concepts/models.md gets a new Schema versioning section and a version row in the model fields table; CLAUDE.md gains a Key Conventions bullet pointing at it. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
📝 WalkthroughWalkthroughAdds schema versioning for persisted entities (SlayerModel, SlayerQuery, DatasourceConfig), a migrations registry module, and Pydantic pre-validation hooks that run migrations on load; documentation and tests for migration behaviors and storage round-trips are included. Changes
Sequence Diagram(s)sequenceDiagram
participant Client
participant Validator as Pydantic Validator<br/>(mode="before")
participant Migration as Migration System<br/>(slayer/storage/migrations.py)
participant Model as Pydantic Model
participant Storage
rect rgba(100,150,255,0.5)
Note over Client,Storage: Load flow (YAML/SQLite/API)
Client->>Storage: Request persisted payload
Storage-->>Validator: Raw dict input (version=N or missing)
Validator->>Migration: migrate(entity, data)
Migration->>Migration: Read data["version"] (default 1)
loop Until current version reached
Migration->>Migration: Apply converter vN → vN+1
Migration->>Migration: Set data["version"]=vN+1
end
Migration-->>Validator: Migrated dict (version=current)
Validator->>Model: Pydantic validation/create instance
Model-->>Client: Validated model instance
end
rect rgba(100,255,150,0.5)
Note over Client,Storage: Save flow
Client->>Model: Model instance to persist
Model->>Storage: Serialize with `version` included
Storage-->>Client: Persisted (YAML/SQLite contains version)
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Review rate limit: 4/5 reviews remaining, refill in 12 minutes. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
slayer/core/query.py (1)
306-309: Prefer keyword arguments in migration hook call.Use named args when calling
_migrate_schemahere as well, matching repository conventions.Proposed refactor
def _apply_schema_migrations(cls, data: Any) -> Any: - return _migrate_schema("SlayerQuery", data) + return _migrate_schema(entity="SlayerQuery", data=data)As per coding guidelines:
Use keyword arguments for functions with more than 1 parameter.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/core/query.py` around lines 306 - 309, The call in the model validator _apply_schema_migrations uses positional args for _migrate_schema; update the call to use keyword args to match repo conventions (pass name="SlayerQuery" and data=data) while keeping the method signature and return value the same so _apply_schema_migrations still returns the migrated data.slayer/core/models.py (1)
219-223: Use keyword arguments for_migrate_schema(...)calls.Both migration hook calls pass two arguments positionally; switch to keyword args for consistency/readability with project conventions.
Proposed refactor
def _apply_schema_migrations(cls, data: Any) -> Any: - return _migrate_schema("SlayerModel", data) + return _migrate_schema(entity="SlayerModel", data=data) @@ def _apply_schema_migrations_and_aliases(cls, data: Any) -> Any: - data = _migrate_schema("DatasourceConfig", data) + data = _migrate_schema(entity="DatasourceConfig", data=data)As per coding guidelines:
Use keyword arguments for functions with more than 1 parameter.Also applies to: 298-300
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/core/models.py` around lines 219 - 223, The _apply_schema_migrations validator is calling _migrate_schema with positional args; change that call to use keyword arguments (e.g., pass model_name="SlayerModel" and data=data) for readability and consistency with project conventions, and make the same change for the other migration hook that currently calls _migrate_schema positionally (the one around lines where the second validator is defined) so both usages of _migrate_schema use explicit keyword parameters.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@slayer/storage/migrations.py`:
- Around line 61-77: The migrate() function mutates the caller-supplied dict
(e.g., via setdefault), so make a shallow copy of the input data immediately on
entry (e.g., data = dict(data) or data.copy()) and perform all subsequent
reads/writes against that copy; keep the rest of the logic (version lookup via
CURRENT_VERSIONS, migration loop using _REGISTRY, calling fn, updating current
and data["version"]) unchanged so callers’ payloads are not mutated.
---
Nitpick comments:
In `@slayer/core/models.py`:
- Around line 219-223: The _apply_schema_migrations validator is calling
_migrate_schema with positional args; change that call to use keyword arguments
(e.g., pass model_name="SlayerModel" and data=data) for readability and
consistency with project conventions, and make the same change for the other
migration hook that currently calls _migrate_schema positionally (the one around
lines where the second validator is defined) so both usages of _migrate_schema
use explicit keyword parameters.
In `@slayer/core/query.py`:
- Around line 306-309: The call in the model validator _apply_schema_migrations
uses positional args for _migrate_schema; update the call to use keyword args to
match repo conventions (pass name="SlayerQuery" and data=data) while keeping the
method signature and return value the same so _apply_schema_migrations still
returns the migrated data.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: d233dfd5-a4e9-4784-8ea4-22cc7e57ec2b
📒 Files selected for processing (6)
CLAUDE.mddocs/concepts/models.mdslayer/core/models.pyslayer/core/query.pyslayer/storage/migrations.pytests/test_migrations.py
Avoid mutating caller-provided dicts in migrate() by copying on entry, and switch _migrate_schema() call sites to keyword arguments per repo convention. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
slayer/core/query.py (1)
301-301: Bind query default version to the central version registry.Line [301] hardcodes
1, which can silently drift fromCURRENT_VERSIONS["SlayerQuery"]after a future schema bump.Proposed change
-from slayer.storage.migrations import migrate as _migrate_schema +from slayer.storage.migrations import CURRENT_VERSIONS as _CURRENT_VERSIONS +from slayer.storage.migrations import migrate as _migrate_schema @@ - version: int = 1 + version: int = _CURRENT_VERSIONS["SlayerQuery"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/core/query.py` at line 301, The hardcoded default version "version: int = 1" should be bound to the central registry so it won't drift; change the default to use CURRENT_VERSIONS["SlayerQuery"] (ensure CURRENT_VERSIONS is imported or referenced where defined) and keep the type as int (e.g., version: int = CURRENT_VERSIONS["SlayerQuery"]). Update any import or module-qualified reference needed so the symbol resolves in slayer/core/query.py.slayer/storage/migrations.py (1)
27-49: Fail fast on invalid migration registrations.
register_migration()should reject unknown entities and invalid source versions at registration time (instead of failing later duringmigrate()).Proposed change
def register_migration( entity: str, source_version: int ) -> Callable[[Callable[[dict], dict]], Callable[[dict], dict]]: @@ + if entity not in CURRENT_VERSIONS: + raise KeyError(f"Unknown entity '{entity}' in register_migration()") + if source_version < 1: + raise ValueError("source_version must be >= 1") + def deco(fn: Callable[[dict], dict]) -> Callable[[dict], dict]: key = (entity, source_version)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@slayer/storage/migrations.py` around lines 27 - 49, register_migration currently defers validation until migrate() — update it to fail fast: inside deco (before the duplicate check) validate that entity is a known entity (e.g. exists in your canonical list like VALID_ENTITIES or _ENTITY_SCHEMAS) and that source_version is an integer >= 0 and within the allowed range (use the module's current/latest version mapping such as _LATEST_VERSIONS.get(entity) or add a VALID_ENTITIES/LATEST map if missing) and raise ValueError for unknown entities or out-of-range/invalid source_version; keep the existing duplicate-registration check on _REGISTRY afterwards.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@slayer/core/query.py`:
- Line 301: The hardcoded default version "version: int = 1" should be bound to
the central registry so it won't drift; change the default to use
CURRENT_VERSIONS["SlayerQuery"] (ensure CURRENT_VERSIONS is imported or
referenced where defined) and keep the type as int (e.g., version: int =
CURRENT_VERSIONS["SlayerQuery"]). Update any import or module-qualified
reference needed so the symbol resolves in slayer/core/query.py.
In `@slayer/storage/migrations.py`:
- Around line 27-49: register_migration currently defers validation until
migrate() — update it to fail fast: inside deco (before the duplicate check)
validate that entity is a known entity (e.g. exists in your canonical list like
VALID_ENTITIES or _ENTITY_SCHEMAS) and that source_version is an integer >= 0
and within the allowed range (use the module's current/latest version mapping
such as _LATEST_VERSIONS.get(entity) or add a VALID_ENTITIES/LATEST map if
missing) and raise ValueError for unknown entities or out-of-range/invalid
source_version; keep the existing duplicate-registration check on _REGISTRY
afterwards.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 1d4905ea-bf70-47fa-9f77-e69422f31e37
📒 Files selected for processing (4)
slayer/core/models.pyslayer/core/query.pyslayer/storage/migrations.pytests/test_migrations.py
🚧 Files skipped from review as they are similar to previous changes (2)
- slayer/core/models.py
- tests/test_migrations.py
Summary
version: int = 1field to every persisted entity (SlayerModel,SlayerQuery,DatasourceConfig) and a converter chain inslayer/storage/migrations.pythat upgrades older dicts before Pydantic validates them.@model_validator(mode="before")), not on individual storage backends. Every call site that doesModel.model_validate(dict)—YAMLStorage,SQLiteStorage, third-party backends registered viaregister_storage(), plus the HTTP API, MCP server, and dbt importer — picks up migrations transparently with zero backend-side changes.Design notes
get_*method. That would silently miss third-party backends and other entry points; instead the chain runs at the Pydantic-validation layer, which is the single point every persisted dict flows through. This is the same pattern as the existingDatasourceConfig._accept_user_alias(user→usernamerename), generalised.CURRENT_VERSIONSso each schema can evolve independently.versionthan this SLayer knows about pass through untouched (Pydantic's defaultextra="ignore"handles unknown fields).versiondefaults toCURRENT_VERSIONS[entity], somodel_dump(mode="json", exclude_none=True)always emits the current value.Test plan
tests/test_migrations.py— 17 new tests covering: v1→v1 no-op, synthetic v0→v1 chain ordering, forward tolerance, duplicate-registration rejection, end-to-end migration through bothYAMLStorageandSQLiteStorage, round-trip version preservation, and the existinguser→usernamealias surviving the new before-validator.poetry run pytest --ignore=tests/integration→ 944 passed.poetry run ruff check slayer/ tests/→ clean.docs/concepts/models.mdSchema versioning section reads correctly and the newversionrow in the model fields reference is in the right place.🤖 Generated with Claude Code
Summary by CodeRabbit
New Features
Documentation
Tests