feat(virtual-entities): support plain BaseModel roots end-to-end (#87)#89
Merged
Conversation
Three orthogonal capabilities close Issue #87 — make plain pydantic.BaseModel subclasses first-class participants in NexusX resolution and ER visualization, eliminating the _subset_registry hack. 1. ErManager.add_virtual_entities([...]) New method on ErManager (loader/registry.py). Registers plain BaseModel classes as virtual entities in _registry. Validates: - entry must be a BaseModel subclass (TypeError otherwise) - entry must NOT be a SQLModel subclass (TypeError — SQLModel goes through __init__'s entities= / base=) - duplicate registration rejected (ValueError) - calling after first create_resolver() rejected (RuntimeError — registry is frozen at that point) 2. DefineSubset source widening DefineSubset.__subset__'s source element widens from type[SQLModel] to type[BaseModel] (subset.py). "Subset" means schema subset (selection of model_fields) — both SQLModel and BaseModel have well-defined schemas. The difference is purely about data provisioning: SQLModel via ORM (_orm_to_dto invoked), BaseModel via other channels (constructed directly by user). 3. Resolver unified source-resolution _scan_auto_load_fields and _get_loader gain a unified fallback (resolver.py, ~15 LOC): when get_subset_source(node_type) returns None, check whether node_type itself is registered in _registry (i.e., a plain BaseModel virtual root). The downstream _registry.get_relationships(source) is source-type-agnostic. ER/Voyager: - ErDiagram.from_er_manager(er) classmethod (er_diagram.py) — includes both SQLModel and virtual entities. Refactored shared _build() path skips sa_inspect() on BaseModel classes. - ErDiagramDotBuilder already iterates _registry uniformly; virtual entities appear as schema nodes automatically. Type widening only (relationship.py): - get_custom_relationships(entity: type[SQLModel]) → (entity: type) - Relationship.target_entity -> type[SQLModel] → type Tests: +35 (1060 passed, 6 skipped — 0 regressions) - tests/test_virtual_entities.py (22 tests): API contract (9), capability parity (9), regression invariants (4) - tests/test_definesubset_basemodel.py (6 tests): schema subsetting, DTO + registered source → auto-load fires - tests/test_virtual_entities_er.py (7 tests): ER diagram + Voyager DOT rendering with mixed SQLModel + virtual entities DefineSubset's public API shape is unchanged. ErManager.__init__'s base=/entities= requirement is unchanged. All 1025 prior tests still pass without modification. Design artifacts in specs/004-non-sqlmodel-roots/: - spec.md (17 FRs, 9 edge cases, 3 user stories) - plan.md / research.md (R1–R8 decisions recorded) - contracts/api.md (Contracts 1–6) - quickstart.md (11 scenarios + Coverage Matrix) - tasks.md (36 tasks, all [X]) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…on + official docs
Code:
- DOT/Voyager rendering now visually distinguishes virtual entities via
yellow fill (#FFF9C4), «virtual» stereotype prefix, and a dashed
cluster_virtual group (FR-009, Contract 3). signal via is_virtual flag
on ErDiagram.EntityInfo and voyager.SchemaNode.
- _orm_to_dto docstring clarifies BaseModel sources bypass it (T042).
Tests:
- TestVoyagerDotBuilderVisualDistinction asserts cluster_virtual,
«virtual» stereotype, FFF9C4 fill, and absence on SQLModel nodes.
- Strengthened zero-virtual regression: functional equivalence of
from_er_manager vs from_sqlmodel + DOT has no cluster_virtual when
no virtuals are registered.
Docs:
- New bilingual docs/guide/virtual_entities{,.zh}.md covering the
add_virtual_entities API + DefineSubset BaseModel widening + the
_subset_registry → official API migration.
- Migration sections added to docs/reference/migration{,.zh}.md.
- Index rows in docs/index{,.zh}.md.
Spec alignment:
- contracts/api.md Contract 3 + data-model.md updated to reflect the
chosen unified entities + is_virtual design (rather than a separate
virtual_entities field).
- quickstart.md S6 rewritten with the real API (from_er_manager +
ErDiagramDotBuilder) and the actual visual tokens.
- tasks.md Phase 7 + Phase 8 Convergence sections tracked.
Tests: 1066 passed, 6 skipped. Ruff src/: clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…rce path Six targeted cleanups after the convergence pass: 1. loader/registry.py — `get_relationships` and `get_all_entities` signatures widen from `type[SQLModel]` to `type[BaseModel]`. The registry already holds both kinds; the annotations were stale. 2. er_diagram.py — `from_sqlmodel` now raises TypeError up front when a non-SQLModel class is passed. Pre-feature code would have crashed later inside sa_inspect with NoInspectionAvailable; the silent "produce an empty shell" behavior introduced during #87 was neither safe nor documented. Error message points at `from_er_manager`. 3. resolver.py — extract `_resolve_source(node_type)` single helper. `_get_loader` and `_scan_auto_load_fields` previously each carried the same 4-line FR-017 fallback inline; both now call the helper. The unified principle is now enforceable in one place. 4. relationship.py — new `is_virtual_entity(cls)` helper. `er_diagram.py` and `voyager/er_diagram_dot.py` previously each computed `not issubclass(cls, SQLModel)` inline with subtle variations; both now call the canonical definition. 5. tests/test_virtual_entities.py — new `TestUnifiedSourceResolution` with 4 focused unit tests on `_resolve_source`: registered virtual root → self; DefineSubset DTO → source; unregistered BaseModel → None; consistent across both call sites. Replaces indirect coverage. 6. er_diagram.py — remove the unrequested `%% virtual non-SQLModel root` line from `to_mermaid()`. Contract 3 only specifies DOT visual distinction; Mermaid comment syntax in erDiagram is questionable; the line was never tested. Output regresses to minimum. Tests: 1070 passed, 6 skipped (+4 new unit tests, 0 regressions). Ruff src/: clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ared DTO type
When a custom-relationship loader returned SQLModel rows (or any BaseModel
that wasn't the field's declared DTO type), the resolver used the output
as-is. The check was ``isinstance(r, BaseModel)`` — and since SQLModel
inherits from BaseModel, source rows qualified as "already converted",
silently skipping projection. The field annotation ``list[AgentDTO]``
was a lie at runtime: it held ``_Agent`` instances, schema projection
was lost, and ``model_dump()`` leaked SQLModel-only fields.
Fix: tighten the check to ``isinstance(r, dto_cls)``. The resolver now
trusts loader output only when it actually matches the declared DTO
type (or a subclass). Anything else goes through ``_orm_to_dto`` —
the same conversion the SQLModel ORM-relationship path uses.
Behavior change for downstream users:
| Loader returns | Before | After |
| --------------------------------| ------------- | --------------- |
| dto_cls instance | as-is | as-is |
| dto_cls subclass instance | as-is | as-is |
| SQLModel source of dto_cls | as-is (BUG) | _orm_to_dto |
| Unrelated BaseModel class | as-is (BUG) | model_validate |
| dict / ORM row / non-BaseModel | _orm_to_dto | _orm_to_dto |
The "unrelated BaseModel class" case used to silently work via duck
typing; it now correctly raises (pydantic v2 model_validate refuses
cross-class). Callers relying on that shape must either declare the
actual return type as the field type, or have the loader return the
declared type.
Tests:
- New TestCustomRelationshipAutoConversion (4 tests) pins the fix via
Given/When/Then narrative:
* field type matches target → no conversion (baseline)
* field=DTO, target=SQLModel, loader=SQLModel → field holds DTO
* DTO-only field survives model_dump()
* __subset__-excluded field is hasattr=False
- Fixed test_virtual_to_virtual_traversal: the loader used to return a
locally-scoped _Inner class (different from field type Inner) and
passed only via duck typing. Now returns the declared type directly.
Tests: 1074 passed, 6 skipped (+1 new test class, 0 regressions).
Ruff src/: clean.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… source A DTO sourcing from a plain BaseModel had a silent failure mode: if the user excluded the fk field from __subset__, the relationship load returned empty without any error. The loader was never called because ``getattr(dto, fk_field, None)`` returned None. Root cause: the existing auto-include logic in ``_resolve_subset_info`` only knew how to detect fk fields from SQLAlchemy metadata (``Field(primary_key=True)`` / ``Field(foreign_key=...)``). Plain BaseModel sources don't have that metadata, so detection returned [] and auto-include never fired. Fix: add a third auto-include pass that reads fk field names directly from ``__relationships__`` declarations on the source. The user wrote ``Relationship(fk="key", ...)`` — that's the BaseModel-source equivalent of ``Field(foreign_key=...)``, and the framework now treats it the same way. Behavior parity with SQLModel FK auto-include: | User __subset__ | model_fields | model_dump | relationship load | | ------------------------| ------------ | ---------- | ----------------- | | excludes fk | auto-adds fk | excludes | works | | lists fk explicitly | as-is | includes | works | | omit_fields=[fk] | absent | n/a | silent fail (user choice) | The fix is symmetric for SQLModel sources too — it covers the case where ``__relationships__`` uses a non-FK column as fk (previously silent fail; now auto-included). Tests: new TestBaseModelSourceFkAutoInclude (5 tests) pins the four branches + end-to-end resolve. 1079 passed, 6 skipped (+5 new, 0 regressions). Ruff src/: clean. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… (Edge B) Spec Edge Case B requires a clear error when a plain BaseModel with __relationships__ is resolved without add_virtual_entities. Previously _resolve_source silently returned None, skipping auto-load. Also fixes the _orm_to_dto docstring to describe the actual call condition (loader output projection) rather than the misleading "only for SQLModel sources". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…angelog The public API surface for non-SQLModel roots was missing from the API reference (api_core.md) and had no changelog entry. Adds the method contract + validation table to api_core, widens the DefineSubset note to mention BaseModel sources, and adds an Unreleased changelog section covering add_virtual_entities / DefineSubset widening / ER visual distinction / unified source resolution. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Three orthogonal capabilities close Issue #87 — make plain
pydantic.BaseModelsubclasses first-class participants in NexusX resolution and ER visualization, eliminating the_subset_registryhack.ErManager.add_virtual_entities([...])— new method onErManager. Registers plain BaseModel classes as virtual entities in_registry. Validates: must be a BaseModel (TypeError), must NOT be a SQLModel (TypeError — SQLModel goes through__init__), duplicates rejected (ValueError), calls after firstcreate_resolver()rejected (RuntimeError — registry frozen).DefineSubset.__subset__source widens fromtype[SQLModel]totype[BaseModel]. "Subset" = schema subset (selection ofmodel_fields); both SQLModel and BaseModel fit. Difference is data provisioning: SQLModel via ORM (_orm_to_dtoinvoked), BaseModel constructed directly by user._scan_auto_load_fieldsand_get_loadergain a unified fallback (~15 LOC): whenget_subset_source(node_type)returnsNone, check whethernode_typeitself is in_registry(plain BaseModel virtual root). Downstream_registry.get_relationships(source)is source-type-agnostic.Plus ER/Voyager rendering:
ErDiagram.from_er_manager(er)classmethod handles both SQLModel and virtual entities;ErDiagramDotBuilderalready iterates_registryuniformly so virtual entities appear as schema nodes automatically.Design artifacts (
specs/004-non-sqlmodel-roots/)spec.md— 17 FRs, 9 edge cases, 3 user storiesplan.md/research.md(R1–R8 decisions recorded)contracts/api.md— 6 public API contractsquickstart.md— 11 runnable scenarios + Coverage Matrixtasks.md— 36 tasks, all[X]Test plan
tasks.md)Test files:
tests/test_virtual_entities.py(22 tests): Layer 1 API contract (9), Layer 2 capability parity (9), Layer 3 invariants (4)tests/test_definesubset_basemodel.py(6 tests): schema subsetting, DTO + registered source → auto-load firestests/test_virtual_entities_er.py(7 tests): ER diagram + Voyager DOT rendering with mixed SQLModel + virtual entitiesSample test (Layer 1, the most representative):
Backward compatibility
ErManager.__init__signature unchanged (base=/entities=still required)DefineSubset.__subset__shape unchanged (just accepts wider source type)tests/test_resolver.py:22-32); this PR adds the registration mechanism and unified source-resolution🤖 Generated with Claude Code