Summary
Pydantic is currently installed but completely unused. All models use plain @dataclass. This issue tracks full implementation of Pydantic v2 for proper validation, serialization, and future ontology support.
Why Pydantic?
- Validation: Type validation at runtime for API inputs and data loading
- Serialization: Clean JSON/dict serialization for future needs
- Schema generation: OpenAPI/JSON schema for tooling
- IDE support: Better autocomplete and type hints
- Future-proofing: Essential for ontology and classful object handling
Scope
Convert key models to Pydantic v2 BaseModel:
-
AST nodes (src/graphforge/ast/):
- Expression nodes with field validators
- Clause nodes with model validators
- Pattern nodes with constraint validation
-
Planner operators (src/graphforge/planner/operators.py):
- Add validators for operator constraints
- Validate operator composition rules
-
API inputs (src/graphforge/api.py):
- Validate query strings
- Validate transaction state
-
Dataset metadata (src/graphforge/datasets/base.py):
- DatasetInfo with URL validation
- Size constraints
-
Storage models:
- Node/edge property validation
- Label/type validation
Implementation Checklist
Testing
- Unit tests for each validator (target: 100% coverage)
- Integration tests for API validation
- Error message testing
- Performance baseline (ensure no significant overhead)
Estimated Effort
20-30 hours
Dependencies
pydantic>=2.6 (already in pyproject.toml)
Success Criteria
Part of larger LDBC dataset integration effort (#51).
Summary
Pydantic is currently installed but completely unused. All models use plain
@dataclass. This issue tracks full implementation of Pydantic v2 for proper validation, serialization, and future ontology support.Why Pydantic?
Scope
Convert key models to Pydantic v2
BaseModel:AST nodes (
src/graphforge/ast/):Planner operators (
src/graphforge/planner/operators.py):API inputs (
src/graphforge/api.py):Dataset metadata (
src/graphforge/datasets/base.py):Storage models:
Implementation Checklist
Testing
Estimated Effort
20-30 hours
Dependencies
pydantic>=2.6(already in pyproject.toml)Success Criteria
Part of larger LDBC dataset integration effort (#51).