-
Notifications
You must be signed in to change notification settings - Fork 0
Description
🚀 Feature Summary
Replace the current jsonschema dependency with marshmallow for data validation across the Terranova project.
🤔 Problem Statement
The project currently uses jsonschema (v4.23.0+) for JSON schema validation, specifically in the manifest validation logic. While jsonschema is a solid choice for JSON Schema validation, marshmallow offers:
- Better integration with Python dataclasses: The project already uses dataclasses extensively with the
@serdedecorator for data serialization/deserialization - Simpler validation API: Marshmallow provides a more Pythonic and intuitive interface compared to jsonschema
- Built-in field validation: Direct validation methods tied to specific fields rather than generic schema validation
- Schema composition: Better support for nested and composed schemas, reducing boilerplate
- Error messages: More structured and user-friendly validation error messages
- Dependency consolidation: Potential to leverage marshmallow alongside existing serialization patterns
💡 Proposed Solution
-
Identify all jsonschema usages: Currently used in
src/terranova/resources.pyfor manifest validation (imports on lines 28-29, usage on lines 186-188) -
Create marshmallow schemas: Convert existing JSON schemas to marshmallow schema definitions, maintaining the same validation rules
-
Update validation logic: Replace
jsonschema.validators.validate()calls with marshmallow's load/validate methods -
Update dependencies:
- Remove
jsonschema>=4.23.0frompyproject.toml - Add
marshmallow>=3.20.0(or latest stable version)
- Remove
-
Update imports: Change imports from jsonschema to marshmallow in affected files
-
Maintain backward compatibility: Ensure validation behavior remains identical - same validation rules, same error handling
-
Add tests: Ensure existing test coverage covers all validation scenarios
🔄 Alternatives Considered
- Keep jsonschema: Maintain status quo, but misses opportunity for better integration with existing code patterns
- Use Pydantic: Another popular validation library, but would require more significant refactoring
- Use attrs: Similar to dataclasses but adds significant complexity without clear benefit
📈 Impact
- Code simplification: Reduce dependency count and align validation with dataclass usage patterns
- Improved maintainability: More consistent with project architecture (already using serde and dataclasses)
- Better error handling: More detailed and structured validation errors
- Future-proofing: Marshmallow integrates better with modern Python ecosystem tools
📝 Acceptance Criteria
- All jsonschema imports removed from codebase
- Marshmallow schemas created for all validation scenarios
- Manifest validation works identically to current implementation
- All existing tests pass without modification
- New tests added for marshmallow validation
- Dependencies updated in pyproject.toml
- Documentation updated if needed
- No breaking changes to public APIs
📝 Additional Context
Current Usage:
- File:
src/terranova/resources.py - Method:
ResourcesManifest.load() - Validation: JSON schema validation of manifest data
- Exception Handling:
ValidationErrorfromjsonschema.exceptions
Implementation Notes:
- The project uses
pyserdefor serialization/deserialization - Dataclasses are decorated with
@serde - Error handling should preserve exception types or map them appropriately
- Consider potential performance implications of the migration
Related: This change should be coordinated with the existing serialization strategy using the @serde decorator and pyserde package.