feat: add model serialization with input types and constraint types by Kamilbenkirane · Pull Request #103 · withceleste/celeste-python

Kamilbenkirane · 2025-12-26T12:08:44Z

Summary

Add type computed_field to Constraint for JSON serialization
Add InputType enum (text, image, video, audio) to core
Add supported_input_types computed field to Model (per-capability)
Add optional_input_types computed field to Model (from constraints)
Add get_required_input_types() and get_constraint_input_type() helpers
Use SerializeAsAny[Constraint] for proper polymorphic serialization

This enables the API to return complete model metadata including:

What input types each capability requires (e.g., text-generation: text)
What optional inputs are accepted via parameters (e.g., reference_images)
Constraint type identifiers for frontend rendering

Test plan

All existing tests pass
CI passes (lint, format, typecheck, security, coverage)

- Add `type` computed_field to Constraint for JSON serialization - Add `InputType` enum (text, image, video, audio) to core - Add `supported_input_types` computed field to Model (per-capability) - Add `optional_input_types` computed field to Model (from constraints) - Add `get_required_input_types()` and `get_constraint_input_type()` helpers - Use `SerializeAsAny[Constraint]` for proper polymorphic serialization This enables the API to return complete model metadata including: - What input types each capability requires (e.g., text-generation: text) - What optional inputs are accepted via parameters (e.g., reference_images) - Constraint type identifiers for frontend rendering

claude · 2025-12-26T12:12:59Z

Pull Request Review: Model Serialization with Input Types and Constraint Types

Summary

This PR adds computed fields to Model and Constraint classes to enable proper JSON serialization with metadata about input types and constraint types. The implementation is well-designed and follows good software engineering practices.

Code Quality and Best Practices

Strengths

Well-designed API: The use of computed fields for supported_input_types and optional_input_types keeps the Model class clean and the data is derived automatically.
Good separation of concerns: Helper functions (get_required_input_types, get_constraint_input_type) are properly isolated in io.py.
Type safety: Excellent use of type hints throughout, including proper handling of generics and unions.
Clean introspection: The get_constraint_input_type function properly uses inspect.get_annotations with eval_str=True to handle stringified annotations.
Proper use of SerializeAsAny: Using SerializeAsAny[Constraint] for polymorphic serialization is the correct approach for Pydantic v2.

Areas for Improvement

1. Missing Tests (Priority: HIGH)

The PR adds significant new functionality but I don't see corresponding test coverage. Missing test cases include:

Test Model.supported_input_types computed field
Test Model.optional_input_types computed field
Test Constraint.type computed field for all constraint types
Test get_required_input_types() with different capabilities
Test get_constraint_input_type() with various constraint types
Test _extract_input_type() with unions, lists, and nested generics
Test serialization of Model to JSON includes new fields

Recommendation: Add a new test file tests/unit_tests/test_io.py or expand test_models.py with tests for the new computed fields and helper functions.

2. Potential Edge Cases in _extract_input_type (Priority: MEDIUM)

In io.py:95-116, the condition 'if origin is types.UnionType or origin is not None:' means ANY generic type (not just unions) will trigger recursion into args. While this works, it could lead to unexpected behavior with complex nested types.

Recommendation: Be more explicit about which types should be recursively examined.

3. Documentation (Priority: LOW)

The computed fields have docstrings, but they're quite brief. Consider adding examples showing what the return values look like.

4. Error Handling (Priority: LOW)

In get_required_input_types (io.py:75-92), if get_input_class raises a KeyError, the error will propagate. Consider whether this is the desired behavior or if it should return an empty set for unregistered capabilities.

Potential Bugs or Issues

Issue 1: KeyError for Unregistered Capabilities

Location: io.py:87, models.py:31

If a Model has a capability that hasn't had an Input class registered yet (e.g., during initialization), calling model.supported_input_types will raise a KeyError.

Scenario: A model with Capability.SEARCH (if no Input registered for SEARCH) will raise KeyError when accessing supported_input_types.

Recommendation: Add error handling in supported_input_types computed field to gracefully skip capabilities without registered Input classes, or document this requirement clearly.

Issue 2: Return Type Inconsistency

Location: io.py:119-136

The function get_constraint_input_type returns InputType | None. The function could return early on first match for better readability rather than continuing to iterate through all annotations.

Performance Considerations

Good Choices

Computed fields are cached by Pydantic, so repeated access to supported_input_types won't trigger re-computation.
Introspection happens at access time, not initialization, which is good for startup performance.

Minor Optimization

The optional_input_types property (models.py:35-42) iterates through all constraints. For models with many constraints, this is acceptable but worth noting.

Security Concerns

No Major Security Issues

The use of inspect.get_annotations(eval_str=True) is safe here since it's only evaluating type annotations from code within the package, not user input.
No external data is being processed without validation.
All new code follows the existing security patterns in the codebase.

Test Coverage

Critical Gap: Missing Test Coverage

Based on my review of the test files, there are NO tests for the new functionality.

Required tests include:

Constraint type serialization (test that each constraint type serializes with correct type field)
Model input types (test supported_input_types and optional_input_types)
Helper functions (test get_required_input_types, get_constraint_input_type, _extract_input_type)
Edge cases (empty capabilities, empty parameter_constraints)
Integration (test Model.model_dump() includes new computed fields)

Overall Assessment

This is a well-designed feature that follows good architectural principles. The code quality is high, and the implementation is clean. However, the PR should not be merged without addressing the test coverage gap.

Recommendations Before Merge:

MUST: Add comprehensive unit tests for all new functionality
SHOULD: Add defensive handling for unregistered capabilities in supported_input_types
SHOULD: Refine the type checking logic in _extract_input_type to be more explicit
NICE TO HAVE: Expand docstrings with examples

Score: 7.5/10

Code Quality: 9/10
Test Coverage: 2/10 (Critical gap)
Documentation: 7/10
Architecture: 9/10

Great work on the implementation! Once tests are added, this will be a solid contribution.

claude · 2025-12-26T12:14:32Z

Code Review - PR #103: Add Model Serialization with Input Types and Constraint Types

Summary

This PR adds computed fields to enable complete model metadata serialization, including input type requirements and constraint type identifiers. The implementation is well-structured and follows good architectural patterns.

✅ Strengths

Clean API Design: The computed fields (supported_input_types, optional_input_types) provide a clear, self-documenting API for model capabilities
Proper Separation of Concerns: Input type introspection logic is centralized in io.py rather than scattered across multiple files
Type Safety: Good use of type hints throughout, including proper handling of generic types and unions
Centralized Mapping: INPUT_TYPE_MAPPING provides a single source of truth for type-to-InputType mappings
Polymorphic Serialization: Correct use of SerializeAsAny[Constraint] to ensure constraint subclasses serialize properly

claude · 2025-12-26T12:14:44Z

🔍 Code Quality Issues

1. Missing Test Coverage ⚠️ HIGH PRIORITY

The PR adds significant new functionality but includes no tests for the new features:

Missing test coverage for:

get_required_input_types() - needs tests with various Input class configurations
get_constraint_input_type() - needs tests for all constraint types (ImageConstraint, ImagesConstraint, etc.)
_extract_input_type() - needs tests for union types, generics, nested types
Model.supported_input_types computed field - needs serialization tests
Model.optional_input_types computed field - needs tests with various constraint combinations
Constraint.type computed field - needs tests to verify it returns correct class names

Recommended tests (examples):

# tests/unit_tests/test_io.py (new file)
def test_get_required_input_types_text_generation():
    result = get_required_input_types(Capability.TEXT_GENERATION)
    assert result == {InputType.TEXT}

def test_get_constraint_input_type_image_constraint():
    constraint = ImageConstraint()
    result = get_constraint_input_type(constraint)
    assert result == InputType.IMAGE

claude · 2025-12-26T12:14:55Z

2. Potential Runtime Error in `get_required_input_types()` ⚠️ MEDIUM PRIORITY

Location: src/celeste/io.py:88-92

Issue: The function assumes all field annotations exist directly in INPUT_TYPE_MAPPING, but doesn't handle fields with Optional[...] or union types.

return {
    INPUT_TYPE_MAPPING[field.annotation]  # ❌ KeyError if annotation is Optional[str] or str | None
    for field in input_class.model_fields.values()
    if field.annotation in INPUT_TYPE_MAPPING
}

Example failure case:

class MyInput(Input):
    prompt: str | None  # This won't match INPUT_TYPE_MAPPING[str | None]

Recommended fix:

def get_required_input_types(capability: Capability) -> set[InputType]:
    input_class = get_input_class(capability)
    input_types: set[InputType] = set()
    
    for field in input_class.model_fields.values():
        input_type = _extract_input_type(field.annotation)
        if input_type is not None:
            input_types.add(input_type)
    
    return input_types

This leverages the existing _extract_input_type() helper which already handles unions and generics properly.

claude · 2025-12-26T12:15:04Z

3. Incomplete Type Handling in `_extract_input_type()` ⚠️ LOW PRIORITY

Location: src/celeste/io.py:110

Issue: The condition origin is types.UnionType or origin is not None is overly broad and may process types unexpectedly.

if origin is types.UnionType or origin is not None:  # ❌ Catches ALL generics
    for arg in get_args(param_type):
        result = _extract_input_type(arg)
        if result is not None:
            return result

Edge cases:

dict[str, ImageArtifact] → Would incorrectly return InputType.IMAGE
tuple[str, ImageArtifact] → Would incorrectly return InputType.TEXT

Recommended fix:

from typing import Union
import typing

# Handle Union types (including Optional)
if origin is types.UnionType or origin is Union:
    for arg in get_args(param_type):
        result = _extract_input_type(arg)
        if result is not None:
            return result

# Handle list/sequence types
elif origin in (list, typing.Sequence):
    args = get_args(param_type)
    if args:
        return _extract_input_type(args[0])

claude · 2025-12-26T12:15:16Z

🔒 Security Considerations

No security issues identified. The code:

Uses type introspection safely without eval() or dynamic code execution
Properly uses eval_str=True in inspect.get_annotations() for forward references (Python 3.10+)
Doesn't expose internal implementation details through serialization

⚡ Performance Considerations

Computed Fields May Be Expensive ℹ️ INFORMATIONAL

Location: src/celeste/models.py:27-42

The computed fields call get_required_input_types() for each capability on every serialization:

@computed_field
@property
def supported_input_types(self) -> dict[Capability, set[InputType]]:
    return {cap: get_required_input_types(cap) for cap in self.capabilities}  # Not cached

Impact:

For a model with 5 capabilities, this performs 5 Input class lookups and field introspections per serialization
If list_models() returns 50 models, that's 250+ introspection operations

Recommendation (future optimization):
Consider caching if performance becomes an issue using @lru_cache on get_required_input_types().

Note: This is not blocking for the current PR, but worth considering if serialization becomes a bottleneck.

claude · 2025-12-26T12:15:31Z

📝 Best Practices

Good:

✅ Version bump in pyproject.toml is appropriate for a feature addition
✅ Type ignore comments are justified (# type: ignore[prop-decorator] for computed_field)
✅ Proper use of __all__ exports
✅ Consistent naming conventions

Could Improve:

Add INPUT_TYPE_MAPPING to module docstring explaining its purpose
Consider making _extract_input_type() public if it's useful for extensions

📋 Recommendations

Before Merging (Required):

Add comprehensive test coverage for all new functions and computed fields
Fix get_required_input_types() to handle union types using _extract_input_type()
Add tests verifying JSON serialization includes the new computed fields

Future Improvements (Optional):

Consider stricter type matching in _extract_input_type() to avoid dict/tuple edge cases
Add performance benchmarks if model serialization becomes a hot path
Document the relationship between supported_input_types (from Input classes) vs optional_input_types (from constraints)

🎯 Verdict

Conditional Approval - The architecture and implementation are solid, but test coverage is critical before merging. The potential KeyError in get_required_input_types() should also be addressed.

Once tests are added and the union type handling is fixed, this will be a valuable addition to the API.

Great work on the overall design! The computed field approach provides a clean, discoverable API for consumers.

Kamilbenkirane added 2 commits December 26, 2025 13:08

chore: bump version to 0.3.8

6e84927

Kamilbenkirane merged commit 69ac0fe into main Dec 26, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add model serialization with input types and constraint types#103

feat: add model serialization with input types and constraint types#103
Kamilbenkirane merged 2 commits intomainfrom
feat/model-serialization-input-types

Kamilbenkirane commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

claude Bot commented Dec 26, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kamilbenkirane commented Dec 26, 2025

Summary

Test plan

Uh oh!

claude Bot commented Dec 26, 2025

Pull Request Review: Model Serialization with Input Types and Constraint Types

Summary

Code Quality and Best Practices

Strengths

Areas for Improvement

1. Missing Tests (Priority: HIGH)

2. Potential Edge Cases in _extract_input_type (Priority: MEDIUM)

3. Documentation (Priority: LOW)

4. Error Handling (Priority: LOW)

Potential Bugs or Issues

Issue 1: KeyError for Unregistered Capabilities

Issue 2: Return Type Inconsistency

Performance Considerations

Good Choices

Minor Optimization

Security Concerns

No Major Security Issues

Test Coverage

Critical Gap: Missing Test Coverage

Overall Assessment

Recommendations Before Merge:

Score: 7.5/10

Uh oh!

Uh oh!

claude Bot commented Dec 26, 2025

Code Review - PR #103: Add Model Serialization with Input Types and Constraint Types

Summary

✅ Strengths

Uh oh!

claude Bot commented Dec 26, 2025

🔍 Code Quality Issues

1. Missing Test Coverage ⚠️ HIGH PRIORITY

Uh oh!

claude Bot commented Dec 26, 2025

2. Potential Runtime Error in get_required_input_types() ⚠️ MEDIUM PRIORITY

Uh oh!

claude Bot commented Dec 26, 2025

3. Incomplete Type Handling in _extract_input_type() ⚠️ LOW PRIORITY

Uh oh!

claude Bot commented Dec 26, 2025

🔒 Security Considerations

⚡ Performance Considerations

Computed Fields May Be Expensive ℹ️ INFORMATIONAL

Uh oh!

claude Bot commented Dec 26, 2025

📝 Best Practices

Good:

Could Improve:

📋 Recommendations

Before Merging (Required):

Future Improvements (Optional):

🎯 Verdict

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

2. Potential Runtime Error in `get_required_input_types()` ⚠️ MEDIUM PRIORITY

3. Incomplete Type Handling in `_extract_input_type()` ⚠️ LOW PRIORITY