refactor: improve error messages and metadata handling by Kamilbenkirane · Pull Request #39 · withceleste/celeste-python

Kamilbenkirane · 2025-11-16T20:50:08Z

Summary

This PR improves error message formatting and refactors metadata handling across the codebase.

Changes

Error Messages: Updated constraint error messages to include length and value details for better debugging
Test Fixes: Fixed test expectations to match improved error message format
Metadata Handling: Refactored metadata handling to filter content fields before storing raw_response
Provider Clients: Added content field filtering in provider clients (Anthropic, Cohere, Google, Mistral, OpenAI)
Code Quality: Improved error message formatting using !r for better debugging and cleaned up comments

Testing

✅ All unit tests pass (288 tests)
✅ All integration tests pass (13 tests)
✅ CI pipeline passes (linting, formatting, type checking, security scan)

Impact

Better error messages for debugging constraint violations
Cleaner metadata storage by filtering content fields
Improved code consistency and maintainability

- Update constraint error messages to include length and value details - Fix test expectations to match improved error message format - Refactor metadata handling to filter content fields before storing raw_response - Add content field filtering in provider clients (Anthropic, Cohere, Google, Mistral, OpenAI) - Improve error message formatting using !r for better debugging - Clean up comments and improve code consistency

claude · 2025-11-16T20:51:45Z

Pull Request Review

This PR makes solid improvements to error messaging and metadata handling. The changes are well-tested and the CI passes. Here's my detailed feedback:

✅ Strengths

Improved Error Messages - Adding !r for repr formatting and including actual values/lengths makes debugging significantly easier:
- src/celeste/constraints.py:65,82,119,120,208 - Excellent use of !r to show exact values including quotes/types
Consistent Metadata Handling - The refactor to filter content fields before storing raw_response is architecturally sound:
- Prevents duplicate storage of large content data
- Applied consistently across all 5 text-gen providers and Google image-gen
Test Coverage - Tests updated to match new error format (tests/unit_tests/test_constraints.py:225,234)
Clean Code Quality - Comment cleanup removes unnecessary noise without losing clarity

🔍 Potential Issues

1. Incomplete Metadata Refactoring

Severity: Medium

The _build_metadata refactoring creates an inconsistency:

# packages/text-generation/src/celeste_text_generation/client.py:66-72
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    metadata["raw_response"] = (
        response_data  # Complete raw response (providers filter content fields)
    )
    return metadata

Problem: The comment says "providers filter content fields" but response_data here is the unfiltered original response. The filtering happens in provider subclasses before calling super()._build_metadata(), so the filtered data is already processed by the base class.

Impact:

The raw_response stored contains full content fields (duplicating data)
Comment is misleading about what's actually stored
Same issue in packages/image-generation/src/celeste_image_generation/client.py:66-72

Recommendation:

# Option 1: Store the filtered data that comes from super()
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    # Note: Providers should pass filtered response_data
    metadata["raw_response"] = response_data
    return metadata

# Option 2: Document that raw_response contains full response
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    metadata["raw_response"] = response_data  # Unfiltered complete response
    return metadata

The current implementation appears to defeat the purpose of filtering in providers since the unfiltered response is stored anyway.

2. Removed Finish Reason Logic

Severity: Medium

In packages/image-generation/src/celeste_image_generation/client.py:66-72, the old code had conditional finish_reason parsing:

# OLD:
if "finish_reason" not in metadata:
    finish_reason = self._parse_finish_reason(response_data)
    if finish_reason is not None:
        metadata["finish_reason"] = finish_reason

This was removed entirely. Looking at the base class src/celeste/client.py:70, finish_reason is handled there:

return self._output_class()(
    finish_reason=self._parse_finish_reason(response_data),
    ...
)

Questions:

Is finish_reason handling now fully delegated to the base class?
Was the conditional logic unnecessary?
Does Google's image gen provider still override _build_metadata correctly to parse finish_reason from the full response before filtering?

Recommendation: Verify that finish_reason is still correctly parsed for image generation, especially for Google provider which has custom logic in packages/image-generation/src/celeste_image_generation/providers/google/client.py:90-98.

3. Minor: Inconsistent Precision

Severity: Low

packages/image-generation/src/celeste_image_generation/constraints.py:64

f"[{self.min_aspect_ratio:.3f}, {self.max_aspect_ratio:.3f}]"

Good catch fixing the inconsistent .1f to .3f! But this raises a question: should all range outputs use consistent precision? Check if other Range constraints would benefit from similar precision alignment.

🎯 Code Quality Observations

Type Annotation (packages/text-generation/src/celeste_text_generation/client.py:53)
```
prompt: str | None = parameters.get("prompt")
```
Nice improvement - explicit type hint clarifies intent.
Comment Cleanup - Removing redundant comments like # Register provider-specific clients and # Import after register_package is defined to avoid circular imports improves readability without losing essential information.
Docstring Simplification - Trimmed verbose explanations (e.g., packages/image-generation/src/celeste_image_generation/io.py:19,27) while keeping essential info.

🔒 Security & Performance

Security: ✅ No concerns

No external input handling changes
No new dependencies
Error messages appropriately expose validation details without leaking sensitive data

Performance: ✅ No concerns

Filtering content before metadata storage could reduce memory footprint (if working as intended)
!r repr formatting has negligible overhead
No algorithmic changes

🧪 Testing

Coverage: ✅ Excellent

✅ 288 unit tests pass
✅ 13 integration tests pass
✅ CI pipeline passes (linting, type checking, security)
✅ Test expectations updated to match new error format

Suggestions:

Add a test verifying that metadata["raw_response"] does NOT contain content fields to validate the filtering approach
Consider adding a test for the improved error messages showing actual values

📋 Recommendations Summary

Must Address:

Clarify metadata handling - ensure raw_response actually contains filtered data as intended, or update comments/implementation

Should Address:
2. Verify finish_reason handling still works correctly for image generation after refactoring
3. Add test for raw_response content field filtering

Nice to Have:
4. Consistent precision formatting across all constraint error messages

🎖️ Overall Assessment

Rating: Approve with Minor Changes

This is a solid refactoring that improves developer experience with better error messages and cleaner code. The architecture is sound, tests pass, and the changes are well-scoped. The main concern is ensuring the metadata filtering actually works as documented - this appears to be a logic issue where unfiltered data is stored despite the filtering effort in providers.

Great work on the comprehensive testing and attention to error message quality! 🚀

Generated by Claude Code

Remove comment that incorrectly implied raw_response contains complete unfiltered response. The response_data parameter is already filtered by providers before calling super()._build_metadata().

claude · 2025-11-16T20:59:15Z

PR Review: Improve Error Messages and Metadata Handling

Summary

This PR makes two main improvements: enhancing error messages with better debugging information and refactoring metadata handling to filter content fields from raw_response. Overall, the changes are well-implemented and improve code quality.

Strengths

Better Error Messages: The addition of repr formatting in constraint error messages significantly improves debuggability - Including actual values in error messages will make debugging much easier for users
Consistent Metadata Handling: The refactoring to filter content fields before storing in raw_response is architecturally sound - Prevents duplication since content is already parsed and stored separately - Applied consistently across all provider clients
Code Cleanup: Removing redundant comments improves code readability
Type Safety: Good explicit type annotation in packages/text-generation/src/celeste_text_generation/client.py:53
Test Coverage: Tests were properly updated to match the new error message format

Issues and Concerns

1. Inconsistent Metadata Filtering Logic (Medium Priority)

The metadata filtering implementation may not work as intended. The base TextGenerationClient._build_metadata() receives response_data from the generate() method which is the unfiltered response. It then stores this unfiltered data as raw_response. The provider clients filter the data before calling super(), but this only affects what the base Client._build_metadata() sees, not what TextGenerationClient._build_metadata() stores.

Impact: The raw_response in metadata may still contain content fields, defeating the purpose of the refactoring.

Recommendation: Verify that the filtering actually works end-to-end. Consider storing the filtered data explicitly in provider clients after calling super().

2. Finish Reason Handling Removed (Medium Priority)

In packages/image-generation/src/celeste_image_generation/client.py:66-72, the previous logic that parsed finish_reason has been completely removed. The finish_reason is no longer being stored in metadata. Looking at the generate() method in src/celeste/client.py:70, the finish reason is passed to the Output constructor, but it's not clear if it's also needed in metadata for some providers or use cases.

Questions: Was finish_reason in metadata redundant? Are there any consumers of metadata that expect finish_reason?

Security Considerations

No security concerns identified
The filtering of content from metadata actually improves security by reducing data exposure
No changes to authentication, API keys, or sensitive data handling

Performance Considerations

The metadata filtering adds minimal overhead (single dict comprehension per request)
Smaller metadata objects may provide minor memory benefits
No performance concerns identified

Testing

All unit tests pass (288 tests)
All integration tests pass (13 tests)
CI pipeline passes
Test expectations properly updated

Suggestion: Consider adding a test to verify that content fields are actually filtered from metadata raw_response in the provider clients.

Recommendations

High Priority: Verify metadata filtering actually works end-to-end

Medium Priority: Document or verify finish_reason removal - ensure this is not a breaking change

Low Priority: Improve or remove redundant comments

Overall Assessment

Code Quality: 4/5
Test Coverage: 5/5
Documentation: 3/5
Architecture: 4/5

Recommendation: Approve with minor revisions - The changes are solid, but please verify the metadata filtering is working as intended. The error message improvements are excellent and ready to merge.

Great work on improving the debugging experience!

refactor: remove misleading comment about raw_response

60f9f67

Remove comment that incorrectly implied raw_response contains complete unfiltered response. The response_data parameter is already filtered by providers before calling super()._build_metadata().

Kamilbenkirane merged commit 21de831 into main Nov 16, 2025
9 of 10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: improve error messages and metadata handling#39

refactor: improve error messages and metadata handling#39
Kamilbenkirane merged 2 commits intomainfrom
refactor/metadata-handling-and-error-messages

Kamilbenkirane commented Nov 16, 2025

Uh oh!

claude Bot commented Nov 16, 2025

Uh oh!

Uh oh!

claude Bot commented Nov 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kamilbenkirane commented Nov 16, 2025

Summary

Changes

Testing

Impact

Uh oh!

claude Bot commented Nov 16, 2025

Pull Request Review

✅ Strengths

🔍 Potential Issues

1. Incomplete Metadata Refactoring

2. Removed Finish Reason Logic

3. Minor: Inconsistent Precision

🎯 Code Quality Observations

🔒 Security & Performance

🧪 Testing

📋 Recommendations Summary

🎖️ Overall Assessment

Uh oh!

Uh oh!

claude Bot commented Nov 16, 2025

PR Review: Improve Error Messages and Metadata Handling

Summary

Strengths

Issues and Concerns

1. Inconsistent Metadata Filtering Logic (Medium Priority)

2. Finish Reason Handling Removed (Medium Priority)

Security Considerations

Performance Considerations

Testing

Recommendations

Overall Assessment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant