Skip to content

refactor: improve error messages and metadata handling#39

Merged
Kamilbenkirane merged 2 commits intomainfrom
refactor/metadata-handling-and-error-messages
Nov 16, 2025
Merged

refactor: improve error messages and metadata handling#39
Kamilbenkirane merged 2 commits intomainfrom
refactor/metadata-handling-and-error-messages

Conversation

@Kamilbenkirane
Copy link
Copy Markdown
Member

Summary

This PR improves error message formatting and refactors metadata handling across the codebase.

Changes

  • Error Messages: Updated constraint error messages to include length and value details for better debugging
  • Test Fixes: Fixed test expectations to match improved error message format
  • Metadata Handling: Refactored metadata handling to filter content fields before storing raw_response
  • Provider Clients: Added content field filtering in provider clients (Anthropic, Cohere, Google, Mistral, OpenAI)
  • Code Quality: Improved error message formatting using !r for better debugging and cleaned up comments

Testing

  • ✅ All unit tests pass (288 tests)
  • ✅ All integration tests pass (13 tests)
  • ✅ CI pipeline passes (linting, formatting, type checking, security scan)

Impact

  • Better error messages for debugging constraint violations
  • Cleaner metadata storage by filtering content fields
  • Improved code consistency and maintainability

- Update constraint error messages to include length and value details
- Fix test expectations to match improved error message format
- Refactor metadata handling to filter content fields before storing raw_response
- Add content field filtering in provider clients (Anthropic, Cohere, Google, Mistral, OpenAI)
- Improve error message formatting using !r for better debugging
- Clean up comments and improve code consistency
@claude
Copy link
Copy Markdown

claude Bot commented Nov 16, 2025

Pull Request Review

This PR makes solid improvements to error messaging and metadata handling. The changes are well-tested and the CI passes. Here's my detailed feedback:


Strengths

  1. Improved Error Messages - Adding !r for repr formatting and including actual values/lengths makes debugging significantly easier:

    • src/celeste/constraints.py:65,82,119,120,208 - Excellent use of !r to show exact values including quotes/types
  2. Consistent Metadata Handling - The refactor to filter content fields before storing raw_response is architecturally sound:

    • Prevents duplicate storage of large content data
    • Applied consistently across all 5 text-gen providers and Google image-gen
  3. Test Coverage - Tests updated to match new error format (tests/unit_tests/test_constraints.py:225,234)

  4. Clean Code Quality - Comment cleanup removes unnecessary noise without losing clarity


🔍 Potential Issues

1. Incomplete Metadata Refactoring

Severity: Medium

The _build_metadata refactoring creates an inconsistency:

# packages/text-generation/src/celeste_text_generation/client.py:66-72
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    metadata["raw_response"] = (
        response_data  # Complete raw response (providers filter content fields)
    )
    return metadata

Problem: The comment says "providers filter content fields" but response_data here is the unfiltered original response. The filtering happens in provider subclasses before calling super()._build_metadata(), so the filtered data is already processed by the base class.

Impact:

  • The raw_response stored contains full content fields (duplicating data)
  • Comment is misleading about what's actually stored
  • Same issue in packages/image-generation/src/celeste_image_generation/client.py:66-72

Recommendation:

# Option 1: Store the filtered data that comes from super()
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    # Note: Providers should pass filtered response_data
    metadata["raw_response"] = response_data
    return metadata

# Option 2: Document that raw_response contains full response
def _build_metadata(self, response_data: dict[str, Any]) -> dict[str, Any]:
    metadata = super()._build_metadata(response_data)
    metadata["raw_response"] = response_data  # Unfiltered complete response
    return metadata

The current implementation appears to defeat the purpose of filtering in providers since the unfiltered response is stored anyway.

2. Removed Finish Reason Logic

Severity: Medium

In packages/image-generation/src/celeste_image_generation/client.py:66-72, the old code had conditional finish_reason parsing:

# OLD:
if "finish_reason" not in metadata:
    finish_reason = self._parse_finish_reason(response_data)
    if finish_reason is not None:
        metadata["finish_reason"] = finish_reason

This was removed entirely. Looking at the base class src/celeste/client.py:70, finish_reason is handled there:

return self._output_class()(
    finish_reason=self._parse_finish_reason(response_data),
    ...
)

Questions:

  • Is finish_reason handling now fully delegated to the base class?
  • Was the conditional logic unnecessary?
  • Does Google's image gen provider still override _build_metadata correctly to parse finish_reason from the full response before filtering?

Recommendation: Verify that finish_reason is still correctly parsed for image generation, especially for Google provider which has custom logic in packages/image-generation/src/celeste_image_generation/providers/google/client.py:90-98.

3. Minor: Inconsistent Precision

Severity: Low

packages/image-generation/src/celeste_image_generation/constraints.py:64

f"[{self.min_aspect_ratio:.3f}, {self.max_aspect_ratio:.3f}]"

Good catch fixing the inconsistent .1f to .3f! But this raises a question: should all range outputs use consistent precision? Check if other Range constraints would benefit from similar precision alignment.


🎯 Code Quality Observations

  1. Type Annotation (packages/text-generation/src/celeste_text_generation/client.py:53)

    prompt: str | None = parameters.get("prompt")

    Nice improvement - explicit type hint clarifies intent.

  2. Comment Cleanup - Removing redundant comments like # Register provider-specific clients and # Import after register_package is defined to avoid circular imports improves readability without losing essential information.

  3. Docstring Simplification - Trimmed verbose explanations (e.g., packages/image-generation/src/celeste_image_generation/io.py:19,27) while keeping essential info.


🔒 Security & Performance

Security: ✅ No concerns

  • No external input handling changes
  • No new dependencies
  • Error messages appropriately expose validation details without leaking sensitive data

Performance: ✅ No concerns

  • Filtering content before metadata storage could reduce memory footprint (if working as intended)
  • !r repr formatting has negligible overhead
  • No algorithmic changes

🧪 Testing

Coverage: ✅ Excellent

  • ✅ 288 unit tests pass
  • ✅ 13 integration tests pass
  • ✅ CI pipeline passes (linting, type checking, security)
  • ✅ Test expectations updated to match new error format

Suggestions:

  1. Add a test verifying that metadata["raw_response"] does NOT contain content fields to validate the filtering approach
  2. Consider adding a test for the improved error messages showing actual values

📋 Recommendations Summary

Must Address:

  1. Clarify metadata handling - ensure raw_response actually contains filtered data as intended, or update comments/implementation

Should Address:
2. Verify finish_reason handling still works correctly for image generation after refactoring
3. Add test for raw_response content field filtering

Nice to Have:
4. Consistent precision formatting across all constraint error messages


🎖️ Overall Assessment

Rating: Approve with Minor Changes

This is a solid refactoring that improves developer experience with better error messages and cleaner code. The architecture is sound, tests pass, and the changes are well-scoped. The main concern is ensuring the metadata filtering actually works as documented - this appears to be a logic issue where unfiltered data is stored despite the filtering effort in providers.

Great work on the comprehensive testing and attention to error message quality! 🚀


Generated by Claude Code

Remove comment that incorrectly implied raw_response contains complete
unfiltered response. The response_data parameter is already filtered
by providers before calling super()._build_metadata().
@Kamilbenkirane Kamilbenkirane merged commit 21de831 into main Nov 16, 2025
9 of 10 checks passed
@claude
Copy link
Copy Markdown

claude Bot commented Nov 16, 2025

PR Review: Improve Error Messages and Metadata Handling

Summary

This PR makes two main improvements: enhancing error messages with better debugging information and refactoring metadata handling to filter content fields from raw_response. Overall, the changes are well-implemented and improve code quality.

Strengths

  1. Better Error Messages: The addition of repr formatting in constraint error messages significantly improves debuggability - Including actual values in error messages will make debugging much easier for users

  2. Consistent Metadata Handling: The refactoring to filter content fields before storing in raw_response is architecturally sound - Prevents duplication since content is already parsed and stored separately - Applied consistently across all provider clients

  3. Code Cleanup: Removing redundant comments improves code readability

  4. Type Safety: Good explicit type annotation in packages/text-generation/src/celeste_text_generation/client.py:53

  5. Test Coverage: Tests were properly updated to match the new error message format

Issues and Concerns

1. Inconsistent Metadata Filtering Logic (Medium Priority)

The metadata filtering implementation may not work as intended. The base TextGenerationClient._build_metadata() receives response_data from the generate() method which is the unfiltered response. It then stores this unfiltered data as raw_response. The provider clients filter the data before calling super(), but this only affects what the base Client._build_metadata() sees, not what TextGenerationClient._build_metadata() stores.

Impact: The raw_response in metadata may still contain content fields, defeating the purpose of the refactoring.

Recommendation: Verify that the filtering actually works end-to-end. Consider storing the filtered data explicitly in provider clients after calling super().

2. Finish Reason Handling Removed (Medium Priority)

In packages/image-generation/src/celeste_image_generation/client.py:66-72, the previous logic that parsed finish_reason has been completely removed. The finish_reason is no longer being stored in metadata. Looking at the generate() method in src/celeste/client.py:70, the finish reason is passed to the Output constructor, but it's not clear if it's also needed in metadata for some providers or use cases.

Questions: Was finish_reason in metadata redundant? Are there any consumers of metadata that expect finish_reason?

Security Considerations

  • No security concerns identified
  • The filtering of content from metadata actually improves security by reducing data exposure
  • No changes to authentication, API keys, or sensitive data handling

Performance Considerations

  • The metadata filtering adds minimal overhead (single dict comprehension per request)
  • Smaller metadata objects may provide minor memory benefits
  • No performance concerns identified

Testing

  • All unit tests pass (288 tests)
  • All integration tests pass (13 tests)
  • CI pipeline passes
  • Test expectations properly updated

Suggestion: Consider adding a test to verify that content fields are actually filtered from metadata raw_response in the provider clients.

Recommendations

High Priority: Verify metadata filtering actually works end-to-end

Medium Priority: Document or verify finish_reason removal - ensure this is not a breaking change

Low Priority: Improve or remove redundant comments

Overall Assessment

Code Quality: 4/5
Test Coverage: 5/5
Documentation: 3/5
Architecture: 4/5

Recommendation: Approve with minor revisions - The changes are solid, but please verify the metadata filtering is working as intended. The error message improvements are excellent and ready to merge.

Great work on improving the debugging experience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant