refactor: migrate Anthropic to native structured outputs API and remove raw_event metadata by Kamilbenkirane · Pull Request #40 · withceleste/celeste-python

Kamilbenkirane · 2025-11-16T21:25:23Z

This PR includes two major improvements:

1. Anthropic Structured Outputs Migration

Migrate from tool-based structured outputs to Anthropic's native structured outputs API
Use output_format parameter with json_schema type instead of tools/tool_choice
Simplify streaming implementation by removing tool_use block tracking
Remove complex tool_use parsing logic (203 lines removed from streaming.py)
Add anthropic-beta header for structured-outputs-2025-11-13 feature
Update OutputSchemaMapper to use native output_format instead of tools
Remove OUTPUT_SCHEMA constraint from model definitions (now handled via API)

2. Remove raw_event Metadata

Remove metadata={"raw_event": event} from all streaming chunk creations
Violates metadata principle: metadata should not contain content fields
Reduces memory usage by not duplicating event data
Affects: OpenAI, Anthropic, XAI, Cohere, Mistral, Google text generation providers
Also removes from image intelligence and speech generation providers
ElevenLabs: removed raw_event wrapper but kept content_length metadata

Benefits

Cleaner codebase with 313 lines removed
Better adherence to metadata design principles
Reduced memory footprint in streaming scenarios
Simpler Anthropic integration using official API

Testing

✅ All CI checks passed (linting, formatting, type checking, security, tests)
✅ 288 tests passed with 84% coverage

…ve raw_event metadata This commit includes two major improvements: 1. Anthropic Structured Outputs Migration: - Migrate from tool-based structured outputs to Anthropic's native structured outputs API - Use output_format parameter with json_schema type instead of tools/tool_choice - Simplify streaming implementation by removing tool_use block tracking - Remove complex tool_use parsing logic (203 lines removed from streaming.py) - Add anthropic-beta header for structured-outputs-2025-11-13 feature - Update OutputSchemaMapper to use native output_format instead of tools - Remove OUTPUT_SCHEMA constraint from model definitions (now handled via API) 2. Remove raw_event Metadata: - Remove metadata={"raw_event": event} from all streaming chunk creations - Violates metadata principle: metadata should not contain content fields - Reduces memory usage by not duplicating event data - Affects: OpenAI, Anthropic, XAI, Cohere, Mistral, Google text generation providers - Also removes from image intelligence and speech generation providers - ElevenLabs: removed raw_event wrapper but kept content_length metadata Benefits: - Cleaner codebase with 313 lines removed - Better adherence to metadata design principles - Reduced memory footprint in streaming scenarios - Simpler Anthropic integration using official API

claude · 2025-11-16T21:27:44Z

Pull Request Review: Migration to Anthropic Native Structured Outputs API

This PR represents a significant refactoring that modernizes the Anthropic integration and improves code quality. Overall, the changes are well-executed with strong benefits. Here's my detailed review:

✅ Code Quality and Best Practices

Excellent Improvements:

Massive Code Simplification: Removing 313 lines while adding only 70 is impressive - this represents a 77% reduction in changed areas
Cleaner Architecture: Migration from tool-based workaround to native API is the right approach
Better Separation of Concerns: The new _ensure_additional_properties method properly handles schema requirements
Proper Error Handling: Removed complex tool_use parsing logic that was error-prone
Good Documentation: Docstrings are clear and explain the new approach well

Minor Observations:

packages/text-generation/src/celeste_text_generation/providers/anthropic/parameters.py:156-182: The _ensure_additional_properties method recursively processes schemas - consider adding a depth limit or cycle detection for pathological cases (though unlikely in practice)
packages/text-generation/src/celeste_text_generation/providers/anthropic/parameters.py:94: Schema conversion happens on every request - for frequently-used schemas, consider memoization/caching

🐛 Potential Bugs or Issues

Critical Items to Verify:

Breaking API Change (packages/text-generation/src/celeste_text_generation/providers/anthropic/config.py:18):
- The beta version structured-outputs-2025-11-13 is quite new
- Question: Has this been tested with production API keys? Is there a rollback plan if Anthropic deprecates this beta version?
- Suggestion: Document version requirements and add monitoring for API deprecation warnings
Response Content Parsing (packages/text-generation/src/celeste_text_generation/providers/anthropic/client.py:68-74):
```
text_content = ""
for content_block in content:
    if content_block.get("type") == "text":
        text_content = content_block.get("text") or ""
        break
```
- With native structured outputs, the response content structure changes from tool_use blocks to direct JSON in text blocks
- Concern: The code only takes the first text block and breaks. What if there are multiple text blocks? What if structured output is split across blocks?
- Recommendation: Add validation that only one text block exists when output_schema is set, or concatenate all text blocks
parse_output Type Handling (packages/text-generation/src/celeste_text_generation/providers/anthropic/parameters.py:121-126):
```
if isinstance(content, dict):
    parsed_json = content
else:
    parsed_json = json.loads(content)
```
- With native structured outputs, content should always be a string (JSON text)
- Concern: The isinstance(content, dict) branch may be dead code now, or could mask issues if called incorrectly
- Recommendation: Add a comment explaining when dict vs string occurs, or remove dead code path

⚡ Performance Considerations

Improvements:

Reduced Memory Footprint: Removing raw_event metadata duplication is excellent - this was wasteful and violated design principles
Simpler Streaming: Removing 200 lines of tool_use tracking significantly reduces streaming overhead
Fewer Allocations: No longer building intermediate tool definition objects

Potential Optimizations:

Schema Resolution (parameters.py:184-237): The _resolve_refs method recursively processes the entire schema on every request
- For static schemas (most use cases), this is repeated work
- Suggestion: Cache resolved schemas keyed by schema hash/id
Nested Dictionary Copies (parameters.py:161): schema = schema.copy() creates shallow copies
- This is correct, but the nested dictionary updates in properties/items create new dicts
- Performance impact is minimal for typical schemas but could be noticeable for very large nested schemas

🔒 Security Concerns

Good Practices:

API Key Handling: Unchanged, continues to use SecretStr properly
No Arbitrary Code Execution: Removed tool-based approach eliminates potential tool injection vectors

Observations:

Schema Validation: The new approach relies on Anthropic's server-side validation of additionalProperties: false
- This is correct, but worth noting that malformed schemas will fail at runtime rather than client-side
- Impact: Low - API will return clear errors
Beta Feature Header (config.py:18): Using a beta API feature in production
- Recommendation: Add monitoring/alerting for unexpected API responses
- Recommendation: Document the beta status clearly for users

🧪 Test Coverage

Concerns:

No New Tests: The PR description mentions "288 tests passed with 84% coverage" but doesn't show new tests for the migration
- Critical: Were existing tests updated to verify native structured outputs work correctly?
- Question: Are there integration tests that verify:
  - Structured outputs with BaseModel
  - Structured outputs with list[BaseModel]
  - Streaming with structured outputs
  - The new beta header is sent correctly
  - Error handling when beta feature is not available
Regression Testing:
- The old tool-based approach is completely removed
- Recommendation: Ensure tests verify backward compatibility for clients using output_schema
Edge Cases:
- Empty responses
- Malformed JSON from API
- Nested complex schemas with $ref, anyOf, allOf
- Very large schemas

📋 Specific Recommendations

High Priority:

Add integration test coverage for native structured outputs (both streaming and non-streaming)
Verify the response parsing logic handles all content block scenarios with structured outputs
Document the beta API dependency and version requirements in README/docs
Add error handling for cases where the beta feature might not be available

Medium Priority:

Consider caching resolved schemas to improve performance for repeated requests
Add validation that content has expected structure when output_schema is set
Document migration path for any breaking changes in streaming chunk metadata

Low Priority:

Clean up potentially dead code path in parse_output (dict vs string handling)
Add depth limit to _ensure_additional_properties recursive calls
Consider adding telemetry to track native structured output usage vs failures

🎯 Summary

Overall Assessment: ✅ Approve with minor recommendations

This is a high-quality refactoring that:

Removes significant technical debt (313 lines removed)
Adopts the official API approach (better long-term)
Improves memory efficiency (no raw_event duplication)
Simplifies the codebase (especially streaming)

Main concerns:

Test coverage for the new implementation needs verification
Beta API dependency should be documented and monitored
Response parsing logic should be validated for edge cases

Recommendation: Merge after addressing the test coverage question and verifying production API compatibility.

Great work on this refactoring! The code is much cleaner and more maintainable. 🎉

Kamilbenkirane merged commit a55ccc3 into main Nov 16, 2025
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor: migrate Anthropic to native structured outputs API and remove raw_event metadata#40

refactor: migrate Anthropic to native structured outputs API and remove raw_event metadata#40
Kamilbenkirane merged 1 commit intomainfrom
refactor/anthropic-structured-outputs-and-remove-raw-event-metadata

Kamilbenkirane commented Nov 16, 2025

Uh oh!

claude Bot commented Nov 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kamilbenkirane commented Nov 16, 2025

1. Anthropic Structured Outputs Migration

2. Remove raw_event Metadata

Benefits

Testing

Uh oh!

claude Bot commented Nov 16, 2025

Pull Request Review: Migration to Anthropic Native Structured Outputs API

✅ Code Quality and Best Practices

Excellent Improvements:

Minor Observations:

🐛 Potential Bugs or Issues

Critical Items to Verify:

⚡ Performance Considerations

Improvements:

Potential Optimizations:

🔒 Security Concerns

Good Practices:

Observations:

🧪 Test Coverage

Concerns:

📋 Specific Recommendations

High Priority:

Medium Priority:

Low Priority:

🎯 Summary

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant