Skip to content

feat: add image-generation package with OpenAI, Google, and ByteDance providers#27

Merged
Kamilbenkirane merged 5 commits intomainfrom
feat/image-generation
Nov 10, 2025
Merged

feat: add image-generation package with OpenAI, Google, and ByteDance providers#27
Kamilbenkirane merged 5 commits intomainfrom
feat/image-generation

Conversation

@Kamilbenkirane
Copy link
Copy Markdown
Member

🎨 Image Generation Capability

This PR adds comprehensive image generation support to Celeste AI, enabling unified access to multiple image generation providers through a single, type-safe interface.

✨ Features

  • Unified API: Same interface across all providers
  • Type-Safe: Full IDE autocomplete and validation
  • Streaming Support: Real-time image generation with
  • Parameter Validation: Automatic validation based on model constraints
  • Multiple Providers: OpenAI, Google, and ByteDance support

🚀 Quick Start

from celeste import create_client, Capability, Provider

client = create_client(
    capability=Capability.IMAGE_GENERATION,
    provider=Provider.OPENAI,
)

response = await client.generate(
    prompt="A red apple on a white background",
    aspect_ratio="1024x1024",
    quality="hd"
)
print(response.content.url)  # ImageArtifact with URL

Install:

uv add "celeste-ai[image-generation]"

📦 Supported Providers & Models

OpenAI (3 models)

  • DALL-E 2: Classic image generation
    • Aspect ratios: 256x256, 512x512, 1024x1024
  • DALL-E 3: High-quality generation
    • Aspect ratios: 1024x1024, 1792x1024, 1024x1792
    • Quality: standard, hd
  • GPT Image 1: Streaming image generation ⚡
    • Aspect ratios: 1024x1024, 1536x1024, 1024x1536, auto
    • Quality: low, medium, high, auto
    • Partial images: 0-3 (for streaming)

Google (5 models)

  • Imagen 4 (imagen-4.0-generate-001): Standard quality
  • Imagen 4 Fast (imagen-4.0-fast-generate-001): Faster generation
  • Imagen 4 Ultra (imagen-4.0-ultra-generate-001): Highest quality
    • Aspect ratios: 1:1, 3:4, 4:3, 9:16, 16:9
    • Quality: 1K, 2K
  • Imagen 3 (imagen-3.0-generate-002): Legacy support
  • Gemini 2.5 Flash Image (gemini-2.5-flash-image): Gemini API
    • Aspect ratios: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9

ByteDance (1 model)

  • Seedream 4.0 (seedream-4-0-250828): Advanced image generation
    • Flexible dimensions: 1280x720 to 4096x4096 pixels
    • Aspect ratios: 1/16 to 16 (any ratio)
    • Quality: 1K, 2K, 4K
    • Watermark control: true/false
    • Presets: Square 2K/4K, HD/2K/4K 16:9, Portrait HD/2K, Ultra-wide 21:9

🎛️ Parameters

All parameters are type-safe and automatically validated based on the selected model:

aspect_ratio (str | None)

Controls image dimensions. Format varies by provider:

  • OpenAI: Size strings ("1024x1024", "1792x1024")
  • Google: Ratio strings ("1:1", "16:9", "9:16")
  • ByteDance: Freeform dimensions ("2048x2048", "3840x2160") with pixel/ratio validation

quality (str | None)

Image quality/detail level:

  • DALL-E 3: "standard" | "hd"
  • GPT Image 1: "low" | "medium" | "high" | "auto"
  • Google Imagen: "1K" | "2K"
  • ByteDance: "1K" | "2K" | "4K"

partial_images (int | None)

Number of partial images during streaming (GPT Image 1 only):

  • Range: 0-3
  • Controls how many intermediate images are generated during streaming

watermark (bool | None)

Add "AI generated" watermark (ByteDance only):

  • true: Add watermark (default)
  • false: No watermark

🔄 Streaming Support

Streaming is supported for GPT Image 1:

async for chunk in client.stream(prompt="A sunset over mountains"):
    if chunk.content:
        print(f"Partial image: {chunk.content.url}")

🏗️ Architecture

Infrastructure Improvements

  • Registry Module: Extracted _load_from_entry_points() to separate module to fix circular imports
  • Performance Optimization: Early return check prevents unnecessary entry point scanning
  • Test Isolation: Added monkeypatch to prevent entry point loading interference

Package Structure

  • Unified parameter interface across all providers
  • Provider-specific parameter mappers
  • Type-safe I/O types (ImageGenerationInput, ImageGenerationOutput, ImageGenerationChunk)
  • Comprehensive test coverage (unit + integration)

📊 Changes Summary

  • 44 files changed: 2,064 insertions(+), 42 deletions(-)
  • 3 logical commits:
    1. refactor: extract registry module to fix circular imports
    2. feat: add image-generation package with OpenAI, Google, and ByteDance providers
    3. chore: bump versions to 0.2.1 and optimize text-generation imports

✅ Testing

  • All unit tests passing
  • Integration tests included
  • 90%+ code coverage maintained
  • Pre-commit hooks passing

📚 Documentation

  • README with quick start guide
  • Provider-specific documentation
  • Parameter reference in code
  • Type hints throughout

Ready for review and merge! 🚀

@claude
Copy link
Copy Markdown

claude Bot commented Nov 10, 2025

🔍 Pull Request Review

Overall Assessment

This is a well-architected and high-quality PR that successfully adds comprehensive image generation capability to Celeste AI. The implementation demonstrates strong engineering practices, consistent patterns with the existing codebase, and good attention to detail. Ready for merge with minor suggestions below.


Strengths

1. Architecture & Design

  • Excellent abstraction: The unified API across providers (OpenAI, Google, ByteDance) is clean and type-safe
  • Consistent patterns: Follows established patterns from text-generation package (client hierarchy, parameter mappers, constraints)
  • Smart refactoring: Extracting registry.py to fix circular imports shows good architectural awareness
  • Provider adapter pattern: Google's dual API support (Imagen/Gemini) via adapter pattern is elegant

2. Code Quality

  • Strong type safety: Full type hints with Pydantic models, typed I/O classes (ImageGenerationInput, ImageGenerationOutput, ImageGenerationChunk)
  • Clear abstractions: Abstract base classes (ImageGenerationClient, ImageGenerationStream) with well-defined extension points
  • Consistent error handling: Proper use of custom exceptions (ValidationError, ConstraintViolationError)
  • Good logging: Appropriate warning/error logging in streaming implementations

3. Parameter Handling

  • Sophisticated validation: The Dimensions constraint with pixel and aspect ratio bounds is well-implemented
  • Provider-specific mapping: Clean separation between unified API parameters and provider-specific transformations
  • Preset support: ByteDance presets ("Square 2K", "HD 16:9") provide great UX
  • Conflict resolution: ByteDance client's mutual exclusivity check for aspect_ratio and quality is excellent (client.py:111-119)

4. Testing

  • Good unit test coverage: Comprehensive finish reason parsing tests for Google provider
  • Integration tests: Cross-provider test demonstrates unified API value
  • Test isolation: Proper use of fixtures and mocking to prevent entry point interference

🔧 Issues & Recommendations

Critical Issues

None found ✅

High Priority

1. Missing Unit Tests for Core Functionality

The package has only 5 test files, with limited coverage:

  • ✅ Google finish reason parsing (excellent coverage)
  • ✅ Basic integration test
  • Missing: OpenAI client unit tests
  • Missing: ByteDance client unit tests
  • Missing: Parameter mapper tests
  • Missing: Dimensions constraint tests
  • Missing: Streaming logic tests

Recommendation: Add unit tests for:

# Priority test files to add:
packages/image-generation/tests/unit_tests/test_constraints.py
packages/image-generation/tests/unit_tests/providers/openai/test_client.py
packages/image-generation/tests/unit_tests/providers/bytedance/test_client.py
packages/image-generation/tests/unit_tests/test_parameters.py

Impact: Without comprehensive tests, edge cases may not be covered (e.g., Dimensions constraint with invalid formats, parameter mapper conflicts).


Medium Priority

2. Error Handling: Generic ValueError Usage

Multiple locations use generic ValueError instead of custom exceptions:

# openai/client.py:56, 70
raise ValueError(msg)  # Should be ValidationError or ContentError

# google/client.py:148
raise ValueError(msg)  # Should be ModelNotFoundError or ConfigurationError

# bytedance/client.py:77, 119
raise ValueError(msg)  # Should be ContentError or ValidationError

Recommendation: Create capability-specific exceptions:

class ImageContentError(Error):
    """Raised when image content is missing or invalid in response."""

class InvalidModelError(Error):
    """Raised when model ID is not recognized."""

Impact: Better error handling for library consumers and clearer debugging.


3. ByteDance Mutual Exclusivity Check Location

The aspect_ratio vs quality conflict check happens in _make_request (runtime), but should happen earlier in parameter validation.

Current (client.py:111-119):

async def _make_request(self, request_body, **parameters):
    if parameters.get("aspect_ratio") and parameters.get("quality"):
        raise ValueError(...)  # Runtime check

Recommendation: Move to parameter mapper or add a validate_parameters hook called before request building. This provides earlier feedback to users.


4. Inconsistent Content Filtering in Metadata

Different providers use different field names in _build_metadata:

  • OpenAI: filters {"data"}
  • Google: filters {"predictions", "candidates"}
  • ByteDance: filters {"images", "data"}

Recommendation: Document this pattern or extract to a class variable:

class ByteDanceImageGenerationClient(ImageGenerationClient):
    _CONTENT_FIELDS = {"images", "data"}
    
    def _build_metadata(self, response_data):
        filtered_data = {k: v for k, v in response_data.items() 
                        if k not in self._CONTENT_FIELDS}

Impact: Improves maintainability and makes the pattern explicit.


5. Google Finish Reason Edge Case

In test_finish_reason.py:164, empty string finishMessage is preserved:

assert result.message == ""  # Empty string preserved

Question: Should empty strings be normalized to None for consistency? Current behavior may confuse users checking if message:.

Recommendation: Consider normalizing:

finish_message = candidate.get("finishMessage") or None

Low Priority

6. Magic Strings for Event Types

Streaming parsers use string literals for event types:

if event_type == "image_generation.partial_succeeded":  # ByteDance
if event_type == "image_generation.partial_image":      # OpenAI

Recommendation: Define constants:

class ByteDanceEventType:
    PARTIAL_SUCCEEDED = "image_generation.partial_succeeded"
    COMPLETED = "image_generation.completed"
    PARTIAL_FAILED = "image_generation.partial_failed"

Impact: Reduces typo risk and improves refactoring safety.


7. Incomplete Docstrings

Some methods lack comprehensive docstrings:

  • ByteDanceImageGenerationClient._make_request (client.py:105)
  • OpenAIImageGenerationClient._build_metadata (client.py:79)

Recommendation: Add parameter descriptions and return types to all public methods.


8. README Consistency

The package README uses favicons from external domains:

<img src="https://www.google.com/s2/favicons?domain=openai.com&sz=64">

Minor concern: External dependencies for docs. Consider hosting logos locally or using emoji alternatives.


🔒 Security Review

Good Practices

  • Proper API key handling via SecretStr
  • Environment variable pattern (BYTEDANCE_API_KEY, etc.)
  • No hardcoded credentials
  • HTTPS for all API endpoints (ByteDance auto-upgrades HTTP→HTTPS)

⚠️ Suggestions

  1. ByteDance base URL: Uses regional endpoint (ark.ap-southeast.bytepluses.com). Consider making region configurable for compliance requirements.
  2. Request logging: Ensure HTTP client doesn't log full request bodies (potential prompt leakage). Verify in celeste.http.HTTPClient.

📊 Performance Considerations

Good Optimizations

  • Early return in registry._load_from_entry_points (registry.py:14-16) prevents redundant entry point scans
  • Efficient set-based model lookup in Google adapter (client.py:136-137)
  • Streaming support for progressive image generation (OpenAI gpt-image-1)

💡 Potential Optimizations

  1. Adapter caching: _get_adapter_for_model recomputes model ID sets on every call. Consider module-level caching:
_IMAGEN_MODEL_IDS = {model.id for model in IMAGEN_MODELS}  # Computed once
  1. Base64 decoding: Happens synchronously in streaming. For large images, consider async decoding if performance issues arise.

🧪 Test Coverage Assessment

Component Coverage Notes
Google finish reason ✅ Excellent 165 lines of thorough tests
Integration tests ✅ Good Cross-provider validation
OpenAI client ⚠️ Missing No unit tests found
ByteDance client ⚠️ Missing No unit tests found
Parameter mappers ⚠️ Missing Critical path untested
Dimensions constraint ⚠️ Missing Complex validation logic untested
Streaming ⚠️ Partial Integration only, no unit tests

Target: Achieve 90%+ coverage as specified in pyproject.toml:89.


📝 Documentation

Strong Points

  • Clear README with quick start examples
  • Provider-specific documentation (ByteDance README)
  • Comprehensive PR description with parameter reference
  • Type hints serve as inline documentation

💡 Improvements

  1. Add docstring examples for complex parameters (e.g., Dimensions constraint usage)
  2. Document finish reason values per provider (what does "PROHIBITED_CONTENT" mean?)
  3. Add migration guide if users are coming from provider SDKs

🎯 Final Verdict

Score: 8.5/10

Recommendation: ✅ Approve with suggestions

This PR demonstrates excellent engineering and is ready for merge. The suggestions above are mostly enhancements rather than blockers. Priority should be:

  1. Before merge: None (code is production-ready)
  2. Follow-up PR: Add comprehensive unit tests for OpenAI/ByteDance clients
  3. Future enhancement: Custom exception types for better error semantics

🚀 Commit Quality

The 3 logical commits are well-structured:

  1. refactor: extract registry module - Clean separation of concerns
  2. feat: add image-generation package - Complete feature implementation
  3. chore: bump versions to 0.2.1 - Proper version management

Great adherence to conventional commits!


Excellent work on this feature! 🎉 The unified API and type-safe design will make Celeste AI much more powerful for users working with image generation.

…ltering

- Replace ValueError with ValidationError, ModelNotFoundError, and ConstraintViolationError
- Use Provider.GOOGLE enum instead of string 'google'
- Remove content field filtering from _build_metadata methods to match text-generation pattern
@claude
Copy link
Copy Markdown

claude Bot commented Nov 10, 2025

Pull Request Review: Image Generation Capability

Summary

This PR adds comprehensive image generation support to Celeste AI with providers for OpenAI, Google, and ByteDance. The implementation is well-architected, type-safe, and follows the established patterns from the text-generation package. Overall, this is high-quality work that's ready to merge with a few minor suggestions for improvement.

✅ Code Quality & Architecture

Strengths

  • Excellent abstraction design: The unified ImageGenerationClient base class provides a clean contract that all providers implement consistently
  • Type safety: Full use of generics, Pydantic models, and TypedDict throughout - great IDE support
  • Separation of concerns: Clear separation between client logic, parameter mapping, streaming, and provider-specific adapters
  • Adapter pattern: Google's dual API support (Imagen/Gemini) using adapters is elegant and maintainable
  • Constraint system: The Dimensions constraint (lines 7-72 in constraints.py) is well-designed with clear validation logic for pixels and aspect ratios
  • Registry refactoring: Extracting the registry module to fix circular imports shows good architectural thinking

Minor Suggestions

  1. Incomplete truncation in ByteDance parameters (packages/image-generation/src/celeste_image_generation/providers/bytedance/parameters.py:87):

    validated_value = self._validate_val

    This line appears to be truncated and should be:

    validated_value = self._validate_value(value, model)
  2. Magic numbers in constraints (packages/image-generation/src/celeste_image_generation/constraints.py:29):
    Consider extracting the dimension parsing logic into a helper method for better testability and reusability.

  3. Error messages could be more helpful (packages/image-generation/src/celeste_image_generation/providers/bytedance/client.py:112-116):
    The error message is excellent! This is actually a strength - it clearly explains why the parameters conflict and what the user should do instead.

🐛 Potential Issues

Critical

  1. Truncated code in ByteDance WatermarkMapper - Line 87 in parameters.py is incomplete. This will cause a syntax error or attribute error at runtime.

Low Priority

  1. Division by zero protection (constraints.py:60): While unlikely given the validation at line 46, consider adding an explicit check before the division:

    if height == 0:
        msg = "Height cannot be zero"
        raise ConstraintViolationError(msg)
    aspect_ratio = width / height
  2. Empty string handling (providers/google/client.py:87-91): The test at line 143-165 in test_finish_reason.py shows that empty strings are preserved as-is. Consider normalizing empty strings to None for consistency:

    finish_message = candidate.get("finishMessage") or None

⚡ Performance Considerations

Strengths

  • Early return optimization (registry.py:15-16): The set-based check prevents redundant entry point scanning - excellent!
  • Lazy loading: Providers are only imported when needed, reducing startup time
  • O(1) lookups: The adapter selection in google/client.py:132-134 uses sets for fast model type checking

Suggestions

  1. Base64 decoding: Multiple places decode base64 data (e.g., openai/client.py:64, google/client.py:69). These operations are fine for typical image sizes, but consider:

    • Adding size limits or warnings for very large images
    • Streaming decoding for multi-megabyte responses (future enhancement)
  2. Streaming chunk accumulation (streaming.py:33): The final chunk lookup is efficient. No concerns here.

🔒 Security Concerns

Good Practices

  • ✅ API keys use SecretStr from Pydantic
  • ✅ No hardcoded credentials
  • ✅ Input validation through constraints
  • ✅ Type validation prevents injection attacks

Recommendations

  1. URL validation: When returning image URLs from providers (e.g., bytedance/client.py:59, openai/client.py:68), consider validating that URLs use HTTPS and come from expected domains to prevent potential SSRF or phishing attacks.

  2. Base64 validation: The base64 decoding operations could benefit from size limits to prevent DoS through memory exhaustion:

    MAX_IMAGE_SIZE = 10 * 1024 * 1024  # 10MB
    if len(b64_json) > MAX_IMAGE_SIZE * 4 / 3:  # base64 is ~4/3 original size
        raise ValidationError("Image data exceeds maximum size")
    image_bytes = base64.b64decode(b64_json)
  3. Watermark default (bytedance/models.py:31): Good that ByteDance supports watermarking. Consider documenting the default behavior (appears to be true based on the docstring).

🧪 Test Coverage

Strengths

  • ✅ Integration tests cover all three providers uniformly (test_generate.py)
  • ✅ Comprehensive unit tests for edge cases (Google finish reason parsing)
  • ✅ Parameterized tests for multiple scenarios
  • ✅ Clear test names and documentation

Gaps

  1. Missing unit tests for:

    • Dimensions constraint validation (various edge cases: zero dimensions, overflow, negative values)
    • ByteDance parameter mappers (especially the conflict resolution between aspect_ratio and quality)
    • OpenAI streaming chunk parsing
    • Error handling paths (malformed responses, missing fields)
  2. Integration test parameters: The integration tests use empty parameters dict. Consider adding tests with actual parameters:

    (Provider.OPENAI, "dall-e-3", {"quality": "hd", "aspect_ratio": "1024x1024"}),
    (Provider.GOOGLE, "imagen-4.0-ultra-generate-001", {"quality": "2K", "aspect_ratio": "16:9"}),
  3. Streaming tests: No streaming integration tests. Consider adding at least one test for GPT Image 1 streaming.

  4. Test coverage metrics: The PR mentions "90%+ code coverage maintained" - verify this includes the new package with:

    uv run pytest packages/image-generation/tests/unit_tests --cov=celeste_image_generation --cov-report=term-missing

📝 Additional Observations

Documentation

  • ✅ Excellent README with clear examples and provider documentation
  • ✅ Comprehensive docstrings throughout
  • ✅ Type hints make the code self-documenting
  • ✅ Parameter descriptions in PR body are very helpful

Consistency

  • ✅ Follows the same patterns as text-generation package
  • ✅ Consistent naming conventions across providers
  • ✅ Uniform error handling

Minor Notes

  1. pyproject.toml (packages/image-generation/pyproject.toml:1): Version is 0.2.1, matching the core package - good for consistency.

  2. Makefile update (Makefile:7): Good addition of the new package tests to the test command.

  3. Empty py.typed file (packages/image-generation/src/celeste_image_generation/py.typed): Correct marker for PEP 561 type checking support.

🎯 Recommendations

Before Merge

  1. Fix the truncated code in bytedance/parameters.py:87
  2. Add unit tests for the Dimensions constraint
  3. Consider adding streaming integration test (optional but recommended)

Future Enhancements

  1. Add more comprehensive error handling tests
  2. Consider adding retry logic for transient API failures
  3. Document rate limits for each provider
  4. Add examples for advanced features (watermarking, quality settings)

Final Verdict

APPROVE with minor fixes

This is excellent work that demonstrates strong software engineering practices. The architecture is sound, the code is clean and well-documented, and it integrates seamlessly with the existing codebase. Once the truncated line in bytedance/parameters.py is fixed, this is ready to ship!

Great job on this comprehensive feature addition! 🚀


Review generated with assistance from Claude Code

@Kamilbenkirane Kamilbenkirane merged commit f11bdb5 into main Nov 10, 2025
11 checks passed
@claude
Copy link
Copy Markdown

claude Bot commented Nov 10, 2025

Pull Request Review: Image Generation Capability

Summary

This is an excellent PR that adds comprehensive image generation support to Celeste AI. The implementation is well-architected, type-safe, and follows existing patterns.

Strengths

  • Clean abstraction layer across providers (OpenAI, Google, ByteDance)
  • Excellent type safety with Pydantic and generics
  • Smart registry refactoring to fix circular imports
  • Adapter pattern for Google Imagen/Gemini APIs
  • Good test coverage with integration tests for all providers

Critical Issues

1. Security: Missing Input Validation (constraints.py:42)

  • No overflow check before width * height multiplication
  • Malicious input like "2147483647x2147483647" could crash
  • Fix: Add MAX_DIMENSION validation before calculating total pixels

2. Base64 Error Handling (bytedance/client.py:62)

  • base64.b64decode not wrapped in try-except
  • Fix: Add proper error handling with ValidationError

Moderate Issues

  1. Redundant None check in streaming.py:35
  2. Missing mime_type in openai/client.py:65 ImageArtifact
  3. Empty pass statement in io.py:42
  4. Potential race condition in registry.py:14-16 (not thread-safe)

Test Coverage Gaps

  • No unit tests for OpenAI/ByteDance clients
  • No streaming tests
  • No parameter mapper tests
  • Limited constraint edge case tests

Recommendations

Before merge:

  1. Fix dimension overflow check
  2. Fix base64 error handling
  3. Add provider client unit tests
  4. Add constraint edge case tests

Post-merge:

  • Add streaming tests
  • Improve coverage to 90%+
  • Extract magic strings to constants

Final Verdict

APPROVE with minor changes

Excellent architecture and code quality. Critical issues are straightforward to fix. The unified API design is impressive - same interface works across all providers.

Once the two critical issues are addressed, this is ready to merge. Great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant