feat(xai): add Grok Imagine for image and video generation by Kamilbenkirane · Pull Request #128 · withceleste/celeste-python

Kamilbenkirane · 2026-01-29T21:37:15Z

Summary

Add xAI's Grok Imagine models for image and video generation:

grok-imagine-image - AI image generation and editing
grok-imagine-video - AI video generation and editing

Features

Images (`grok-imagine-image`)

Operation	Endpoint	Description
`generate`	`POST /v1/images/generations`	Generate images from text prompts
`edit`	`POST /v1/images/edits`	Edit existing images with text instructions

Supported Parameters:

Parameter	Type	Options	Description
`aspect_ratio`	Choice	`1:1`, `3:4`, `4:3`, `9:16`, `16:9`, `2:3`, `3:2`, `9:19.5`, `19.5:9`, `9:20`, `20:9`, `1:2`, `2:1`, `auto`	Image aspect ratio
`num_images`	Range	1-10	Number of images to generate
`output_format`	Choice	`url`, `b64_json`	Response format

Videos (`grok-imagine-video`)

Operation	Endpoint	Description
`generate`	`POST /v1/videos/generations`	Generate videos from text prompts
`edit`	`POST /v1/videos/edits`	Edit existing videos with text instructions

Async Polling Pattern:

Initial request returns request_id
Poll GET /v1/videos/{request_id} until completion
HTTP 202 = processing, HTTP 200 = ready

Supported Parameters:

Parameter	Type	Options	Description
`duration`	Range	1-15 seconds	Video duration
`aspect_ratio`	Choice	`16:9`, `4:3`, `1:1`, `9:16`, `3:4`, `3:2`, `2:3`	Video aspect ratio
`resolution`	Choice	`720p`, `480p`	Video resolution

Usage

```python
import celeste

Image generation

image = await celeste.images.generate(
prompt="A cat in a tree",
model="grok-imagine-image",
aspect_ratio="16:9",
num_images=1,
)
print(image.content.url)

Image editing

edited = await celeste.images.edit(
image=image.content,
prompt="Add a bird in the tree",
model="grok-imagine-image",
)

Video generation

video = await celeste.videos.generate(
prompt="A ball bouncing",
model="grok-imagine-video",
duration=5,
aspect_ratio="16:9",
)
print(video.content.url)

Video editing

edited = await celeste.videos.edit(
video=video.content,
prompt="Change the ball color to blue",
model="grok-imagine-video",
)
```

Files Changed

Provider-level (HTTP/API handling)

`src/celeste/providers/xai/images/` - Images API client, config, parameters
`src/celeste/providers/xai/videos/` - Videos API client with async polling

Modality-level (Celeste interface)

`src/celeste/modalities/images/providers/xai/` - Images client, models, parameters
`src/celeste/modalities/videos/providers/xai/` - Videos client, models, parameters
`src/celeste/modalities/videos/io.py` - Added `video` field to `VideoInput` for edit ops

Tests

Added xAI to integration tests for images generate/edit and videos generate

Test plan

Unit tests pass (433 tests, 82% coverage)
Integration test: `pytest -m integration -k xai` for image generation
Integration test: `pytest -m integration -k xai` for image editing
Integration test: `pytest -m integration -k xai --slow` for video generation
Manual test: video editing with generated video URL

🤖 Generated with Claude Code

Add xAI Grok Imagine support for image and video generation: Images (grok-imagine-image): - Generate and edit operations - Parameters: aspect_ratio, num_images, output_format - Aspect ratios: 1:1, 3:4, 4:3, 9:16, 16:9, 2:3, 3:2, and more Videos (grok-imagine-video): - Generate and edit operations - Async polling pattern (HTTP 200=ready, 202=processing) - Parameters: duration (1-15s), aspect_ratio, resolution Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-29T21:41:50Z

Pull Request Review: xAI Grok Imagine Integration

Overview

This PR adds xAI's Grok Imagine models for image and video generation/editing. The implementation follows established patterns in the codebase and demonstrates solid understanding of the architecture.

✅ Strengths

1. Excellent Architectural Consistency

Follows the two-layer mixin pattern consistently across all implementations
Provider-level mixins in src/celeste/providers/xai/ handle HTTP/API logic
Modality-level clients handle operations cleanly
Parameter mappers properly separated and reusable

2. Async Polling Implementation

The video polling implementation is well-designed with proper handling of immediate responses, async polling flow, configurable timeout, and clean error handling.

3. Good Error Handling

Raises appropriate exceptions (ValueError, RuntimeError, TimeoutError)
Uses descriptive error messages
Properly handles missing fields in responses

4. Clean Code Quality

Clear docstrings with operation descriptions
Type hints throughout
Consistent naming conventions
Well-organized file structure

🔍 Issues & Recommendations

CRITICAL: Potential Bug in Video Edit Operation

Location: src/celeste/modalities/videos/providers/xai/client.py:28

Issue: This assumes VideoArtifact always has a url attribute, but artifacts can also have data (bytes) or path (local file). This will fail with AttributeError for non-URL artifacts.

Recommendation: Add proper handling for non-URL artifacts or raise a clear error message.

MEDIUM: Image Edit Missing Proper Serialization

Location: src/celeste/modalities/images/providers/xai/client.py:32

Issue: Passing the ImageArtifact object directly to the request. Verify this matches xAI API specification.

MEDIUM: Missing Video Edit Tests

Issue: No automated integration test for video editing. Images has test_edit.py, but videos doesn't.

LOW: Polling Configuration Could Be More Robust

Location: src/celeste/providers/xai/videos/config.py:16-17

5-minute timeout might be insufficient for longer videos. Consider increasing default timeout to 10-15 minutes.

LOW: Missing Type Validation

Add type validation in _parse_content methods to prevent silent failures.

🔒 Security: No Issues Found

API keys properly handled via AuthHeader with SecretStr
No hardcoded credentials
Uses standard auth registration pattern

⚡ Performance

The polling implementation could benefit from exponential backoff to reduce API calls.

📋 Test Coverage

✅ Image generation integration test
✅ Image edit integration test
✅ Video generation integration test
❌ Video edit integration test (missing)
❌ Unit tests for edge cases (missing)

🎯 Action Items

Before Merge (MUST FIX)

Fix video edit to handle non-URL artifacts or raise clear error
Verify image edit serialization matches xAI API spec
Add video edit integration test

Recommended (SHOULD FIX)

Increase polling timeout or make configurable
Add type validation in _parse_content methods
Consider exponential backoff for polling

Optional (NICE TO HAVE)

Add unit tests for edge cases
Strip notebook execution metadata
Document timeout/retry behavior

🎉 Conclusion

This is a solid implementation that follows best practices and integrates cleanly with the existing codebase. The main concerns are around edge case handling for different artifact types in edit operations and missing test coverage for video editing.

Recommendation: Approve with changes requested ✅ (after fixing the critical issues)

Great work on maintaining consistency with the codebase patterns! 🚀

- Image edit: serialize ImageArtifact to URL or base64 string instead of passing object directly (xAI API expects string) - Video edit: validate video has URL before using, raise clear error if not (xAI only supports URL, not base64/path) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Let the xAI API handle validation for video edit requests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

claude · 2026-01-29T21:53:48Z

PR Review: xAI Grok Imagine Integration

Summary

This PR adds comprehensive support for xAI's Grok Imagine models for image and video generation. The implementation follows the existing codebase patterns well and integrates cleanly with the Celeste architecture.

✅ Strengths

Architecture & Design

Excellent Pattern Consistency: The implementation closely follows existing provider patterns (OpenAI, Google, etc.), making it easy to understand and maintain
Proper Layering: Clean separation between provider-level HTTP/API handling and modality-level Celeste interfaces
Async Polling Pattern: The video client implements async polling correctly with configurable timeouts (5s intervals, max 60 polls = 5 minutes)
Mixin Architecture: Proper use of mixins for shared functionality across modalities

Code Quality

Type Safety: Good use of type hints throughout (Unpack[ImageParameters], dict[str, Any], etc.)
Error Handling: Proper validation and error messages for missing data
Documentation: Clear docstrings explaining the purpose and behavior of classes and methods
DRY Principle: Parameter mappers properly inherit from base classes and avoid duplication

🔍 Issues & Recommendations

🐛 Critical: Video Polling Logic Bug

Location: src/celeste/providers/xai/videos/client.py:108-113

# Parse response for error handling
video_obj = status_response.json()
status = video_obj.get("status", "")
if status == config.STATUS_FAILED:
    error = video_obj.get("error", "Video generation failed")
    raise RuntimeError(error)

Issue: This code is unreachable because:

Line 98: self._handle_error_response(status_response) is called first
Line 101: If status is 200, we return immediately
Line 105: If status is 202, we continue the loop
Lines 108-113 can only be reached if status is neither 200 nor 202, but _handle_error_response would have already raised an exception for non-2xx status codes

Recommendation: Either remove the unreachable code or restructure the logic:

# Option 1: Remove unreachable code
if status_response.status_code == 200:
    return status_response.json()
if status_response.status_code == 202:
    continue
# _handle_error_response will handle any other status codes
self._handle_error_response(status_response)

⚠️ Potential Issues

1. Race Condition in Video Polling

Location: src/celeste/providers/xai/videos/client.py:91-92

The polling starts immediately without checking if the initial response already has a URL:

# Poll for completion
poll_endpoint = f"/v1/videos/{request_id}"
for _ in range(config.MAX_POLLS):
    await asyncio.sleep(config.POLL_INTERVAL)  # Sleeps before first check

Recommendation: Check status immediately before sleeping:

for _ in range(config.MAX_POLLS):
    status_response = await self.http_client.get(...)
    self._handle_error_response(status_response)
    
    if status_response.status_code == 200:
        return status_response.json()
    if status_response.status_code != 202:
        break
    
    await asyncio.sleep(config.POLL_INTERVAL)

2. Import Location Inconsistency

Location: src/celeste/modalities/images/providers/xai/client.py:339

Base64 is imported inside the method:

def _parse_content(self, response_data: dict[str, Any], **parameters) -> ImageArtifact:
    b64_json = image_data.get("b64_json")
    if b64_json:
        import base64  # Imported here

Recommendation: Move to top-level imports for consistency with OpenAIImagesClient (line 3).

3. Missing MIME Type for Video Artifacts

Location: src/celeste/modalities/videos/providers/xai/client.py:584

Unlike OpenAI's video client which specifies VideoMimeType.MP4, xAI videos don't specify a MIME type:

return VideoArtifact(url=url)  # Missing mime_type

Recommendation: Add MIME type if known, or document why it's not needed for URL-based artifacts.

4. Timeout Configuration Hardcoded

Location: src/celeste/providers/xai/videos/config.py:89-90

MAX_POLLS = 60
POLL_INTERVAL = 5  # seconds

Recommendation: Consider making these configurable through environment variables or client initialization for users who need longer/shorter timeouts.

📝 Minor Improvements

1. Notebook Cell Formatting

The changes to notebooks/working-with-images.ipynb reorder JSON fields (moving execution_count before outputs). While this works, it's not directly related to the xAI feature.

Recommendation: Consider separating notebook formatting changes into a separate commit or reverting if not necessary.

2. Error Messages

Generic error messages could be more specific:

Line 349 (images): "No image URL or base64 data in response" - could indicate which format was expected based on output_format parameter
Line 1051 (videos): "No video URL in response" - could show the actual response structure for debugging

3. Video Edit Input Validation

Location: src/celeste/modalities/videos/providers/xai/client.py:540-542

if inputs.video is not None:
    request["video"] = {"url": inputs.video.url}

Potential Issue: No validation that inputs.video.url is not None. If a VideoArtifact is created with data but no URL, this will fail.

Recommendation: Add validation:

if inputs.video is not None:
    if not inputs.video.url:
        raise ValueError("Video editing requires a video URL")
    request["video"] = {"url": inputs.video.url}

🔒 Security Assessment

✅ No significant security issues found

Credentials properly managed through the existing XAI_API_KEY environment variable
No sensitive data logged or exposed
Base64 decoding uses built-in library (safe)
URL handling doesn't introduce injection vulnerabilities
Async sleep durations are reasonable (5s intervals, not susceptible to timing attacks)

🧪 Test Coverage

Current Coverage

✅ Integration tests added for images generate/edit
✅ Integration tests added for videos generate
❌ Video editing integration test marked as manual (unchecked in test plan)

Recommendations

Add Video Edit Test: The video edit functionality is implemented but not tested in integration tests
Add Unit Tests: Consider adding unit tests for:
- Parameter mappers
- Response parsing logic
- Error handling paths
- Polling timeout behavior
Edge Cases: Test scenarios like:
- Immediate video availability (no polling needed)
- Polling timeout
- Invalid response formats
- Missing URLs/base64 data

Test Coverage Impact

The PR mentions 82% overall coverage with 433 tests passing, which is good. However, the new xAI code paths may not be fully covered by existing tests.

📊 Performance Considerations

Video Generation Timeout

Current: 5 minutes maximum (60 polls × 5 seconds)
Assessment: Reasonable for video generation, but could be too short for longer videos (15s videos at max duration)
Recommendation: Consider increasing MAX_POLLS to 120 (10 minutes) or making it model-aware based on requested duration

Polling Efficiency

Current: Fixed 5-second intervals
Consideration: Could implement exponential backoff to reduce API calls
Impact: Not critical given the 5-minute timeout window

✅ Final Verdict

Overall Assessment: APPROVE with minor changes recommended

This is a solid implementation that:

✅ Follows existing patterns consistently
✅ Has good code quality and documentation
✅ Includes reasonable test coverage
✅ Has no security concerns
⚠️ Has one critical bug (unreachable error handling code)
⚠️ Has a few minor issues that should be addressed

Required Changes

Fix the unreachable error handling code in video polling logic

Recommended Changes

Fix race condition in video polling (check before first sleep)
Move base64 import to top-level
Add video edit integration test
Add input validation for video edit operations
Consider making timeout configurable

Optional Improvements

Add MIME type to video artifacts
Improve error messages with more context
Add unit tests for edge cases
Consider exponential backoff for polling

Great work on this integration! 🚀

Review performed by Claude Code PR Review

Kamilbenkirane and others added 2 commits January 29, 2026 22:48

fix(xai): remove client-side video URL validation

5af9d83

Let the xAI API handle validation for video edit requests. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

Kamilbenkirane merged commit 0ca70d0 into main Jan 29, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(xai): add Grok Imagine for image and video generation#128

feat(xai): add Grok Imagine for image and video generation#128
Kamilbenkirane merged 3 commits intomainfrom
xai/imagine

Kamilbenkirane commented Jan 29, 2026

Uh oh!

claude bot commented Jan 29, 2026

Uh oh!

claude bot commented Jan 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kamilbenkirane commented Jan 29, 2026

Summary

Features

Images (grok-imagine-image)

Videos (grok-imagine-video)

Usage

Image generation

Image editing

Video generation

Video editing

Files Changed

Provider-level (HTTP/API handling)

Modality-level (Celeste interface)

Tests

Test plan

Uh oh!

claude bot commented Jan 29, 2026

Pull Request Review: xAI Grok Imagine Integration

Overview

✅ Strengths

1. Excellent Architectural Consistency

2. Async Polling Implementation

3. Good Error Handling

4. Clean Code Quality

🔍 Issues & Recommendations

CRITICAL: Potential Bug in Video Edit Operation

MEDIUM: Image Edit Missing Proper Serialization

MEDIUM: Missing Video Edit Tests

LOW: Polling Configuration Could Be More Robust

LOW: Missing Type Validation

🔒 Security: No Issues Found

⚡ Performance

📋 Test Coverage

🎯 Action Items

Before Merge (MUST FIX)

Recommended (SHOULD FIX)

Optional (NICE TO HAVE)

🎉 Conclusion

Uh oh!

claude bot commented Jan 29, 2026

PR Review: xAI Grok Imagine Integration

Summary

✅ Strengths

Architecture & Design

Code Quality

🔍 Issues & Recommendations

🐛 Critical: Video Polling Logic Bug

⚠️ Potential Issues

1. Race Condition in Video Polling

2. Import Location Inconsistency

3. Missing MIME Type for Video Artifacts

4. Timeout Configuration Hardcoded

📝 Minor Improvements

1. Notebook Cell Formatting

2. Error Messages

3. Video Edit Input Validation

🔒 Security Assessment

🧪 Test Coverage

Current Coverage

Recommendations

Test Coverage Impact

📊 Performance Considerations

Video Generation Timeout

Polling Efficiency

✅ Final Verdict

Required Changes

Recommended Changes

Optional Improvements

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Images (`grok-imagine-image`)

Videos (`grok-imagine-video`)