feat(speech-generation): add Gradium TTS provider with WebSocket API by Kamilbenkirane · Pull Request #89 · withceleste/celeste-python

Kamilbenkirane · 2025-12-19T16:33:47Z

Summary

Add Gradium as a new speech generation provider using WebSocket-based TTS API
Implement 14 flagship voices across 5 languages (EN, FR, DE, ES, PT)
Add celeste-gradium provider package with WebSocket client implementation
Support for multiple audio output formats (wav, pcm, opus, ulaw, alaw)

Changes

New Provider Package (`packages/providers/gradium/`)

WebSocket-based TTS client (not HTTP REST like other providers)
Parameter mappers for voice, speed, output format, JSON config
Configuration with DEFAULT_VOICE_ID consistent with ElevenLabs pattern

Capability Integration (`packages/capabilities/speech-generation/`)

Gradium provider client extending both GradiumTextToSpeechClient and SpeechGenerationClient
14 flagship voices: Emma, Lucas, Camille, Antoine, Mia, Lukas, Lucia, Carlos, Sofia, Miguel (+ 4 more)
Model definitions for gradium-tts-v1 and gradium-tts-v1-turbo

Core Updates

Add Provider.GRADIUM enum value
Add Capability.SPEECH_GENERATION to Gradium
Add GRADIUM_API_KEY credential support
Add WebSocket utility module (src/celeste/websocket.py)

Test plan

All existing unit tests pass (316 tests)
Coverage at 84.21% (above 80% threshold)
MyPy type checking passes
Ruff linting/formatting passes
Bandit security scan passes
Integration test with live Gradium API (requires API key)

Add comprehensive Gradium text-to-speech integration with WebSocket streaming support. Features: - WebSocket-based TTS streaming with low-latency audio generation - 14 flagship voices across 5 languages (en, fr, de, es, pt) - Custom voice cloning with create/list/update/delete operations - Speed control via padding_bonus parameter (-4.0 to 4.0) - Multiple audio formats (wav, pcm, opus, ulaw_8000, alaw_8000, pcm_16000, pcm_24000) - EU/US regional endpoints for optimized latency - Credits monitoring and usage tracking Implementation details: - WebSocket protocol handling with proper message sequencing - Parameter mappers for voice, speed, and response_format - Pydantic models for API responses (VoiceInfo, CreditsSummary, TTSResult) - Full voice management REST API integration - Comprehensive test suite with 6 test functions Dependencies: - Added websockets>=13.0 to support WebSocket connections Documentation: - Complete README with usage examples and API reference - Test scripts for validation (test_gradium_tts.py, test_gradium_minimal.py)

Add Gradium as a new speech generation provider using their WebSocket-based Text-to-Speech API. This provider enables high-quality multilingual speech synthesis with 14 flagship voices across 5 languages. Key changes: - Add celeste-gradium provider package with WebSocket TTS client - Add Gradium capability provider with parameter mappers - Add WebSocket client module to celeste core - Add 14 flagship voices (EN, FR, DE, ES, PT) - Add Gradium integration test and CI secrets - Bump celeste-ai and celeste-speech-generation to 0.3.1 Provider features: - WebSocket streaming for low-latency audio generation - Support for wav, pcm, opus output formats - Voice selection by name (Emma, Kent, Sydney, etc.) - Speed control via padding_bonus translation

- Export GRADIUM_VOICES in speech-generation __init__.py - Add DEFAULT_VOICE_ID to config.py (consistent with ElevenLabs) - Add _make_request stub to satisfy abstract interface - Remove unused httpx import from capability client

claude · 2025-12-19T16:36:23Z

PR Review: Add Gradium TTS Provider with WebSocket API

Summary

This PR adds Gradium as a new speech generation provider with WebSocket-based TTS. The implementation is well-structured and follows existing patterns in the codebase. Overall, this is a solid addition with good architectural decisions.

✅ Strengths

Code Quality & Architecture

Excellent consistency: The implementation follows the existing provider patterns established by ElevenLabs, Google, and OpenAI
Clean separation of concerns: Provider-specific logic in celeste-gradium package, capability integration in speech-generation
Good abstraction: The mixin pattern for GradiumTextToSpeechClient is well-designed
Proper typing: Type hints are used consistently throughout

WebSocket Implementation

Proper protocol handling: The WebSocket flow is implemented correctly with setup → ready → text → audio chunks → end_of_stream
Context manager usage: Uses async with for automatic cleanup of WebSocket connections
Error handling: Captures and raises errors from the WebSocket stream appropriately

Documentation

Clear docstrings: Methods are well-documented with clear explanations
Inline comments: Critical sections have helpful comments explaining the flow

⚠️ Issues & Concerns

1. WebSocket Connection Management (Medium Priority)

Location: packages/providers/gradium/src/celeste_gradium/text_to_speech/client.py:64

The implementation creates a new WebSocket connection for each TTS request:

async with ws_connect(url, additional_headers=headers) as ws:

Issues:

No connection pooling or reuse
Creates overhead for multiple sequential requests
Doesn't leverage the WebSocketClient class added in src/celeste/websocket.py

Recommendation: Consider using the WebSocketClient utility for consistency, though creating new connections per request may be acceptable depending on Gradium's API design.

2. Unused WebSocket Utility Module (Low Priority)

Location: src/celeste/websocket.py

This module defines WebSocketClient and WebSocketConnection classes but they're not used anywhere in the Gradium implementation. The Gradium client directly uses websockets.asyncio.client.connect instead.

Questions:

Was this intended for future use?
Should Gradium use this abstraction?
If unused, should it be removed to avoid confusion?

Recommendation: Either use the utility module or remove it if it's not needed yet. If it's for future providers, add a comment explaining that.

3. WebSocket Cleanup Not Exposed (Low Priority)

Location: src/celeste/__init__.py:127

The module exports close_all_ws_clients() but the function in websocket.py:117-119 doesn't actually close any connections - it just clears the registry:

async def close_all_ws_clients() -> None:
    """Close all WebSocket clients and clear registry."""
    _ws_clients.clear()

Issue: This doesn't match the function's documentation. Active WebSocket connections won't be closed, potentially leading to resource leaks.

Recommendation: If WebSocketConnection objects are stored in the registry, iterate through them and call .close() before clearing.

4. Error Handling: JSON Parsing (Medium Priority)

Location: packages/providers/gradium/src/celeste_gradium/text_to_speech/client.py:94-96

if isinstance(message, bytes):
    data = json.loads(message.decode("utf-8"))
else:
    data = json.loads(message)

Issue: No exception handling for:

Invalid UTF-8 in bytes
Malformed JSON
Missing expected keys in the parsed JSON

Recommendation: Wrap in try-except and provide informative error messages:

try:
    if isinstance(message, bytes):
        data = json.loads(message.decode("utf-8"))
    else:
        data = json.loads(message)
except (json.JSONDecodeError, UnicodeDecodeError) as e:
    raise ValueError(f"Invalid message from Gradium API: {e}") from e

5. Test Coverage: No Unit Tests for Gradium (High Priority)

Observation: Only integration tests exist (in test_generate.py:22). No unit tests for:

WebSocket flow logic
Error handling paths
Parameter mapping
MIME type mapping

Recommendation: Add unit tests with mocked WebSocket connections to cover:

Successful TTS generation
Error responses from the API
Connection failures
Malformed responses
Parameter transformation edge cases

6. Hardcoded Region in BASE_URL (Low Priority)

Location: packages/providers/gradium/src/celeste_gradium/text_to_speech/config.py:12

BASE_URL = "wss://eu.api.gradium.ai/api"

Issue: The region is hardcoded to "eu". Users in other regions may experience higher latency.

Recommendation: Consider making the region configurable via environment variable or parameter:

BASE_URL = os.getenv("GRADIUM_REGION", "eu")
url = f"wss://{BASE_URL}.api.gradium.ai/api"

7. Model ID "default" Too Generic (Low Priority)

Location: packages/capabilities/speech-generation/src/celeste_speech_generation/providers/gradium/models.py:12

Model(
    id="default",
    provider=Provider.GRADIUM,
    display_name="Gradium Default TTS",

Issue: Using "default" as the model ID is ambiguous. If Gradium adds more models later, this could cause confusion.

Recommendation: Use a more specific ID like "gradium-tts-v1" (which is mentioned in the PR description but not used in the code).

8. Empty Audio Response Handling (Low Priority)

Location: packages/capabilities/speech-generation/src/celeste_speech_generation/providers/gradium/client.py:65-67

if not audio_bytes:
    msg = "No audio data in response"
    raise ValueError(msg)

Question: What happens if audio_chunks is an empty list? The joined result would be an empty bytes object b"", which is falsy.

Recommendation: This is correct, but consider checking if not audio_chunks earlier in the flow for clearer error messaging.

9. Speed Mapper Formula Could Be Confusing (Low Priority)

Location: packages/capabilities/speech-generation/src/celeste_speech_generation/providers/gradium/parameters.py:50

padding_bonus = (1.0 - validated_value) * 4.0

Issue: The formula is documented but not the reasoning behind the * 4.0 multiplier.

Recommendation: Add a comment explaining why 4.0 is the multiplier:

# Map speed range [0.25, 4.0] to padding_bonus range [12.0, -12.0]
# At speed 0.25 (slowest): padding_bonus = (1.0 - 0.25) * 4.0 = 3.0
# At speed 1.0 (normal): padding_bonus = 0.0
# At speed 2.0 (2x faster): padding_bonus = -4.0
padding_bonus = (1.0 - validated_value) * 4.0

🔒 Security Considerations

✅ Good Security Practices:

API key handling: Properly uses SecretStr and environment variables
No hardcoded credentials: API keys are properly externalized
Header-based auth: Uses standard x-api-key header pattern

⚠️ Minor Concerns:

WebSocket URL: Hardcoded to wss:// (correct for production)
No timeout on WebSocket: Consider adding timeouts to prevent hanging connections
No rate limiting: Consider adding client-side rate limiting if Gradium has rate limits

📊 Performance Considerations

✅ Good Practices:

Streams audio chunks incrementally
Uses async/await properly throughout
Minimal memory overhead with chunk collection

⚠️ Potential Improvements:

Memory usage: All audio chunks are accumulated in memory before returning. For long audio, consider streaming to disk or returning chunks incrementally.
Connection overhead: New WebSocket per request may add latency. Consider connection pooling if making many requests.

🧪 Test Coverage

Current Status:

✅ Integration test added
✅ Overall coverage at 84.21%
❌ No unit tests specifically for Gradium provider

Recommendation: Add unit tests for better coverage of edge cases and error paths.

🎯 Recommended Actions

Priority Order:

High: Add unit tests for Gradium provider
Medium: Improve error handling in JSON parsing
Medium: Address WebSocket connection management
Low: Clarify the purpose of websocket.py utility module
Low: Consider making region configurable
Low: Use more specific model ID instead of "default"

📝 Minor Issues

File in .gitignore: mureka.md is added to .gitignore - appears unrelated to this PR
Voice list comment: PR description mentions "14 flagship voices" but the actual count matches ✓
pyproject.toml: Missing celeste-google dependency was added (good catch!)

✨ Overall Assessment

Rating: 8/10

This is a well-implemented feature that follows the codebase conventions and patterns. The WebSocket implementation is correct, and the integration with the existing capability system is clean. The main areas for improvement are:

Adding comprehensive unit tests
Improving error handling robustness
Clarifying the WebSocket utility module usage

Recommendation: ✅ Approve with minor revisions

The identified issues are mostly minor and don't block merging. Consider addressing the high-priority items (unit tests) in a follow-up PR if time is constrained.

Great work on maintaining consistency with existing providers! 🎉

didux123 and others added 5 commits December 19, 2025 16:30

fix(gradium): fix provider inconsistencies

4f71aac

- Export GRADIUM_VOICES in speech-generation __init__.py - Add DEFAULT_VOICE_ID to config.py (consistent with ElevenLabs) - Add _make_request stub to satisfy abstract interface - Remove unused httpx import from capability client

test: improve test coverage for constraints and parameters

805249d

test: fix Int constraint test to accept booleans

5ee60ba

Kamilbenkirane temporarily deployed to integration-tests December 19, 2025 16:44 — with GitHub Actions Inactive

Kamilbenkirane had a problem deploying to integration-tests December 19, 2025 16:44 — with GitHub Actions Failure

Kamilbenkirane temporarily deployed to integration-tests December 19, 2025 16:44 — with GitHub Actions Inactive

Kamilbenkirane had a problem deploying to integration-tests December 19, 2025 16:47 — with GitHub Actions Failure

Kamilbenkirane temporarily deployed to integration-tests December 19, 2025 16:47 — with GitHub Actions Inactive

Kamilbenkirane had a problem deploying to testpypi December 19, 2025 16:50 — with GitHub Actions Failure

Kamilbenkirane had a problem deploying to testpypi December 19, 2025 16:52 — with GitHub Actions Failure

Kamilbenkirane temporarily deployed to testpypi December 19, 2025 16:53 — with GitHub Actions Inactive

Kamilbenkirane temporarily deployed to pypi December 19, 2025 16:54 — with GitHub Actions Inactive

Kamilbenkirane merged commit 9d0ee5c into main Dec 19, 2025
105 of 109 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(speech-generation): add Gradium TTS provider with WebSocket API#89

feat(speech-generation): add Gradium TTS provider with WebSocket API#89
Kamilbenkirane merged 5 commits intomainfrom
api/gradium_tts

Kamilbenkirane commented Dec 19, 2025

Uh oh!

claude Bot commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Kamilbenkirane commented Dec 19, 2025

Summary

Changes

New Provider Package (packages/providers/gradium/)

Capability Integration (packages/capabilities/speech-generation/)

Core Updates

Test plan

Uh oh!

claude Bot commented Dec 19, 2025

PR Review: Add Gradium TTS Provider with WebSocket API

Summary

✅ Strengths

Code Quality & Architecture

WebSocket Implementation

Documentation

⚠️ Issues & Concerns

1. WebSocket Connection Management (Medium Priority)

2. Unused WebSocket Utility Module (Low Priority)

3. WebSocket Cleanup Not Exposed (Low Priority)

4. Error Handling: JSON Parsing (Medium Priority)

5. Test Coverage: No Unit Tests for Gradium (High Priority)

6. Hardcoded Region in BASE_URL (Low Priority)

7. Model ID "default" Too Generic (Low Priority)

8. Empty Audio Response Handling (Low Priority)

9. Speed Mapper Formula Could Be Confusing (Low Priority)

🔒 Security Considerations

✅ Good Security Practices:

⚠️ Minor Concerns:

📊 Performance Considerations

✅ Good Practices:

⚠️ Potential Improvements:

🧪 Test Coverage

🎯 Recommended Actions

Priority Order:

📝 Minor Issues

✨ Overall Assessment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New Provider Package (`packages/providers/gradium/`)

Capability Integration (`packages/capabilities/speech-generation/`)