-
Notifications
You must be signed in to change notification settings - Fork 0
API Reference
Complete API documentation for the Kokoro TTS MCP Server.
The server implements the Model Context Protocol (MCP) specification.
{
"tools": {}
}The server provides tools that can be called by MCP clients.
Converts text to speech using the Kokoro TTS model.
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "text_to_speech",
"arguments": {
"text": "string (required)",
"voice": "string (optional, default: 'af_heart')",
"speed": "number (optional, default: 1.0)"
}
}
}| Parameter | Type | Required | Default | Description |
|---|---|---|---|---|
text |
string | Yes | - | The text to convert to speech |
voice |
string | No | af_heart |
Voice ID to use |
speed |
number | No | 1.0 |
Speech speed multiplier (0.5-2.0) |
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": [
{
"type": "text",
"text": "Successfully generated speech audio ({size} bytes). Audio data URI: data:audio/wav;base64,{base64data}\n\nAudio file saved to: {filepath}\n{playback_status}"
}
],
"isError": false
}
}{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": [
{
"type": "text",
"text": "Successfully generated speech audio (144044 bytes). Audio data URI: data:audio/wav;base64,UklGRiQAAABXQVZFZm10...\n\nAudio file saved to: /mnt/c/Users/user/Desktop/kokoro-tts-1234567890.wav\nβ
Audio played successfully using aplay"
}
],
"isError": false
}
}{
"jsonrpc": "2.0",
"id": 1,
"result": {
"content": [
{
"type": "text",
"text": "Error generating speech: Voice 'invalid_voice' not found"
}
],
"isError": true
}
}Available voice options:
-
af_heart- Default female voice, warm and friendly -
af_bella- Female voice, clear and professional -
af_sarah- Female voice, energetic and cheerful
See Kokoro documentation for complete list.
-
Minimum:
0.5(half speed) -
Maximum:
2.0(double speed) -
Default:
1.0(normal speed)
When running in SSE mode (--sse), the server exposes HTTP endpoints.
Server-Sent Events stream endpoint.
Headers:
Content-Type: text/event-stream
Cache-Control: no-cache
Connection: keep-alive
Access-Control-Allow-Origin: *
Response:
data: {"type": "connected"}
MCP JSON-RPC endpoint.
Content-Type: application/json
Request:
{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/list"
}{
"jsonrpc": "2.0",
"id": 1,
"method": "tools/call",
"params": {
"name": "text_to_speech",
"arguments": {
"text": "Hello"
}
}
}Response:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"tools": [...]
}
}Generated audio is in WAV format:
- Encoding: PCM
- Sample Rate: 24kHz (default Kokoro output)
- Channels: Mono
- Bit Depth: 16-bit
Audio data is provided as base64-encoded data URI:
data:audio/wav;base64,{base64_encoded_audio}
Audio files are saved with naming pattern:
kokoro-tts-{timestamp}.wav
Where {timestamp} is Unix timestamp in milliseconds.
| Error | Description | Solution |
|---|---|---|
Voice not found |
Invalid voice ID | Use valid voice from list |
Text parameter is required |
Missing text parameter | Provide text in request |
Generation timeout |
Model initialization taking too long | Wait for first-time setup |
Memory error |
Insufficient RAM | Ensure 4GB+ available |
Currently, no rate limiting is enforced. However:
- Model initialization takes ~5-10 seconds
- Each generation takes ~2-5 seconds
- Consider implementing client-side rate limiting for production
No authentication required - uses process communication.
Currently no authentication. For production:
- Implement API key authentication
- Use HTTPS/TLS
- Add rate limiting
API version follows package version:
- Current:
1.0.1 - Check package.json for latest version
- Major version bumps indicate breaking changes
- Deprecated features will have 6-month notice
- Check CHANGELOG.md for migration guides
- See Examples for usage patterns
- Check Troubleshooting for error resolution
- Review Development Guide for extending the API
Kokoro TTS MCP Server by Ross Technologies
π Beer Sheva, Israel | π§ devops.ross@gmail.com
Repository: github.com/ross-sec/kokoro_mcp_server | NPM: @ross_tchnologies/kokoro-tts-mcp-server
Β© 2025 Ross Technologies. Licensed under MIT License.
Ross Technologies | Beer Sheva, Israel