Skip to content

nosun/edge-tts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Edge TTS for Laravel

edge-tts is a Laravel package that allows you to use Microsoft Edge's online text-to-speech service from within your Laravel code. No Azure API Key required - this package uses the free Edge Read Aloud API.

Latest Version on Packagist Total Downloads Tests

Features

  • 🎤 Text to Speech: Convert text to natural-sounding speech using Microsoft's neural voices
  • 🌍 Multi-language Support: Support for 100+ languages and locales
  • Speech Customization: Adjust rate, pitch, and volume
  • 💾 Flexible Storage: Save to local file or Laravel Storage
  • 🎮 Artisan Commands: Command-line interface for quick usage
  • 🔄 Streaming Support: Stream audio data in real-time
  • 📝 Metadata Support: Word and sentence boundary metadata for subtitle generation
  • 🔧 Smart Text Processing: Automatic text splitting and special character handling
  • 🛡️ Robust Error Handling: Detailed exception types and automatic retry mechanisms

Requirements

  • PHP >= 8.1
  • Laravel >= 10.0 || >= 11.0
  • ext-json

Installation

You can install the package via Composer:

composer require nosun/edge-tts

Configuration

The package works with default configuration out of the box. If you want to customize the settings, publish the config file:

php artisan vendor:publish --provider="Nosun\EdgeTts\EdgeTtsServiceProvider" --tag="config"

This will create config/edge-tts.php with the following options:

return [
    'default' => [
        'voice' => 'en-US-JennyNeural',   // Default voice
        'rate' => '+0%',                     // Speech rate (-100% to +100%)
        'pitch' => '+0Hz',                   // Pitch adjustment
        'volume' => '+0%',                   // Volume (-100% to +100%)
        'format' => 'audio-24khz-96kbitrate-mono-mp3', // Output format
    ],

    'storage' => [
        'disk' => 'public',                  // Storage disk for synthesizeToStorage
        'path' => 'edge-tts',                // Storage path
    ],

    'websocket' => [
        'timeout' => 30,                     // Connection timeout in seconds
        'wss_url' => '...',                  // WebSocket endpoint
        'voices_url' => '...',               // Voices list endpoint
    ],
];

Usage

Using Facade

use Nosun\EdgeTts\Facades\EdgeTts;

// Basic text-to-speech
$result = EdgeTts::synthesize('Hello, World!');

// Get audio data
$audioData = $result->audioData;

// Save to file
$path = EdgeTts::synthesizeToFile('Hello, World!', '/path/to/output.mp3');

// Save to Laravel Storage
$storagePath = EdgeTts::synthesizeToStorage('Hello, World!', 'greeting.mp3');

Customizing Speech Parameters

use Nosun\EdgeTts\Facades\EdgeTts;

// With custom voice and parameters
$result = EdgeTts::synthesize('你好世界!', [
    'voice' => 'zh-CN-XiaoxiaoNeural',  // Chinese female voice
    'rate' => '+20%',                     // Slightly faster
    'pitch' => '+10Hz',                   // Higher pitch
    'volume' => '+50%',                   // Louder
]);

// Save with custom options
EdgeTts::synthesizeToFile(
    'Welcome to Laravel!', 
    '/path/to/output.mp3',
    [
        'voice' => 'en-US-GuyNeural',
        'rate' => '-10%',
    ]
);

Streaming Audio with Metadata

use Nosun\EdgeTts\Facades\EdgeTts;

// Stream audio in real-time with metadata
$stream = EdgeTts::stream('This is a long text that will be streamed...', [
    'boundary' => 'WordBoundary', // or 'SentenceBoundary'
]);

foreach ($stream as $chunk) {
    if ($chunk['type'] === 'audio') {
        // Process audio chunk
        file_put_contents('output.mp3', $chunk['data'], FILE_APPEND);
    } elseif ($chunk['type'] === 'WordBoundary' || $chunk['type'] === 'SentenceBoundary') {
        // Process metadata for subtitle generation
        echo "Text: {$chunk['text']}\n";
        echo "Offset: {$chunk['offset']} ticks\n";
        echo "Duration: {$chunk['duration']} ticks\n";
    }
}

Voice Management

use Nosun\EdgeTts\Facades\EdgeTts;

// Get all available voices
$voices = EdgeTts::voices();
foreach ($voices as $voice) {
    echo $voice->shortName . ' - ' . $voice->localeName . PHP_EOL;
}

// Filter voices by locale
$chineseVoices = EdgeTts::voicesByLocale('zh-CN');

// Filter voices by gender
$femaleVoices = EdgeTts::voicesByGender('Female');

// Get a specific voice
$voice = EdgeTts::getVoice('en-US-JennyNeural');

// Check if voice exists
if (EdgeTts::voiceExists('zh-CN-YunxiNeural')) {
    // Voice is available
}

// Get all available locales
$locales = EdgeTts::getLocales();

Using Dependency Injection

use Nosun\EdgeTts\EdgeTts;

class TtsController extends Controller
{
    public function synthesize(EdgeTts $edgeTts, Request $request)
    {
        $result = $edgeTts->synthesize($request->input('text'));
        
        return response($result->audioData)
            ->header('Content-Type', 'audio/mpeg');
    }
}

Artisan Commands

List Available Voices

# List all voices
php artisan edge-tts:voices

# Filter by locale
php artisan edge-tts:voices --locale=en-US
php artisan edge-tts:voices --locale=zh-CN

# Filter by gender
php artisan edge-tts:voices --gender=Female
php artisan edge-tts:voices --gender=Male

Synthesize Text to Speech

# Basic synthesis (shows info only)
php artisan edge-tts:synthesize "Hello, World!"

# Save to local file
php artisan edge-tts:synthesize "Hello, World!" --output=/path/to/output.mp3

# Save to Laravel Storage
php artisan edge-tts:synthesize "Hello, World!" --storage --output=greeting.mp3

# With custom options
php artisan edge-tts:synthesize "你好!" \
    --voice=zh-CN-XiaoxiaoNeural \
    --rate=+20% \
    --pitch=+10Hz \
    --volume=+50% \
    --output=chinese.mp3

Popular Voices

Here are some popular voices you can use:

Voice Name Language Gender Description
en-US-JennyNeural English (US) Female Warm, professional
en-US-GuyNeural English (US) Male Deep, confident
zh-CN-XiaoxiaoNeural Chinese (Mandarin) Female Natural Chinese
zh-CN-YunxiNeural Chinese (Mandarin) Male Clear Chinese
en-GB-SoniaNeural English (UK) Female British accent
ja-JP-NanamiNeural Japanese Female Japanese
ko-KR-SunHiNeural Korean Female Korean
fr-FR-DeniseNeural French Female French
de-DE-KatjaNeural German Female German
es-ES-ElviraNeural Spanish Female Spanish

Run php artisan edge-tts:voices to see the complete list.

Speech Parameters

Rate

Controls the speaking speed.

  • Format: ±X% (e.g., +50%, -30%)
  • Range: Approximately -100% to +100%
  • Default: +0%

Pitch

Controls the voice pitch.

  • Format: ±XHz (e.g., +10Hz, -5Hz)
  • Or relative: high, low, default
  • Default: +0Hz

Volume

Controls the audio volume.

  • Format: ±X% (e.g., +50%, -20%)
  • Range: -100% to +100%
  • Default: +0%

Output Formats

  • audio-24khz-96kbitrate-mono-mp3 (Default)
  • audio-24khz-48kbitrate-mono-mp3
  • audio-16khz-128kbitrate-mono-mp3
  • audio-16khz-64kbitrate-mono-mp3
  • audio-16khz-32kbitrate-mono-mp3

Examples

Controller Example

<?php

namespace App\Http\Controllers;

use Illuminate\Http\Request;
use Illuminate\Support\Facades\Storage;
use Nosun\EdgeTts\Facades\EdgeTts;

class TtsController extends Controller
{
    public function synthesize(Request $request)
    {
        $request->validate([
            'text' => 'required|string|max:1000',
            'voice' => 'nullable|string',
            'rate' => 'nullable|string',
        ]);

        $options = [];
        if ($request->voice) {
            $options['voice'] = $request->voice;
        }
        if ($request->rate) {
            $options['rate'] = $request->rate;
        }

        $filename = 'tts_' . uniqid() . '.mp3';
        $path = EdgeTts::synthesizeToStorage($request->text, $filename, $options);

        return response()->json([
            'success' => true,
            'path' => $path,
            'url' => Storage::url($path),
        ]);
    }

    public function download(Request $request)
    {
        $request->validate([
            'text' => 'required|string|max:500',
        ]);

        $result = EdgeTts::synthesize($request->text);

        return response($result->audioData)
            ->header('Content-Type', 'audio/mpeg')
            ->header('Content-Disposition', 'attachment; filename="speech.mp3"');
    }

    public function voices()
    {
        $voices = EdgeTts::voices();
        
        return response()->json([
            'voices' => array_map(fn($v) => $v->toArray(), $voices),
            'locales' => EdgeTts::getLocales(),
        ]);
    }
}

Job Example

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;
use Illuminate\Contracts\Queue\ShouldBeUnique;
use Illuminate\Contracts\Queue\ShouldQueue;
use Illuminate\Foundation\Bus\Dispatchable;
use Illuminate\Queue\InteractsWithQueue;
use Illuminate\Queue\SerializesModels;
use Nosun\EdgeTts\Facades\EdgeTts;

class ProcessTextToSpeech implements ShouldQueue
{
    use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    public function __construct(
        public string $text,
        public string $filename,
        public array $options = []
    ) {}

    public function handle(): void
    {
        EdgeTts::synthesizeToStorage(
            $this->text,
            $this->filename,
            $this->options
        );
    }
}

Testing

Run the package tests:

composer test

Generate code coverage report:

composer test-coverage

Important Notes

API Limitations

  • This package uses Microsoft Edge's free Read Aloud API
  • There may be rate limits or usage restrictions
  • The API could change at any time (Microsoft controls this)

Network Requirements

  • Requires HTTPS/WebSocket connectivity to Microsoft servers
  • Firewall must allow outbound connections to speech.platform.bing.com

Long Text

  • For very long text, consider splitting into smaller chunks
  • Streaming is recommended for long texts to reduce memory usage

Troubleshooting

Connection Issues

NetworkException: WebSocket connection failed
  • Check your internet connection
  • Verify firewall allows WebSocket connections
  • Try increasing the timeout in config

Authentication Issues

403 Forbidden / Authentication error
  • This is usually caused by clock skew between your server and Microsoft's servers
  • The package includes automatic clock skew detection and retry
  • If persistent, check your server's system clock

Voice Not Found

VoiceNotFoundException: Voice not found: unknown-voice
  • Run php artisan edge-tts:voices to see available voices
  • Use the exact ShortName value

No Audio Received

NoAudioReceivedException: No audio was received
  • Verify your text is not empty
  • Check if the voice supports the language of your text
  • Review network connectivity

Unexpected Response

UnexpectedResponseException: Unexpected response from service
  • This indicates the service returned an unexpected format
  • Check if Microsoft has updated their API
  • Ensure you're using the latest version of this package

SSL/TLS Issues

Ensure your PHP installation has proper SSL certificates configured.

Changelog

Please see CHANGELOG for more information on what has changed recently.

Contributing

Please see CONTRIBUTING for details.

Security Vulnerabilities

Please review our security policy on how to report security vulnerabilities.

Credits

This package is inspired by:

License

The MIT License (MIT). Please see License File for more information.

About

edge-tts is a laravel package that allows you to use Microsoft Edge's online text-to-speech service from within your laravel code

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages