-
Hi, I've managed to generate a proper text to speech following the sample: But the only model i can use is tts or tts-hd - all of them have a cap of 4,096 chars... I am building a language teacher and would like to generate audio sessions ranging up to 20 or more minutes... Is there any way to overcome this "hard cap"? or what should I use instead, tts seems to only have this model... What would you suggest to use? Best, |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
is gpt-4o-audio-preview supported already? Otherwise the most "practical" solution is to split the text in chunks of 4.096 characters (or a bit less just in case) and concatenate the output audio in a single file. For the curious, this works, here's the code: (NOTE: I removed the code as now it is in a public repo along more examples using the Azure OpenAI SDK - which supports gpt-4o-audio-preview) |
Beta Was this translation helpful? Give feedback.
-
Quoted in issue: #10655 I also have managed to use the gpt-4o-voice-preview model (it seems to expire in may...) and it is a bit flaky or slow (I got some timeouts) but the code works too. Thx to @RogerBarreto for suggesting this and using the Azure OpenAI SDK. I will look the next days on how to adapt this already working code to support the model and if its not too hard to provide support for this. Might need help though ;) |
Beta Was this translation helpful? Give feedback.
-
Update: the tts & tts-hd model since 24-02-2025 a new expiration date: 01-02-2026 - bad timing I guess... Anyhow the issue/limitation of 3 requests per minute seem to remain, as well as the limit of 4096 characters per request. To overcome this I've implemented:
Note: for the later i did a version in file system and another, more clean, in memory. You can find the two separate projects in the following repo: https://github.com/joslat/PlayingWithAudio |
Beta Was this translation helpful? Give feedback.
Update: the tts & tts-hd model since 24-02-2025 a new expiration date: 01-02-2026 - bad timing I guess...
Anyhow the issue/limitation of 3 requests per minute seem to remain, as well as the limit of 4096 characters per request.
To overcome this I've implemented:
Note: for the later i did a version in file system and another, more clean, in memory.
You can find the two separate projects in the following repo: https://github.com/joslat/PlayingWithAudio