Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] - Managing Google TTS #475

Open
schmurtzm opened this issue Dec 29, 2021 · 2 comments
Open

[Feature Request] - Managing Google TTS #475

schmurtzm opened this issue Dec 29, 2021 · 2 comments

Comments

@schmurtzm
Copy link

Google TTS is a killer feature for ESP due to the multilingual and quality of voices.
Following issue #395 : it seems not possible to use Google TTS url like a classical mp3 stream due to a buffering problem at the end of the play.

Easy way to reproduce the problem : take "StreamMP3FromHTTP" example, and replace :
const char *URL="http://kvbstreams.dyndns.org:8000/wkvi-am";
with
const char *URL="http://translate.google.com/translate_tts?ie=UTF-8&q=bonjour&tl=fr&client=tw-ob&ttsspeed=1";

This is a record of the sound that I obtain.

Managing Google TTS seems more complex than what I was thinking...
I found this I2S library which support Google TTS: ESP32-audioI2S

As you can see here the author has made a complex function to use Google TTS. This could be a good source of inspiration to make a new AudioFileSourceGoogleTTS.h 😅

Interesting facts : I also tested to play Google TTS on this library without using this specific function, just with the URL of Google Translate and the result was similar to ESP8266Audio : at the end of the play there is a buffer problem. It doesn't hang the ESP on this library but it means that Google TTS send the mp3 file in a particular way which require more work than a classic mp3 stream.

schmurtzm referenced this issue in schmurtzm/MrDiy-Audio-Notifier Jan 4, 2022
@FedericoBusero
Copy link
Contributor

Also remark following change in ESP32-audioI2S which might inspire to detect the end of the stream

schreibfaul1/ESP32-audioI2S@ed13136#diff-6033949d768051c96b2a380edec84a3e1cbe58b3a94e41aae408c6a8f8bbc2b2

@schmurtzm
Copy link
Author

schmurtzm commented Jan 12, 2022

After some investigation, I've found where is the problem :
as you can see here in ESP32-audioI2S , he makes a special exception for TTS : "tts has one chunk only".

There are some issues about chunk management in ESP8266audio, most of them for ICY streams (=shoutcast).
When we look at chunk management in ESP8266audio, we quickly find a pull request from yoav-klein which improves a lot the result with google TTS : now it hangs few seconds (about 11 seconds) instead of minutes !

From what I understand, there are some data in the mp3 stream which indicate the size of the file. I think that it is what is done here in ESP32-audioI2S library.

@yoav-klein & @DSangyy , I saw that you have working on the chunk management, if you have some time to take a look it will be very welcome ;)
Thank you very much 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants