Welcome to Piper Discussions! #136
Replies: 6 comments 6 replies
-
Guude (Hi) 👋, |
Beta Was this translation helpful? Give feedback.
-
Hello, maybe it's a good idea to add C++ examples (without python) for whisper.cpp users. Let's say project is using whisper.cpp for speech recognition and piper for speech generation. |
Beta Was this translation helpful? Give feedback.
-
Hi Guys, Great project, I have managed to install piper on a Raspberry Pi 4 and I am impressed with the quality of the synthesis. So far I have managed to use the ‘en_US-lessac-medium.onnx’ model successfully but a couple others I have tried have generated some sort of json error, ‘en_GB-northern_english_male-medium.onnx’ model for example. Error: terminate called after throwing an instance of 'nlohmann::json_abi_v3_11_2::detail::parse_error' But my main reason for this submission is to ask, is it possible to direct the output directly to a speaker rather than to a wav file? Thanks for your time, great job. Ernie |
Beta Was this translation helpful? Give feedback.
-
Hi Michael,
What my aim is to create a voice assistant where ‘piper’ would be the response side. I can now pass a message to ‘piper’ and then pipe the output to ‘aplay’ to output to the speaker. This all works at the command line but is slow! For each phrase to be spoken ‘piper’ needs to load with the appropriate voice model and then load ‘aplay’ to sound the voice.
Does this sound sensible or can you suggest another approach? Thanks again for your reply. Ernie |
Beta Was this translation helpful? Give feedback.
-
Hi Michael,
How is the WAV file then played? Thanks again for your help. Ernie |
Beta Was this translation helpful? Give feedback.
-
Hi Michael ( @synesthesiam ) I would like to request a new feature similar to SpeakProgressEventArgs.Text and SpeakProgressEventArgs.AudioPosition properties in Microsoft's System.Speech.Synthesis namespace. For example, the sentence "Struggles and challenges are integral to growth and resilience" has 9 words in it. Text: It will be the actual spoken word Assuming the .wav produced has total duration of 1984ms, then the .csv file's data will look somewhat like this:
Thanks. |
Beta Was this translation helpful? Give feedback.
-
👋 Welcome!
We’re using Discussions as a place to connect with other members of our community. We hope that you:
build together 💪.
To get started, comment below with an introduction of yourself and tell us about what you do with this community.
Beta Was this translation helpful? Give feedback.
All reactions