-
-
Notifications
You must be signed in to change notification settings - Fork 138
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Speech after a particular interval #68
Comments
Thanks for your proposal! Your idea is nice, but it has a problem that the audio can be splitted in the middle of a word. When I wanted to make each speech segment shorter, I tried reducing |
Hi @IrfanAli17899, I'm glad this package has been useful for you. So in other words, throughout a continuous speech period, you would like to have a callback that runs on a regular 5 second interval and takes as an argument the current raw audio of the speech segment? Can I ask what kind of UI updates you are referring to? I would like to understand the use case better. |
Hi @ricky0123 yes you are right, actually the audio i am getting from your package, i am feeding that audio to gpt for transcription and translation and then i show those results on the frontend, i am trying to make a real time translator, the problem is the library doesn't provide audio segments untill user don't stop speaking, i want a smooth audio segment on a regular interval so that if user contineously speaks without stopping then i could still show the transcription and translation results. do you get it? let me know if you need more explanation of the use case, thanks much. |
Hi @IrfanAli17899, thanks for the clarification. Have you considered streaming audio from the browser to your server and doing all of the audio processing there, instead of using this package? Potentially what we could do is provide a method on the vad object that allows you to get the current audio segment. That would allow you to experiment by creating a timer that queries the current audio and sends it to your server. |
yeah i tried the browser media recorder api to stream audio to the server, but as the first chunk is playable because it has all the necessary headers rest of the chunks are not so i had to merge all the chunks on the backend and then crop the last 5 seconds for the transcription so it was a very lengthy hectic solution that is why i tried your package. yes it will be very helpful if there is a prop which takes the callback function and also another prop for interval and it can provide me chunks but each chunk should be playable itself i guess, then it will be useful for me. let me know what do you think, thanks @ricky0123 |
Hi @IrfanAli17899 what I'm saying is that we probably won't add a callback that runs on an interval, but I would be open to adding a method for you to get the raw audio at any given time, so you could do something like
This would be easy to implement and more general. I'm not sure if the method you're describing of sending audio to your server on an interval will work, but this would at least allow you to try it out. |
Yes it will be very helpful, please implement it.
…On Thu, 21 Dec 2023, 1:32 am Ricky Samore, ***@***.***> wrote:
Hi @IrfanAli17899 <https://github.com/IrfanAli17899> what I'm saying is
that we probably won't add a callback that runs on an interval, but I would
be open to adding a method for you to get the raw audio at any given time,
so you could do something like
myvad = await vad.MicVAD.new(...)
mytimer = createTimer(myIntervalLength, function() {
const audio = myvad.getCurrentAudio()
// send audio to server, etc
})
This would be easy to implement and more general. I'm not sure if the
method you're describing of sending audio to your server on an interval
will work, but this would at least allow you to try it out.
—
Reply to this email directly, view it on GitHub
<#68 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AKDBII6GN4QWR5QFCZ5ERS3YKNDMJAVCNFSM6AAAAABA24QKPWVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQNRVGEYDANZWGI>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Hi, pardon @ricky0123 will that audio be vad model processed? |
@ricky0123 Hello Ricky, thanks for the great package, is the above feature implemented in the package? I am looking for the same thing to do a real time stream of audio to the server!! |
to @ricky0123 Hi, is it done? |
Hi, first of all thank you so much for this awesome package, its working great for me, the only thing that i am wondering if i can get the speech after each 5 seconds instead of pause, because right now if user continuously speaks so it doesn't give us results and this shows a kind of latency in the UI. please help thanks in advance.
The text was updated successfully, but these errors were encountered: