-
Notifications
You must be signed in to change notification settings - Fork 114
Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7
Comments
could you turn freeswitch log levels to debug and recreate, then send me the (entire) log? Feel free to either send me a link to a private gist or email it to me directly |
This also happend to me. Only these three lines are relevant in the log:
|
OK, just wondering about the scenario where this happens. Was there literally 5 minutes with no spoken audio? Or was it five minutes with some sort of speech input, but it just wasn't recognized as an intent? |
There was continuous speech on the call, and detection returned several transcripts correctly. |
OK, just to confirm: you are using mod_google_transcribe, correct? |
actually, how are the two channels arranged? Are they bridged together, and you want to transcribe each? are you calling Looking at the code, it doesn't look like I implemented support for sending two channels at once to google (and transcribing each separately) so perhaps I should implement that. But I would like to understand how you are trying to transcribe the two channels in your app. |
Two channels are bridged, yes. One has no mic, so no sound. |
ok, then I am confused. The channel that has speech -- the one you are transcribing -- was returning transcripts correctly? Or it was for a bit, and then eventually it went 305 seconds without returning a transcript? Assuming the latter, I think all I could (should) do is to return a timeout event to your application. Do you agree? |
It did returned transcripts correctly, but after 305 seconds, it stopped. A bridged call has 2 channels (uuid-s), each channel (uuid) has 2 audio streams (sent and received). As I mentioned earlier, the |
no, if you called Are you saying it returned several transcripts correctly, THEN there was a pause of 305 seconds, followed by that error? Or are you saying that 305 seconds from calling |
Ok ,thanks, good to know.
There was no pause. I was talking, it returned the transcriptions almost continuously. It stopped 305 seconds from |
and you got a final transcription during that time ? I'd really like to see the entire freeswitch log (debug level) from the time of the call to |
Here you go. Timestamp difference is exactly 305s. |
ok, thanks that was helpful. It looks like google simply limits the length of a long-running recognize operation. Perhaps I just need to respond to this error under the covers by starting another one, or else send an event to the application so that it can restart it. Let me think about that. By the way, note the |
Actually it is Google limitation (GSR), they recommend to refresh the grpc channel every 5 mins. i.e destroy and recreate every 5 mins. |
OK, I (finally) have a partial fix for this -- I've addressed the issue with using google speech to text where transcriptions end after 305 seconds. There is some work on the application side to handle this -- see the updated google_transcribe example to see a working example. FIrst, of all, I have added a new freeswitch event ( The application is responsible for handling that event and restarting the transcription, as shown here. This requires a build of mod_google_transcribe with the commit referenced above. The similar dialogflow issue is still outstanding (probably needs a similar fix), but @prady77 if possible please test and let me know if this resolves your issue. |
Hello Dave
Amazing work !!!
I am encountering issue of 60s exceed for DF and 305s for speech. They ask to refresh the channel. Any ideas on how to refresh the stream every 60s or 305s?
"Reason-Code":11,"Reason-Msg":"Exceeded maximum allowed stream duration of 305 seconds." GRPC status details.
The text was updated successfully, but these errors were encountered: