Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7

prady77 · 2019-07-26T18:30:25Z

Hello Dave

Amazing work !!!

I am encountering issue of 60s exceed for DF and 305s for speech. They ask to refresh the channel. Any ideas on how to refresh the stream every 60s or 305s?

"Reason-Code":11,"Reason-Msg":"Exceeded maximum allowed stream duration of 305 seconds." GRPC status details.

davehorton · 2019-07-26T18:39:29Z

could you turn freeswitch log levels to debug and recreate, then send me the (entire) log? Feel free to either send me a link to a private gist or email it to me directly

viktorsperl · 2019-08-12T14:05:24Z

This also happend to me. Only these three lines are relevant in the log:

2019-08-12 15:47:59.633164 [ERR] google_glue.cpp:165 grpc_read_thread: error Exceeded maximum allowed stream duration of 305 seconds. (11)
2019-08-12 15:47:59.633164 [DEBUG] google_glue.cpp:221 grpc_read_thread: got 0 responses
2019-08-12 15:47:59.633164 [DEBUG] google_glue.cpp:231 grpc_read_thread: finish() status Exceeded maximum allowed stream duration of 305 seconds. (11)

davehorton · 2019-08-12T14:12:49Z

OK, just wondering about the scenario where this happens. Was there literally 5 minutes with no spoken audio? Or was it five minutes with some sort of speech input, but it just wasn't recognized as an intent?

viktorsperl · 2019-08-12T16:12:01Z

There was continuous speech on the call, and detection returned several transcripts correctly.
However only on one channel of the two, the other one was completely silent. Maybe I should set GOOGLE_SPEECH_SEPARATE_RECOGNITION_PER_CHANNEL on this call?

davehorton · 2019-08-12T20:50:40Z

OK, just to confirm: you are using mod_google_transcribe, correct?

davehorton · 2019-08-12T20:55:48Z

actually, how are the two channels arranged? Are they bridged together, and you want to transcribe each? are you calling uuid_google_transcribe on each separately?

Looking at the code, it doesn't look like I implemented support for sending two channels at once to google (and transcribing each separately) so perhaps I should implement that.

But I would like to understand how you are trying to transcribe the two channels in your app.

viktorsperl · 2019-08-13T04:03:33Z

Two channels are bridged, yes. One has no mic, so no sound.
uuid_google_transcribe is started only on the channel uuid which has incoming speech.

davehorton · 2019-08-13T11:22:01Z

ok, then I am confused. The channel that has speech -- the one you are transcribing -- was returning transcripts correctly? Or it was for a bit, and then eventually it went 305 seconds without returning a transcript?

Assuming the latter, I think all I could (should) do is to return a timeout event to your application. Do you agree?

viktorsperl · 2019-08-13T11:49:18Z

The channel that has speech -- the one you are transcribing -- was returning transcripts correctly? Or it was for a bit, and then eventually it went 305 seconds without returning a transcript?

It did returned transcripts correctly, but after 305 seconds, it stopped.

A bridged call has 2 channels (uuid-s), each channel (uuid) has 2 audio streams (sent and received). As I mentioned earlier, the uuid_google_transcribe was only started on one channel, but that still means 2 audio streams. One of these contained speech, but the other one was mute. Could this be the issue?

davehorton · 2019-08-13T11:52:27Z

no, if you called uuid_google_transcribe on a single channel, then it would be sending only the received audio on that channel to google speech.

Are you saying it returned several transcripts correctly, THEN there was a pause of 305 seconds, followed by that error? Or are you saying that 305 seconds from calling uuid_google_transcribe that error was returned, even though several transcripts were returned during that 305 seconds?

viktorsperl · 2019-08-13T11:59:20Z

no, if you called uuid_google_transcribe on a single channel, then it would be sending only the received audio on that channel to google speech.

Ok ,thanks, good to know.

Are you saying it returned several transcripts correctly, THEN there was a pause of 305 seconds, followed by that error? Or are you saying that 305 seconds from calling uuid_google_transcribe that error was returned, even though several transcripts were returned during that 305 seconds?

There was no pause. I was talking, it returned the transcriptions almost continuously. It stopped 305 seconds from uuid_google_transcribe start.

davehorton · 2019-08-13T12:06:46Z

and you got a final transcription during that time ?

I'd really like to see the entire freeswitch log (debug level) from the time of the call to uuid_google_transcribe start to the error

viktorsperl · 2019-08-13T12:22:14Z

Here you go. Timestamp difference is exactly 305s.
call.log

davehorton · 2019-08-14T12:11:43Z

ok, thanks that was helpful. It looks like google simply limits the length of a long-running recognize operation. Perhaps I just need to respond to this error under the covers by starting another one, or else send an event to the application so that it can restart it. Let me think about that.

By the way, note the GOOGLE_SPEECH_SINGLE_UTTERANCE channel variable option -- not sure of your use case, but if you are doing a command and response type of use case, you may want to use this. If you are just wanting to transcribe a complete phone call or something, then the current way you are doing it is better.

prady77 · 2019-09-03T19:13:14Z

Actually it is Google limitation (GSR), they recommend to refresh the grpc channel every 5 mins. i.e destroy and recreate every 5 mins.

davehorton · 2019-09-10T00:39:49Z

OK, I (finally) have a partial fix for this -- I've addressed the issue with using google speech to text where transcriptions end after 305 seconds.

There is some work on the application side to handle this -- see the updated google_transcribe example to see a working example.

FIrst, of all, I have added a new freeswitch event (google_transcribe::max_duration_exceeded) that the application will receive when google terminates a transcription due to the 305 second limit.

The application is responsible for handling that event and restarting the transcription, as shown here.

This requires a build of mod_google_transcribe with the commit referenced above.

The similar dialogflow issue is still outstanding (probably needs a similar fix), but @prady77 if possible please test and let me know if this resolves your issue.

davehorton added a commit that referenced this issue Sep 10, 2019

bugfix: google transcribe can not handle more than 305 secs (#7)

7cc0da4

vikash-plivo mentioned this issue Oct 22, 2019

mod_google_transcribe: intermittent crash while initialising google session #13

Closed

vikash-plivo mentioned this issue Nov 5, 2019

[mod_google_transcribe]: Freeswitch crash while ending google transcription #17

Open

anchitDave mentioned this issue Aug 17, 2020

mod_google_tts and mod_google_transcribe stuck #36

Closed

jgoyette-jg mentioned this issue Apr 17, 2023

Fails to start after build on Debian 11 #111

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7

Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7

prady77 commented Jul 26, 2019

davehorton commented Jul 26, 2019 •

edited

Loading

viktorsperl commented Aug 12, 2019

davehorton commented Aug 12, 2019

viktorsperl commented Aug 12, 2019

davehorton commented Aug 12, 2019

davehorton commented Aug 12, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 14, 2019

prady77 commented Sep 3, 2019 •

edited

Loading

davehorton commented Sep 10, 2019 •

edited

Loading

Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7

Dialogflow service cannot handle audio more than 60s. Google transcribe cannot handle audio more than 305s #7

Comments

prady77 commented Jul 26, 2019

davehorton commented Jul 26, 2019 • edited Loading

viktorsperl commented Aug 12, 2019

davehorton commented Aug 12, 2019

viktorsperl commented Aug 12, 2019

davehorton commented Aug 12, 2019

davehorton commented Aug 12, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 13, 2019

viktorsperl commented Aug 13, 2019

davehorton commented Aug 14, 2019

prady77 commented Sep 3, 2019 • edited Loading

davehorton commented Sep 10, 2019 • edited Loading

davehorton commented Jul 26, 2019 •

edited

Loading

prady77 commented Sep 3, 2019 •

edited

Loading

davehorton commented Sep 10, 2019 •

edited

Loading