Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

isFinal missing on streamingRecognize #65

Closed
blerest opened this issue Apr 25, 2018 · 8 comments
Closed

isFinal missing on streamingRecognize #65

blerest opened this issue Apr 25, 2018 · 8 comments
Assignees
Labels
api: speech Issues related to the googleapis/nodejs-speech API. ml-apis priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@blerest
Copy link

blerest commented Apr 25, 2018

Sometime the Speech API stuck when I say only one word using streaming recognize. The API recognize the end of the sentence as I receive correctly END_OF_SINGLE_UTTERANCE, but I never receive the transcription with isFinal=true.

This is a big problem for me as I use isFinal to reload the API connection. I can reproduce the issue on both API v1 and v1p1beta1.

{ config:
   { encoding: 1,
     sampleRateHertz: 8000,
     languageCode: 'fr-FR',
     maxAlternatives: 0,
     profanityFilter: true },
  singleUtterance: true,
  interimResults: true }

long sentence:

{"results":[{"alternatives":[{"words":[],"transcript":"pour","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"Bonjour","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça m'a","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça marche","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce","confidence":0}],"isFinal":false,"stability":0.8999999761581421},{"alternatives":[{"words":[],"transcript":" que ça marche","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que","confidence":0}],"isFinal":false,"stability":0.8999999761581421},{"alternatives":[{"words":[],"transcript":" ça marche bien","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça","confidence":0}],"isFinal":false,"stability":0.8999999761581421},{"alternatives":[{"words":[],"transcript":" marche bien","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça marche","confidence":0}],"isFinal":false,"stability":0.8999999761581421},{"alternatives":[{"words":[],"transcript":" bien","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça marche bien","confidence":0}],"isFinal":false,"stability":0.8999999761581421}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[],"error":null,"speechEventType":"END_OF_SINGLE_UTTERANCE"}
{"results":[{"alternatives":[{"words":[],"transcript":"bonjour est-ce que ça marche bien","confidence":0.9081912636756897}],"isFinal":true,"stability":0}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}

one word sentence:

{"results":[{"alternatives":[{"words":[],"transcript":"un","confidence":0}],"isFinal":false,"stability":0.009999999776482582}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[{"alternatives":[{"words":[],"transcript":"un","confidence":0}],"isFinal":false,"stability":0.8999999761581421}],"error":null,"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{"results":[],"error":null,"speechEventType":"END_OF_SINGLE_UTTERANCE"}
{"results":[],"error":{"details":[],"code":11,"message":"Exceeded maximum allowed stream duration of 65 seconds."},"speechEventType":"SPEECH_EVENT_UNSPECIFIED"}
{ Error: 11 OUT_OF_RANGE: Exceeded maximum allowed stream duration of 65 seconds.
    at createStatusError (node_modules/grpc/src/client.js:64:15)
    at ClientDuplexStream._emitStatusIfDone (node_modules/grpc/src/client.js:270:19)
    at ClientDuplexStream._receiveStatus (node_modules/grpc/src/client.js:248:8)
    at node_modules/grpc/src/client.js:804:12
  code: 11,
  metadata:
   Metadata {
     _internal_repr: { 'content-disposition': [Array], 'x-goog-trace-id': [Array] } },
  details: 'Exceeded maximum allowed stream duration of 65 seconds.' }
@JustinBeckwith JustinBeckwith added priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jun 1, 2018
@JustinBeckwith JustinBeckwith added the 🚨 This issue needs some love. label Jun 8, 2018
@alisher-agzamov
Copy link

alisher-agzamov commented Jul 9, 2018

I have the same issue but not only when I'm saying one word. Sometimes it happens while long recognition. I'm using confidence result in my app every time when I receive isFinial=true. But sometimes while long recognition I receive END_OF_SINGLE_UTTERANCE and I cannot use confidence because it always returns only with isFinal=true. And this is really important thing in my app.

Can you return confidence value even if I'm receiving END_OF_SINGLE_UTTERANCE or while interim results?

@JustinBeckwith JustinBeckwith added ml-apis triage me I really want to be triaged. labels Sep 21, 2018
@JustinBeckwith JustinBeckwith added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. 🚨 This issue needs some love. triage me I really want to be triaged. labels Oct 1, 2018
@JustinBeckwith JustinBeckwith added the 🚨 This issue needs some love. label Oct 22, 2018
@blerest
Copy link
Author

blerest commented Dec 5, 2018

I still have issue. Do you have information?

@jkwlui jkwlui assigned jkwlui and beccasaurus and unassigned jkwlui Dec 14, 2018
@pohsiu
Copy link

pohsiu commented Jan 31, 2019

How about restart the service every single minute

@beccasaurus
Copy link
Contributor

Just saw this get bumped. Following up on this, will report back.

@callmehiphop
Copy link
Contributor

@beccasaurus gentle ping!

@0xTea
Copy link

0xTea commented Jun 13, 2019

Has this issue ever been resolved , @blerest ... I have a similar issues of time , I have noticed the same issue where the commands is on only final after an a new world or sound has been registered between the utterances.

@beccasaurus beccasaurus assigned nnegrey and unassigned beccasaurus Jun 13, 2019
@kidplug
Copy link

kidplug commented Aug 21, 2019

@blerest
When you get the END_OF_SINGLE_UTTERANCE event, are you then calling stream.end() ?
That's what I'm doing and it works fine... but you still have to keep listening for the "data" event from the stream itself, which should give you the final data event with isFinal:true. Then you'll also get an end event, at which time you can "unlisten" from the stream events: data, error, and end.

From docs:
`

END_OF_SINGLE_UTTERANCE This event indicates that the server has detected the end of the user's speech utterance and expects no additional speech. Therefore, the server will not process additional audio (although it may subsequently return additional results). The client should stop sending additional audio data, half-close the gRPC connection, and wait for any additional results until the server closes the gRPC connection. This event is only sent if single_utterance was set to true, and is not used otherwise.

`

@nnegrey
Copy link
Contributor

nnegrey commented Sep 13, 2019

+1 to what @kidplug said.

Also tried this myself and got:

# First result is a interim result
results {
  alternatives {
    transcript: "check"
  }
  stability: 0.009999999776482582
  result_end_time {
    nanos: 940000000
  }
}
# Second result is a interim result
results {
  alternatives {
    transcript: "check"
  }
  stability: 0.8999999761581421
  result_end_time {
    seconds: 1
    nanos: 510000000
  }
}
# Speech api detects the end of my utterance
speech_event_type: END_OF_SINGLE_UTTERANCE
# Speech api returns is_final=true
results {
  alternatives {
    transcript: "check"
    confidence: 0.9628270864486694
  }
  is_final: true
  result_end_time {
    seconds: 3
    nanos: 670000000
  }
}

If anyone is still having an issue, please comment again here and I'll try to reproduce. But I'm going to close this for now.

@nnegrey nnegrey closed this as completed Sep 13, 2019
@google-cloud-label-sync google-cloud-label-sync bot added the api: speech Issues related to the googleapis/nodejs-speech API. label Jan 31, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
api: speech Issues related to the googleapis/nodejs-speech API. ml-apis priority: p2 Moderately-important priority. Fix may not be included in next release. 🚨 This issue needs some love. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

No branches or pull requests

10 participants