-
Notifications
You must be signed in to change notification settings - Fork 2.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
InfiniteStreaming with soft handover. #8250
Comments
Hi @bburli , thank you for your application. If I get it right, you are saying that the code sample in InfiniteStreamRecognize.java lacks some functionality that includes:
Please, note that the samples in this repo demonstrate an opinionated way of using Google APIs. The samples usually require additional work to include all functionality necessary to use the code in production. For example, the error handling is very basic and does not demonstrate exponential backoff error handling technique and/or other error handling practices that aren't necessary related to demonstration of the specific Google API. If you find this functionality important and consider to contribute to the collection of samples, we are welcome contribution to the code samples. |
@minherz I agree with the nature of the repo and I am aware these samples are not to be used in production as is but would require work on operational concerns. With that said, I do think that the resetting stream because of the 5 minute timeout is fundamentally an error handling problem and since most consumers use the Streaming Speech API in real-time, I think it's essential to provide this example as it indicates the only other alternative approach for real-time switching of streams that I could think of. I would be happy to bring in a PR for this to the repo and tag you for review. Please do suggest any other reviewers. Regardless of whether this goes into the repo (for any reason whatsoever) I would welcome feedback of any kind. I will keep this open until PR is up. |
- Adding a sample for Soft Handover in stream switching. Please refer GoogleCloudPlatform#8250 for issue background.
Closing. See PR comment from me for context. @bburli posted his code sample and the discussion at https://www.googlecloudcommunity.com/gc/AI-ML/Soft-Handover-in-Infinite-streaming/m-p/602877/thread-id/2153. thanks Badari. |
Context
This feature suggestion is inspired from "Soft Handover" that happens in mobile towers. For details: https://en.wikipedia.org/wiki/Handover#Types
Basically, with a 5 min streaming limit for Google STT, there is an example provided in the repo which continues to stream audio by opening a new stream and resending audio by calculating the audio packets to be sent from last final transcript's end time. Please correct me if my understanding is incorrect.
This uses what is referred to as "break-and-make" (As mentioned in the wiki article above). This has some problems:
Alternative:
I tried to work with a "Soft handover" method where we open another stream early (Say, 1 min before) and send audio to both streams until the transcripts align. When they do, we switch to new stream.
While this has its own problems, I believe this gives better control and is not tied to word timings.
I wanted to get feedback on the same. Attaching the code file (in java) here. For testing purposes, I have kept the
STREAMING_LIMIT
as 30 seconds.I am aware of some areas where I can sharpen this more, but I am looking for concrete and major concerns from experts or authors.
Thanks in advance!
InfiniteStreamRecognizeSoftHandOver.java.txt
The text was updated successfully, but these errors were encountered: