Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-utterance extension of speech_recognizer.recognize_once() API #397

Closed
larschristensen opened this issue Oct 11, 2019 · 7 comments
Closed
Assignees

Comments

@larschristensen
Copy link

I'm using the speech_recognizer.recognize_once() method for synchronous/blocking transcription of audio files. However, this method only does recognition of a single utterance, but the files I wish to transcribe can contain multiple utterances.

What is the recommended approach for synchronous/blocking multi-utterance transcription of audio files? Would it be possible to extend the speech_recognizer.recognize_once() API to also accept multi-utterance audio files?

@chlandsi
Copy link
Contributor

Please have a look at the continuous recognition mode. There is a sample here. This comment shows how to accumulate all results from continuous recognition.

@larschristensen
Copy link
Author

Please have a look at the continuous recognition mode. There is a sample here. This comment shows how to accumulate all results from continuous recognition.

Thanks for the information, the sample code seems to work fine with multiple utterances. However in this continuous recognition mode, I can't seem to delete the audio file immediately after the recognition is complete. Is there something I have to do in order to close the audio file used for recognition?

@chlandsi
Copy link
Contributor

You could try calling del recognizer after the recognition is finished (i.e., after you have received a session stopped event) to clean up the recognizer resources. Also have a look at the Batch API (sample), maybe it fits for your needs.

@larschristensen
Copy link
Author

I have tried deleting both speech_recognizer and others after recognition is completed, but it doesn't make any difference. Could it be that the file access is somehow not correctly released in the SDK after use when using speech_recognizer.start_continuous_recognition()?

@chlandsi
Copy link
Contributor

Hi @larschristensen, could you share the code that shows the problem you are having? Also, could you check the discussion in #352 to see whether it might be related? Thanks!

@larschristensen
Copy link
Author

Hi @larschristensen, could you share the code that shows the problem you are having? Also, could you check the discussion in #352 to see whether it might be related? Thanks!

@chlandsi Thanks for the input. The issue identified in #352 indeed seems to be the same as I'm having: If I do del speech_recognizer._impl, the clean-up correctly does its job and I can delete the audio file afterwards without problems. Maybe you should update the sample code and/or SDK to not have this problem?

@chlandsi
Copy link
Contributor

@larschristensen Good to know that this is the underlying issue. You should then also be able to solve the problem by calling recognizer.canceled.disconnect_all() (etc. for the other signals) after recognition has finished, or moving the call to stop_continuous_recognition out of the callback. We do indeed have work items in the backlog to make this easier and clearer, but no ETA yet.

I'll proceed to close the issue, please reopen if you continue to have problems. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants