Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support to Google Speech To Text transcription format #152

Closed
eyal13579 opened this issue May 7, 2019 · 5 comments
Closed

Support to Google Speech To Text transcription format #152

eyal13579 opened this issue May 7, 2019 · 5 comments
Assignees
Labels
Enhancement a request for improvement help wanted Extra attention is needed Speech To Text Adapters Speech To Text Adapters

Comments

@eyal13579
Copy link

eyal13579 commented May 7, 2019

any plans to support Google?

@eyal13579 eyal13579 added the Enhancement a request for improvement label May 7, 2019
@pietrop pietrop added the Speech To Text Adapters Speech To Text Adapters label May 7, 2019
@pietrop
Copy link
Contributor

pietrop commented May 7, 2019

Hi @eyal13579 ,
At the moment we have been relying on open source contributors to create the adapters they might need, and made a guide to help with the process. As we are only using the BBC Kaldi in our projects.

If you are interested in creating a Google STT one, and need any help to setup a PR let me know.

@pietrop pietrop added the help wanted Extra attention is needed label May 7, 2019
@sshniro
Copy link
Contributor

sshniro commented Jul 9, 2019

Hi @pietrop , I'm familiar with Google STT and would love to contribute to this development. I have gone through the helper docs. I have a small clarification regarding which approach to take. Google STT provides speaker diarization as a beta feature, so is it okay if I use the beta results for creating the sample json ?

@pietrop
Copy link
Contributor

pietrop commented Jul 9, 2019

Hi @sshniro, that's awesome, thanks!

re-speaker info, sounds good to use the beta feature for Google STT speaker diarization.

We have been using a flexible approach in the current module, because a lot of these STT have those optional attributes, so didn't want to impose them as defaults.

You can see in the other modules, that we check if the attribute is available, if it is we break the paragraph down based on the speaker diarization, if it isn't we break the paragraphs down based on punctuation, and add place holder names to the speaker labels. The user can then change the size of the paragraphs and speaker labels in the UI.

for example you can see in the /packages/stt-adapters/bbc-kaldi we check if the attribute is present in the json (/packages/stt-adapters/bbc-kaldi/index.js#L57) and then we either do groupWordsInParagraphs (in/packages/stt-adapters/bbc-kaldi/index.js#L15 ) or groupWordsInParagraphsBySpeakers( in/packages/stt-adapters/bbc-kaldi/group-words-by-speakers.js)

I think it be good to add support for both, creating paragraph based on speaker label as well as on punctuation, as a fallback if speaker labels are not provided. However if you want to priorities the speaker labels, and leave the punctuation one for a separate PR that's also fine.

Let me know if you have any questions.

@sshniro
Copy link
Contributor

sshniro commented Jul 18, 2019

Hi @pietrop I have created a PR for GCP to Draft JS conversion , The PR does not cover speaker diarization. I will send another PR with the segmentation soon.

@pietrop
Copy link
Contributor

pietrop commented Jul 19, 2019

closing as addressed by @sshniro in PR #167

@pietrop pietrop closed this as completed Jul 19, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement a request for improvement help wanted Extra attention is needed Speech To Text Adapters Speech To Text Adapters
Projects
None yet
Development

No branches or pull requests

3 participants