Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use DeepSpeech for verification #1666

Closed
davidak opened this issue Dec 4, 2018 · 7 comments
Closed

Use DeepSpeech for verification #1666

davidak opened this issue Dec 4, 2018 · 7 comments

Comments

@davidak
Copy link

davidak commented Dec 4, 2018

Could DeepSpeech be used for verification? When 2 verifications are needed, one could be from DeepSpeech when recognition confidence is very high.

It should be able to detect when a whole different sentence is recorded and flag it. Relates to #272
It could also detect offensive words.

The verification status could be saved and used to further train the model comparing to how users verified it.

It would be nice for recording to have a UX where is just speak sentences and the system detects that i have spoken them without the need to press any button. Other STT trainings work this way.

@Gregoor
Copy link
Contributor

Gregoor commented Jan 30, 2019

Interesting idea. @lissyx do you have a take on whether this could work?

@lissyx
Copy link
Contributor

lissyx commented Jan 30, 2019

@Gregoor might be tricky to do, especially when we have common voice data used to train the model? cc @kdavis-mozilla

@Gregoor
Copy link
Contributor

Gregoor commented Jan 30, 2019

Makes sense, thanks! I guess we'd only do it for clips that have not been released yet then.

@lissyx
Copy link
Contributor

lissyx commented Jan 30, 2019

But this requires setting up and maintaining infra to deal with that, seems like non trivial to me.

@davidak
Copy link
Author

davidak commented Jun 8, 2019

For reference: This idea was also posted on the forum.

https://discourse.mozilla.org/t/use-deepspeech-as-one-positive-validation/41144

We can also use other open datasets to train that deepspeech instance.

@MichaelKohler
Copy link
Member

Closing this given that there is a discourse post.

@davidak
Copy link
Author

davidak commented Sep 23, 2022

There are now two high quality speech recognition software available:

We can run them BOTH over ALL unvalidated clips in languages that are supported well. WHEN both validate the clip successfully, only one human vote is required. If one disagrees and one agrees, also a human should decide. If both disagree, two humans must validate.

We should monitor them closely and can release a study. Which one is more correct? Which languages work reliable? ...

Then we have validated everything in no time! (for popular languages, increasing the gap)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants