Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stateful aug30 model + VAD gating #712

Conversation

benejoseph
Copy link
Contributor

@benejoseph benejoseph commented Dec 8, 2016

neural net model

  • works across most speakers on office test set actually recorded from sense
  • numerous false positives on continuous speech which is negated by the "VAD gating"

VAD gating

  • looks for sufficiently long period of non-speech (0.75 seconds) before allowing "okay sense" to trigger callback
  • gives two seconds to actually say okay sense
  • thus if something that triggers "okay sense" occurs in the middle of a sentence, it is ignored, but it still uploads the audio features 😄
  • eliminates over 90% of the false positives I see from my continuous speech samples
    screen shot 2016-12-07 at 4 29 33 pm

@benejoseph
Copy link
Contributor Author

@plasticchris

@plasticchris
Copy link
Contributor

What if a user is having trouble, so they are repeating "ok sense" over and over? won't this ensure it never works for them?

@benejoseph
Copy link
Contributor Author

why don't you try it?

@plasticchris
Copy link
Contributor

Ok, tried it and verified it's possible to never trigger while repeating "ok sense" over and over

@benejoseph
Copy link
Contributor Author

and what happens if you wait a second?

@benejoseph
Copy link
Contributor Author

benejoseph commented Dec 8, 2016

Also, repeating okay sense over and over doesn't work with the RNN anyway. We've always had to wait a short period if time.

@plasticchris
Copy link
Contributor

plasticchris commented Dec 8, 2016 via email

@benejoseph
Copy link
Contributor Author

I think we should try this on our master units and get feedback. Perhaps your perception isn't shared by everyone else.

@plasticchris
Copy link
Contributor

plasticchris commented Dec 8, 2016 via email

@pims
Copy link
Contributor

pims commented Dec 8, 2016

From my anecdotal testing, it feels more responsive.
We can adjust the no-speech delay if it's an issue.

Let's get some real feedback by pushing to master.

@plasticchris
Copy link
Contributor

Ok, but people in master are already conditioned and/or in the training set. This is important, it will drastically impact the new user experience as people get increasingly frustrated and then give up.

@plasticchris
Copy link
Contributor

It it possible to decouple the VAD gating from the new net?

@benejoseph
Copy link
Contributor Author

programatically or at compile time?

@plasticchris
Copy link
Contributor

Compile time is fine, but it looks like the threshold for ok sense got lowered significantly. It's not clear if the VAD gating is required to make that work.

@benejoseph
Copy link
Contributor Author

Barring setting extreme thresholds (0.05 or 0.95), the main influence over false positives is the choice of neural net. This net has more false positives per hour than the previous net by an order of magnitude, but it works across many more speakers in the real world. The VAD gating is an attempt reduce false positives from continuous speech. So I'd like the two to go hand-in-hand. But to disable the VAD gating all you have to do is comment out the "if" statement surrounding the keyword on_end callback in keyword_net.c

callback_item->on_end(callback_item->context,(Keyword_t)i, callback_item->max_value);

//if there hasn't been enough non-speech before the keyword "okay sense", then don't do the callback
if (!((flags & TINYFEATS_FLAGS_TRIGGER_PRIMARY_KEYWORD_INVALID) &&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comment this out to ignore VAD gating

@benejoseph
Copy link
Contributor Author

benejoseph commented Dec 8, 2016

don't merge, btw. the VAD gate seems to get stuck.

@plasticchris
Copy link
Contributor

mmk

@benejoseph
Copy link
Contributor Author

closing because we're addressing actual root causes! Yipee!

@benejoseph benejoseph closed this Dec 13, 2016
@plasticchris plasticchris deleted the model_aug30_lstm_med_stateful_okay_sense_stop_snooze_tiny_end0_plus_1115_ep050_with_VAD_gate branch December 13, 2016 22:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants