Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Browser add-on that has basic control of browser #79

Closed
kdavis-mozilla opened this issue Oct 21, 2016 · 3 comments

Comments

@kdavis-mozilla
Copy link
Collaborator

commented Oct 21, 2016

No description provided.

@lissyx

This comment has been minimized.

Copy link
Collaborator

commented Nov 29, 2016

Made some progress:

  • bootstrap addon
  • able to constantly listen on the mic (with no user interaction, I guess a on/off button in the toolbar would be better)
  • looks at FFT to detect spikes in sound
  • records when sound is detected, until some silence for ~2 secs
  • write to disk as ogg/opus file (this is what firefox produces)

Next steps:

  • python websocket server that waits for the addon to send some sound
  • converts to the same kind of audio file as the ones we trained the network against (WAV/PCM ?)
  • passes over to Tensorflow Serving
  • passes back decoded string to the addon, over websocket
  • addon interprets what server could decode
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 29, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 29, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 29, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 29, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 29, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 30, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Nov 30, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 1, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 1, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 1, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 1, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 2, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 2, 2016
@lissyx

This comment has been minimized.

Copy link
Collaborator

commented Dec 4, 2016

Made some progress:

  • bootstrap addon
  • able to constantly listen on the mic (with no user interaction, I guess a on/off button in the toolbar would be better)
  • looks at FFT to detect spikes in sound
  • records when sound is detected, until some silence for ~2 secs
  • write to disk as ogg/opus file (this is what firefox produces)
  • python websocket server that waits for the addon to send some sound
  • converts to the same kind of audio file as the ones we trained the network against, WAV/PCM
  • passes over to Tensorflow Serving
  • passes back decoded string to the addon, over websocket
  • addon interprets what server could decode
  • open website in a new tab command working

Next steps:

  • run this with really trained model
  • add more commands:
    -- switch to an open tab
    -- scroll up/down
    -- zoom in/out
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 5, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 8, 2016
lissyx pushed a commit to lissyx/DeepSpeech that referenced this issue Dec 8, 2016
@lock

This comment has been minimized.

Copy link

commented Jan 3, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@lock lock bot locked and limited conversation to collaborators Jan 3, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
2 participants
You can’t perform that action at this time.