Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exporting audio model to python? #36

Open
td0m opened this issue Nov 11, 2019 · 13 comments · Fixed by tensorflow/tfjs-models#464
Open

Exporting audio model to python? #36

td0m opened this issue Nov 11, 2019 · 13 comments · Fixed by tensorflow/tfjs-models#464
Labels
duplicate This issue or pull request already exists feature request New feature or request

Comments

@td0m
Copy link

td0m commented Nov 11, 2019

Is it possible to export the audio model to tflite and include a snippet explaining the usage in python?

@HalfdanJ
Copy link
Contributor

This is high on our wishlist. The issue we haven't solved yet is that the pre-processing of the audio data to the format the network expects hasn't been written for python yet. As I understand it, fft processing of audio is handled differently natively in javascript and python, making it tricky.

The model that is used for audio training is https://github.com/tensorflow/tfjs-models/tree/master/speech-commands, and does unfortunately not yet have a python counter part.

Contributions to this is very welcomed!

@HalfdanJ HalfdanJ added the feature request New feature or request label Nov 12, 2019
@td0m
Copy link
Author

td0m commented Nov 12, 2019

Thanks for the reply @HalfdanJ, does the speech commands package work on node.js? I tried running it in a non-browser environment and it didnt seem to work.

@HalfdanJ
Copy link
Contributor

I don't believe so

@caisq
Copy link

caisq commented Nov 13, 2019

@d0minikt Can you say a little more about your use case? The audio model is tied to WebAudio's frequency analyzer (FFT). This means that in order to use the model in Python, you'll find a way to replicate the audio input parameters and frequency analysis.

@td0m
Copy link
Author

td0m commented Nov 14, 2019

@caisq if that's hard to do, is it easier to port the speech-commands package so that it also runs on node.js? I'm sure I'm not the only one with a headless use case.

@lc0
Copy link

lc0 commented Nov 17, 2019

There are ops for DSP in tensorflow directly[1], but I guess it's hard to maintain these for different platforms like TFLite and TFjs.

Also, most likely you rely on optimized FFT of browser.

  1. https://www.tensorflow.org/api_docs/python/tf/signal

@lc0
Copy link

lc0 commented Nov 17, 2019

Also, any plans to support exporting saved model? Currently I only see export to Tensorflow.js

@nickoala
Copy link

As I understand it, fft processing of audio is handled differently natively in javascript and python, making it tricky.

The model that is used for audio training is ...... and does unfortunately not yet have a python counter part.

In case anyone hasn't noticed, the Coral example project Keyphrase detector seems to have the pre-processing code necessary. Not sure it's equivalent to those in Speech Commands, but at least they both compute Mel spectrogram.

I am just saying this, in case it may be helpful to someone.

@caisq
Copy link

caisq commented Dec 15, 2019

@nickoala To be clear, I'm pretty sure the preprocessing steps in the Coral example doesn't fit TF.js Speech Commands, because Speech Commands is based on the browser's WebAudio FFT, which is a linear-frequency spectrum, not a Mel one.

@nickoala
Copy link

@caisq, but there is a SOFT_FFT option to speechCommands.create() right? This file does seem to compute Mel spectrogram.

@caisq
Copy link

caisq commented Dec 15, 2019

@nickoala My apologies: The document is not very clear and some of the code is obsolete. The SOFT_FFT mode does use Mel spectrum. But the default mode of Speech Commands (BROWSER_FFT) uses linear spectrum from WebAudio.

@nickoala
Copy link

I see. Thank you for clarification.

caisq added a commit to tensorflow/tfjs-models that referenced this issue Jul 7, 2020
…464)

* Fixes googlecreativelab/teachablemachine-community#36

The notebook shows
* how to convert a speech-commands model from the TF.js format
  to the Python (tf.keras) and TFLite formats
* how to run the Python (tf.keras) model for inference.
@charlielito
Copy link

Hey guys, I had the same problem trying to run an audio model in a headless device with python. I could make it work but with node.js, but the trick could work also with python. The little trick was to launch a headless chromium with puppeteer where the javascript run the model and inside the node.js script the predictions are parsed and then you are free to go and do whatever with the predictions.

I made it to turn off/on my room's light. If you want to check out the code and how to do it go to: https://github.com/charlielito/teachable-machines-audio-demo

Any feedback is welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
duplicate This issue or pull request already exists feature request New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants