You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
YAMNet is able to accept dynamically sized inputs to be able to process audio of different lenghts and sample rates. However, when trying to port this to TFlite a few problems arise as seen here.
A workaround could be to modify the input of YAMNet to only accept a fixed amount of samples (basically limiting the input to a certain sample rate and time window). It would be convenient to have an option to set an input size before converting the model to TFlite. If that's not possible, how would one be able to change the input layer of the model?
Note: this was tried by using the STM32 X CUBE AI model analyzer on a .tflite version of YAMNet with experimental_new_converter = True enabled. It returns the following message:
Neural Network Tools for STM32AI v1.5.1 (STM.ai v7.0.0-RC8)
NOT IMPLEMENTED: Shape with 1 dimensions not supported: (1,)
The text was updated successfully, but these errors were encountered:
We have released a couple of different TF-Lite versions of YAMNet on TF-Hub:
https://tfhub.dev/google/lite-model/yamnet/tflite/1 is an unmodified TF-Lite conversion of YAMNet that accepts variable-length float32 samples as input. This uses the model exporter code from our GitHub repo.
https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1 is a quantized TF-Lite conversion with a simpler signature that accepts a fixed-length 0.975s waveform as input (15600 samples @ 16 kHz). The model is slightly different (Relu6 instead of Relu, to support quantization) and should be more efficient on mobile devices.
It's not clear to me how you are converting YAMNet to TF-Lite and I know nothing about the analyzer tool you are using, but note that the latest version of TF-Lite naturally supports variable-length inputs. If you need fixed length inputs because you want the model to be quantized or because your tool is having problems with variable-length inputs, then perhaps you could try the quantized model that I linked above.
We have released a couple of different TF-Lite versions of YAMNet on TF-Hub:
https://tfhub.dev/google/lite-model/yamnet/tflite/1 is an unmodified TF-Lite conversion of YAMNet that accepts variable-length float32 samples as input. This uses the model exporter code from our GitHub repo.
https://tfhub.dev/google/lite-model/yamnet/classification/tflite/1 is a quantized TF-Lite conversion with a simpler signature that accepts a fixed-length 0.975s waveform as input (15600 samples @ 16 kHz). The model is slightly different (Relu6 instead of Relu, to support quantization) and should be more efficient on mobile devices.
It's not clear to me how you are converting YAMNet to TF-Lite and I know nothing about the analyzer tool you are using, but note that the latest version of TF-Lite naturally supports variable-length inputs. If you need fixed length inputs because you want the model to be quantized or because your tool is having problems with variable-length inputs, then perhaps you could try the quantized model that I linked above.
Disclaimer: I know very little about ML/AI.
I want to use the second fixed input size model you linked to to do classification on a coral edge device, but use my own labels (i.e. transfer learning). But as I understand, I can't do transfer learning on tflite files, and, if I were to transfer learn on the first dynamic size input model you linked to, that won't work on the coral edge board bc of the dynamic size inputs. Could you make the tf model for the second link available so I can transfer learn on it? Or is there a simpler process that I'm unaware of?
YAMNet is able to accept dynamically sized inputs to be able to process audio of different lenghts and sample rates. However, when trying to port this to TFlite a few problems arise as seen here.
A workaround could be to modify the input of YAMNet to only accept a fixed amount of samples (basically limiting the input to a certain sample rate and time window). It would be convenient to have an option to set an input size before converting the model to TFlite. If that's not possible, how would one be able to change the input layer of the model?
Note: this was tried by using the STM32 X CUBE AI model analyzer on a .tflite version of YAMNet with
experimental_new_converter = True
enabled. It returns the following message:The text was updated successfully, but these errors were encountered: