Making predictions with MFCC/stored audio file #24

PeteSahad · 2021-12-02T17:19:44Z

Hi,

I'm very new to the topics flutter and tensorflow. Just so you know that maybe some things I ask may not make any sense :).

I'm trying to build an app that allows me to record some audio samples. Then I would like to do some classification with the recorded files.

My questions are:

Is it possible to make a prediction with a recorded file instead of using the audio stream? (á la model.predict(data) like in python/tensorflow)
I'm using mfcc in my trained model. I expect that I would need to do some transformation with the recorded audio files to load them with the model (as I'm doing it in python). To which degree is that even possible with this plugin?

I hope you understand my problem.

Thanks in advance!

Caldarie · 2021-12-03T03:54:32Z

Hi @PeteSahad

To answer your question:

1. Currently, the plugin has no feature to use recorded audio. However, it is possible to implement this on the plugin if you were to convert the audio file to PCM16. I had planned to implement this feature in the future, but haven’t had the time to do so. If you’d like to collaborate and take a shot at it, let me know,
This is now available on package 0.2.2+1

2. Currently, the plugin does not have the feature to convert audio files to MFCC. However, it is possible to implement this feature to this plugin. The problem is that it will take a much longer time to implement, especially if I was to work on this myself. I think you’re better off adding MFCC to the models pipeline, instead of manually extracting it with this plugin. More info here
This is now available on package 0.3.0 as an experimental feature

Let me know if my response answers your questions.

PeteSahad · 2021-12-03T12:09:19Z

Hi Michael,

thanks for the very quick response!

As of right now I'm more or less toying around with how to perform the classification. The requirements are yet unclear whether I need stored audio files or do it on the fly. I'm rather looking around for potential solutions.

But if I wont find any other solutions it would absolutely make sense for me to base my further work on your solution. However, as I mentioned I'm pretty new to flutter/tensorflow so it would take some time to make some contibutions. Although I will have to implement a solution somehow :).

I'm currently rifling through your code to understand what you did and if I could continue with your work for my purpose.

I'll have to do some fundamental research first, since I just heared about mfcc for the first time yesterday, so I can't really assess your answer to 2) right now ;).

But thank you very much sofar!

Caldarie · 2021-12-03T12:28:25Z

Hi @PeteSahad,

No problems at all.

If you have any questions, let me know. I would be happy to assist you if i can.

Caldarie · 2021-12-25T03:57:00Z

Hey guys, just and update that I’m currently working on making predictions using stored audio. I will post an update once I get it to work.

PeteSahad · 2021-12-29T19:20:46Z

awesome news!

I'm currently working on a different project but I'll get back to my audio project in a few weeks. I guess I'll definetly going to use that feature!

I was also thinking about the mfcc and feature extraction. I found another project which allows using librosa functionality in java -> JLibrosa. I'm not sure if right now if this project is further maintained, but I tried some stuff and seemed to work with little adjustments.
Maybe it is also interesting for your project.

Caldarie · 2021-12-30T05:28:52Z

Ah, thanks for the input. I had planned to transcribe the JLibrosa library to swift, but hadn't had the chance to do so. I may work on this feature once i'm done with my current project.

As for the loading audio feature, its a difficult one to implement (especially on android). You can find my progress here. Its on a different branch from master.

Caldarie · 2022-01-07T05:10:18Z

Hi, I have published a new release 0.2.2+2. This should allow you to make inferences on stored audio.

If you do run on any bugs, please let me know.

Caldarie · 2022-01-10T10:31:58Z

Hi @PeteSahad,

I hope it’s not too much to ask, but I was wondering if you could help clarify with a concept, since you had some success with the Jlibrosa library.

I’ve been trying to wrap my head with MFCC at a deeper level, but could not figure out how to feed the spectrogram (extracted from the library) to the model. How did you manage it? What values did you use for your parameters? I.e mel_bin, hop_length etc.

PeteSahad · 2022-01-10T15:14:22Z

Hi Caldarie,

unfortunately I didn't make it very far yet. Had to work on another project.

I used the following parameters:

Python
spectrogram = librosa.feature.melspectrogram(audio, sr=sample_rate, n_mels=128, n_fft=2048, hop_length=256)

Java
float [][] melSpectrogram = jLibrosa.generateMelSpectroGram(audioFeatureValues, sampleRate, 2048, 128, 256);

However, I don't really use the spectogram when building the model. I just used the two functions above to verify that they create the same values - which they do. [I went step by step to figure out whats done in python and what the equivalent in java would be.]

I stopped at the next steps which are in python:

Python
mfccs_features = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
mfccs_scaled_features = np.mean(mfccs_features.T,axis=0)

in JLIbrosa it should be:

Java
float[][] mfccValues = jLibrosa.generateMFCCFeatures(audioFeatureValues, sampleRate, 40);
float[] meanMFCCValues = jLibrosa.generateMeanMFCCFeatures(mfccValues, mfccValues.length, mfccValues[0].length);

but I only get garbage data for meanMFCCValues.

My next step was to check what jLibrosa.generateMeanMFCCFeatures is doing exactly.

I'm not even sure if jLibrosa.generateMeanMFCCFeatures really is the java equivalent to pythons np.mean(mfccs_features.T,axis=0).

As I said, I'm very new to the topic so I might be on the wrong path.... but I'll start looking deeper into it the next days and weeks.

PeteSahad · 2022-01-10T15:30:21Z

looked into it and figured, if you write garbage code - you get garbage data ;).

I now get the correct data for jLibrosa.generateMeanMFCCFeatures. I will now try to build the model in java with this data...

Caldarie · 2022-01-11T11:27:03Z

Hi @PeteSahad

Many thanks for sharing your information. Appreciate it.

Been trying to implement Mel-spectrogram or MFCC as an Input type for this plugin. Unfortunately, the information out there isn’t very straightforward with how to fit the spectrogram to the models input shape,

if you do come across such information, do let me know.

Caldarie · 2022-01-13T14:43:30Z

Hi @PeteSahad

I would like to do a few tests for MFCC, but I don’t have a model with that input type. Would you be willing to share a model? Any model is fine, as long as it has MFCC as the input.

PeteSahad · 2022-01-13T19:06:16Z

I only have this basic model for testing: https://drive.google.com/file/d/10ixguuoUKxsryu0MhNZcS19BqNNeOOWD/view?usp=sharing

It has two labels (cough/hiss) to distinguish between a hiss and a cough (who would have guessed...)

Input tensor is (1/40)

Caldarie · 2022-01-17T10:37:04Z

hi @PeteSahad

Thanks for the model.

Just an update: Although i ran across some problems with rubbish outputs (NaN and Infinite), i had solved the problem by padding those values with real numbers.

Once, I figure out how to get iOS/swift running, I will publish an update.

Caldarie · 2022-03-08T08:30:06Z

Hi @PeteSahad

Maybe a bit late, but if you are still interested in using MFCC, i have added the feature for both Android and iOS on this branch here.

If you have time, can you run a few tests with your own model and let me know if it works for you?

bdytx5 · 2022-03-10T17:30:01Z

Hey guys,

As for using MFCC's in your plugin, I would not recommend doing this level of preprocessing outside of the model. This requires the same preprocessing code in the training step and inference step, which is difficult to do and may require continual refactoring... My recommendation is just to do the MFCC or any other preprocessing in the model itself, which is doable especially with keras and tensorflow, and is supported by tflite too!

Caldarie · 2022-03-18T06:01:35Z

@bdytx5 Thank you for the input.

Likewise, I concur with your recommendation.

For those who still wish to use MFCC, I have left this feature open in the new update 0.3.0. Be aware though that It's an experimental feature and may not produce the intended results.

Caldarie added the enhancement New feature or request label Dec 9, 2021

Caldarie changed the title ~~Making predictions with a stored audio file~~ Making predictions with MFCC/stored audio file Jan 13, 2022

Caldarie added the help wanted Extra attention is needed label Jan 13, 2022

Caldarie closed this as completed Apr 28, 2022

Caldarie mentioned this issue Jan 5, 2023

Different results on android and ios with mfcc model. #46

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making predictions with MFCC/stored audio file #24

Making predictions with MFCC/stored audio file #24

PeteSahad commented Dec 2, 2021

Caldarie commented Dec 3, 2021 •

edited

PeteSahad commented Dec 3, 2021

Caldarie commented Dec 3, 2021

Caldarie commented Dec 25, 2021

PeteSahad commented Dec 29, 2021

Caldarie commented Dec 30, 2021

Caldarie commented Jan 7, 2022 •

edited

Caldarie commented Jan 10, 2022

PeteSahad commented Jan 10, 2022

PeteSahad commented Jan 10, 2022

Caldarie commented Jan 11, 2022

Caldarie commented Jan 13, 2022

PeteSahad commented Jan 13, 2022

Caldarie commented Jan 17, 2022 •

edited

Caldarie commented Mar 8, 2022 •

edited

bdytx5 commented Mar 10, 2022

Caldarie commented Mar 18, 2022

Making predictions with MFCC/stored audio file #24

Making predictions with MFCC/stored audio file #24

Comments

PeteSahad commented Dec 2, 2021

Caldarie commented Dec 3, 2021 • edited

PeteSahad commented Dec 3, 2021

Caldarie commented Dec 3, 2021

Caldarie commented Dec 25, 2021

PeteSahad commented Dec 29, 2021

Caldarie commented Dec 30, 2021

Caldarie commented Jan 7, 2022 • edited

Caldarie commented Jan 10, 2022

PeteSahad commented Jan 10, 2022

PeteSahad commented Jan 10, 2022

Caldarie commented Jan 11, 2022

Caldarie commented Jan 13, 2022

PeteSahad commented Jan 13, 2022

Caldarie commented Jan 17, 2022 • edited

Caldarie commented Mar 8, 2022 • edited

bdytx5 commented Mar 10, 2022

Caldarie commented Mar 18, 2022

Caldarie commented Dec 3, 2021 •

edited

Caldarie commented Jan 7, 2022 •

edited

Caldarie commented Jan 17, 2022 •

edited

Caldarie commented Mar 8, 2022 •

edited