New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Making predictions with MFCC/stored audio file #24
Comments
Hi @PeteSahad To answer your question:
Let me know if my response answers your questions. |
Hi Michael, thanks for the very quick response! As of right now I'm more or less toying around with how to perform the classification. The requirements are yet unclear whether I need stored audio files or do it on the fly. I'm rather looking around for potential solutions. But if I wont find any other solutions it would absolutely make sense for me to base my further work on your solution. However, as I mentioned I'm pretty new to flutter/tensorflow so it would take some time to make some contibutions. Although I will have to implement a solution somehow :). I'm currently rifling through your code to understand what you did and if I could continue with your work for my purpose. I'll have to do some fundamental research first, since I just heared about mfcc for the first time yesterday, so I can't really assess your answer to 2) right now ;). But thank you very much sofar! |
Hi @PeteSahad, No problems at all. If you have any questions, let me know. I would be happy to assist you if i can. |
Hey guys, just and update that I’m currently working on making predictions using stored audio. I will post an update once I get it to work. |
awesome news! I'm currently working on a different project but I'll get back to my audio project in a few weeks. I guess I'll definetly going to use that feature! I was also thinking about the mfcc and feature extraction. I found another project which allows using librosa functionality in java -> JLibrosa. I'm not sure if right now if this project is further maintained, but I tried some stuff and seemed to work with little adjustments. |
Ah, thanks for the input. I had planned to transcribe the JLibrosa library to swift, but hadn't had the chance to do so. I may work on this feature once i'm done with my current project. As for the loading audio feature, its a difficult one to implement (especially on android). You can find my progress here. Its on a different branch from master. |
Hi, I have published a new release 0.2.2+2. This should allow you to make inferences on stored audio. If you do run on any bugs, please let me know. |
Hi @PeteSahad, I hope it’s not too much to ask, but I was wondering if you could help clarify with a concept, since you had some success with the Jlibrosa library. I’ve been trying to wrap my head with MFCC at a deeper level, but could not figure out how to feed the spectrogram (extracted from the library) to the model. How did you manage it? What values did you use for your parameters? I.e mel_bin, hop_length etc. |
Hi Caldarie, unfortunately I didn't make it very far yet. Had to work on another project. I used the following parameters: Python
spectrogram = librosa.feature.melspectrogram(audio, sr=sample_rate, n_mels=128, n_fft=2048, hop_length=256) Java
float [][] melSpectrogram = jLibrosa.generateMelSpectroGram(audioFeatureValues, sampleRate, 2048, 128, 256); However, I don't really use the spectogram when building the model. I just used the two functions above to verify that they create the same values - which they do. [I went step by step to figure out whats done in python and what the equivalent in java would be.] I stopped at the next steps which are in python: Python
mfccs_features = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40)
mfccs_scaled_features = np.mean(mfccs_features.T,axis=0) in JLIbrosa it should be: Java
float[][] mfccValues = jLibrosa.generateMFCCFeatures(audioFeatureValues, sampleRate, 40);
float[] meanMFCCValues = jLibrosa.generateMeanMFCCFeatures(mfccValues, mfccValues.length, mfccValues[0].length); but I only get garbage data for meanMFCCValues. My next step was to check what jLibrosa.generateMeanMFCCFeatures is doing exactly. I'm not even sure if jLibrosa.generateMeanMFCCFeatures really is the java equivalent to pythons np.mean(mfccs_features.T,axis=0). As I said, I'm very new to the topic so I might be on the wrong path.... but I'll start looking deeper into it the next days and weeks. |
looked into it and figured, if you write garbage code - you get garbage data ;). I now get the correct data for jLibrosa.generateMeanMFCCFeatures. I will now try to build the model in java with this data... |
Hi @PeteSahad Many thanks for sharing your information. Appreciate it. Been trying to implement Mel-spectrogram or MFCC as an Input type for this plugin. Unfortunately, the information out there isn’t very straightforward with how to fit the spectrogram to the models input shape, if you do come across such information, do let me know. |
Hi @PeteSahad I would like to do a few tests for MFCC, but I don’t have a model with that input type. Would you be willing to share a model? Any model is fine, as long as it has MFCC as the input. |
I only have this basic model for testing: https://drive.google.com/file/d/10ixguuoUKxsryu0MhNZcS19BqNNeOOWD/view?usp=sharing It has two labels (cough/hiss) to distinguish between a hiss and a cough (who would have guessed...) Input tensor is (1/40) |
hi @PeteSahad Thanks for the model. Just an update: Although i ran across some problems with rubbish outputs (NaN and Infinite), i had solved the problem by padding those values with real numbers. Once, I figure out how to get iOS/swift running, I will publish an update. |
Hi @PeteSahad Maybe a bit late, but if you are still interested in using MFCC, i have added the feature for both Android and iOS on this branch here. If you have time, can you run a few tests with your own model and let me know if it works for you? |
Hey guys, As for using MFCC's in your plugin, I would not recommend doing this level of preprocessing outside of the model. This requires the same preprocessing code in the training step and inference step, which is difficult to do and may require continual refactoring... My recommendation is just to do the MFCC or any other preprocessing in the model itself, which is doable especially with keras and tensorflow, and is supported by tflite too! |
@bdytx5 Thank you for the input. Likewise, I concur with your recommendation. For those who still wish to use MFCC, I have left this feature open in the new update 0.3.0. Be aware though that It's an experimental feature and may not produce the intended results. |
Hi,
I'm very new to the topics flutter and tensorflow. Just so you know that maybe some things I ask may not make any sense :).
I'm trying to build an app that allows me to record some audio samples. Then I would like to do some classification with the recorded files.
My questions are:
I hope you understand my problem.
Thanks in advance!
The text was updated successfully, but these errors were encountered: