-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Need help with the inputs #35
Comments
Hmm, am I correct to assume that you want to upload an audio file and return an array of 16 bit values? If yes, you may need to edit the package to do so. For example on android, the code below returns spliced arrays of 16 bit values, and is then fed into to the model. If you just want the array, just edit the code
If you want the package to take care of the recognition as well, all you need to do is evoke the code below from the TfliteAudio package. This should have the same effect as
|
Hi @Caldarie, My model receives the Can you please guide me on where & what should i change in the above "::startRecognition" code. Thanks, |
Hi @SanaSizmic Am I correct to assume that you want to load the audio file, and then output an array of float values? in that case, you can simply modify the code below to:
you might wanna double check on the syntax. Been awhile since I’ve touched Java. |
Hi @Caldarie , When I tested locally using python, my model gives me Thanks, |
Hi @SanaSizmic I see what you mean. I suspect that the float values are distorted from extraction, or the audio files are not spliced correctly. If possible, can you compare the values from librosa.load and subscribe(value -> print(value));? And tell me whether they are similar to each other? |
Hi @Caldarie , No, it's not similar to each other. Instead of splicing the audio file can I give the whole file to the model? Thanks, |
That really depends on your model. If the audio file has correct number of samples per second, then there’s no need to splice it.
Take a look at the following code. You can test it to find errors. From tfliteAudioPlugim.java, extraction data starts from here: private byte[] extractRawData(AssetFileDescriptor fileDescriptor, long startOffset, long declaredLength) {
Log.d(LOG_TAG, "Extracting byte data from audio file");
MediaDecoder decoder = new MediaDecoder(fileDescriptor, startOffset, declaredLength);
AudioProcessing audioData = new AudioProcessing();
byte[] byteData = {};
byte[] readData;
while ((readData = decoder.readByteData()) != null) {
byteData = audioData.appendByteData(readData, byteData);
Log.d(LOG_TAG, "data chunk length: " + readData.length);
}
Log.d(LOG_TAG, "byte data length: " + byteData.length);
return byteData;
} From AudioFile.java , conversion from byte to short starts here:
For splicing, take a look at the code below. I have actually written some unit tests found here to test this algorithm. You are free to check it yourself for any problems public void splice() {
isSplicing = true;
for (int i = 0; i < shortBuffer.limit(); i++) {
short dataPoint = shortBuffer.get(i);
if (!isSplicing) {
subject.onComplete();
break;
}
switch (audioData.getState(i)) {
case "append":
audioData
.append(dataPoint);
break;
case "recognise":
Log.d(LOG_TAG, "Recognising");
audioData
.append(dataPoint)
.displayInference()
.emit(data -> subject.onNext(data))
.reset();
break;
case "finalise":
Log.d(LOG_TAG, "Finalising");
audioData
.append(dataPoint)
.displayInference()
.emit(data -> subject.onNext(data));
stop();
break;
case "padAndFinalise":
Log.d(LOG_TAG, "Padding and finalising");
audioData
.append(dataPoint)
.padSilence(i)
.displayInference()
.emit(data -> subject.onNext(data));
stop();
break;
default:
throw new AssertionError("Incorrect state when preprocessing");
}
}
} |
Hi @Caldarie,
Can I do this using flutter_tflite_audio |
@SanaSizmic sorry for the late reply. Yeah, you can absolutely follow something similar by editing the code in this plugin |
Hi @Caldarie, |
AS mentioned above, all you need to do is tweak the code provided below. The value returns an array of samples which you can use to implement the code you have provided
|
Hi @tamnguyenvan, Did you manage to figure out this? |
Hi @Caldarie,
do you mean this code in TfliteAudioPlugin.java file
This output.txt is my output file can you please check and let me know if anything is missing? |
Hmm, everything seems to be in order. The question is whether it’s producing accurate results? |
Hi, I have a dumb question. My model receives outputs of
librosa.load(audio_file, sr=16000)
as inputs. How can I reproduce it with your code?Thank you.
The text was updated successfully, but these errors were encountered: