Need help with the inputs #35

tamnguyenvan · 2022-05-22T03:04:12Z

Hi, I have a dumb question. My model receives outputs of librosa.load(audio_file, sr=16000) as inputs. How can I reproduce it with your code?

Thank you.

The text was updated successfully, but these errors were encountered:

Caldarie · 2022-05-22T03:33:17Z

Hmm, am I correct to assume that you want to upload an audio file and return an array of 16 bit values? If yes, you may need to edit the package to do so. For example on android, the code below returns spliced arrays of 16 bit values, and is then fed into to the model. If you just want the array, just edit the code this::startRecognition to your own function.

    public void preprocess(byte[] byteData) {
        Log.d(LOG_TAG, "Preprocessing audio file..");

        audioFile = new AudioFile(byteData, audioLength);
        audioFile.getObservable()
                .doOnComplete(() -> {
                    stopStream(); 
                    clearPreprocessing();
                })
                .subscribe(this::startRecognition);  //EDIT THIS CODE HERE
        audioFile.splice();
    }

If you want the package to take care of the recognition as well, all you need to do is evoke the code below from the TfliteAudio package. This should have the same effect as librosa.load(audio_file, sr=16000)

recognitionStream = TfliteAudio.startFileRecognition(
  sampleRate: 44100,
  audioDirectory: "assets/sampleAudio.wav",
  );

SanaSizmic · 2022-12-14T11:45:29Z

Hi @Caldarie,
signal, sample_rate = librosa.load(file_path)

My model receives the signal of fixed 1sec duration and sample_rate 22050 as input,
I tested locally in python it gives the correct output, but when I try in flutter using flutter_tflite_audio it's giving incorrect output.

Can you please guide me on where & what should i change in the above "::startRecognition" code.

Thanks,

Caldarie · 2022-12-14T12:30:32Z

Hi @SanaSizmic

Am I correct to assume that you want to load the audio file, and then output an array of float values?

in that case, you can simply modify the code below to:

subscribe(value -> print(value));

you might wanna double check on the syntax. Been awhile since I’ve touched Java.

SanaSizmic · 2022-12-15T08:02:08Z

Hi @Caldarie ,

When I tested locally using python, my model gives me [0.07594858, 0.9240514 ] predicted output which is the correct prediction, and for the same audio.wav file when I tested using flutter_tflite_audio it's giving [0.27258825, 0.72741175] output which is incorrect. Can you please suggest what should i change in flutter_tflite_audio package.

Thanks,

Caldarie · 2022-12-15T08:15:02Z

Hi @SanaSizmic

I see what you mean. I suspect that the float values are distorted from extraction, or the audio files are not spliced correctly.

If possible, can you compare the values from librosa.load and subscribe(value -> print(value));? And tell me whether they are similar to each other?

SanaSizmic · 2022-12-15T10:21:35Z

Hi @Caldarie ,

No, it's not similar to each other.
when I print it, it gives sets of different arrays and every array generates a different output, I also think the same that the audio files are not spliced correctly.

Instead of splicing the audio file can I give the whole file to the model?
So can you please guide me on how can I fix this.

Thanks,

Caldarie · 2022-12-15T11:16:51Z

Instead of splicing the audio file can I give the whole file to the model? So can you please guide me on how can I fix this

That really depends on your model. If the audio file has correct number of samples per second, then there’s no need to splice it.

No, it's not similar to each other. when I print it, it gives sets of different arrays and every array generates a different output, I also think the same that the audio files are not spliced correctly.

Take a look at the following code. You can test it to find errors.

From tfliteAudioPlugim.java, extraction data starts from here:

private byte[] extractRawData(AssetFileDescriptor fileDescriptor, long startOffset, long declaredLength) {
        Log.d(LOG_TAG, "Extracting byte data from audio file");

        MediaDecoder decoder = new MediaDecoder(fileDescriptor, startOffset, declaredLength);
        AudioProcessing audioData = new AudioProcessing();

        byte[] byteData = {};
        byte[] readData;
        while ((readData = decoder.readByteData()) != null) {
            byteData = audioData.appendByteData(readData, byteData);
            Log.d(LOG_TAG, "data chunk length: " + readData.length);
        }
        Log.d(LOG_TAG, "byte data length: " + byteData.length);
        return byteData;

    }

From AudioFile.java , conversion from byte to short starts here:

shortBuffer = ByteBuffer.wrap(byteData).order(ByteOrder.LITTLE_ENDIAN).asShortBuffer();

For splicing, take a look at the code below. I have actually written some unit tests found here to test this algorithm. You are free to check it yourself for any problems

    public void splice() {
        isSplicing = true;

        for (int i = 0; i < shortBuffer.limit(); i++) {

            short dataPoint = shortBuffer.get(i);

            if (!isSplicing) {
                subject.onComplete();
                break;
            }

            switch (audioData.getState(i)) {
                case "append":
                    audioData
                        .append(dataPoint);
                break;
                case "recognise":
                    Log.d(LOG_TAG, "Recognising");
                    audioData
                        .append(dataPoint)
                        .displayInference()
                        .emit(data -> subject.onNext(data))
                        .reset();
                    break;
                case "finalise":
                    Log.d(LOG_TAG, "Finalising");
                    audioData
                        .append(dataPoint)
                        .displayInference()
                        .emit(data -> subject.onNext(data));
                    stop();
                    break;
                case "padAndFinalise":
                    Log.d(LOG_TAG, "Padding and finalising");
                    audioData
                        .append(dataPoint)
                        .padSilence(i)
                        .displayInference()
                        .emit(data -> subject.onNext(data));
                    stop();
                    break;
         
                default:
                    throw new AssertionError("Incorrect state when preprocessing");
            }
        }
    }

SanaSizmic · 2022-12-27T08:02:24Z

Hi @Caldarie,

SAMPLES_TO_CONSIDER = 22050
signal, sample_rate = librosa.load(file_path)

if len(signal) >= SAMPLES_TO_CONSIDER:
    # ensure consistency of the length of the signal
    signal = signal[:SAMPLES_TO_CONSIDER]

else:
    signal = fix_length(signal, size=int(1*sample_rate), mode='edge')


# predictions = self.model.predict(signal)

Can I do this using flutter_tflite_audio
read the raw audio data convert it to the fixed sample rate length and predict.
Thanks,

Caldarie · 2023-01-03T12:36:23Z

@SanaSizmic sorry for the late reply.

Yeah, you can absolutely follow something similar by editing the code in this plugin

SanaSizmic · 2023-01-04T09:20:46Z

Hi @Caldarie,
Can you please explain to me how the plugin works now? the structure, like First it takes the raw input signal array then splices to what length?
Or in order to do that (which I shared in the above code) which files do I have to edit,
If you can guide me that will be highly appreciated.
Thanks

Caldarie · 2023-01-05T07:41:29Z

AS mentioned above, all you need to do is tweak the code provided below. The value returns an array of samples which you can use to implement the code you have provided

subscribe(value -> print(value));

SanaSizmic · 2023-01-10T09:43:36Z

Hi, I have a dumb question. My model receives outputs of librosa.load(audio_file, sr=16000) as inputs. How can I reproduce it with your code?

Thank you.

Hi @tamnguyenvan, Did you manage to figure out this?

SanaSizmic · 2023-01-10T13:11:37Z

Hi @Caldarie,

AS mentioned above, all you need to do is tweak the code provided below. The value returns an array of samples which you can use to implement the code you have provided

subscribe(value -> print(value));

do you mean this code in TfliteAudioPlugin.java file

public void preprocess(byte[] byteData) {
        Log.d(LOG_TAG, "Preprocessing audio file..");

        audioFile = new AudioFile(byteData, audioLength);
        audioFile.getObservable()
                .doOnComplete(() -> {
                    stopStream(); 
                    clearPreprocessing();
                })
                .subscribe(this::startRecognition);
        audioFile.splice();
    }

This output.txt is my output file can you please check and let me know if anything is missing?

Caldarie · 2023-01-20T02:00:43Z

Hmm, everything seems to be in order. The question is whether it’s producing accurate results?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need help with the inputs #35

Need help with the inputs #35

tamnguyenvan commented May 22, 2022

Caldarie commented May 22, 2022 •

edited

SanaSizmic commented Dec 14, 2022 •

edited

Caldarie commented Dec 14, 2022 •

edited

SanaSizmic commented Dec 15, 2022

Caldarie commented Dec 15, 2022

SanaSizmic commented Dec 15, 2022 •

edited

Caldarie commented Dec 15, 2022 •

edited

SanaSizmic commented Dec 27, 2022 •

edited

Caldarie commented Jan 3, 2023

SanaSizmic commented Jan 4, 2023

Caldarie commented Jan 5, 2023

SanaSizmic commented Jan 10, 2023 •

edited

SanaSizmic commented Jan 10, 2023 •

edited

Caldarie commented Jan 20, 2023

Need help with the inputs #35

Need help with the inputs #35

Comments

tamnguyenvan commented May 22, 2022

Caldarie commented May 22, 2022 • edited

SanaSizmic commented Dec 14, 2022 • edited

Caldarie commented Dec 14, 2022 • edited

SanaSizmic commented Dec 15, 2022

Caldarie commented Dec 15, 2022

SanaSizmic commented Dec 15, 2022 • edited

Caldarie commented Dec 15, 2022 • edited

SanaSizmic commented Dec 27, 2022 • edited

Caldarie commented Jan 3, 2023

SanaSizmic commented Jan 4, 2023

Caldarie commented Jan 5, 2023

SanaSizmic commented Jan 10, 2023 • edited

SanaSizmic commented Jan 10, 2023 • edited

Caldarie commented Jan 20, 2023

Caldarie commented May 22, 2022 •

edited

SanaSizmic commented Dec 14, 2022 •

edited

Caldarie commented Dec 14, 2022 •

edited

SanaSizmic commented Dec 15, 2022 •

edited

Caldarie commented Dec 15, 2022 •

edited

SanaSizmic commented Dec 27, 2022 •

edited

SanaSizmic commented Jan 10, 2023 •

edited

SanaSizmic commented Jan 10, 2023 •

edited