Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion results in -1 frame length #18

Open
craigryan opened this issue Apr 7, 2017 · 0 comments
Open

Conversion results in -1 frame length #18

craigryan opened this issue Apr 7, 2017 · 0 comments

Comments

@craigryan
Copy link

Very useful project, appreciate your efforts! Not sure if this is part issue, part clarification in case I'm misunderstanding what is involved in audio data conversion.

I was playing with my own version of Recognito.java to experiment, I'm using various WAV voice records as my voice prints, all of which I ensure are the same format, basically:

(AudioFormat:) PCM_SIGNED 16000.0 Hz, 16 bit, mono, 2 bytes/frame, big-endian and frame length: > 0

The actual sample WAV is slightly different, needs conversion from:

PCM_SIGNED 44100.0 Hz, 16 bit, mono, 2 bytes/frame, big-endian frame length: 384752

which I allow through into FileHelper expecting conversion to work. Maybe you can shed light on what is happening, but first issue that this code results in a new stream with frame length -1:

localIs = AudioSystem.getAudioInputStream(format, is);

I can't seem to find a clear explanation how the frame length is not set after conversion, a java issue or sun impl issue? I'm not sure.

Following this of course this fails with index out of range (-1) error:

double[] audioSample = new double[(int)localIs.getFrameLength()];

A couple of issues I'm not clear on. Can this code be changed to not need the frame length and calculate the size of audioSample[] another way? My fix was to add a new param to the method to let me pass in the original sample length (384752 as above) use that instead and then this method works fine and I get a valid result from the indentify() call in my test client. Maybe I'm just lucky the frame size is just big enough (or too big?) for the conversion to succeed?

Hope this explains the issue, would like to understand this better before attempting forks, pull requests etc with mods I'd be happy to help with.

In case you're interested in what I'm working on, its to use this logic for a search function.. ie find all WAV files in a folder that 'Mary' is speaking in ('Mary' being the sample). Also considering combining this with other code that can split a WAV into multiple streams based on number of unique speakers and testing each in turn, thus making the results much more accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant