Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with Windowing and Spectrum methods #240

Closed
ghost opened this issue Mar 25, 2015 · 7 comments
Closed

Issue with Windowing and Spectrum methods #240

ghost opened this issue Mar 25, 2015 · 7 comments
Assignees
Milestone

Comments

@ghost
Copy link

ghost commented Mar 25, 2015

I'm playing with Essentia and I get some weird results with Windowing and Spectrum algorithms when using AudioLoader.

First Windowing:

In [122]: monoloader = essentia.standard.MonoLoader(filename = 'myfile.wav')

In [123]: audioloader = essentia.standard.AudioLoader(filename = 'myfile.wav')

In [124]: monoloader_audio = monoloader()

In [125]: audioloader_audio, sr, c = audioloader()

In [126]: monoloader_frame = monoloader_audio[0:1024]

In [127]: audioloader_frame = audioloader_audio[0:1024,0]

In [128]: monoloader_frame
Out[128]: 
array([ 0.00143437,  0.02502518,  0.04733421, ...,  0.02630696,
        0.01281777, -0.00366222], dtype=float32)

In [129]: audioloader_frame
Out[129]: 
array([ 0.00143437,  0.02502518,  0.04733421, ...,  0.02630696,
        0.01281777, -0.00366222], dtype=float32)

In [130]: w = Windowing(type = 'hann', size=1024)

In [131]: w(monoloader_frame)
Out[131]: 
array([-0.00038114, -0.00030989, -0.00022922, ..., -0.0004094 ,
       -0.00044366, -0.00042744], dtype=float32)

In [132]: w(audioloader_frame)
Out[132]: 
array([-0.00022529,  0.        , -0.00018149, ...,  0.        ,
       -0.00028865,  0.        ], dtype=float32)

Both frame are the same, but I get weird windowed data with AudioLoader (one every two value is 0).

Same thing with Spectrum: I get the expected result when using MonoLoader and weird result with AudioLoader: I set the size parameter to 1024, which from the doc is the audio input size.
From a spectrum function I would expect either the whole symmetric magnitude spectrum on 1024 points, or half the spectrum on 513 points. With the Spectrum method I get the whole (symmetric) spectrum on 513 points. What am I missing?

In [4]: import essentia.standard

In [5]: import essentia

In [6]: loader = essentia.standard.AudioLoader(filename = 'myfile.wav')

In [7]: audio, sr, c = loader()

In [8]: spectrum = essentia.standard.Spectrum(size=1024)

In [9]: spectrum.paramValue('size')
Out[9]: 1024

In [10]: s = spectrum(audio[100000:101024, 0])

In [11]: ion()

In [12]: plot(s)
Out[12]: [<matplotlib.lines.Line2D at 0x7f64edf76f10>]

figure_1

@ghost ghost changed the title Issue with Spectrum method Issue with Windowing and Spectrum method Mar 25, 2015
@ghost ghost changed the title Issue with Windowing and Spectrum method Issue with Windowing and Spectrum methods Mar 26, 2015
@dbogdanov
Copy link
Member

Is the input file mono and are you sure that monoloader_frame and audioloader_frame are equal value by value? AudioLoader return stereo signal, so that the 0-th index will correspond to the left channel.

@dbogdanov
Copy link
Member

@pabloEntropia

@palonso
Copy link
Contributor

palonso commented Jan 11, 2017

I've reproduced the experiment getting similar results to @ghost.
I can't exactly tell what is the difference between monoloader_frame and audioloader_frame as they are equal for the numpy array_equal and array_equiv methods. However the problem is solved simply by casting to essentia.array. For instance:

import essentia.standard as ess

plt.plot(w(ess.essentia.array(monoloader_frame)))
plt.show()

plt.plot(w(ess.essentia.array(audioloader_frame)))
plt.show()

The same applies for the Spectrogram

@dbogdanov
Copy link
Member

Ok, this is a problem interleaved representation of vectors of StereoSamples.

audioloader_frame = audioloader_audio[0:1024,0]

a new array object is created, but it does not allocate additional memory for the array's data. Instead, it creates a "view" that shares the original array's data buffer. Therefore, while printing monoloader_frame looks fine in python, passing that back to C++ algorithm produces an error. Addressing the first 1024 floats results in interleaved left/right channel values for 512 samples.

Casting to essentia.array or numpy.array solves the issue. This would not be obvious, however, for a user to do that.

@dbogdanov
Copy link
Member

dbogdanov commented Jan 12, 2017

The same issue seems to appear for any slicing. Giving any numpy array object created by slicing to Essentia algorithm would result in incorrect memory access.

For example, monoloader_frame[::2] should have 512 value with every second value from monoloader_frame, but the resulting vector input will have the first 512 values instead.

To sum up, we should implement a check for if the input is a copy or a view when passing input to Essentia algorithms in the wrapper. In the case it is a view, we should create a new copy and pass that.

@dbogdanov
Copy link
Member

We should implement a base python test for that too.

palonso pushed a commit to palonso/essentia that referenced this issue Jan 13, 2017
palonso pushed a commit to palonso/essentia that referenced this issue Jan 17, 2017
palonso pushed a commit to palonso/essentia that referenced this issue Jan 17, 2017
@dbogdanov
Copy link
Member

Fixed in #555

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants