Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Zeros extracting chromagram with NNLSChroma and LogSpectrum #948

Closed
xaviliz opened this issue Jan 28, 2020 · 3 comments
Closed

Zeros extracting chromagram with NNLSChroma and LogSpectrum #948

xaviliz opened this issue Jan 28, 2020 · 3 comments

Comments

@xaviliz
Copy link
Contributor

xaviliz commented Jan 28, 2020

Hi all,

I am trying to extract NNLSChroma with Essentia library using LogSpectrum. I got the chromagram but something is wrong. In theory, NNLSChroma is waiting for meanTuning as VECTOR_REAL when meanTuning returned by LogSpectrum is a MATRIX_REAL defined by a N x nBPS VECTOR_REAL, a vector for each frame.

So, I have tried to sum each nBPS x 1 VECTOR_REAL as original NNLSChroma does, but the resulting chromagram, bassChromagram and tunedLogFreqSpectrum are filled with zeros.

I was revising this C++ example, but I cannot get similar results in Python
https://github.com/MTG/essentia/blob/master/src/examples/standard_nnls.cpp

Here my code:

from essentia import Pool, common
from essentia.standard import MonoLoader, Windowing, Spectrum, LogSpectrum, NNLSChroma, FrameGenerator

windowType = 'hann'
sampleRate = 44100
frameSize = 16384
nBPS = 3

window = Windowing(type=windowType, size=frameSize)
spectrum = Spectrum(size=frameSize)
logSpectrum = LogSpectrum(frameSize=int(frameSize/2) + 1, binsPerSemitone=nBPS, sampleRate=sampleRate)
nnlsChroma = NNLSChroma()
pool = Pool()

audio = MonoLoader(filename=filePath, sampleRate=sampleRate)()
for frame in FrameGenerator(audio, frameSize=frameSize, hopSize=hopSize, startFromZero=True):
            spectrum = spectrum(window(frame))
            logFreqSpectrum, meanTuning, localTuning = logSpectrum(spectrum)
            pool.add('features.logSpectrogram', logFreqSpectrum)
            pool.add('features.meanTuning', meanTuning)
            pool.add('features.localTuning', localTuning)
tunedLogfreqSpectrum, semitoneSpectrum, bassChromagram, chromagram = nnlsChroma(pool['features.logSpectrogram'], np.mean(pool['features.meanTuning'], axis=1), pool['features.localTuning'])

Please, can you provide a Python example or any clarification in my code to compute these features?

Thanks in advance,
XL

@palonso
Copy link
Contributor

palonso commented Jan 31, 2020

Hi @xaviliz,

LogSpectrum updates the average tuning internally so using the last value would be more correct than re-averaging. You can do that by setting instead of adding to the pool.

I reproduced your example and also got zeros in the output. While checking our comments, I found that only calculations without NNLS were properly tested and warrantied to work fine, so I'd recommend sticking to that configuration until we figure out the problem.

Here it's your code snippet with the required modifications:

from essentia import Pool, common
from essentia.standard import MonoLoader, Windowing, Spectrum, LogSpectrum, NNLSChroma, FrameGenerator

windowType = 'hann'
sampleRate = 44100
frameSize = 16384
nBPS = 3

window = Windowing(type=windowType, size=frameSize, normalized=False)
spectrum = Spectrum(size=frameSize)
logSpectrum = LogSpectrum(frameSize=int(frameSize/2) + 1, binsPerSemitone=nBPS, sampleRate=sampleRate)
nnlsChroma = NNLSChroma(frameSize=int(frameSize/2) + 1, useNNLS=False)
pool = Pool()

audio = MonoLoader(filename=filePath, sampleRate=sampleRate)()
for frame in FrameGenerator(audio, frameSize=frameSize, hopSize=hopSize, startFromZero=True):
            logFreqSpectrum, meanTuning, localTuning = logSpectrum(spectrum(window(frame)))
            pool.add('features.logSpectrogram', logFreqSpectrum)
            pool.add('features.localTuning', localTuning)


pool.set('features.meanTuning', meanTuning)  # use only the last value

tunedLogfreqSpectrum, semitoneSpectrum, bassChromagram, chromagram = nnlsChroma(pool['features.logSpectrogram'], pool['features.meanTuning'], pool['features.localTuning'])

@xaviliz
Copy link
Contributor Author

xaviliz commented Jan 31, 2020

Hi @pabloEntropia

thanks for your clarification with the average tuning and your code modifications.

I was testing NNLS Chroma during the last days and I found the same bug. When useNNLS is True chromagram is empty. However, when it is False chromagram is exactly the same than the original VAMP plugin implementation.

I hope this helps.

@palonso
Copy link
Contributor

palonso commented Jan 31, 2020

It's nice to know that it works as expected without NNLS. We'll discuss a fix for the NNLS mode in #951

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants