Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Further CQT enhancements #279

Merged
merged 9 commits into from Feb 5, 2016
Merged

Further CQT enhancements #279

merged 9 commits into from Feb 5, 2016

Conversation

bmcfee
Copy link
Member

@bmcfee bmcfee commented Nov 9, 2015

The current cqt implementation returns only magnitude. If we're going to support inverse CQT #165 , phase will be critical for good reconstruction.

This PR is the first step in this direction.

It's currently implemented as API-compatible with the existing CQT; complex-output is enabled by setting real=False. In 0.5, we should remove this parameter and make it always return complex-valued transforms.

As a side issue, this PR fixes util.sync to retain the dtype of the input data.

@bmcfee bmcfee added enhancement Does this improve existing functionality? functionality Does this add new functionality? labels Nov 9, 2015
@bmcfee bmcfee self-assigned this Nov 9, 2015
@bmcfee bmcfee added this to the 0.4.2 milestone Nov 9, 2015
@bmcfee
Copy link
Member Author

bmcfee commented Nov 9, 2015

Paging @ebattenberg for feedback. Not urgent, but I reordered some of the abs operations in a not-entirely-equivalent way. This may effect the accuracy of pseudo and hybrid cqt, though whether it's improved or not, I can't say just yet.

UPDATE:

hybrid cqt tests still pass with the new order of operations (abs(filters.dot(D))), and some informal sanity checks show a slight reduction in error compared to the old way (abs(filters).dot(abs(D))).

@ebattenberg
Copy link
Contributor

Changing the Pseudo CQT to take phase into account changes it into not a pseudo CQT but an undersampled (in time) CQT.

@bmcfee
Copy link
Member Author

bmcfee commented Nov 13, 2015

Fair enough. Is this really a meaningful distinction, or is there a reference implementation/description of pcqt that we adhere to?

@dpwe
Copy link
Contributor

dpwe commented Nov 13, 2015

It's a big deal. Imagine the "impulse response" corresponding to a
high-frequency, wide-bandwidth CQT bin. It's a little windowed (complex)
sinusoid whose window might be 1ms or shorter. Now tell me the phase of
this kernel over a 10ms window - there could be several entire instances of
the kernel within the window, each with unrelated phases. So a single phase
value doesn't really mean anything.

If you do a single calculation against the complex Fourier coefficients,
you're most likely taking the inner product against one particular
positioning of the IR, which will invite severe time-aliasing (under
sampling).

Taking the average magnitude, the total energy coming through the filters
defined by the impulse response over any length of window, does make sense.

DAn.

On Friday, November 13, 2015, Brian McFee notifications@github.com wrote:

Fair enough. Is this really a meaningful distinction, or is there a
reference implementation/description of pcqt that we adhere to?


Reply to this email directly or view it on GitHub
https://github.com/bmcfee/librosa/pull/279#issuecomment-156427937.

@bmcfee
Copy link
Member Author

bmcfee commented Nov 13, 2015

If you do a single calculation against the complex Fourier coefficients,
you're most likely taking the inner product against one particular
positioning of the IR, which will invite severe time-aliasing (under
sampling).

Ok, this makes sense. Thanks.

Backing up a bit: maybe it's just silly to offer complex mode for pseudo-cqt in the first place? I suppose that anyone that cares enough about phase (eg for later cqt inversion or what-have-you) would just use the full cqt anyway. Does that seem reasonable to everyone else?

@ebattenberg
Copy link
Contributor

Backing up a bit: maybe it's just silly to offer complex mode for pseudo-cqt in the first place? I suppose that anyone that cares enough about phase (eg for later cqt inversion or what-have-you) would just use the full cqt anyway. Does that seem reasonable to everyone else?

Yeah. The Pseudo CQT is an alternative to averaging and downsampling the CQT for higher frequency bands. It should be more efficient but is not exactly equivalent. Though I don't know how well inverting the CQT will go if you average and downsample in time to get down to the desired hop size.

@ebattenberg
Copy link
Contributor

Is this really a meaningful distinction, or is there a reference implementation/description of pcqt that we adhere to?

I just googled "pseudo CQT" to get the exact definition, but just got links to librosa. ;)

@bmcfee
Copy link
Member Author

bmcfee commented Nov 14, 2015

I just googled "pseudo CQT" to get the exact definition, but just got links to librosa

I guess the spirit of my question was more: is "pseudo CQT" a precise term, or do we have some wiggle room in its definition? At any rate, DAn's comment makes sense, and I'm happy to keep pcqt as magnitude-only.

@bmcfee
Copy link
Member Author

bmcfee commented Feb 1, 2016

I don't think there's anything left to do on this one. I have a working (but terribly slow) inverse-cqt implemented in a notebook, but I don't think perfecting inverse-cqt should hold up complex-cqt.

Does anyone object to merging?

@bmcfee
Copy link
Member Author

bmcfee commented Feb 2, 2016

In working out #302 , it occurred to me that we should probably rename the resolution parameter for cqt functions, because: A) it's not all that descriptive, and B) it conflicts with resolution in tuning, which has a much more precise interpretation. Since this PR has other cqt changes, we may as well consolidate.

Do folks on this thread have any thoughts on a better name for resolution? I'd hesitate to use something like q_factor since the resolution parameter is just one contributing factor to Q here, and I don't want to be misleading. Eric H suggested something like width. Maybe filter_scale?

(And don't worry, we have a warning shim that will provide backward compatibility now, so the old argument name will continue to work.)

@bmcfee
Copy link
Member Author

bmcfee commented Feb 4, 2016

I don't think there's anything left to do on this one. I have a working (but terribly slow) inverse-cqt implemented in a notebook, but I don't think perfecting inverse-cqt should hold up complex-cqt.

Fast inverse CQT implementation is here ; gist won't store audio buffers, so you'll have to download it and run it on your own favorite example. I think this basically demonstrates that the complex cqt is doing the right thing.

(Caveats abound: icqt sounds lousy for resolution > 1, and frequency over-sampling actually degrades quality due to using filter^H instead of a proper basis inversion. SNR is also rather low, mostly i suspect due to LPF inherent in this cqt implementation, but to my ear it sounds pretty good.)

@bmcfee bmcfee mentioned this pull request Feb 4, 2016
@bmcfee
Copy link
Member Author

bmcfee commented Feb 4, 2016

Final to-dos:

  • decide on new name for resolution
  • implement deprecation warning for real=True

bmcfee added a commit that referenced this pull request Feb 5, 2016
@bmcfee bmcfee merged commit ce00692 into master Feb 5, 2016
@bmcfee bmcfee deleted the complex-cqt branch February 5, 2016 01:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Does this improve existing functionality? functionality Does this add new functionality?
Development

Successfully merging this pull request may close these issues.

None yet

3 participants