Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] YIN #974

Closed
wants to merge 8 commits into from
Closed

[WIP] YIN #974

wants to merge 8 commits into from

Conversation

lostanlen
Copy link
Contributor

@lostanlen lostanlen commented Aug 20, 2019

Reference Issue

Fixes #527 (eventually)

What does this implement/fix? Explain your changes.

This is a comprehensive reimplementation of the YIN algorithm (de Cheveigné and Kawahara, JASA 2002). It's a complex and long-awaited new feature, so i imagine that it's going to require meticulous rounds of CR.

To write this function, i re-read the YIN paper as well as the MATLAB code:
https://github.com/lemonzi/matlab/blob/master/yin/yin2.m

Thankfully, this paper is very well written so it was easy to link it to relevant parts of the code.

Any other comments?

  • the code is in a relatively mature state, but it's still a work in progress. I haven't written the tests yet. I want to create a consensus in the API before i do.

  • i tried my best to follow existing naming conventions in the librosa API. Hence y, sr, frame_length, hop_length, fmin, fmax, and pad_mode.

  • However, threshold_1 and threshold_2 could use better names. Ideas appreciated.

  • Likewise for the booleans cumulative and interpolate. Do we even support them as kwargs? If so, i need to document them.

  • Perhaps the docstring should be longer and give some historical / methodological context around YIN? Contributions appreciated.

  • The parabolic interpolation step could potentially be faster by allocating less intermediate arrays. But maybe that's a premature optimization.

  • I did not implement part 6: "best local estimate". It would considerably slow down the code; de Cheveigné and Kawahara report a relatively small effect; and i doubt many people use it anyway. Long term, we would rather use pYIN rather then part 6.

  • This implementation works splendidly well as long as there are ten periods or more in the window. Below that number, it's quite numerically unstable. Could it be due to the boxcar window? de Cheveigné and Kawahara write that "there is [...] little reason not to use a square window", but i doubt it.

  • Likewise for padding. I am using pad_mode=reflect, but really, there is little hope of recovering the instantaneous frequency at the edges.

  • What should the tests be? Tests on synthetic tones or tests vs the MATLAB code?

@bmcfee
Copy link
Member

bmcfee commented Aug 20, 2019

😮 thanks for doing this! I'll give it some cycles later on today.

@bmcfee bmcfee added the functionality Does this add new functionality? label Aug 20, 2019
@bmcfee bmcfee added this to the 0.7.1 milestone Aug 20, 2019
@bmcfee bmcfee self-requested a review August 20, 2019 13:42
librosa/core/pitch.py Outdated Show resolved Hide resolved
librosa/core/pitch.py Outdated Show resolved Hide resolved
librosa/core/pitch.py Outdated Show resolved Hide resolved
Copy link
Member

@bmcfee bmcfee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good so far; a few minor API requests.

One big question: we'll want to support pyin in the future, if not immediately in this PR. How can we set up the internals of this function to make that easy to do?

@bmcfee
Copy link
Member

bmcfee commented Aug 20, 2019

  • the code is in a relatively mature state, but it's still a work in progress. I haven't written the tests yet. I want to create a consensus in the API before i do.

Notes in my first CR. Generally, I think it's better if we can make the API and default parameters internally consistent with the rest of the library. If we want to show how to recover a good approximation to the original paper / implementation, that can be done in a notebook or docstring example.

  • i tried my best to follow existing naming conventions in the librosa API. Hence y, sr, frame_length, hop_length, fmin, fmax, and pad_mode.

  • However, threshold_1 and threshold_2 could use better names. Ideas appreciated.

peak_threshold and periodicity_threshold? A bit wordy, but they get the point across.

  • Likewise for the booleans cumulative and interpolate. Do we even support them as kwargs? If so, i need to document them.

I don't have strong opinions on this one, but it doesn't seem to hurt to include them.

  • Perhaps the docstring should be longer and give some historical / methodological context around YIN? Contributions appreciated.

Let's hold off on that for now.

  • The parabolic interpolation step could potentially be faster by allocating less intermediate arrays. But maybe that's a premature optimization.

I'm wondering if this could be simplified to use scipy's polynomial interpolation instead? It might take a little bit of clever array shaping, but it would be preferable to having a bunch of grungy interpolation code IMO.

  • I did not implement part 6: "best local estimate". It would considerably slow down the code; de Cheveigné and Kawahara report a relatively small effect; and i doubt many people use it anyway. Long term, we would rather use pYIN rather then part 6.

Indeed. Our API plan should make it easy to extend to pyin.

  • This implementation works splendidly well as long as there are ten periods or more in the window. Below that number, it's quite numerically unstable. Could it be due to the boxcar window? de Cheveigné and Kawahara write that "there is [...] little reason not to use a square window", but i doubt it.

We should definitely support user-specified windows. Ditto for padding modes. Edge behavior is not well-defined anyway, so I'm not going to worry about it.

  • Likewise for padding. I am using pad_mode=reflect, but really, there is little hope of recovering the instantaneous frequency at the edges.

  • What should the tests be? Tests on synthetic tones or tests vs the MATLAB code?

I think synthetic tones would make sense here.

@lostanlen
Copy link
Contributor Author

In the interest of releasing v0.7.1 soon, i suggest we bump this to the v0.8 milestone?

@bmcfee
Copy link
Member

bmcfee commented Sep 3, 2019

I'm okay with that.

@bmcfee bmcfee modified the milestones: 0.7.1, 0.8.0 Sep 3, 2019
@lostanlen
Copy link
Contributor Author

Looking at this again in the wake of ISMIR. I will address the CR, add tests, and reconvene.

@bshall bshall mentioned this pull request Jan 18, 2020
@lostanlen
Copy link
Contributor Author

superseded by #1063

@lostanlen lostanlen closed this Jan 21, 2020
@bmcfee bmcfee deleted the yin branch August 27, 2021 16:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functionality Does this add new functionality?
Development

Successfully merging this pull request may close these issues.

YIN fundamental frequency estimator
2 participants