New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] YIN #974
Conversation
😮 thanks for doing this! I'll give it some cycles later on today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks pretty good so far; a few minor API requests.
One big question: we'll want to support pyin in the future, if not immediately in this PR. How can we set up the internals of this function to make that easy to do?
Notes in my first CR. Generally, I think it's better if we can make the API and default parameters internally consistent with the rest of the library. If we want to show how to recover a good approximation to the original paper / implementation, that can be done in a notebook or docstring example.
I don't have strong opinions on this one, but it doesn't seem to hurt to include them.
Let's hold off on that for now.
I'm wondering if this could be simplified to use scipy's polynomial interpolation instead? It might take a little bit of clever array shaping, but it would be preferable to having a bunch of grungy interpolation code IMO.
Indeed. Our API plan should make it easy to extend to pyin.
We should definitely support user-specified windows. Ditto for padding modes. Edge behavior is not well-defined anyway, so I'm not going to worry about it.
I think synthetic tones would make sense here. |
In the interest of releasing v0.7.1 soon, i suggest we bump this to the v0.8 milestone? |
I'm okay with that. |
Looking at this again in the wake of ISMIR. I will address the CR, add tests, and reconvene. |
superseded by #1063 |
Reference Issue
Fixes #527 (eventually)
What does this implement/fix? Explain your changes.
This is a comprehensive reimplementation of the YIN algorithm (de Cheveigné and Kawahara, JASA 2002). It's a complex and long-awaited new feature, so i imagine that it's going to require meticulous rounds of CR.
To write this function, i re-read the YIN paper as well as the MATLAB code:
https://github.com/lemonzi/matlab/blob/master/yin/yin2.m
Thankfully, this paper is very well written so it was easy to link it to relevant parts of the code.
Any other comments?
the code is in a relatively mature state, but it's still a work in progress. I haven't written the tests yet. I want to create a consensus in the API before i do.
i tried my best to follow existing naming conventions in the librosa API. Hence
y
,sr
,frame_length
,hop_length
,fmin
,fmax
, andpad_mode
.However,
threshold_1
andthreshold_2
could use better names. Ideas appreciated.Likewise for the booleans
cumulative
andinterpolate
. Do we even support them as kwargs? If so, i need to document them.Perhaps the docstring should be longer and give some historical / methodological context around YIN? Contributions appreciated.
The parabolic interpolation step could potentially be faster by allocating less intermediate arrays. But maybe that's a premature optimization.
I did not implement part 6: "best local estimate". It would considerably slow down the code; de Cheveigné and Kawahara report a relatively small effect; and i doubt many people use it anyway. Long term, we would rather use pYIN rather then part 6.
This implementation works splendidly well as long as there are ten periods or more in the window. Below that number, it's quite numerically unstable. Could it be due to the boxcar window? de Cheveigné and Kawahara write that "there is [...] little reason not to use a square window", but i doubt it.
Likewise for padding. I am using
pad_mode=reflect
, but really, there is little hope of recovering the instantaneous frequency at the edges.What should the tests be? Tests on synthetic tones or tests vs the MATLAB code?