Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DTW #323

Merged
merged 1 commit into from
Jul 26, 2016
Merged

DTW #323

merged 1 commit into from
Jul 26, 2016

Conversation

stefan-balke
Copy link
Member

@stefan-balke stefan-balke commented Apr 18, 2016

Hey all,

I mainly took @craffel dijtw source code and merged it with mine.
As dicussed in #298, we want the following features:

  • Arbitrary step sizes
  • Additive or multiplicative local weights for the steps
  • Subsequence (so it can be used for matching)
  • Global path constraints (e.g. Sakoe-Chiba band etc.)
  • make numba optional (cf. Brian's comment)
  • test backtracking explicitly
  • plot D + wp (in the examples)
  • Gullying

After finishing the features (which are in dijtw), we need more tests and some final notebook with benchmarking the implementation against "vanilla-vanilla" dtw.


This change is Reviewable

@stefan-balke stefan-balke changed the title WIP: DTW [WIP] DTW Apr 18, 2016
@craffel
Copy link
Contributor

craffel commented Apr 18, 2016

Nice. I can take a look at this in May.

@bmcfee
Copy link
Member

bmcfee commented Apr 18, 2016

I know this is still WIP and commenting at this stage is annoying, but can we not make numba a requirement?

This comment provides a workaround for optional jit compilation, so that the same code should work with or without numba.

@bmcfee
Copy link
Member

bmcfee commented Apr 20, 2016

This PR is failing tests because of a broken comparison test in the display module.

I just fixed this upstream, so if you git rebase master, it should be fixed.

@stefan-balke
Copy link
Member Author

This comment provides a workaround for optional jit compilation, so that the same code should work with or without numba.

Done.

Should this go into 0.4.3? I think it could go but if we decide that the function interfaces are fine. Putting more features into it later is easy then.

@bmcfee bmcfee added the functionality Does this add new functionality? label Apr 28, 2016
@bmcfee
Copy link
Member

bmcfee commented Apr 28, 2016

Should this go into 0.4.3? I think it could go but if we decide that the function interfaces are fine. Putting more features into it later is easy then.

If you think we can do it quickly (ie next couple of weeks), then sure. If not, we can continue to punt it.

@bmcfee
Copy link
Member

bmcfee commented May 4, 2016

If you think we can do it quickly (ie next couple of weeks), then sure. If not, we can continue to punt it.
Hide all checks

@stefan-balke Any thoughts on this? I'm fine either way, but if you don't have time to finish it soon, let's punt.

@stefan-balke
Copy link
Member Author

Since this is planned to be joint work with @craffel I'd prefer to have it ready for the next version after the his phd thesis is handed in.
@craffel any objections?

@craffel
Copy link
Contributor

craffel commented May 4, 2016

Yeah, still interested in helping, and I will have more bandwidth to help later this month.

@bmcfee
Copy link
Member

bmcfee commented May 4, 2016

Ok, I've reassigned #298 to 0.5 then.

@stefan-balke
Copy link
Member Author

fyi: This is on my next week's list.

@stefan-balke
Copy link
Member Author

Alright, I got most of the things done.

Gullying

@craffel can we talk about the gullying?
Couldn't we also do it with the same idea as subsequence DTW?
(Initialize the first row of the accumulated cost matrix D with the first row of the C matrix instead of init with the cumsum of C's first row?)

In the case of allowing gullying at the beginning 3 frames, we could simply init D's first row with:

[C[0,0], C[0,1], C[0,2] ....proceed with cumsum(C)... ]

and for the ending frames in a similar way.

Display function

@bmcfee with the rewrite of the display module, I'm confused where to put the visualizations.

I need two things:

  • Plot the cost matrix
  • Add the warping path to the matrix

Could look like this:

fig = plt.figure()
ax = fig.add_subplot(111)
imax = ax.imshow(C, cmap=plt.get_cmap('gray_r'),
                 origin='lower', interpolation='nearest', aspect='auto')

x_val = [x[1] for x in p_adj]
y_val = [x[0] for x in p_adj]
ax.plot(x_val, y_val, marker='o', color='r')
ax.autoscale(tight=True)
plt.ylabel('$X$')
plt.xlabel('$Y$')
cbar = fig.colorbar(imax)
cbar.ax.set_ylabel('Cost')
ax.set_title('Cost Matrix $C$ with an optimal warping path')

Should I do own functions for this or should it derive from anything in the display module?

@bmcfee
Copy link
Member

bmcfee commented Jun 14, 2016

Should I do own functions for this or should it derive from anything in the display module?

I don't think we necessarily need special case display functions for this. What's wrong with the following (matching your notation)?

>>> librosa.display.specshow(C)
>>> plt.plot(p_adj[:, 0], p_adj[:, 1], label='Optimal path')

Or, if you want time axes:

>>> librosa.display.specshow(C, x_axis='time', y_axis='time')
>>> p_time = librosa.frames_to_time(p_adj)
>>> plt.plot(p_time[:, 0], p_time[:, 1], label='Optimal path')

@stefan-balke
Copy link
Member Author

I don't think we necessarily need special case display functions for this. What's wrong with the following (matching your notation)?

Nothing, specshow just sounded wrong. :)
Will put it into the examples then.

@craffel
Copy link
Contributor

craffel commented Jun 14, 2016

@craffel can we talk about the gullying?
Couldn't we also do it with the same idea as subsequence DTW?

No, I don't think so. I don't think we're on the same page. The gully simply constrains the possible end points. For example, if the distance matrix is of shape (100, 200) and the gully was .9 then the path would have to end any of (90, 199) through (99, 199) or (99, 180) through (99, 199). Does that make sense? I have definitions of it more rigorously in any of my papers which use DTW. So, it's something I usually enforce at traceback time, not in the distance matrix, although I'd expect you could also do it by filling portions of the final row and column of the matrix with infinity. I don't see how it could be accomplished by modifying the first row or column because in my understanding it is independent of the first row and column, it's only a constraint on the final point. BTW, this is something I picked up (and maybe reinterpreted/bastardized) from @dpwe, not sure where he got it from.

@craffel
Copy link
Contributor

craffel commented Jun 14, 2016

Also, thanks for all of this! I will spend some time checking it out in the near future.

@stefan-balke
Copy link
Member Author

Hey Colin,

So, it's something I usually enforce at traceback time

Yep, thought about this and got to the same conclusion :)

However, I wonder if we couldn't accomplish this by restricting the subsequence DTW somehow.

Here a small sketch:

img_2108

So for me, gullying live somewhat in between subsequence and global DTW but I'll think about it a little bit more...

@bmcfee
Copy link
Member

bmcfee commented Jun 15, 2016

Could we get some equations (or pointers/screenshots to relevant papers) to describe what "gullying" means?

If @stefan-balke's drawing (c) is accurate, it seems like you could enforce this by, as @craffel says, filling the top row (outside the red box) and right column (outside the green box) by infinity, and the rest fixes itself.

@craffel
Copy link
Contributor

craffel commented Jun 16, 2016

Firs things first, what I originally wrote

if the distance matrix is of shape (100, 200) and the gully was .9 then the path would have to end any of (90, 199) through (99, 199) or (99, 180) through (99, 199)

is wrong for subsequence DTW, I wrote it in haste. For subsequence DTW, it should be "the path would have to end any of (90, 199) through (99, 199) or (99, 90) through (99, 199)".

Here a small sketch:

Your sketch looks reasonable to me.

Could we get some equations (or pointers/screenshots to relevant papers) to describe what "gullying" means?

As I said above, I have definitions in my papers which use DTW (and they all say basically the same thing, and it's something I picked up from @dpwe, not sure where he got it from. Here's one from ISMIR 2015 (note this definition is not for subsequence DTW):

gully

For subsequence DTW it's g*min(M, N) <= p[L] <= N, g*min(M, N) <= q[L] <= M. (also in retrospect I should have defined p and q to be natural numbers, not real numbers, but oh well). @dpwe's code/description is here: http://www.ee.columbia.edu/ln/rosa/matlab/alignmidi/

@stefan-balke
Copy link
Member Author

Anyone in for a CR? We can add gullying when @craffel has more cycles..

@bmcfee
Copy link
Member

bmcfee commented Jun 20, 2016

I'm happy to CR after it passes on travis.

@stefan-balke
Copy link
Member Author

stefan-balke commented Jun 20, 2016

Oops, it was passing in 3.5. I think we need numba in the dependencies...Tried to add it but it seems it's not used in Travis.

Obtaining file:///home/travis/build/librosa/librosa
Requirement already satisfied (use --upgrade to upgrade): audioread>=2.0.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): numpy>=1.8.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): scipy>=0.13.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): scikit-learn>=0.14.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): matplotlib>=1.5 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): joblib>=0.7.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): decorator>=3.0.0 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): six>=1.3 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): resampy>=0.1.2 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): python-dateutil in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from matplotlib>=1.5->librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): pytz in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from matplotlib>=1.5->librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): cycler in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from matplotlib>=1.5->librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): pyparsing!=2.0.4,>=1.5.6 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from matplotlib>=1.5->librosa==0.5.0.dev0)
Requirement already satisfied (use --upgrade to upgrade): Cython>=0.21 in /home/travis/env/miniconda3.5/envs/test-environment/lib/python3.5/site-packages (from resampy>=0.1.2->librosa==0.5.0.dev0)
Installing collected packages: librosa
  Running setup.py develop for librosa

@@ -11,7 +11,7 @@ conda_create ()
conda update -q conda
conda config --add channels pypi
conda info -a
deps='pip numpy scipy pandas requests nose coverage numpydoc matplotlib sphinx scikit-learn seaborn'
deps='pip numpy scipy pandas requests nose coverage numpydoc matplotlib sphinx scikit-learn seaborn numba'
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here I add numba.

@bmcfee
Copy link
Member

bmcfee commented Jun 20, 2016

.Tried to add it but it seems it's not used in Travis.

I just wiped the image cache and restarted the build. Hopefully that should do it.

But yes, numba needs to be installed via conda to get llvm packages; pip alone won't do it.

@stefan-balke
Copy link
Member Author

But yes, numba needs to be installed via conda to get llvm packages; pip alone won't do it.

True fact.

@bmcfee
Copy link
Member

bmcfee commented Jun 20, 2016

I don't think your travis problems are coming from conda, but instead are due to the way you parameterize the decorator.

Travis log

Offending code

You need to return that as a decorator, eg like done here.

EDIT: scratch that; the problem is that when numba can be imported, it returns the jit wrapper as a proper decorator. But when numba can't be imported, it returns a bool, which doesn't work as a decorator.

@stefan-balke
Copy link
Member Author

test_dtw.test_dtw_global ... ok
test_dtw.test_dtw_global_diagonal ... ok
test_dtw.test_dtw_subseq ... ok

Here we go....

@stefan-balke
Copy link
Member Author

stefan-balke commented Jun 20, 2016

EDIT: scratch that; the problem is that when numba can be imported, it returns the jit wrapper as a proper decorator. But when numba can't be imported, it returns a bool, which doesn't work as a decorator.

Yup, but I'm doing something similar. I'm importing the decorator. If it's available - use it - otherwise return the function and do the normal Python thingy....

Tried it locally with a numba env and a non-numba env. Both passing.

@stefan-balke
Copy link
Member Author

Ready for a first CR.

@bmcfee
Copy link
Member

bmcfee commented Jun 22, 2016

Review status: 6 of 8 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


.travis_dependencies.sh, line 14 [r10] (raw file):

Previously, bmcfee (Brian McFee) wrote…

Really? That's weird.

Apparently so is libsamplerate O_o

I'll make a separate PR to clean that stuff out. Thanks for catching it.

I fixed this in master; please rebase

Comments from Reviewable

@stefan-balke
Copy link
Member Author

.travis_dependencies.sh, line 14 [r10] (raw file):

Previously, bmcfee (Brian McFee) wrote…

I fixed this in master; please rebase

Done.

Comments from Reviewable

@stefan-balke
Copy link
Member Author

librosa/core/dtw.py, line 13 [r10] (raw file):

Previously, bmcfee (Brian McFee) wrote…

I'm not even sure we need this function at all.

It only gets used in one place, and that's explicitly to zero out the triangles of a given matrix.

I would much rather have a function that runs in-place to do this, eg,

def band_mask(x, k, value):

    # not suer about k here, might need to be radius+1
    idx_u = np.triu_indices_from(x, k=k)
    idx_l = np.tril_indices_from(x, k=k)
    x[idx_u] = value
    x[idx_l] = value

this ought to be more memory efficient than the current numba code, and just as fast since it's using numpy builtins. The main benefit though is that it's much less code, and the functionality is obvious.

Looks good, will try it out.

Comments from Reviewable

@bmcfee
Copy link
Member

bmcfee commented Jun 23, 2016

Reviewed 1 of 1 files at r18.
Review status: 6 of 8 files reviewed at latest revision, 5 unresolved discussions, some commit checks failed.


Comments from Reviewable

@stefan-balke
Copy link
Member Author

librosa/core/dtw.py, line 13 [r10] (raw file):

Previously, stefan-balke (Stefan Balke) wrote…

Looks good, will try it out.

Finally found some time to do it. Still not convinced by the if-else condition. Any thoughts?

Comments from Reviewable

@bmcfee
Copy link
Member

bmcfee commented Jul 20, 2016

Review status: 5 of 8 files reviewed at latest revision, 6 unresolved discussions.


librosa/core/dtw.py, line 13 [r10] (raw file):

Previously, stefan-balke (Stefan Balke) wrote…

Finally found some time to do it. Still not convinced by the if-else condition. Any thoughts?

What do you mean by if-else condition?

tests/test_dtw.py, line 69 [r19] (raw file):

    assert np.array_equal(mut_x, gt_x)

    # Case 2: N!=M

This should be a separate test.


Comments from Reviewable

@stefan-balke
Copy link
Member Author

librosa/core/dtw.py, line 13 [r10] (raw file):

Previously, bmcfee (Brian McFee) wrote…

What do you mean by if-else condition?

Checking for the "longer" axis. Would love to have something like `np.roll_to_biggest_axis()` but it's okay I guess.

Comments from Reviewable

@stefan-balke
Copy link
Member Author

I squashed the commits. This one is ready to go merge.

@bmcfee
Copy link
Member

bmcfee commented Jul 21, 2016

Reviewed 1 of 1 files at r10.
Review status: 6 of 8 files reviewed at latest revision, 5 unresolved discussions.


librosa/core/dtw.py, line 190 [r10] (raw file):

Previously, stefan-balke (Stefan Balke) wrote…

see above

looks like `my_mask` is no longer used here.

Can you pylint this and track down any other unused variables / icky stuff?


tests/test_dtw.py, line 69 [r19] (raw file):

Previously, bmcfee (Brian McFee) wrote…

This should be a separate test.

There should also be a test on the transpose of this, just to make sure our corner cases are correct. Otherwise looks good.

Comments from Reviewable

@stefan-balke
Copy link
Member Author

Done.

@bmcfee
Copy link
Member

bmcfee commented Jul 21, 2016

Review status: 6 of 8 files reviewed at latest revision, 7 unresolved discussions.


librosa/core/dtw.py, line 14 [r21] (raw file):

def band_mask(x, radius, value=0):
    """Construct band-around-diagonal mask (Sakoe-Chiba band).  When

This doesn't actually create a band anymore; it fills values outside the band. Maybe we can find a more appropriate name and/or description?


librosa/core/dtw.py, line 28 [r21] (raw file):

        ``int(radius*min(mask.shape))``.
    value : int
        Replacement-value.

This is unclear. Suggest "x[i,j] = value when (i, j) lies outside the band"


Comments from Reviewable

@stefan-balke
Copy link
Member Author

I called the band now global_constraints which is a bit more general.

@bmcfee
Copy link
Member

bmcfee commented Jul 22, 2016

Review status: 6 of 8 files reviewed at latest revision, 5 unresolved discussions.


librosa/core/dtw.py, line 190 [r10] (raw file):

Previously, bmcfee (Brian McFee) wrote…

looks like my_mask is no longer used here.

Can you pylint this and track down any other unused variables / icky stuff?

It's bad luck to have a variable name (`global_constraints`) the same as a function within scope.

Given the options, I'd prefer to change the function name (again, sorry). The name "global_constraints" doesn't convey much information. How about fill_off_diagonal?


Comments from Reviewable

@stefan-balke
Copy link
Member Author

Did I miss anything? Otherwise this one is ready.

@bmcfee
Copy link
Member

bmcfee commented Jul 26, 2016

Review status: 6 of 8 files reviewed at latest revision, 4 unresolved discussions.


librosa/core/dtw.py, line 7 [r23] (raw file):

import numpy as np
from scipy.spatial.distance import cdist
from librosa.util.decorators import optional_jit

Use local module names:

from ..util.decorators import optional_jit

tests/test_dtw.py, line 69 [r19] (raw file):

Previously, bmcfee (Brian McFee) wrote…

There should also be a test on the transpose of this, just to make sure our corner cases are correct. Otherwise looks good.

There still needs to be a test for the transpose (12x8)

Comments from Reviewable

@stefan-balke
Copy link
Member Author

tests/test_dtw.py, line 69 [r19] (raw file):

Previously, bmcfee (Brian McFee) wrote…

There still needs to be a test for the transpose (12x8)

It's in line 88 or you want a separate test?

Comments from Reviewable

@bmcfee
Copy link
Member

bmcfee commented Jul 26, 2016

Reviewed 1 of 2 files at r23, 1 of 1 files at r24.
Review status: all files reviewed at latest revision, 3 unresolved discussions.


tests/test_dtw.py, line 69 [r19] (raw file):

Previously, stefan-balke (Stefan Balke) wrote…

It's in line 88 or you want a separate test?

ohhh i see. I prefer small tests, but this is fine.

Comments from Reviewable

@bmcfee bmcfee merged commit c053d9e into librosa:master Jul 26, 2016
@bmcfee bmcfee added this to the 0.5 milestone Jul 26, 2016
@bmcfee bmcfee assigned craffel, dpwe and bmcfee and unassigned dpwe Jul 26, 2016
@bmcfee
Copy link
Member

bmcfee commented Jul 26, 2016

Merged. Thanks @stefan-balke !

@stefan-balke stefan-balke mentioned this pull request Jul 26, 2016
Closed
@dpwe
Copy link
Contributor

dpwe commented Jul 26, 2016

I got all excited only to find I had been unassigned!

DAn.

On Tuesday, July 26, 2016, Brian McFee notifications@github.com wrote:

Assigned #323 #323 to @dpwe
https://github.com/dpwe.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#323 (comment), or mute
the thread
https://github.com/notifications/unsubscribe-auth/AAhs0Wkqgx05rFWVmecoYUCl2YE8hLRkks5qZgGQgaJpZM4IJqB9
.

@bmcfee bmcfee mentioned this pull request Jul 27, 2016
16 tasks
@bmcfee bmcfee changed the title [CR] DTW DTW Jan 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
functionality Does this add new functionality?
Development

Successfully merging this pull request may close these issues.

None yet

4 participants