Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What's the meaning of d in calculating DTW distance? #5

Open
DanAlz opened this issue Sep 10, 2019 · 3 comments
Open

What's the meaning of d in calculating DTW distance? #5

DanAlz opened this issue Sep 10, 2019 · 3 comments

Comments

@DanAlz
Copy link

DanAlz commented Sep 10, 2019

I have used DTW distance as the documentation shows. I want to calculate the distance between two curves that each one has three points and when I print (d) I get back a 3*3 matrix that I don't know what it shows? What does each element of the matrix show?

@cjekel
Copy link
Owner

cjekel commented Sep 10, 2019

d is the cumulative distance matrix between the two curves. I think the Matlab explanation is good. It's sometimes useful to visualize the distance matrix and optimal alignment path.

I think the [0, 0] value is the distance between the first point of each curve, the [n-1, n-1] value is the DTW distance, and all other values represent the distance from [0,0] to the [i, j] index.

Let me know if this helps.

@DanAlz
Copy link
Author

DanAlz commented Sep 10, 2019

d is the cumulative distance matrix between the two curves. I think the Matlab explanation is good. It's sometimes useful to visualize the distance matrix and optimal alignment path.

I think the [0, 0] value is the distance between the first point of each curve, the [n-1, n-1] value is the DTW distance, and all other values represent the distance from [0,0] to the [i, j] index.

Let me know if this helps.

At first, I thought the [0,0] value is the euclidean distance between the first point of each curve and the [0,1] value is the euclidean distance between the first point of the first curve and the second point of the second curve and generally [n,m] value is the euclidean distance between n_th point of the first curve and m_th point of the second curve but I calculated euclidean distance between points of the first curve and second curve and I saw the result is different from what d matrix represent.

@cjekel
Copy link
Owner

cjekel commented Sep 10, 2019

It's the cumulative distance matrix that results in the smallest d[n-1, n-1].

At first, I thought the [0,0] value is the euclidean distance between the first point of each curve

This is correct. If it's not -- something is wrong.

Let c be the pairwise distance between each point.

d[0, 1] is d[0, 0] + c[0, 1]

d[1, 1] is min(d[0,0], d[0,1], d[1,0]) + c[1,1]

etc...

Here is some code to help visualize what's going on.

import numpy as np
import matplotlib.pyplot as plt
import similaritymeasures as sm

x = np.array([[0, 0, 0, 0]]).T
y = np.array([[1, 2, 3, 4]]).T
r, d = sm.dtw(x, y)
assert(0 + 1 + 2 + 3 + 4 == r)

plt.imshow(d, origin='lower')
plt.yticks([0, 1, 2, 3])
plt.xticks([0, 1, 2, 3])
plt.title('Cumlative distance matrix')
plt.xlabel('Index of x')
plt.ylabel('Index of y')
plt.colorbar()
plt.show()

Check out algorithm 2.1 from http://seninp.github.io/assets/pubs/senin_dtw_litreview_2008.pdf

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants