Warped series is not of the same length as the reference and query #7

JulietteVTrigt · 2020-11-10T16:37:34Z

Python port of R's Comprehensive Dynamic Time Warp algorithm package version:
Python version: 3.7.3
Operating System: OS X

Description

If the query and reference series are of size N, then the warped series is always of size N-1. This can be explained by how the warp function is set up:

jset = alignment.index2
jmax = numpy.max(jset)
ii = function(numpy.arange(jmax))
return ii

As Python starts counting from 0, jmax is always N-1. I was wondering whether this is the desired output or whether the length of the warped series should also be N.

What I Did

alignment = dtw(
+        y_pred,
+        y_true,
+        keep_internals=True,
+        step_pattern=step_pattern,
+        window_type="sakoechiba",
+        window_args={"window_size": 4},
+    )
+    warped_indices = warp(alignment, index_reference=False)
+    aligned_series = y_pred[warped_indices]

The text was updated successfully, but these errors were encountered:

tonigi · 2020-11-10T17:40:35Z

It is intended. jmax is the index to the element containing the last aligned point. It may differ from N-1 for open_end=True alignments.

davidmwatson · 2021-02-10T18:26:56Z

I realise this issue has been closed, however I've been experiencing the same problem. There is an off-by-one difference, where the warped series is always one sample short of the length of the series it is being warped to.

I can see that jmax should be the index of the last element, but I was wondering if the issue is more in the generation of the indices to be fed to the interpolation: ii = ifun(numpy.arange(jmax)) . arange will generate values up to but not including the given value, so the length will be one short of jmax. For example, if jmax = 9 (corresponding to the 10th position), then arange(jmax) will only return 9 numbers (0 through 8 inclusive), and so will miss the final index of jmax. Might it make more sense to do arange(jmax + 1) instead, which would give the full length sequence ending on the value of jmax?

I've tried making that change in my own local copy, and it doesn't seem to have broken anything. If nothing else, maybe an extra argument could be added to the function giving the user the option to choose?

tonigi · 2021-02-11T11:27:27Z

I think you rather want either the .index1 or .index2 properties in the result. jmax is just an index.
They will be of the same length and be correct for any step pattern chosen.

davidmwatson · 2021-02-11T18:11:10Z

Okay, here's an example adapted from the docs to show what I mean. First, modify the warp function in warp.py to allow incrementing jmax if needed.

def warp(d, index_reference=False, full_length=False):
    if not index_reference:
        iset = d.index1
        jset = d.index2
    else:
        iset = d.index2
        jset = d.index1

    jmax = numpy.max(jset)

    # Scipy interp1d is buggy. it does not deal with leading
    # duplicated values of x. It returns different values
    # depending on the dtypes of arguments.
    ifun = _interp(x=jset, y=iset)

    ## Change here ##
    n = jmax + 1 if full_length else jmax 
    ii = ifun(numpy.arange(n))
    ## End change ##

    # Quick fix for bug
    if numpy.isnan(ii[0]):
        ii[numpy.isnan(ii)] = iset[0]
    return ii.astype(int)

Then run the example from the docs

idx = np.linspace(0, 2 * np.pi, num=100)
query = np.sin(idx) + np.random.uniform(size=100)/10.0
reference = np.cos(idx)

print(len(reference))  # -> 100

alignment = dtw(query, reference)

# Existing behaviour
wq1 = warp(alignment)
warped_query1 = query[wq1]
print(len(wq1), len(warped_query1))  # -> 99

# Modified behaviour
wq2 = warp(alignment, full_length=True)
warped_query2 = query[wq2]
print(len(wq2), len(warped_query2))  # -> 100

The existing warp behaviour makes wq (and hence also warped_query) one short of the length of the reference. I was expecting that if I warp something to a given reference, the warped series should be the same length as that reference, not one short of it. Using jmax+1 produces the behaviour I was expecting, at least. The .index1/2 attributes are both the same length as each other, but not the same length as the reference, so wouldn't really do what I want either.

Using the R implementation, the same example produces the behaviour I was expecting (the warped query is the same length as the reference):

idx <- seq(0, 2 * pi, length.out=100)
query <- sin(idx) + runif(100)/10.0
reference <- cos(idx)

print(length(reference))  # -> 100

alignment <- dtw(query, reference)

wq <- warp(alignment)
warped_query <- query[wq]
print(c(length(wq), length(warped_query)))  # -> 100

So the python and R implementations are slightly inconsistent here. The difference arises because python and R use different indexing (numpy's arange goes up to but not including jmax, while R's colon operator goes up to and including jmax). If I understand correctly, using arange(jmax+1) in python should reproduce the behaviour of the R implementation.

tonigi · 2021-02-11T21:18:00Z

Thanks for the added details. I'll have a look. The R warp function should be correct.

tonigi · 2021-06-30T09:41:07Z

Sorry for the delay. I still haven't had the time to carefully check the fix and think about the corner cases, but in the meantime, it sounds good for people stumbling upon the bug.

steelec · 2022-09-16T00:09:23Z

I have also recently come across this issue and decided on the same fix for the code. Is there any confirmation that this change will then provide the same output as the R version?

tonigi · 2022-09-16T06:33:29Z

Python version's warp function relies on scipy interpolation, which is different from R's. This should be fixed in the future. So, even with the fix there still can be discrepancies.

tonigi · 2024-03-18T16:36:14Z

I incorporated the change in 9ad9bf9 and released 1.4.0 . Sorry for taking it so long.

tonigi closed this as completed Nov 10, 2020

tonigi reopened this Feb 15, 2021

tonigi mentioned this issue Mar 13, 2024

Length of warped sequence #90

Closed

tonigi closed this as completed Mar 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Warped series is not of the same length as the reference and query #7

Warped series is not of the same length as the reference and query #7

JulietteVTrigt commented Nov 10, 2020 •

edited

tonigi commented Nov 10, 2020

davidmwatson commented Feb 10, 2021

tonigi commented Feb 11, 2021 •

edited

davidmwatson commented Feb 11, 2021

tonigi commented Feb 11, 2021

tonigi commented Jun 30, 2021

steelec commented Sep 16, 2022

tonigi commented Sep 16, 2022

tonigi commented Mar 18, 2024

Warped series is not of the same length as the reference and query #7

Warped series is not of the same length as the reference and query #7

Comments

JulietteVTrigt commented Nov 10, 2020 • edited

Description

What I Did

tonigi commented Nov 10, 2020

davidmwatson commented Feb 10, 2021

tonigi commented Feb 11, 2021 • edited

davidmwatson commented Feb 11, 2021

tonigi commented Feb 11, 2021

tonigi commented Jun 30, 2021

steelec commented Sep 16, 2022

tonigi commented Sep 16, 2022

tonigi commented Mar 18, 2024

JulietteVTrigt commented Nov 10, 2020 •

edited

tonigi commented Feb 11, 2021 •

edited