Skip to content

inconsistency between the distance calculated in the profile D (of stumpy.match) and np.linalg.norm #499

@NimaSarajpoor

Description

@NimaSarajpoor

Problem:

I was trying to use stumpy.match and I realized there is a small error (maybe caused by rounding or something else) that prevents stumpy from discovering the matching pattern of a query in one of my cases. In below, I show the distance between a query and a pattern (of the same size) calculated in three ways:

when normalize=False:
1 - numpy.linalg.norm( ) gives 46.54073160107859
2- sklearn.metrics.pairwise_distances() gives 46.54073160107859
3- stumpy.match output gives 46.5407316010786

As you can see the value calculated by stumpy is a little bit lower. So, when I set the max_distance to 46.54073160107859, the function stumpy.match returned an empty array.

FYI: I had a step in my study where I had to calculate all the pairwise distances between Q (with length L) and "some" (not all) subsequences (with length L each). Then, I found the minimum distance d among the calculated distances. I set max_distance to d and I expected to get at least one matching subsequence. (why at least? because this time I searched through the whole time series not just "some" of its subsequences)

Solution (?):

I resolved it by simply add 1e-6 to max_distance value


Also:

If I understand correctly, you are using <= when trying to compare distance of two patterns with max_distance.
However, in the docstring stumpy.match function, it explains the return out as follows:

out : numpy.ndarray
The first column consists of distances of subsequences of T whose distances
to Q are smaller than max_distance ...

The term "smaller than" should be changed to "less than or equal to" (the same as what provided in the beginning of the same docstring)

Metadata

Metadata

Assignees

Labels

enhancementNew feature or requestquestionFurther information is requested

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions