distance between Q and itself is not zero

```
seed = 13
np.random.seed(seed)
T = np.random.uniform(-1000, 1000, [64]).astype(np.float64)

Q = T[44 : 48]
```

The distance between a sequence and itself should be zero. However, this is not true for the `Q` shown above if we use  pearson approach. Let's see:


```
QT = np.dot(Q, Q)
μ_Q = np.mean(Q)
M_T = np.mean(Q)

σ_Q = np.std(Q)
Σ_T = np.std(Q)

denom = m * σ_Q * Σ_T
ρ = (QT - m * μ_Q * M_T) / denom 
D_squared = np.abs(2 * m * (1.0 - ρ))
D = np.sqrt(D_squared)
```
And, we have:
```
>>> ρ
0.9999999999999999

>>> D_squared
8.881784197001252e-16

>>> D
2.9802322387695312e-08
```

Note that `D` should have been 0. Althought `npt.assert_almost_equal(d, 0)` does not raise an error, `D` is greater than 0. This can become an issue in testing snippet module (Explained below)

---

The performant snippet uses performant `_mpdist_vect`, which computes mpdist profile using `_mass`. The naive snippet uses naive ` mpdist_vect` which computes distances using naive `stump`.  so, the distance between `Q` and itself is  `2.9802322387695312e-08` in naive approach. In performant approach, however, it is 0. This small difference can result in a considerable change in the boolean array `mask`:

https://github.com/TDAmeritrade/stumpy/blob/2d003ff2c2e0212eba32b22211a743306bafa992/stumpy/snippets.py#L285

which, in turn, results in a considerable change in `np.sum(mask)` and, consequently `snippet_fractions`.

https://github.com/TDAmeritrade/stumpy/blob/2d003ff2c2e0212eba32b22211a743306bafa992/stumpy/snippets.py#L286

Note that `np.sum(mask)` is a an interger. So, we are not talking about approx. `1e-8` loss of precision here! That small loss of precision in `D` can result in a loss of precision of at least `1` in `np.sum(mask)`. 

If we look at the errors in https://github.com/TDAmeritrade/stumpy/actions/runs/4636087916/jobs/8203665926?pr=823 we can see that all errors are related to `snippets_fractions`, and the errors are not negligible due to the reason explained earlier.



Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

distance between Q and itself is not zero #828

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

distance between Q and itself is not zero #828

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions