In this notebook, we try to provide some calculations that might help clarifying the precision in z-normalize case.

# Approach I:

$d = \sqrt{2m (1 - \rho)}$

Now, we take the derivative of function $f(\rho) = \sqrt{2m (1 - \rho)}$ with respect to the variable $\rho$:

$\frac{df}{d\rho} = \sqrt{2m}\frac{-1}{2\sqrt{1 - \rho}}$

Now, let us assume $d$ is the precise z-normalized distance between two subsequence, and $\hat{d}$ the z-normalized distance with imprecision. Similarly, $\rho$ and $\hat{\rho}$ are, respectively, the pearson values, wihout and with imprecision.

$d  - \hat{d} = \Delta{d}$

$\rho  - \hat{\rho} = \Delta{\rho}$

So, if we assume the two values $\Delta{d}$ and $\Delta{\rho}$ are small values, then we can say:


$$
\begin{align}
\frac{\Delta{d}}{\Delta{\rho}} \approx{}&
\frac{-\sqrt{2m}}{2\sqrt{1 - \rho}}
\end{align}
$$


Hence:


$$
\begin{align}
\Delta{d} \approx{}&
\frac{-\sqrt{2m}\Delta{\rho}}{2\sqrt{1 - \rho}}
\end{align}
$$


Let's say we want to have small error in z-normalized distance, i.e. $|\Delta{d}| < \epsilon_{d}$. (In STUMPY, $\epsilon_{d}$ is set by config variable `STUMPY_TEST_PRECISION`, which is default to 1e-5.)


$$
\begin{align}
|\Delta{d}| \le \epsilon_{d}
\end{align}
$$



$$
\begin{align}
\left|\frac{-\sqrt{2m}\Delta{\rho}}{2\sqrt{1 - \rho}}\right| \le \epsilon_{d}
\end{align}
$$


Also, note that:
$\Delta_{\rho} = \rho - \hat{\rho} = \frac{cov}{s} - \frac{\hat{cov}}{s} = \frac{\Delta{cov}}{s}$, **where `s` is the multiplication of std values of two subsequenes** for which the cov is being calculated.

Note that we assumed the standard deviation has no imprecision.


$$
\begin{align}
\left|\frac{\sqrt{2m}\Delta{cov}}{2s\sqrt{1 - \rho}}\right| \le \epsilon_{d}
\end{align}
$$


Hence:


$$
\begin{align}
\left|\frac{2s\sqrt{1 - \rho}}{\sqrt{2m}\Delta{cov}}\right| \ge \frac{1}{\epsilon_{d}}
\end{align}
$$


$$
\begin{align}
s \ge \frac{\sqrt{2m}|\Delta{cov}|}{2\epsilon_{d}\sqrt{1 - \rho}}
\end{align}
$$


So, if have no imprecision in `cov`, i.e. $\Delta{cov}=0$, then, $s \ge 0$, which is always satisfied. But, if we have imprecision in `cov`, i.e. $\Delta{cov} \ne 0$, then `s` should satisfiy the inequality above, otherwise, we the imprecision in distance, $|\Delta{d}|$, will be more than $\epsilon_{d}$ (hence, error in STUMPY test suite)

# Approach II:

Let us assume $d$ is the precise z-normalized distance between two subsequence, and $\hat{d}$ is the z-normalized distance with imprecision.


$$
\begin{align}
d - \hat{d} ={}& \Delta{d}
\\
\sqrt{2m(1-\rho)} - \sqrt{2m(1 - \hat{\rho})} ={}& \Delta{d}
\\
\sqrt{1-\rho} - \sqrt{1 - \hat{\rho}} ={}& \frac{\Delta{d}}{\sqrt{2m}}
\end{align}
$$

Also, recall that $\rho = \frac{cov}{stddev}$, where `cov` is the covariance between two subsequence and `stddev` is the mutiplication of the standard devations of the two subsequences. Hereafter, it is denoted as `s`. We also assume `s` is exact and has no imprecision.

$\rho = \frac{cov}{s}$

$\hat{\rho} = \frac{\hat{cov}}{s}$


Hence:

$\rho - \hat{\rho} = \frac{\Delta{cov}}{s}$

Now, let us solve the following equations:


$$
\begin{align}
   \begin{cases}
      \sqrt{1-\rho} - \sqrt{1 - \hat{\rho}} ={}& \frac{\Delta{d}}{\sqrt{2m}}
      \\
      \rho - \hat{\rho} = \frac{\Delta{cov}}{s}
    \end{cases}
\end{align}
$$

we define new variables:

$\rho' \triangleq 1 - \rho$; note that $\rho'$ is in range `[0, 2]`

$\hat{\rho}' \triangleq 1 - \hat{\rho}$; note that $\hat{\rho}'$ is in range `[0, 2]`

Hence, we now have:


$$
\begin{align}
   \begin{cases}
      \sqrt{\rho'} - \sqrt{\hat{\rho}'} ={}& \frac{\Delta{d}}{\sqrt{2m}}
      \\
      \hat{\rho}' - \rho' ={}& \frac{\Delta{cov}}{s}
    \end{cases}
\end{align}
$$

And, we define new variables:

$x \triangleq \sqrt{\rho'}$, note that x is in range $[0, \sqrt{2}]$

$y \triangleq \sqrt{\hat{\rho}'}$, note that x is in range $[0, \sqrt{2}]$

Hence:


$$
\begin{align}
   \begin{cases}
      x - y ={}& \frac{\Delta{d}}{\sqrt{2m}}
      \\
      y^{2} - x^{2} ={}& \frac{\Delta{cov}}{s}
    \end{cases}
\end{align}
$$


Thus,


$$
\begin{align}
   \begin{cases}
      x - y ={}& \frac{\Delta{d}}{\sqrt{2m}}
      \\
      (y - x)(y + x) ={}& \frac{\Delta{cov}}{s}
    \end{cases}
\end{align}
$$


By dividing the second equation by the first equation, we can find the equation for `y + x`. Hence:


$$
\begin{align}
   \begin{cases}
      x - y ={}& \frac{\Delta{d}}{\sqrt{2m}}
      \\
      y + x ={}& -\frac{\Delta{cov}\sqrt{2m}}{s\Delta{d}}
    \end{cases}
\end{align}
$$


Now, we can solve this system of equations for `x` and `y`. Hence:


$$
\begin{align}
   \begin{cases}
      x  ={}& \frac{1}{2}\left(
      \frac{\Delta{d}}{\sqrt{2m}}
      -
      \frac{\Delta{cov}\sqrt{2m}}{s\Delta{d}}
      \right)
      \\
      y  ={}& -\frac{1}{2}\left(
      \frac{\Delta{cov}\sqrt{2m}}{s\Delta{d}}
      +
      \frac{\Delta{d}}{\sqrt{2m}}
      \right)
    \end{cases}
\end{align}
$$


**What now? :)**