In this notebook, we will explore the similarity measures between time series. We will use the following similarity measures:
- Euclidean distance 
   The Euclidean distance between two time series is the square root of the sum of the squared differences between the two time series. It is formulated as follows:
    $$d_{euclidean}(x,y) = \sqrt{\sum_{i=1}^{n}(x_i - y_i)^2}$$

    we use the following function to calculate the Euclidean distance between two time series:
    ```python
    def euclidean_distance(x, y):
        return np.sqrt(np.sum((x - y)**2))
    ```
    The Euclidean distance is a good measure of similarity between two time series. However, it is sensitive to the magnitude of the time series. For example, if the magnitude of the time series is increased by a factor of 10, the Euclidean distance will also increase by a factor of 10. This is not desirable. To overcome this problem, we can normalize the time series before calculating the Euclidean distance. The normalized Euclidean distance is formulated as follows:
    $$d_{euclidean}(x,y) = \sqrt{\sum_{i=1}^{n}\frac{(x_i - y_i)^2}{\sum_{i=1}^{n}x_i^2}}$$

    we use the following function to calculate the normalized Euclidean distance between two time series:
    ```python
    def normalized_euclidean_distance(x, y):
        return np.sqrt(np.sum(((x - y)**2) / np.sum(x**2)))
    ```

In [1]:
import numpy as np

In [2]:
def euclidean_distance(x, y):
        return np.sqrt(np.sum((x - y)**2))


def normalized_euclidean_distance(x, y):
        return np.sqrt(np.sum(((x - y)**2) / np.sum(x**2)))
    

# Dynamic Time Warping (DTW) 

Dynamic Time Warping (DTW) is a measure of similarity between two time series that may vary in time or speed. DTW is formulated as follows:
     $$d_{dtw}(x,y) = \min_{\pi} \sum_{i=1}^{n}d(x_{\pi(i)},y_i)$$
    
where $\pi$ is the optimal warping path between the two time series and dtw is the distance between the two time series along the optimal warping path $\pi$. We calulate the dtw as follows: 
1. Calculate the distance between each point in the first time series and each point in the second time series. This is called the cost matrix.
2. Calculate the cumulative cost matrix. The cumulative cost matrix is the cost matrix with the minimum cost path from the top left corner to each point in the cost matrix.
3. Find the optimal warping path. The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the DTW distance between two time series:

```python
     def dtw(
           x: np.ndarray, 
           y: np.ndarray, 
           dist: Callable[[np.ndarray, np.ndarray], float] = euclidean_distance
     ) -> Tuple[float, np.ndarray]:
           """Calculate the Dynamic Time Warping (DTW) of two time series.
     
           Args:
                x (np.ndarray): The first time series.
                y (np.ndarray): The second time series.
                dist (Callable[[np.ndarray, np.ndarray], float], optional): The distance function. Defaults to euclidean_distance.
     
           Returns:
                Tuple[float, np.ndarray]: The DTW distance and the optimal warping path.
           """
           # Calculate the cost matrix.
           cost_matrix = np.zeros((len(x), len(y)))
           for i in range(len(x)):
                for j in range(len(y)):
                     cost_matrix[i, j] = dist(x[i], y[j])
     
           # Calculate the cumulative cost matrix.
           cumulative_cost_matrix = np.zeros((len(x), len(y)))
           cumulative_cost_matrix[0, 0] = cost_matrix[0, 0]
           for i in range(1, len(x)):
                cumulative_cost_matrix[i, 0] = cumulative_cost_matrix[i - 1, 0] + cost_matrix[i, 0]
           for j in range(1, len(y)):
                cumulative_cost_matrix[0, j] = cumulative_cost_matrix[0, j - 1] + cost_matrix[0, j]
           for i in range(1, len(x)):
                for j in range(1, len(y)):
                     cumulative_cost_matrix[i, j] = cost_matrix[i, j] + min(
                          cumulative_cost_matrix[i - 1, j],
                          cumulative_cost_matrix[i, j - 1],
                          cumulative_cost_matrix[i - 1, j - 1],
                     )
     
          # Find the optimal warping path using the most efficient method.
          i = len(x) - 1
          j = len(y) - 1
          optimal_warping_path = [(i, j)]
          while i > 0 and j > 0:
               if i == 0:
                    j -= 1
               elif j == 0:
                    i -= 1
               else:
                    if cumulative_cost_matrix[i - 1, j] < cumulative_cost_matrix[i, j - 1]:
                         i -= 1
                    else:
                         j -= 1
               optimal_warping_path.append((i, j))

          optimal_warping_path.reverse()
          return cumulative_cost_matrix[-1, -1], np.array(optimal_warping_path)
```
         
The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the DTW distance between two time series:

```python
     def dtw_distance(x, y):
          return dtw(x, y, dist=lambda x, y: euclidean_distance(x, y))[0]
```

The DTW distance is a good measure of similarity between two time series. However, it is sensitive to the magnitude of the time series. For example, if the magnitude of the time series is increased by a factor of 10, the DTW distance will also increase by a factor of 10. This is not desirable. To overcome this problem, we can normalize the time series before calculating the DTW distance. The normalized DTW distance is formulated as follows:
     $$d_{dtw}(x,y) = \frac{1}{\sum_{i=1}^{n}x_i^2}\min_{\pi} \sum_{i=1}^{n}d(x_{\pi(i)},y_i)$$
    
we use the following function to calculate the normalized DTW distance between two time series:

```python
     def normalized_dtw_distance(x, y):
          return dtw(x, y, dist=lambda x, y: normalized_euclidean_distance(x, y))[0]
```

# Longest Common Subsequence (LCSS) 
Longest Common Subsequence (LCSS) is a measure of similarity between two time series that may vary in time or speed. LCSS is formulated as follows:
    $$d_{lcss}(x,y) = \frac{1}{\max\{n,m\}}\sum_{i=1}^{n}\sum_{j=1}^{m}1_{\{d(x_i,y_j) \leq \epsilon\}}$$
    
where $\epsilon$ is the threshold distance and $1_{\{d(x_i,y_j) \leq \epsilon\}}$ is an indicator function that is equal to 1 if $d(x_i,y_j) \leq \epsilon$ and 0 otherwise. We calulate the lcss as follows: 
1. Calculate the distance between each point in the first time series and each point in the second time series. This is called the cost matrix.
2. Calculate the cumulative cost matrix. The cumulative cost matrix is the cost matrix with the minimum cost path from the top left corner to each point in the cost matrix.
3. Find the optimal warping path. The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the LCSS distance between two time series:

    ```python
    def lcss(
        x: np.ndarray, 
        y: np.ndarray, 
        dist: Callable[[np.ndarray, np.ndarray], float] = euclidean_distance,
        epsilon: float = 0.1
    ) -> Tuple[float, np.ndarray]:
        """Calculate the Longest Common Subsequence (LCSS) of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.
            dist (Callable[[np.ndarray, np.ndarray], float], optional): The distance function. Defaults to euclidean_distance.
            epsilon (float, optional): The threshold distance. Defaults to 0.1.

        Returns:
            Tuple[float, np.ndarray]: The LCSS distance and the optimal warping path.
        """
        # Calculate the cost matrix.
        cost_matrix = np.zeros((len(x), len(y)))
        for i in range(len(x)):
            for j in range(len(y)):
                cost_matrix[i, j] = dist(x[i], y[j])
        
        # Calculate the cumulative cost matrix.
        cumulative_cost_matrix = np.zeros((len(x), len(y)))
        cumulative_cost_matrix[0, 0] = cost_matrix[0, 0]
        for i in range(1, len(x)):
            cumulative_cost_matrix[i, 0] = cumulative_cost_matrix[i - 1, 0] + cost_matrix[i, 0]
        for j in range(1, len(y)):
            cumulative_cost_matrix[0, j] = cumulative_cost_matrix[0, j - 1] + cost_matrix[0, j]
        for i in range(1, len(x)):
            for j in range(1, len(y)):
                cumulative_cost_matrix[i, j] = cost_matrix[i, j] + min(
                    cumulative_cost_matrix[i - 1, j],
                    cumulative_cost_matrix[i, j - 1],
                    cumulative_cost_matrix[i - 1, j - 1],
                )

        # Find the optimal warping path using the most efficient method.
        i = len(x) - 1
        j = len(y) - 1
        optimal_warping_path = [(i, j)]
        while i > 0 and j > 0:
            if i == 0:
                j -= 1
            elif j == 0:
                i -= 1
            else:
                if cumulative_cost_matrix[i - 1, j] < cumulative_cost_matrix[i, j - 1]:
                    i -= 1
                else:
                    j -= 1
            optimal_warping_path.append((i, j))

        optimal_warping_path.reverse()
        return cumulative_cost_matrix[-1, -1], np.array(optimal_warping_path)
    ```

The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the LCSS distance between two time series:

```python
    def lcss_distance(x, y, epsilon=0.1):
        return lcss(x, y, dist=lambda x, y: euclidean_distance(x, y), epsilon=epsilon)[0]
```



# Frechet distance 
Frechet distance is a measure of similarity between two time series that may vary in time or speed. Frechet distance is formulated as follows:
    $$d_{frechet}(x,y) = \min_{\pi} \max_{i=1}^{n}d(x_{\pi(i)},y_i)$$
    
where $\pi$ is the warping path. We calulate the frechet distance as follows:
1. Calculate the distance between each point in the first time series and each point in the second time series. This is called the cost matrix.
2. Calculate the cumulative cost matrix. The cumulative cost matrix is the cost matrix with the minimum cost path from the top left corner to each point in the cost matrix.
3. Find the optimal warping path. The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the Frechet distance between two time series:

    ```python
    def frechet(
        x: np.ndarray, 
        y: np.ndarray, 
        dist: Callable[[np.ndarray, np.ndarray], float] = euclidean_distance
    ) -> Tuple[float, np.ndarray]:
        """Calculate the Frechet distance of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.
            dist (Callable[[np.ndarray, np.ndarray], float], optional): The distance function. Defaults to euclidean_distance.

        Returns:
            Tuple[float, np.ndarray]: The Frechet distance and the optimal warping path.
        """
        # Calculate the cost matrix.
        cost_matrix = np.zeros((len(x), len(y)))
        for i in range(len(x)):
            for j in range(len(y)):
                cost_matrix[i, j] = dist(x[i], y[j])
        
        # Calculate the cumulative cost matrix.
        cumulative_cost_matrix = np.zeros((len(x), len(y)))
        cumulative_cost_matrix[0, 0] = cost_matrix[0, 0]
        for i in range(1, len(x)):
            cumulative_cost_matrix[i, 0] = cumulative_cost_matrix[i - 1, 0] + cost_matrix[i, 0]
        for j in range(1, len(y)):
            cumulative_cost_matrix[0, j] = cumulative_cost_matrix[0, j - 1] + cost_matrix[0, j]
        for i in range(1, len(x)):
            for j in range(1, len(y)):
                cumulative_cost_matrix[i, j] = cost_matrix[i, j] + max(
                    cumulative_cost_matrix[i - 1, j],
                    cumulative_cost_matrix[i, j - 1],
                    cumulative_cost_matrix[i - 1, j - 1],
                )

        # Find the optimal warping path using the most efficient method.
        i = len(x) - 1
        j = len(y) - 1
        optimal_warping_path = [(i, j)]
        while i > 0 and j > 0:
            if i == 0:
                j -= 1
            elif j == 0:
                i -= 1
            else:
                if cumulative_cost_matrix[i - 1, j] > cumulative_cost_matrix[i, j - 1]:
                    i -= 1
                else:
                    j -= 1
            optimal_warping_path.append((i, j))

        optimal_warping_path.reverse()
        return cumulative_cost_matrix[-1, -1], np.array(optimal_warping_path)
    ```




# Hausdorff distance 
Hausdorff distance is a measure of similarity between two time series that may vary in time or speed. Hausdorff distance is formulated as follows:
    $$d_{hausdorff}(x,y) = \max\{d_{frechet}(x,y),d_{frechet}(y,x)\}$$
    
where $\pi$ is the warping path. We calulate the Hausdorff distance as follows:
1. Calculate the distance between each point in the first time series and each point in the second time series. This is called the cost matrix.
2. Calculate the cumulative cost matrix. The cumulative cost matrix is the cost matrix with the minimum cost path from the top left corner to each point in the cost matrix.
3. Find the optimal warping path. The optimal warping path is the path that minimizes the sum of the distances between the two time series. The distance between two points in the time series is calculated using the Euclidean distance. We use the following function to calculate the Hausdorff distance between two time series:

    ```python
    def hausdorff(
        x: np.ndarray, 
        y: np.ndarray, 
        dist: Callable[[np.ndarray, np.ndarray], float] = euclidean_distance
    ) -> float:
        """Calculate the Hausdorff distance of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.
            dist (Callable[[np.ndarray, np.ndarray], float], optional): The distance function. Defaults to euclidean_distance.

        Returns:
            float: The Hausdorff distance.
        """
        return max(frechet(x, y, dist=dist)[0], frechet(y, x, dist=dist)[0])
    ```


# Correlation distance
## Pearson correlation coefficient
Pearson correlation coefficient is a measure of the linear correlation between two time series. It is formulated as follows:
    $$r^{pearson}_{x,y} = \frac{\sum_{i=1}^{n}(x_{i} - \bar{x})(y_{i} - \bar{y})}{\sqrt{\sum_{i=1}^{n}(x_{i} - \bar{x})^{2}}\sqrt{\sum_{i=1}^{n}(y_{i} - \bar{y})^{2}}}$$
    
where $\bar{x}$ and $\bar{y}$ are the mean of the time series $x$ and $y$ respectively. We calulate the Pearson correlation coefficient as follows:
    
```python
    def pearson_correlation_coefficient(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Pearson correlation coefficient of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Pearson correlation coefficient.
        """
        # Calculate the mean of the time series.
        x_mean = np.mean(x)
        y_mean = np.mean(y)

        # Calculate the numerator of the Pearson correlation coefficient.
        numerator = np.sum((x - x_mean) * (y - y_mean))

        # Calculate the denominator of the Pearson correlation coefficient.
        denominator = np.sqrt(np.sum((x - x_mean) ** 2)) * np.sqrt(np.sum((y - y_mean) ** 2))

        # Calculate the Pearson correlation coefficient.
        return numerator / denominator
```

## Spearman correlation coefficient
Spearman correlation coefficient is a measure of the monotonic correlation between two time series. It is formulated as follows:
    $$r^{spearman}_{x,y} = \frac{\sum_{i=1}^{n}(r_{x}(x_{i}) - \bar{r_{x}})(r_{y}(y_{i}) - \bar{r_{y}})}{\sqrt{\sum_{i=1}^{n}(r_{x}(x_{i}) - \bar{r_{x}})^{2}}\sqrt{\sum_{i=1}^{n}(r_{y}(y_{i}) - \bar{r_{y}})^{2}}}$$
    
where $\bar{x}$ and $\bar{y}$ are the mean of the time series $x$ and $y$ respectively. We calulate the Spearman correlation coefficient as follows:

```python
    def spearman_correlation_coefficient(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Spearman correlation coefficient of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Spearman correlation coefficient.
        """
        # Calculate the rank of the time series.
        x_rank = rankdata(x)
        y_rank = rankdata(y)

        # Calculate the mean of the time series.
        x_mean = np.mean(x_rank)
        y_mean = np.mean(y_rank)

        # Calculate the numerator of the Spearman correlation coefficient.
        numerator = np.sum((x_rank - x_mean) * (y_rank - y_mean))

        # Calculate the denominator of the Spearman correlation coefficient.
        denominator = np.sqrt(np.sum((x_rank - x_mean) ** 2)) * np.sqrt(np.sum((y_rank - y_mean) ** 2))

        # Calculate the Spearman correlation coefficient.
        return numerator / denominator
```

## Kendall correlation coefficient
Kendall correlation coefficient is formulated as follows:
    $$r^{kendall}_{x,y} = \frac{2\sum_{i=1}^{n}\sum_{j=i+1}^{n}sgn(x_{i} - x_{j})sgn(y_{i} - y_{j})}{n(n - 1)}$$
    
where $sgn(x)$ is the sign function. We calulate the Kendall correlation coefficient as follows:

```python
    def kendall_correlation_coefficient(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Kendall correlation coefficient of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Kendall correlation coefficient.
        """
        # Calculate the sign of the difference between each pair of points in the time series.
        x_sign = np.sign(np.subtract.outer(x, x))
        y_sign = np.sign(np.subtract.outer(y, y))

        # Calculate the numerator of the Kendall correlation coefficient.
        numerator = 2 * np.sum(x_sign * y_sign)

        # Calculate the denominator of the Kendall correlation coefficient.
        denominator = len(x) * (len(x) - 1)

        # Calculate the Kendall correlation coefficient.
        return numerator / denominator
```

### Kendels tau distance
Kendels tau distance is formulated as follows:
    $$d^{kendall}_{x,y} = 1 - \frac{2\sum_{i=1}^{n}\sum_{j=i+1}^{n}sgn(x_{i} - x_{j})sgn(y_{i} - y_{j})}{n(n - 1)}$$
    
where $sgn(x)$ is the sign function. We calulate the Kendels tau distance as follows:

```python
    def kendall_tau_distance(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Kendels tau distance of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Kendels tau distance.
        """
        return 1 - kendall_correlation_coefficient(x, y)
```


## Kullback-Leibler divergence
Kullback-Leibler divergence is formulated as follows:
    $$d^{kl}_{x,y} = \sum_{i=1}^{n}x_{i}\log\frac{x_{i}}{y_{i}}$$
    
where $x_{i}$ and $y_{i}$ are the $i$-th elements of the time series $x$ and $y$ respectively. We calulate the Kullback-Leibler divergence as follows:

```python
    def kullback_leibler_divergence(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Kullback-Leibler divergence of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Kullback-Leibler divergence.
        """
        # Calculate the Kullback-Leibler divergence.
        return np.sum(x * np.log(x / y))
```
## Jensen-Shannon divergence
Jensen-Shannon divergence is formulated as follows:
    $$d^{js}_{x,y} = \frac{1}{2}d^{kl}_{x,\frac{x + y}{2}} + \frac{1}{2}d^{kl}_{y,\frac{x + y}{2}}$$
    
where $d^{kl}_{x,y}$ is the Kullback-Leibler divergence. Jensen-Shannon divergence is a metric that is always greater than or equal to zero and is zero if and only if the two distributions are identical. It  differs from the Kullback-Leibler divergence in that it is symmetric. 

We calulate the Jensen-Shannon divergence as follows:

```python
    def jensen_shannon_divergence(
        x: np.ndarray, 
        y: np.ndarray
    ) -> float:
        """Calculate the Jensen-Shannon divergence of two time series.
        Args:
            x (np.ndarray): The first time series.
            y (np.ndarray): The second time series.

        Returns:
            float: The Jensen-Shannon divergence.
        """
        # Calculate the mean of the time series.
        x_mean = np.mean(x)
        y_mean = np.mean(y)

        # Calculate the Jensen-Shannon divergence.
        return 0.5 * kullback_leibler_divergence(x, (x_mean + y_mean) / 2) + 0.5 * kullback_leibler_divergence(y, (x_mean + y_mean) / 2)
```


- Bhattacharyya distance 
- Hellinger distance 
- Cosine similarity 
- Chebyshev distance 
- Minkowski distance 
- Bray-Curtis distance 
- Canberra distance 


- Jaccard distance - Weighted Euclidean distance - Weighted DTW - Weighted LCSS - Weighted Frechet distance - Weighted Hausdorff distance - Weighted Pearson correlation - Weighted Spearman correlation - Weighted Kendall correlation - Weighted Jensen-Shannon divergence - Weighted Kullback-Leibler divergence - Weighted Bhattacharyya distance - Weighted Hellinger distance - Weighted Cosine similarity - Weighted Manhattan distance - Weighted Chebyshev distance - Weighted Minkowski distance - Weighted Bray-Curtis distance - Weighted Canberra distance - Weighted Jaccard distance
