# DAMP: Discord-Aware Matrix Profile

Authors in [DAMP](https://www.cs.ucr.edu/~eamonn/DAMP_long_version.pdf) presented a method for anomaly detection that is scalable, and it can be used in offline and online mode. 

A subsequence is anomaly if its distance to its (left) nearest neighbor is larger than the distance of any other subsequences to their (left) nearest neighbors. This paper considers the left nearest neighbors so to not miss catching twin-freak cases, i.e. two anomalies that are similar to each other.

For now, let us consider a time series `T` and let's say we are interested in finding anomaly in an offline setting. For this, we assume that the first few points in `T` are just for training, and we start discovering anomalies for the remaining points. Hence, for a given `split_index`, the train part is `T[:split_index]` and the anomalies should be coming from `T[split_index:]`.

# import libraries

In [20]:
import numpy as np
import matplotlib.pyplot as plt
import stumpy

from stumpy import core

# core idea

**How can we find top-1 discord of `T` in offline mode?**<br> 
To find discord, one can compute (left) matrix profile array and locate the index where the array has the maximum finite value. That is the start index of the anomaly. What `DAMP` does is to obtain an approx. left matrix profile whose maximum is exact! This means that it can still give us the correct (top-1) discord.

**How does DAMP work?**<br>
Let's say we have time series `T` and we start exploring this time series from left, and now we are at index `i`. At this moment, let's say the best-so-far discord score (i.e. distance between discord and its nearest neighbor) is known to be `discord_score`. In a naive approach, one can explore all the left neighbors of the subsequence `S_i = T[i: i + m]` and find its nearest neighbor. DAMP, however, uses an early abondoning approach. So, as soon as it finds a (left) neighbor whose distance to `S_i` is less than `discord_score`, `DAMP` stop exploring the remaining of the left neighbors because the distance between `S_i` to its nearest neighbor would be also less than `discord_score`. This is called `BackwardProcessing`. To take advantage of rolling-based computation, the `BackwardProcessing` does not explore one neighbor at a time. Instead, it explores them chunk by chunk using the MASS algorithm.

We can go to the next index, i.e. `i+1`, and do the same process. That means, for the next subsequence, we need to run `BackwardProcessing` at least for one chunk using MASS algorithm. But what if we can find a way to prune some the forthcoming subsequences? So, we only need to do `BackwardProcessing` for the ones that are not pruned and can be considered as eligible discords. To prune the forthcoming subsequences, we can use MASS on the query `S_i` and find its distances to the subsequences on its right. Note that for those subsequences, `S_i` is considered as one of their left neighbors! So, if the distance between `S_i` and any one of those subsequences become less than `discord_score`, those subsequences can be prunned and ignored later. So, when `DAMP` process reaches one of those subsequences, they can just be skipped.

## Naive approach

In [28]:
def naive_DAMP(T, m, split_index):
    """
    Compute the top-1 discord in `T`, where the subsequence discord resides in T[split_index:]
    
    Parameters
    ----------
    T : np.ndarray
        A time series
        
    m : int
        Window size
    
    split_index : int
        The split index between train and test.
    
    Returns
    -------
    out : np.ndarray
        a numpy array containing the top-k discords start index
    """
    stumpy.config.STUMPY_EXCL_ZONE_DENOM = 1
    excl_zone = int(np.ceil(m / stumpy.config.STUMPY_EXCL_ZONE_DENOM))
    
    mp = stumpy.stump(T, m)
    IL = mp[:, 2].astype(np.int64)
    IL[:split_index] = -1
    
    PL = np.full_like(IL, np.inf, dtype=np.float64)
    for i, nn_i in enumerate(IL):
        if nn_i >= 0:
            PL[i] = np.linalg.norm(core.z_norm(T[i : i + m]) - core.z_norm(T[nn_i : nn_i + m]))
    
    PL = np.where(PL==np.inf, np.NINF, PL)
    idx = np.argmax(PL)
    if PL[idx] == np.NINF:
        discord = np.NINF
        discord_index = -1
        discord_index_nn = -1
    else: 
        discord = PL[idx]
        discord_index = idx
        discord_index_nn = IL[idx]
        
    return discord, discord_index, discord_index_nn

In [29]:
seed = 100
np.random.seed(seed)

T = np.random.rand(10000)
m = 50
split_index = 200

discord, discord_index, discord_index_nn = naive_DAMP(T, m, split_index)

print('discord: ', discord)
print('discord_index: ', discord_index)
print('discord_index_nn: ', discord_index_nn)

discord:  8.500883427933504
discord_index:  209
discord_index_nn:  121


## DAMP approach

In [32]:
def _foreward_processing(T, m, M_T, Σ_T, excl_zone, query_idx, discord_score, is_subseq_pruned):
    """
    Prune forthcoming subsequences so that they become ineligible as discords
    
    Paramaters
    ----------
    T : np.ndarray
        The time series 
    
    m : int
        Window size
        
    M_T : np.ndarray
        The sliding mean
        
    Σ_T : np.ndarray
        The sliding standard deviation
        
    excl_zone : int
        exclusion zone
    
    query_idx : int
        The start index of the subsequence of interest
    
    discord_score : float
        The best-so-far discord score
        
    is_subseq_pruned : np.ndarray
        A boolean numpy array that indicate whether a subsequence is pruned for
        being considered as discord
    """
    excl_zone = int(np.ceil(m / stumpy.config.STUMPY_EXCL_ZONE_DENOM))
    
    lookahead = np.power(2, int(np.ceil(np.log(m) / np.log(2))))
    
    start = query_idx
    stop = min(start + lookahead, len(T))
    if stop - start >= m:
        dist_profile = core.mass(
            T[query_idx : query_idx + m],
            T[start : stop],
            M_T[start : stop - m + 1],
            Σ_T[start : stop - m + 1],
            )
        
        core.apply_exclusion_zone(dist_profile, 0, excl_zone, np.inf)

        IDX = np.flatnonzero(dist_profile < discord_score) + start
        is_subseq_pruned[IDX] = True

    return is_subseq_pruned


def _backward_processing(T, m, M_T, Σ_T, excl_zone, query_idx, discord_score):
    """
    Compute the approx. left matrix profile value for subsequence `T[query_idx:query_idx+m]`
    and update discord_score
    
    Parameters
    ----------
    T : np.ndarray
        The time series 
    
    m : int
        Window size
        
    M_T : np.ndarray
        The sliding mean
        
    Σ_T : np.ndarray
        The sliding standard deviation
    
    excl_zone : int
        exclusion zone

    query_idx : int
        The start index of the subsequence of interest
    
    discord_score : float
        The best-so-far discord score
        
    Returns
    -------
    left_nn_distance : float
        Left matrix profile value for subsequence i
    
    discord_score : float
        The best-so-far distance computed as discord
    """
    left_nn_distance = np.inf
    prefix = np.power(2, int(np.ceil(np.log(m) / np.log(2))))
    
    while left_nn_distance >= discord_score:
        start_idx = max(0, query_idx + m - prefix)
        if start_idx == 0:
            dist_profile = core.mass(
                T[query_idx : query_idx + m],
                T[start_idx : query_idx + m],
                M_T=M_T[start_idx : query_idx + 1],
                Σ_T=Σ_T[start_idx : query_idx + 1],
                )
            
            core.apply_exclusion_zone(dist_profile, len(dist_profile)-1, excl_zone, np.inf)

            left_nn_distance = np.min(dist_profile)
            discord_score = max(left_nn_distance, discord_score)
            break
        
        else:
            dist_profile = core.mass(
                T[query_idx : query_idx + m],
                T[start_idx : query_idx + m],
                M_T=M_T[start_idx : query_idx + m - m + 1],
                Σ_T=Σ_T[start_idx : query_idx + m - m + 1],
                )
            core.apply_exclusion_zone(dist_profile, len(dist_profile)-1, excl_zone, np.inf)
            
            left_nn_distance = np.min(dist_profile)
            if left_nn_distance < discord_score:
                break
            else:
                prefix = 2 * prefix   
    
    return left_nn_distance, discord_score
        
                
def DAMP(T, m, split_index):
    """
    Compute approx. left matrix profile
    
    Parameters
    ----------
    T : np.ndarray
        A time series
    
    m : int
        Window size
    
    split_index : int
        location of split point between train and test. The data `T[:i]`
        is considered as train and the remaining, i.e. `T[i:]` is test.
    
    Returns
    -------
    PL : np.ndarray
        Approx. left matrix profile
    """
    stumpy.config.STUMPY_EXCL_ZONE_DENOM = 1  # according to paper, excl_zone is `m`
    excl_zone = int(np.ceil(m / stumpy.config.STUMPY_EXCL_ZONE_DENOM))
    
    T, M_T, Σ_T, T_subseq_isconstant = core.preprocess(T, m)
    
    l = len(T) - m + 1
    PL = np.full(l, np.inf, dtype=np.float64) 
    is_subseq_pruned = np.full(l, 0, dtype=bool)
    
    discord_score = np.NINF 
    for i in range(split_index, l):
        if is_subseq_pruned[i]:
            PL[i] = PL[i-1]
        else:
            PL[i], discord_score = _backward_processing(T, m, M_T, Σ_T, excl_zone, i, discord_score)
            is_subseq_pruned = _foreward_processing(T, m, M_T, Σ_T, excl_zone, i, discord_score, is_subseq_pruned)
        
    PL = np.where(PL==np.inf, np.NINF, PL)
    discord_idx = np.argmax(PL)
    discord = PL[discord_idx]
    
    if discord == np.NINF:
        discord_idx = -1
        
    return discord, discord_idx

In [33]:
discord, discord_idx = DAMP(T, m, split_index)
PL = np.where(PL==np.inf, np.NINF, PL)
discord_idx = np.argmax(PL)

print('discord value: ', discord)
print('discord at index: ', discord_idx)

discord value:  8.500883427933498
discord at index:  209
