# Finding discords of any length in a time series

This tutorial explains the MERLIN algorithm, proposed in [MERLIN](https://www.cs.ucr.edu/~eamonn/MERLIN_Long_version_for_website.pdf). The support webpage can be found here: [MERLIN: SUPPORT](https://sites.google.com/view/merlin-find-anomalies).

The algorithm discovers the discords of arbitrary length in time series. It is worthwhile to note that the term arbitrary means the user can define a range for the length of discord (i.e. minimum length (`minL`) and maximum length(`maxL`)) and the algorithm finds discords of different lengths included in `[minL, maxL]`.

## What is a discord?
A subsequence of length `L` in a time series `T` is a discord iff it has the largest distance (hereafter being referred to as `discord_dist`) to its `[first]` nearest neighbor (`NN`). The neighbors of a subsequence of length `L`, starting at index `i`, are all the subsequences whose starting index is not in numpy indexing `[i-excl_zone+1 : i+excl_zone]`. 
 
**NOTE (1):** <br>
`excl_zone` refers to the exclusion_zone that should be considered for ignoring trivial cases.

**NOTE (2):** <br>
It is important to note that for the subsequenc `S = T\[i:i+L\]`, some of its neighbors are located on the left of `S` (i.e. the ones with starting index less than/equal to `max(0, i-excl_zon)`) and some of its neighbors are located on the right of `S` (i.e. the ones with starting index greater than/equal to `min(len(T), i+excl_zone)`). To find the `NN` of a subsequence `S`, one needs to calculate the distance between `S` and all of its `[left and right]` neighbors. 

## MatrixProfile approach

How can we discover a discord of length `L` using MatrixProfile(`P`)? The solution is straightforward. `P` shows the distance of each subsequence to its `NN`. Therefore, the one that has the greatest distance to its `NN` is considered as the discord.

There are some advantages/disadvantages in using `P` for discovering discords:

* Advantage:
Once we have the `P`, finding the discord is easy. Also, one can obtain the `top-k` discords very quickly by getting the first `k` largest distances in `P`.

* Disadvantage:
It needs to be calculated for each new length `L` in `[minL, maxL]`. Furthermore, all pairwise calculations are required for obtaining `P`. 

As will be shown later, `MERLIN` can skip some  of the pair-wise distance calculations. Also, it can use the `discord_dist` of length `L` to narrow down the search space for disovering the discord of length `L+1`.

## MERLIN

There are two main ideas at the core of the `MERLIN` algorithm. In below, we briefly explain each concept. Then, we will show its implementation and discuss its performance.

### Idea (1): Elimination Approach
The idea can be explained as follows: Suppose we are told that the discord distance (`discord_dist`) of length `L` is at least `min_dist` (**NOTE:** In the second Idea, we will explain how to set the `min_dist` value). That means the distance between the discord and each one of its neighbors is at least min_dist. We start scanning the subsequences. If, for a subsequence S, we realize that it has a neighbor to which its distance is smaller than `min_dist`, we can say S cannot be the discord. We just eliminated one candidate! 

The main idea is to eliminate all subsequences for which there exist at least one neighbor with pair-wise distance less than `min_dist`. Therefore, the remaining subsequences (i.e. candidates) are the ones that have a distance great than/equal to `min_dist`. Now, we can find the `NN` of each candidate and choose the discord as the one that has the greatest distance to its `NN`. 

-------------------------------------

We would like to have small amount of candidates after the elimination process. This is where choosing a good value for `min_dist` becomes important. For instance, let us consider two very extreme scenarios:

Scenario (I): Choosing a very small value `min_dist = 1e-100`. In this case, we are most likly ended up with almost all subsequnces as the candidates.

Scenario (II): Choosing a very large value `min_dist = 1e+100`. In this case, we are most likely ended up with no candidates at all. 

In the second idea below, we explain how MERLIN chooses the value for `min_dist`.

### Idea (2): Choosing `min_dist`
Let us assume we already discovered the discord `d` of length `L` whose distance to its NN (`d_NN`) is `discord_dist`. Now, to find the discord of length `L+1`, we can set `min_dist = discord_dist`. Because we are increasing the length of subsequences by one, we can say their distance to their neighbors are larger compared to the case where the length was L. So, `min_dist`can be considered as a safe choice for discovering discord of length `L+1`.