<a href="https://colab.research.google.com/github/keriheuer/rqa/blob/main/RQA_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup

First, let's install some necessary packages and set up the notebook (this may take a minute or two).
For optimal display of the widgets in this notebook, make sure your browser/window is full screen. And if you're running this in Google Colab, also make sure to close any tabs opened from the left sidebar (e.g. Files).

In [None]:
# only need to run this cell if you're in Colab

%%capture
!sudo apt-get install texlive-latex-extra texlive-fonts-recommended dvipng cm-super   # install LaTeX
!pip install -q git+https://github.com/keriheuer/rqa   # install RQA

In [None]:
from rqa import *
%matplotlib widget

try:
  from google.colab import output
  output.enable_custom_widget_manager() # need to enable Jupyter widgets if in Colab
except:
  pass

# Intro to Recurrence Analysis

Recurrence analysis is a non-linear method of time series analysis from chaos theory that was originally developed to study phenomena not easily characterized by other techniques. Unlike linear techniques, recurrence analysis quantifies recurrences in a system's phase space trajectory, rather than the original time series. This approach enables recurrence analysis to describe the underlying dynamics of system without any loss of dynamical information from the observed signal.

Recurrences in phase space trajectories can easily be visualized with a recurrence plot (RP). To construct the RP, every point in the time series $\vec{x}(i)$ is compared to subsequent points $\vec{x}(j)$ and a dot plotted if the difference is less than some threshold $\varepsilon$. The resulting 2D pattern is a graphical representation of the recurrence matrix, defined as

<p class="fw" align="center">
</br>
   <img src="https://raw.githubusercontent.com/keriheuer/rqa/2b69ad0812e6de45cd383d3ca040ca460ed88f06/images/recurrence_matrix.svg" width="30%">
</p>


where $\Theta$ : $\mathbf{R} \rightarrow \{0,1\}$ is the Heaviside function and $\varepsilon$ is the pre-defined tolerance. A dot in the RP (or 1 in the recurrence matrix) thus corresponds to a recurrence, which can be understood as a return of the trajectory to a previously visited region of phase space.

As such, RPs mostly consist of single dots and lines which either run parallel to the main diagonal (line of identity, $LOI$) or vertically/horizontally. These structures are referred to as diagonal lines and vertical lines, respectively. Because an RP is symmetric about the $LOI$, horizontal and vertical lines correspond to each other, so only vertical lines are considered when describing RPs.

Because a dot in the RP represents a single recurrence, lines made of many dots correspond to a typical behaviour of the phase space trajectory. Diagonal lines represent segments of the phase space trajectory which run parallel for some time, while vertical lines represent segments which remain in the same region of phase space for some time.

RPs are powerful tools for visualizing recurrences in phase space, but they can also be further quantified to describe the small-scale structures of RPs. Collectively known as Recurrence Quantitative Analysis (RQA), methods of measuring the number and distribution of RP features are used to distinguish between different types of dynamical systems and identify transitions in dynamical behavior. The main advantage of RQA is that it can even be successfully applied to short and non-stationary data, where other methods cannot.

---
## Recurrence Plots (RPs) and Recurrence Quantitative Analysis (RQA)

Listed below are RP features commonly associated with certain dynamics and several important RQA metrics which can be extracted from RPs. Try to spot these features in the RPs of some characteristic systems and use the dropdown menu to see whether their RQA metrics reflect these differences. Click on any of the RPs to zoom in and show it larger (to close the pop up, just click anywhere inside the whole figure on the left).

<p align="center" class="fw">
</br>
   <img src="https://github.com/keriheuer/rqa/blob/main/src/rqa/data/RQA_metrics.png?raw=true" width="100%">
</p>



In [None]:
c = CharacteristicRQA()

If we look at the colors of different features in the thresholded RP (left), we can see how recurrences and non-recurrences correspond to small and large distances in phase space, respectively. Block-like patterns in the thresholded RP are plotted in darker colors on the right, meaning points are very close in phase space. Large white patches, on the other hand, are plotted in brighter colors, representing regions of phase space which are far away from each other.

So in physical terms, we can understand recurrences (small distances in phase space) as repetitive features of the light curves and non-recurrences as states that are distinct and/or far from normal.

---
# Recurrence Plots

## Thresholded vs. Unthresholded RPs

The RPs above were thresholded by a recurrence rate of 5%, meaning that black dots in the RP represent pairs of points separated by a phase space distance $d ≲ \varepsilon$, where $\varepsilon$ is chosen such that the RQA metric $RR$ is 5%. In other words, we pick a value for $\varepsilon$ so that only 5% of the total number of points in the time series are recurrences.

What happens when we look at distances in phase space over a range of recurrence rates? If we let $d = \varepsilon$, the RP is unthresholded and  represents the distance matrix rather than the recurrence matrix. A dot is now colored by the distance between a pair of points in phase space, which allows us to visualize the RP as we increase the threshold $\varepsilon$ from 0 to the maximum distance between any pair of points.

In the below example, dark areas (purple) represent points in the phase space trajectory that are nearby each other, while light areas (yellow) represent pairs of points which would not be considered a recurrence even if we choose the largest possible value for $\varepsilon$. Plotting the distance matrix is therefore a powerful way to visualize how similar or distinct different portions of a time series are.

<style> .markdown {max_width: 100% !important; width: 100% !important}
</style>

<p float="left" class="fw">
   <img src="https://raw.githubusercontent.com/keriheuer/rqa/2b69ad0812e6de45cd383d3ca040ca460ed88f06/images/rp_thresholding.svg" width="100%">
</p>


---
# Constructing RPs from 1D time series

So how exactly do we go from time series data to RPs? For a multi-dimensional time series, the distance between points in any 2 dimensions can be calculated. For a 1-dimensional time series, Takens' time delay theorem says we can reconstruct phase space by comparing the original signal to a delayed version of itself. This process creates a higher-dimensional embedding of the signal while preserving all information about the underlying system dynamics.

<p align="center">
  <img src="https://raw.githubusercontent.com/keriheuer/rqa/53b713c6fa69f38883a6d1906d5e978052e58c18/images/takens_theorem.svg" width="50%">
</p>

Reconstructing phase space involves two important parameters: a time delay $\tau$ and embedding dimension $m$. The time delay corresponds to the time difference between the original and delayed time series at a given $t=i$, while the embedding dimension is equal to the number of dimensions the reconstructed state vector $\vec{x}(t)$ has. In the below example, we see that for $m=3$, we can compare the original signal $x=f(t)$ to two delayed signals, each separated by a time difference of $\tau$, to construct a three-dimensional vector $\vec{x}(t) = (x,y,z)$.

<p align="center">
<img src="https://raw.githubusercontent.com/keriheuer/rqa/53b713c6fa69f38883a6d1906d5e978052e58c18/images/3d_state_vector.svg" width="300px" height="300px">

<img src="https://raw.githubusercontent.com/keriheuer/rqa/ef01d5ebfa7c538456312bf8078580c6b522d3f6/src/rqa/data/state_space_legend.svg" width="300px">
</p>

---
# Exploring RP Features

Use the dropdown menu and sliders to generate different time series and observe how the RP changes visually.

Do the number and thickness of diagonal or vertical lines change with different parameters of the system (i.e. frequency, amplitude)?

Are certain patterns/features of the RP the same for a system even when the time series varies?

To see what the RP looks like when we threshold it by a recurrence rate of 15%, toggle the button above the plot.

Remember, the threshold is the max (threshold) phase space distance $\varepsilon$ between a pair of points which are considered a recurrence. So thresholding by RR = 15% means we pick $\varepsilon$ such that only 0.31 of the total number of points in the time series are recurrences.

In [None]:
rps = GenerateRPs()

---
# Varying RQA Parameters

Now let's look at how changing different parameters used in recurrence analysis (e.g., embedding parameters) affect the RP you constructed above and its RQA metrics.

Use the sliders and toggle buttons to vary parameters. A few things to note:
- By definition, the distance matrix will be the same regardless of the RR value and whether it's local vs. global as the RR parameter only affects the thresholded RP.
- To see the Theiler window which is excluded from RQA calculations (i.e. toggling the button), make sure the 'Threshold RP' button is toggled on. The Theiler window is only used for RQA and not calculating phase space distances, so it is not overplotted on the distance matrix. By default, the Theiler window is set to 1 in order to exclude the main diagonal.
- Changing the distance metric (i.e. what distance is calculated in phase space) will only be noticeable in the RP visually and in the RQA metrics when the time series is embedded (time delay and dimension $\neq$ 1).

In [None]:
rps.varyParams()

# Dynamical transitions revealed by RPs

What happens when the dominant dynamics of a system change over time, such as from stochastic to periodic or from periodic to quasi-periodic? When the dynamics producing the observed signal change, so do the features characterizing that signal. Different features of the signal at different times can be understood as corresponding to distinct states of the system and would easily be seen in the RP. Indeed, recurrence analysis has been successfully used to detect dynamical transitions within the same system ________ [include refs?].

In the astrophysics literature, there is a class of accreting supermassive black holes which appear to have been observed in different states. These rare "changing-look" objects exhibit extreme variability over a wide range of timescales. One might then expect these changes to be reflected in their RPs.

Now that you have seen a few RPs from different types of stationary systems, let's see what the RP of a non-stationary system could look like. We'll simulate some dynamical changes by splicing together multiple time series and examine how that RP looks like in comparison to the ones we looked at above.

Use the dropdown menus and sliders on the left to adjust the different segments of the combined time series. To outline the 3 "windows" or sub-RPs corresponding to each segment in the full RP, toggle the "Outline RP windows" button. You can leave this button toggled to keep the outlines visible while changing any parameter.

To vary RQA parameters, use the sliders above the RP and the dropdown menu at the bottom to change the displayed RQA metric. The RQA metrics shown for each segment are computed by using the same RQA parameters as the full RP ($RR$, $\tau$ and $m$) but only for its own segment of the time series (as denoted by color).

In [None]:
t = CombineTimeseries()