# Proof and Demo of ApEn and SampEn of a Markov Chain
The following notebook provides a demonstration of the mc_entropy module and the functions for calculating the true entropy rate, Approximate Entropy, and Sample Entropy of a Markov Process with known transition probabilities. Accompanying this demonstration is a proof of the formula for the true Sample Entropy of a Markov Chain.



*Note: In this demonstration, all discussion of indexing or elements at index i assumes a series or sequence indexed at 1 for the first element, as opposed to the Python indexing convention which starts at i=0. This means that some of the indexing used in the Python code blocks will not align completely with that used in the text. Therefore when the range $1 \leq i \leq N-m$ is given in the text below, this is equivalent to the range $0 \leq i \leq N-m-1$ in Python indexing.*

In [7]:
import numpy as np
from mc_measures import mc_entropy as mce
from mc_measures import gen_mc_transition as GMTP

## Overview of mc_entropy.py
The module mc_entropy.py is part of the mc_measures package and all three functions in mc_entropy.py are designed to accept an instance of the GenMarkovTransitionProb (GMTP) class from the gen_mc_transition.py module.

There are three functions in mc_entropy.py: entropy_rate(), markov_apen(), and markov_sampen().


### mc_entropy.entropy_rate()
The entropy rate of a stationary, ergodic Markov Chain, $X$, with state space $\Omega$, steady state probability vector, $\pi$, and transition matrix $P$ is :
$$
\mathcal{H}(X) = -\sum_{j\in\Omega}\sum_{i\in\Omega}\pi_jP_{ij}log(P_{ij})
$$
where $\pi_j = Pr(X=j)$, the stationary probability of state j, and $P$ is a left stochastic matrix whose columns sum to 1 and $P_{ij}$ is the probability that the next state in the Markov Chain is $i$ given the present state is $j$, (i.e. $Pr(X_{n+1}=i | X_n=j)$) (see [1](#1), Theorem 4.2.4).  
  
The function entropy_rate() is identical to the entropy_rate class method of the GMTP class. It requires a square transition matrix as it uses eigendecomposition to obtain the steady state probability vector from the transition matrix.

## mc_entropy.markov_apen()
Pincus [2](#2) established that the Approximate Entropy of a Markov chain is equivalent to the same entropy rate formula of a Markov chain above when ($r < min(|\Omega_i - \Omega_j|, i \neq j, i \ and \ j \ state \ space \ values$). This is  the same condition when applying Approximate Entropy to any arbitrary discrete valued state space. Pincus predicts this to be true almost surely for any value of *m*. Because of this, markov_apen() behaves as a wrapper for entropy_rate() and performs the same function as entropy_rate().

    - Example. Take the Markov Chain with transition probabilities $P_{1, 3}=2/3$, $P_{2, 1}=1$, $P_{3, 2}=1$, $P_{3,3}=1/3$. *Note each column sums to 1.*
    - We compute the stationary probabilities to be $\pi_1=2/7$, $\pi_2=2/7$, and $\pi_3=3/7$.
    - Next we plug in the probabilities to the above equation: 
$$
\begin{align}
\mathcal{H}(X) = ApEn(X) &= - (2/7*0log(0) + 2/7*1log(1) + 2/7*0log(0) + 2/7*0log(0) + 2/7*0log(0) + 2/7*1log(1) + \\ 
& \ \ \ \ 3/7*2/3log(2/3) + 3/7*0log(0) + 3/7*1/3log(1/3)) \\
&= - (2/7log(2/3) + 1/7log(1/3)) \\
&= 0.118
\end{align}
$$

In [None]:
#data = np.array([6, 1, 6, 8, 7, 2, 2, 7, 5, 2, 5, 5, 4, 5, 5, 6, 6, 1, 1, 1])
#X = np.array([1, 1, 1, 3, 1, 2, 2, 3, 1, 2])
X = np.array([1, 1, 1, 3, 1, 2, 2, 3, 1, 2])
m=2
r=0.2
N=len(X)

In [2]:
#example markov chain used by Pincus [1].
#alphabet = {1, 2, 3}
#transition matrix
P = np.array([[0, 0, 2], [3, 0, 0], [0, 3, 1]])/3
#get steady state vector vie eigen decomposition of P
eigvalues, eigvectors = np.linalg.eig(P)
#get index of the eigenvalue equal to 1
eig_index = np.where(eigvalues.real.round(1) >= 1.0)
#get the column vector at the index corresponding to eigenvalue of 1
pibasis = eigvectors[:, eig_index]
#normalize it to get the steady state probability vector of P 
pi = pibasis/pibasis.sum(axis=0)

In [10]:
states_temp = [chr(ord('a')+i) for i in range(3)]
order_i = 1
root_dir = None
GMTP.gen_model(root_dir, order_i, states_temp)

LinAlgError: Array must not contain infs or NaNs

In [3]:
print(P)
pi

[[0.         0.         0.66666667]
 [1.         0.         0.        ]
 [0.         1.         0.33333333]]


array([[[0.28571429+0.j]],

       [[0.28571429+0.j]],

       [[0.42857143+0.j]]])

## References
<a id='1'></a>
\[1\]Cover and Thomas
<a id='2'></a>
<div class="csl-entry">[2] Pincus, S. M. (1991). Approximate entropy as a measure of system complexity. <i>Proceedings of the National Academy of Sciences of the United States of America</i>, <i>88</i>(6), 2297–2301. https://doi.org/10.1073/pnas.88.6.2297</div>