# Waiting Times and Pathways of Markov Models

For a more detailed introduction to this topic, please refer to following article.
> **MSMPathfinder: Identification of Pathways in Markov State Models**  
> Daniel Nagel, Anna Weber, and Gerhard Stock  
> *J. Chem. Theory Comput.* 2020 16 (12), 7874-7882  
> doi: [10.1021/acs.jctc.0c00774](https://pubs.acs.org/doi/10.1021/acs.jctc.0c00774)

## Introduction
In the context of biochemical systems, understanding the kinetics of protein dynamics is critical for gaining insight into the underlying mechanisms of biological function. The Markov state model provides a powerful framework for characterizing the kinetics of a system. One can use it to estimate the waiting times $t^\text{wt}$ between transitions, the transition path times $t^\text{tt}$, and the pathways that the system takes between states.

It is important to note that waiting times $t^\text{wt}$, transition path times $t^\text{tt}$, and means first passage times $t^\text{mfpt}$ are related but distinct quantities. Waiting times are the times between individual transitions. Transition path times, on the other hand, are the times required for the system to traverse a particular path between two states. Finally, mean first passage times are the average times required for the system to reach a particular state for the first time. In general, it holds that
$$
t^\text{wt} \ge t^\text{mfpt} \ge t^\text{tt}\;.
$$
In general, we are interested in comparisons with experiments. For example, in biochemical experiments, one can measure the times required for a system (an ensemble) to transition from one state to another. Since this directly corresponds to the waiting time distribution, we mainly focus on it and try to relate it to the corresponding pathways.

In this section, we will focus on the waiting times and transition path times and show how to estimate them using either the Markov state model or directly the state trajectory. We will also discuss how to visualize the pathways and interpret the results in the context of the toy system being studied.

## Model Systems
In the following we will use two simple toy models introduced and discussed by Nagel et al. 20, namely, the following 4-state and 6-state models

In [1]:
import msmhelper as mh
from msmhelper.utils import datasets

print(
    f'4 state model:\nT =\n{datasets.nagel20_4state.tmat}\n\n'
    f'6 tate model:\nT =\n{datasets.nagel20_6state.tmat}'
)

# generate random trajectories from transition matrices
n_steps = int(1e4)
traj_4state = datasets.nagel20_4state(n_steps)
traj_6state = datasets.nagel20_6state(n_steps)

4 state model:
T =
[[0.92 0.04 0.04 0.  ]
 [0.1  0.7  0.1  0.1 ]
 [0.1  0.1  0.7  0.1 ]
 [0.   0.04 0.04 0.92]]

6 tate model:
T =
[[0.9  0.03 0.07 0.   0.   0.  ]
 [0.05 0.63 0.15 0.07 0.1  0.  ]
 [0.08 0.04 0.6  0.14 0.14 0.  ]
 [0.   0.12 0.15 0.5  0.15 0.08]
 [0.   0.04 0.16 0.1  0.6  0.1 ]
 [0.   0.   0.   0.08 0.02 0.9 ]]


## Estimation Timescales from MD

Before relying on Markov state models to analyse the expected times and pathways of a given process, we first want to show to analyse the raw data. In the following this is referred as MD due to the fact that it is the truly simulated dynamic of the MD simulation.

Let us start by analysing the $1\to4$ process of the 4 state model. Within this model, this is the most interesting one, due to the fact that these two states are not directly connected.

In [2]:
start, final = 1, 4
wts = mh.md.estimate_wt(traj_4state, start, final)

print(f'Identified waiting times [frames]:\n{wts}')

Identified waiting times [frames]:
[ 43  19  70   7  48  17  11  64   6  35  36  16  25  15 125  18  22  50
  10  56  15  17 114  22  16  71  99  24   8  55   8  36  60  21  17  19
  10  25  36  80   9  35  33  65  68  38  14  49  29  39  72  26  18  54
  23  29   6  68  24   5  17  12  21  13  71  35  23  61  23  25  29   2
  15  60  15   8  11  65  45  27  16  53  16  31  60  16   8  11  59  17
  30  26 104  38  45  41   9  33   4  20  70   2  17  34  20  28   5  19
  19  96  11  10  25  50  22  14   4  50  20  10  35  17   9  12   6  21
  21  73  61  38  19  60 124  49  16  44  44  50  46   8  59  29  77  73]


In [3]:
paths = mh.md.estimate_paths(traj_4state, start, final)

# Let's format the output
print(f'Identified pathways with time of events given in [frames]:')
for path, pathtimes in paths.items():
    print(f'{path}: {pathtimes}')

Identified pathways with time of events given in [frames]:
(1, 3, 4): [43, 6, 35, 125, 18, 10, 56, 17, 71, 24, 60, 17, 80, 9, 35, 14, 39, 72, 18, 29, 17, 12, 13, 25, 45, 53, 16, 30, 9, 33, 4, 2, 17, 20, 5, 11, 4, 35, 12, 6, 16, 46]
(1, 3, 2, 4): [19, 11, 36, 16, 25, 50, 15, 114, 22, 16, 10, 68, 6, 68, 21, 71, 15, 60, 11, 16, 8, 17, 38, 28, 10, 50, 10, 21, 19, 44, 29]
(1, 2, 4): [70, 17, 64, 15, 22, 55, 8, 36, 21, 25, 36, 33, 65, 38, 49, 29, 26, 54, 23, 24, 5, 35, 23, 23, 2, 15, 8, 65, 27, 31, 60, 11, 59, 26, 45, 41, 20, 34, 19, 19, 96, 25, 50, 20, 17, 9, 73, 61, 38, 60, 124, 49, 44, 8, 59, 73]
(1, 2, 3, 4): [7, 48, 99, 8, 19, 61, 29, 16, 104, 70, 22, 14, 21, 50, 77]
