Skip to content
This repository has been archived by the owner on Sep 11, 2023. It is now read-only.

Documentation on hidden_state_probabilities #1506

Closed
orthonalmatrix opened this issue Aug 11, 2021 · 5 comments
Closed

Documentation on hidden_state_probabilities #1506

orthonalmatrix opened this issue Aug 11, 2021 · 5 comments

Comments

@orthonalmatrix
Copy link

Hello,
I'd like to plot the probability of being in each hidden state as a function of time. In tutorial 7, the hidden_state_probabilities property is used. I can't find any documentation about the property outside of that tutorial. My main question is about the 1st index. I have 13 trajectories that I used for the features but the first index is of size 130. The second and third index seem to be [frame, hmm state]. Since some of my trajectories are of different length I can see that the hidden_state_probabilities[0:9] are all the same trajectory, the hidden_state_probabilities[10-19] are the same, etc. What is the difference between the hidden_state_probabilities[0] and hidden_state_probabilities[1] then?

@thempel
Copy link
Member

thempel commented Aug 12, 2021

Many properties that are internally computed by the BHMM package come in a time-lagged form that was produced by bhmm.lag_observations(). The documentation of this function says

Given a trajectory (s0, s1, s2, s3, s4, ...) and lag 3, this function will generate 3 trajectories (s0, s3, s6, ...), (s1, s4, s7, ...) and (s2, s5, s8, ...).

So that's what you get for the different arrays in hidden_state_probabilities that belong to the same trajectory. In the tutorial example, we used lag time 1, so it doesn't make a difference. You could either re-assemble your trajectories from your output again or, better, compute the hidden_state_probabilities directly from the HMM output probabilities. Hope that helps.

@orthonalmatrix
Copy link
Author

Thank you! There doesn't seem to be documentation on the BHMM package or am I missing it?

I appreciate your suggestion but I'm not sure how to get the HMM output probabilities. I don't see an property where the same information is so that I can create a similar plot to tutorial 7 (in order to check my sampling as tutorial 8 suggests).

@clonker
Copy link
Member

clonker commented Aug 16, 2021

Hi there! There is no hosted documentation of the BHMM package right now. What you can do though is look at the HMM implementation of deeptime

It contains the same algorithms as the BHMM package but is more performant and stable. The hidden_state_probabilities are just called state_probabilities and organized in the same fashion. They correspond to the gamma of the Baum-Welch algorithm. This is basically the conditional probability that a frame of training data belongs to a specific hidden state given its predecessor under the chosen lagtime. If you want to evaluate this on other / hold-out data, you can compute viterbi paths. This will give you crisp assignments however.
Regarding plotting the gammas as a function of time: Since we are dealing with a markov process, you can only sensibly plot the sequence of probabilities with respect to the lagtime you chose. So for example with a lagtime of 3, you could plot [gamma(1), gamma(4), gamma(7), ...] and [gamma(2), gamma(5), gamma(8), ...] and so on.

@clonker
Copy link
Member

clonker commented Aug 16, 2021

This is a toy example (Prinz potential). Here it is two trajectories with a lagtime of 3, yielding 6 state probability sequences (one per trajectory per lagtime shift):

import numpy as np
import matplotlib.pyplot as plt
from deeptime.data import prinz_potential
from deeptime.clustering import BoxDiscretization
from deeptime.clustering import ClusterModel
from deeptime.markov.msm import MaximumLikelihoodMSM
import deeptime.markov.hmm as hmm

system = prinz_potential()
trajs = system.trajectory(np.zeros((2, 1)), length=20000)
clustering = BoxDiscretization(dim=1, n_boxes=1000).fit_fetch(np.concatenate(trajs))
dtrajs = [clustering.transform(trajs[i]) for i in range(len(trajs))]

hmm_init = hmm.init.discrete.metastable_from_data(dtrajs=dtrajs, n_hidden_states=4, lagtime=3)
mlhmm = hmm.MaximumLikelihoodHMM(hmm_init, lagtime=3).fit_fetch(dtrajs)
print(f"Got {len(mlhmm.state_probabilities)} state probability sequences "
      f"from {len(trajs)} trajectories and a lagtime of {mlhmm.lagtime}.")

f, axes = plt.subplots(nrows=3, ncols=2, figsize=(12, 8))
for ix, ax in enumerate(axes.flatten()):
    for hidden_state in range(4):
        xs = np.arange(50)*mlhmm.lagtime
        ax.plot(xs, mlhmm.state_probabilities[ix][:, hidden_state][:50])

To run it it should suffice to install deeptime (conda install -c conda-forge deeptime) and matplotlib.

@clonker
Copy link
Member

clonker commented Dec 22, 2021

If anything we should upload a documentation for bhmm at some point or at least reference to the deeptime implementation given that is largely unmaintained at this point. Closing this issue.

@clonker clonker closed this as completed Dec 22, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants