Skip to content
Mike Strosaker edited this page Mar 22, 2014 · 3 revisions

Overview

Hidden Markov models allow for the state transitions of a model to be deduced based solely on on a series of observations. They are useful in, for example, studying genomes: a given nucleotide (A, C, G, or T) may be part of the various components of a gene, or may be part of the region between genes. One can use an HMM to annotate a genome, portrayed as a series of observations (a string of nucleotides), by calculating which nucleotides are most likely to be present in a gene, and which are most likely to be between genes.

The hmm package provides two classes used to construct HMMs:

  • state: represents a single state in an HMM
  • hmm: represents an HMM in its entirety; it is constructed with a list of state objects

The use of this module is demonstrated in this blog post. To summarize that post, this model:

hmm

can be represented with the following code:

import hmm

s1 = hmm.state(
        'S1',            # name of the state
        0.5,             # probability of being the initial state
        { '1': 0.5,      # probability of emitting a '1' at each visit
          '2': 0.5 },    # probability of emitting a '2' at each visit
        { 'S1': 0.9,     # probability of transitioning to itself
          'S2': 0.1 })   # probability of transitioning to state 'S2'
s2 = hmm.state('S2', 0.5,
        { '1': 0.25, '2': 0.75 },
        { 'S1': 0.8, 'S2': 0.2 })
model = hmm.hmm(['1', '2'],  # all symbols that can be emitted
                [s1, s2])    # all of the states in this HMM

Once the model is created (in the object called model, in this case), the sequence of states that most likely generated an arbitrary sequence of symbols can be calculated with:

path, prob = model.viterbi_path('222')   # can also use ['2', '2', '2']
print path
print prob

which, in this case, would display

['S2', 'S1', 'S1']
-1.17069622717

Note that the probabilities are always provided as log (base 10) transformed numbers, since they can be very small numbers.

Clone this wiki locally