## Initialization

In [1]:
from pomegranate import *
dists = [NormalDistribution(5, 1), NormalDistribution(1, 7), NormalDistribution(8,2)]
trans_mat = numpy.array([[0.7, 0.3, 0.0],
                             [0.0, 0.8, 0.2],
                             [0.0, 0.0, 0.9]])
starts = numpy.array([1.0, 0.0, 0.0])
ends = numpy.array([0.0, 0.0, 0.1])
model = HiddenMarkovModel.from_matrix(trans_mat, dists, starts, ends)

In [2]:
dists

[{
     "class" :"Distribution",
     "name" :"NormalDistribution",
     "parameters" :[
         5.0,
         1.0
     ],
     "frozen" :false
 }, {
     "class" :"Distribution",
     "name" :"NormalDistribution",
     "parameters" :[
         1.0,
         7.0
     ],
     "frozen" :false
 }, {
     "class" :"Distribution",
     "name" :"NormalDistribution",
     "parameters" :[
         8.0,
         2.0
     ],
     "frozen" :false
 }]

In [3]:
from pomegranate import *
s1 = State(NormalDistribution(5, 1))
s2 = State(NormalDistribution(1, 7))
s3 = State(NormalDistribution(8, 2))
model = HiddenMarkovModel()
model.add_states(s1, s2, s3)
model.add_transition(model.start, s1, 1.0)
model.add_transition(s1, s1, 0.7)
model.add_transition(s1, s2, 0.3)
model.add_transition(s2, s2, 0.8)
model.add_transition(s2, s3, 0.2)
model.add_transition(s3, s3, 0.9)
model.add_transition(s3, model.end, 0.1)
model.bake()


Models built in this manner must be explicitly “baked” at the end. This finalizes the model topology and creates the internal sparse matrix which makes up the model. This step also automatically normalizes all transitions to make sure they sum to 1.0, stores information about tied distributions, edges, pseudocounts, and merges unnecessary silent states in the model for computational efficiency. This can cause the bake step to take a little bit of time. If you want to reduce this overhead and are sure you specified the model correctly you can pass in merge=”None” to the bake step to avoid model checking.

The second way to initialize models is to use the from_samples class method. The call is identical to initializing a mixture model.

In [4]:
from pomegranate import *
model = HiddenMarkovModel.from_samples(NormalDistribution, n_components=5, X=X)

NameError: name 'X' is not defined

Much like a mixture model, all arguments present in the fit step can also be passed in to this method. Also like a mixture model, it is initialized by running k-means on the concatenation of all data, ignoring that the symbols are part of a structured sequence. The clusters returned are used to initialize all parameters of the distributions, i.e. both mean and covariances for multivariate Gaussian distributions. The transition matrix is initialized as uniform random probabilities. After the components (distributions on the nodes) are initialized, the given training algorithm is used to refine the parameters of the distributions and learn the appropriate transition probabilities.