Model.predict() gives only minus ones #1003

mhavu · 2022-11-06T14:27:29Z

I am new to pomegranate, and I want to build a HMM with two hidden states. I have labels, and a sequence of series of observations, most of which come from a multinomial distribution (k = 4, n = 60). (There is also one that comes from a wrapped normal distribution—see issue #1002—but it can be forgotten for now.) Any calls to predict() result in just a list of minus ones, and sometimes the warning "Sequence is impossible":

import numpy as np
from pomegranate import HiddenMarkovModel, DirichletDistribution

# label = np.array([0, 0, 0, ..., 1, 1, 1])
# data = np.array([[60, 0, 0, 0], [58, 2, 0, 0], ...])
dists = [DirichletDistribution.from_samples(data[np.where(label == 0)]),
         DirichletDistribution.from_samples(data[np.where(label == 1)])]
trans_mat = np.array([[0.999, 0.001],
                      [0.002, 0.998]])
starts = np.array([0.5, 0.5])
model = HiddenMarkovModel.from_matrix(trans_mat, dists, starts)
model.predict(data)
# [-1, -1, -1, ...]

What am I missing here? (If I create the model with HiddenMarkovModel.from_samples() instead, predict(), fit(), etc. result in a segfault, but I didn't file a bug report yet, since I think I am just doing something wrong.

The text was updated successfully, but these errors were encountered:

jmschrei · 2022-11-06T16:19:27Z

The issue is that my implementation of dirichlet distributions is wrong and bad. I am currently rewriting pomegranate using torch and fixing this issue, among several others. For now, please avoid the DirichletDistribution implementation.

mhavu · 2022-11-07T11:59:02Z

What would you recommend for multinomial distribution instead?

jmschrei · 2022-11-07T15:59:47Z

Is your data categories or is it probability vectors over classes? If the data are categories, i.e., integers specifying class, you should use DiscreteDistribution.

mhavu · 2022-11-07T17:55:03Z

It's probability vectors over classes.

mhavu · 2022-12-11T14:12:42Z

I didn't get very far with this. I used argmax to convert the probability vectors over classes to categories. However, there are way more state transitions than should be likely, given the probabilities in the transition matrix, and changing the transition matrix doesn't seem to have any effect. Incorporating the observations that come from a wrapped normal distribution would probably help, but I'm not quite sure how to build a mixed model where each hidden state corresponds to both a DiscreteDistribution and a VonMisesDistribution (from #1002).

mhavu · 2022-12-18T19:39:03Z

According to #458, it is not possible to mix DiscreteDistribution with any other type of distribution in IndependentComponentsDistribution. Is there a type of distribution in pomegranate I could use for multinomial distribution (probability vectors over classes)?

jmschrei · 2023-04-16T06:07:51Z

Thank you for opening an issue. pomegranate has recently been rewritten from the ground up to use PyTorch instead of Cython (v1.0.0), and so all issues are being closed as they are likely out of date. Please re-open or start a new issue if a related issue is still present in the new codebase.

jmschrei closed this as completed Apr 16, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model.predict() gives only minus ones #1003

Model.predict() gives only minus ones #1003

mhavu commented Nov 6, 2022 •

edited

jmschrei commented Nov 6, 2022

mhavu commented Nov 7, 2022

jmschrei commented Nov 7, 2022

mhavu commented Nov 7, 2022

mhavu commented Dec 11, 2022

mhavu commented Dec 18, 2022 •

edited

jmschrei commented Apr 16, 2023

Model.predict() gives only minus ones #1003

Model.predict() gives only minus ones #1003

Comments

mhavu commented Nov 6, 2022 • edited

jmschrei commented Nov 6, 2022

mhavu commented Nov 7, 2022

jmschrei commented Nov 7, 2022

mhavu commented Nov 7, 2022

mhavu commented Dec 11, 2022

mhavu commented Dec 18, 2022 • edited

jmschrei commented Apr 16, 2023

mhavu commented Nov 6, 2022 •

edited

mhavu commented Dec 18, 2022 •

edited