Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plates on observations for a CategoricalMarkovChain #36

Open
ckemere opened this issue Dec 4, 2015 · 5 comments
Open

plates on observations for a CategoricalMarkovChain #36

ckemere opened this issue Dec 4, 2015 · 5 comments
Labels

Comments

@ckemere
Copy link

ckemere commented Dec 4, 2015

Thanks again for your rapid replies. I've been trying to trace this issue through the code. In the case where the observation associated with a latent node is Gaussian, we can form an HMM as follows:

# Gaussian HMM
initial_state = Dirichlet(1e-3*np.ones(K))
transmat = Dirichlet(1e-3*np.ones((K,K)))

mu_est = Gaussian(np.zeros(D), 1e-5*np.identity(D),plates=(K,))
Lambda_est = Wishart(D, 1e-5*np.identity(D),plates=(K,))
Y = []
Z = []
for i in range(len(TrainingData[0])):
    [sequence_length, ndims]=TrainingData[0][i].shape
    Z.append(CategoricalMarkovChain(initial_state, transmat, states=sequence_length))
    Y.append(Mixture(Z[i], Gaussian, mu_est, Lambda_est))
    Y[i].observe(TrainingData[0][i])

nodes = Y + [rate_parameter] + Z + [transmat, initial_state]

Q = VB(*nodes)

This works because the Gaussian is implicitly a vector observation distribution. In the case of Poisson if I want to add independent observations (with different densities) to the node, it seems that I should be able to do it like this:

rate_prior = Gamma(alpha, beta, plates=((D,K)))

Y = []
Z = []
for i in range(len(TrainingData[0])):
    [sequence_length, D]=TrainingData[0][i].shape
    Z.append(CategoricalMarkovChain(initial_state, transmat, states=sequence_length))
    Y.append(Mixture(Z[i], Poisson, rate_prior))
    Y[i].observe(TrainingData[0][i])

nodes2 = Y + [rate_parameter] + Z + [transmat, initial_state]
Q2 = VB(*nodes2)

However, this gives me a plate mismatch error: "ValueError: Shapes ([100], (40,)) do not broadcast." (D=100, sequence_length=40). The code works if the "CategoricalMarkovChain" is replaced with a "Categorical".

    Z.append(Categorical(initial_state, plates=(sequence_length, 1)))

I have a feeling that this is easily fixable, but I've been struggling to understand why the plates in the Categorical case are (sequence_length,1), but in the CategoricalMarkovChain case are (sequence_length,).

thanks!

@jluttine
Copy link
Member

jluttine commented Dec 4, 2015

Just a very quick comment, can't look into this at the moment. I'll get back to you next week. But try:

rate_prior = Gamma(alpha, beta, plates=(D, 1, K))

The second plate axis matches with the sequence length. In case you want to have different set of rates for each "time" instance. Sometimes useful.

@ckemere
Copy link
Author

ckemere commented Dec 4, 2015

Thanks!

Further info -

Trying your suggestion I get

RuntimeError: A bug found by _message_to_child for Gamma: The plate axes of the moments (1, 100, 25) are not a subset of the plate axes (100, 1, 25) defined by the node .

And if I try (1,D,K), matching the reported plate moments, I get

ValueError: Shapes ([1, 100], (40,)) do not broadcast

jluttine added a commit to bayespy/bayespy-notebooks that referenced this issue Dec 5, 2015
@jluttine
Copy link
Member

jluttine commented Dec 5, 2015

Note: This message is Jupyter Notebook. You can download it or run it interactively.

I wasn't able to reproduce your RuntimeError. If you could give me a complete code to reproduce that error, I could take a look. There seems to be a bug somewhere. If your doing something wrong, BayesPy should give a more meaningful error instead of that.

I wrote here a complete example which should do what you want (I hope). I noticed two mistakes/typos in your Poisson case code. First, you add rate_parameter to you nodes2 although I think you mean rate_prior. Second, the shape of TrainingData[0][i] is incorrect. You need to transpose it to get shape (D, sequence_length). I added some lines to make this a complete working example:

from bayespy.nodes import Dirichlet
K = 3
initial_state = Dirichlet(1e-3*np.ones(K))
transmat = Dirichlet(1e-3*np.ones((K,K)))

from bayespy.nodes import Gamma
D = 5
rate_prior = Gamma(1e-3, 1e-3, plates=(D,1,K))

from bayespy.nodes import Mixture, CategoricalMarkovChain, Poisson
Y = []
Z = []
TrainingData = [
    [np.random.poisson(lam=5, size=(D, np.random.poisson(lam=30)))]
]
for i in range(len(TrainingData[0])):
    [D, sequence_length] = TrainingData[0][i].shape
    Z.append(CategoricalMarkovChain(initial_state, transmat, states=sequence_length))
    Y.append(Mixture(Z[i], Poisson, rate_prior))
    Y[i].observe(TrainingData[0][i])

# Would like to do this:
#nodes = Y + [rate_prior] + Z + [transmat, initial_state]
#for z in Z:
#    z.initialize_from_random()
# But can't until issue number 30 has been fixed.

# Thus, use this:
nodes = Y + Z + [rate_prior, transmat, initial_state]
rate_prior.initialize_from_value(Gamma(10, 10, plates=(D,1,K)).random())

from bayespy.inference import VB
Q = VB(*nodes)

Q.update(repeat=100)

Does this help?

@ckemere
Copy link
Author

ckemere commented Dec 7, 2015

Thanks much. It turns out that I was seeing resulted from initializing the rate_prior with a (D,K) array instead of a (D,1,K) array.

BTW - is there a preferred way to cite this package if used in a paper?

@jluttine
Copy link
Member

jluttine commented Dec 8, 2015

Ok, great that you got it working. I'll examine that bug some day.

And thanks for asking about citing. It is not required to cite the package under the MIT license, but it is of course highly appreciated if you do so. If you do cite BayesPy in a published paper, I'd be happy to hear about the paper. I could maybe make some list of papers where BayesPy has been used/cited.

You can cite the arXiv version:

Jaakko Luttinen, BayesPy: Variational Bayesian Inference in Python. arXiv:1410.0870 [stat.ML], 2015.

Or, you can cite JMLR. The arXiv paper has been accepted to JMLR, so it should appear at some point (it is the same paper).

Jaakko Luttinen, BayesPy: Variational Bayesian Inference in Python. Journal of Machine Learning Research, to appear.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants