-
-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue/838 hmm #840
base: master
Are you sure you want to change the base?
Issue/838 hmm #840
Conversation
I took a look through this and while I'm in favor of minimal examples, this one's a bit too minimal. I would really like to see the model laid out in more detail than the conditional distributions. Here are some concrete suggestions:
array[3] simplex[3] gamma_arr;
matrix[3, 3] gamma;
for (n in 1:3) gamma[n] = gamma_arr[n];
for (n in 1:N) {
for (k in 1:3) {
log_omega[k, n] = normal_lpdf(y[n] | mu[k], sigma);
}
}
|
I'm rewriting the code to make the transition matrix less constrained (per comment 4) and I wanted to check: we don't have a stochastic matrix type, right? |
We have row and column but not double yet https://mc-stan.org/docs/reference-manual/types.html#stochastic-matrices |
@bob-carpenter I implemented your feedback. Some questions:
|
For (5), I think the idea's that there are three sources of error: measurement error, modeling error, and sampling error. For example, sampling error arises when you subsample a population and use that for estimation. You get modeling error if you use a linear regression for a relationship that's not linear or use normal errors when the errors are skewed, and so on. If you're weighing things with a scale and you know the scale's biased to the high side, you can correct that measurement error. You can explicitly add a measurement error model if you know your measurement model (e.g., gravitational lensing is part of the measurement error model; your work with Bruno et al. on deconvolving galactic dust is part of the measurement model for the CMB, etc.). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a ton of little things to fix, but nothing major.
@@ -294,6 +294,21 @@ pagetitle: Alphabetical Index | |||
- <div class='index-container'>[distribution statement](unbounded_discrete_distributions.qmd#index-entry-0c7465aa1beceb6e7e303af36b60e2b847fc562a) <span class='detail'>(unbounded_discrete_distributions.html)</span></div> | |||
|
|||
|
|||
<a id='beta_neg_binomial_cdf' href='#beta_neg_binomial_cdf' class='anchored unlink'>**beta_neg_binomial_cdf**:</a> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this PR touching negative binomial? Are you up to date with the main branch?
latent state $k$ is parameterized by a $V$-simplex $\phi_k$. The | ||
observed output $y_t$ at time $t$ is generated based on the hidden | ||
state indicator $z_t$ at time $t$, | ||
When $z_{1:N}$ is continuous, the user can explicitly encode these distributions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't z_{1:N}
a subset of
Did you perhaps mean to say that the output
Next, we introduce the $K \times K$ transition matrix, $\Gamma$, with | ||
$$ | ||
\Gamma_{ij} = p(z_n = j \mid z_{n - 1} = i, \phi). | ||
$$ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you need to mention that this is a right-stochastic matrix, where we now have that data type built in.
$$ | ||
Finally, we define the initial state $K$-vector $\rho$, with | ||
$$ | ||
\rho_k = p(z_0 = k \mid \phi). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would mention at this point that it is common to take
In the situation where the hidden states are known, the following | ||
naive model can be used to fit the parameters $\theta$ and $\phi$. | ||
As an example, consider a three-state model, with $K = 1, 2, 3$. | ||
The observations are normally distributed with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
K
is an integer, so I think that should just be
The model for the supervised data does not change; the unsupervised | ||
data are handled with the following Stan implementation of the forward | ||
algorithm. | ||
The last function `hmm_marginal` takes in all the ingredients of the HMM, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eliminate the comma---English doesn't use commas between conjunctions unless there are more than two. So it's just "A and B", but it's either "A, B, and C" (Oxford style) or "A, B and C" (defective American style).
different from sample to sample in the posterior. | ||
To obtain samples from the posterior distribution of $z$, | ||
we use the generated quantities block and draw, for each sample $\phi$, | ||
a sample from $p(z \mid y, \phi)$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't typically need draws from
You can also use the forward-backward algorithm to work out marginally
and with the draw in generated quantities, we obtain draws from | ||
$p(\phi \mid y) p(z \mid y, \phi) = p(z, \phi \mid y)$. | ||
It is also possible to compute the posterior probbability of each hidden state, | ||
that is $\text{Pr}(z_n = k \mid \phi, y)$. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typically this phi
is marginalized out and we look at
This function cannot be used to compute the joint probability | ||
$\text{Pr}(z \mid \phi, y)$, because such calculation requires accounting | ||
for the posterior correlation between the different components of $z$. | ||
Therefore, `hidden_probs` should NOT be used to obtain posterior samples. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NOT -> not (that'll make it italics, which is the standard way to add emphasis to typeset text).
This point is a bit more subtle, though. You can use this to obtain posterior draws of the marginals
$\text{Pr}(z \mid \phi, y)$, because such calculation requires accounting | ||
for the posterior correlation between the different components of $z$. | ||
Therefore, `hidden_probs` should NOT be used to obtain posterior samples. | ||
Instead, users should rely on `hmm_latent_rng`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although not necessary, it's nice to have examples of how to do that. Like:
generated quantities {
array[N] int<lower=1, upper=K> z = hmm_latent_rng(...fill-in params here to match example...);
}
I know this feels obvious, but it can be hard for the users to put the arguments and result types together.
Submission Checklist
<<{ since VERSION }>>
Summary
Addresses issue #838 and updates user-doc on HMMs.
Copyright and Licensing
Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): Charles Margossian, Simons Foundation
By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses: