New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
next manual, 2.12.0++ #2051
Comments
From @wds15 on mailing list, moved from stan-dev/math#44:
|
Section 11.2 Meta-Analysis; in the transformed data block the loop runs over j, but the indices on the RHS for the calculation of sigma[j] are for i
|
From Gary Schulz on stan-users: Reference manual for v2.11.0 in section 45.3 says that the exponent of
|
From @davharris in #2065. The mod operator
|
It decomposes the Dirichlet into mean theta and prior count (minus K) kappa. You can just skip the dirichlet prior on theta in which case it defaults to uniform over simplexes. |
Martin Stjernman on stan-users wrote in that:
and then it is stated that the following assignment is legal:
to me this is an one-dimensional array that has 3 elements (length 3) where each element is a 7 element vector
to me this is a two-dimensional array of 7x2 matrices, it has 15*12 elements arranged in 15 rows and 12 columns where each element is a 7 by 2 matrix
|
From Luiz Max Carvalho on stan-users: Imagine I have N (discrete) observations from n individuals. Each observation X_i is a vector {X_1i, X_2i, ..., X_k}, where sum(X_i) = n. The problem I have is: N is in the millions, and since we can only have n_(n +1)/2 different values for X, this means we'll have many "repeated" observations, i.e, each X_j (j =1, 2, n_(n +1)/2) will appear f_j times, sum(f_j) = N. So far, I've been downsampling the data proportional to f_j, but I'd like to use all of the data. How can I include the frequencies in the multinomial likelihood in stan? If I were doing this outside of stan I'd just sum the log of each frequency f_j to the (multinomial) log-likelihood of X_j. Is there a way of incrementing l_p to achieve this? @bgoodri replied: In this case, you can do
Since the duplicative observations are actually observed, this is okay, as opposed to the situation where the f_j are estimates of how many people in the population are in stratum j
|
Portia Brat reported on stan-users that there's a bug on p. 74--75 in "Optimization through Vectorization" section asking users to replace
with
|
From Sean Matthews on stan-users list:
|
|
|
|
|
Thanks to @skanskan via stan-dev/rstan#348: Typo on page 36:
Or I suppose it could conceivably be |
Incluce a simple example for generating data from a univariate regression at the point where we talk about fake data simulation ("One of the best ways to make sure your model is doing the right thing computationally is to generate simulated") Also reference the "Sampling without Parameters" section in the HMC chapter. You can generate fake data for a regression given the parameters (and sizes):
If you had priors on alpha, beta and sigma, you would specify the prior parameters as data and then generate alpha, beta and sigma in the generated quantities block; in fact, without such priors, you're not generating from the model itself |
|
|
Documentation for sampling statements mentions functions that aren't defined anywhere, such as
Shouldn't those be
|
@feuerbach Correct, thoe should be
|
|
From Stephen Martin on stan-users:
Plot histograms of This lets you generalize beyond intercepts, too, if you have other predictors. |
Just curious, what sort of inferences aren't sensitive to the labels? You can do prediction, i.e., likelihood for a new observation x', where You can do similarity, i.e., Pr[items n and n' belong to same group | x, y]. Say you're wanting to fit a latent group model from data. You want to know what probability each person has of belonging to each of K groups, then the parameters of said K groups. Exactly because of label switching this isn't a valid inference. Is there a way of obtaining such parameters in a way that doesn't have a label switching problem? No. You can't evaluate what the probability is that item n belongs to mixture component k. You can get the marginal for a parameter, such as mu, given item i. It'll be multimodal. The conversation went on, and Stephen suggested adding constriaints. In some cases, that can lead to an identifiable model. Also, if the modes truly are symmetric (as in a classic mixture, not as in an LDA model with substantive alternative modes) and just about label switching, you can sometimes post-process and do Bayes-like inferences on the identities of the clusters, like the probability of an item belonging to a cluster. Also, to make things more precise, talk about which inferences are invariant under label switching. |
|
|
|
|
|
Summary:
This is where updates for the 2.12 manual should go.
v2.12.0
The text was updated successfully, but these errors were encountered: