-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shape of sample_ppc #1529
Comments
If you use @AustinRochford can you lend some insight as to the need for a |
Thanks. I omitted the size parameter and set samples = 100. Now I believe I have 100 datasets each of 100 samples from posterior predictive distribution. I think this issue is related to the shape and size discussion on PR #862. The things I was able to infer from the discussion for my example
|
@fonnesbeck I am not familiar at all with the innards of the PPC sampling code, so I don't have much to add offhand, unfortunately. |
Sorry, thought you had authored this. |
No problem. |
@taku-y did |
Here's another question: why not have the PPC sampler return an object like the |
Yeah you're totally right.
…On Nov 26, 2016 6:54 PM, "Kyle Beauchamp" ***@***.***> wrote:
Here's another question: why not have the PPC sampler return an object
like the Trace object? Right now, the trace is an object that supports
slicing, but the output of sample_ppc is a dictionary that requires more
gymnastics.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#1529 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AApJmPDeLEBREU2na5qjX0MMbApZyAo9ks5rCHJdgaJpZM4K1dCZ>
.
|
@kyleabeauchamp Want to do a PR? |
Is there an obvious constructor for the MultiTrace object from a dictionary of traces? I'm not seeing a clear API on what it takes to construct the trace object outside of the usual chain. |
Should we also document the fact that the PPC trace is actually reshuffled with respect to the original trace? E.g. does everyone find it obvious that this thing is first drawing shuffled points from the original trace (https://github.com/pymc-devs/pymc3/blob/481a231dd2ef31d5f1581e26320cf387edeed343/pymc3/sampling.py#L385) as opposed to iterating over the trace? |
E.g. I want to prevent someone from assuming order and doing the following:
|
That's indeed a bit counter intuitive. Could we not just change the order of that loop? |
Its not clear why you would use the trace and the posterior predictive check draws together anywhere. Its doing what I would expect:
Let me know if I am reading that wrong. |
I did not write the code of sample_ppc, but sample_vp. Sorry for late reply. |
OK, here's my simple model for why one might want to Suppose we have a coin that is lands on heads either 75% of the time or 25% of the time. Suppose we have observed 4 coin flips. The key parameter in the model is then Now we go and sample the model, and also the PP. The PP are essentially the resampled coin flips. I imagine I would want to plot a 2D histogram of If we use the current PPC without zipping the MCMC trace for |
If you are doing posterior predictive checks (which is implied in the name of the function), all we want is a set of random draws from the conditional posterior for each data point we have observed. The only thing we are conditioning on are the predictors associated with the observed outcomes, and not on any particular draw from intermediate variables). |
I agree, I guess I'm saying there may sometimes be value in having things conditioned on the intermediates. AFAIK, in pymc2 my use case could be achieved using either hand-introduced auxiliary nodes or by using the masked array / missing values formalism. |
If I'm doing posterior predictive checks, not only do I want the original MCMC sample shuffled, I want it to be sampling with replacement. Because, in principle, you could ask for more PPC samples than original samples, so there should be a means for providing that. I'm happy to have a function that is more flexible, but am suggesting it probably shouldnt be |
That's a good proposal. |
agreed |
My question about how the predictor values are used to calculate the posterior predictive samples, is still not answered. I want to calculate the variance of the posterior predictive samples, y_new after subtracting (aX + b) [here x is the predictor variable] i.e Var(y_new - (aX + b)) where y_new is the list of posterior predictive samples returned. Can someone answer how to do this ? |
@shkr you can just manipulate the posterior samples from |
As per the documentation https://pymc-devs.github.io/pymc3/api.html#pymc3.sampling.sample_ppc
found here. I am expecting a keyed dictionary, in this case a dict with key
y_obs
since I have only one observed variable.The ppc['y_obs'] I expect to be a matrix of 500 datasets of predictive posterior samples of each size 100 samples drawn from different values of the posterior distribution.
However, the shape of
ppc
is not (500, 100) as expected but (500, 100, 100).Can someone explain what is this matrix and why its shape is not (500, 100) ?
The text was updated successfully, but these errors were encountered: