Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DIC with multiple chains #648

Closed
jseabold opened this issue Nov 13, 2014 · 8 comments
Closed

DIC with multiple chains #648

jseabold opened this issue Nov 13, 2014 · 8 comments

Comments

@jseabold
Copy link
Contributor

The way the deviance information criteria code is written in 2.x right now, IIUC only the last chain is used because the default argument for nchains in the Trace objects is -1. Is this intentional? I guess in practice, it may not end up mattering much if you're reasonably happy with the sampling from the posterior in the last chain, but if I'm running multiple chains from overdispersed starting values wouldn't I want to compute the DIC from traces of all the chains?

@fonnesbeck
Copy link
Member

It was intentional, but I'm happy to rethink it. I figured one chain would be enough to get a reasonable DIC estimate, and if it varied a lot from chain to chain, then you would not be happy with the set of non-converged samples anyhow, so you wouldnt need DIC.

At present it is a property, but we could make it a function that took a model and a chain index as an argument:

dic_all_chains = pymc.dic(my_model, chain=None)
dic_current_chain = pymc.dic(my_model, chain=-1)
dic_other_chain = pymc.dic(my_model, chain=2)

Would that be a better implementation?

@jseabold
Copy link
Contributor Author

Thanks. I agree that the current implementation is pretty reasonable given what you say. I don't have a sense of whether there would be any real difference in the computed value given that you're happy with convergence. I've seen sometimes in the literature preference given to one model over another based on pretty small (subjectively) differences in DIC, but that's a methodological issue.

Maybe more important sub-question, what about also adding a start/burn keyword? Maybe my workflow needs some changing, but I've been running some pretty time consuming models and then deciding on burn-in. For the most part this is accommodated, but the DIC is an exception to this.

@fonnesbeck
Copy link
Member

The PyMC 2 is very much geared towards a burn-at-sampling workflow, given the burn and thin arguments for sample (PyMC 3, however follows the workflow you suggest). I am usually throwing away 80-90% of my samples conservatively, at sampling, then I don't have to worry about it later. Implemented as a property, obviously there can't be arguments, but we can implement a function or method if that helps.

@jseabold
Copy link
Contributor Author

I had that impression but surprisingly found that for the most part I could get by with post-sampling adjustments. I just rolled my own as a solution. Up to you whether you want to add the convenience function. Feel free to close this as you see fit.

I only tried briefly to install pymc 3. Will probably switch to it after I wrap up this project and continue to do more statistics by simulation.

@twiecki twiecki added the v.2 label Feb 24, 2015
@twiecki
Copy link
Member

twiecki commented Feb 24, 2015

Should this issue be moved to pymc?

@jseabold
Copy link
Contributor Author

Splitting up the repos? 👍

@fonnesbeck
Copy link
Member

Moving this over to #2

@sammosummo
Copy link

@jseabold What was your solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants