-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use DataFrames for ChainSummaries #79
Conversation
…ains.jl into dfchainsummary # Conflicts: # src/MCMCChains.jl # src/dfchainsummary.jl
Fix issue betwee Julia 1.1 and Julia 1.2
1. Renamed dfchainsummary.jl to summarize.jl. 2. Renamed and added several more test cases to summarize_tests.jl. 3. Updated MCMCChains.jl accordingly.
…ains.jl into dfchainsummary
I've added an |
Looks like our tests are passing. This is now ready for a formal review. |
I think this looks great Cameron! |
I'm happy to review the PR by tomorrow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I said earlier, this looks great. Once merged I will update all models in CmdStan and StatisticalRethinking to use summarize
. But right now all ~60 models run.
Cameron, just as a thought, would it be useful to have |
I actually wasn't sure what to do about that, so I'm happy to take suggestions there. My thinking as to why it doesn't actually return anything is that I had thought it was intended as primarily an I/O function to show you information about the chain --- not to return the dataframes. I had it set to do both but if you ran Perhaps a better option would be to have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the work, look good. However, I think this PR is a bit too large and it's therefore hard for me to carefully review the PR. Maybe next time make sure that the PRs stay smaller and doesn't try to solve so many issues.
Look good to me. I would not add any further functionalities as this PR is already quite large. Better merge the PR and open a new one for the |
Yes, running it in the REPL would require |
Merging for now, since I've only modified comments and previous tests were passing. Thanks for the review, guys! |
This PR switches MCMCChains off the strange I/O driven
ChainSummary
andChainSummaries
structs, and uses a much simplerChainDataFrame
struct that brings a lot of the versatility ofDataFrames
for printing, organization, and indexing. It's now a lot easier to get values out ofChainSummary
structs (see #71).Currently, the bulk of the work is done. There's still a couple of remaining steps, namely tidying and making sure all the tests are doing what they're supposed to:
ChainSummary
andChainSummaries
entirelyThere is a new function called
summarize
which maps a function parameter-wise and returns a dataframe. Hopefully this'll make it quicker to get some non-traditional summary stats, and it's used a lot on the backend. Example:The indexing on these is overloaded to put a higher priority on rows:
The
show
function forChains
now uses this as well:All the stats functions include keyword arguments for
sections
andappend_chains
. Ifappend_chains=true
, then all the chains are smushed together into one chain. Otherwise, you'll get a vector ofChainDataFrames
, one for each chain.sections
just allows you to subset your chain by sections.