-
-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added iterative sampling. #433
Conversation
if progressbar: | ||
progress.update(i) | ||
trace.record(point) | ||
yield trace | ||
except KeyboardInterrupt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like you can remove this try statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely
Seems fine, but I don't think I get the purpose. Why does it help with convergence checking/animated plotting? |
well, a pattern could be that you loop through iter_sample() and inside the for block check if some criterion is reached. Or you could make a call to update the plot with the most recent samples in the for-block (as I do). |
Is it that you want to make different versions of How come you can just add some statements in the for loop? I'm not opposed to this, I just don't quite get the benefit. It does look pretty. |
Certainly that's possible with this. Not sure we can come up with a method that works for everyone to include in pymc but you could imagine sampling until you got 1000 samples with a geweke score < X. My specific use case was real-time plotting during sampling. |
Ooh, okay, more for external users if they want to do their own intersampling logic. I get it now. |
Right, exactly. |
Would definitely be cool to see real-time plotting too at some point. (lets merge this once it passes and check to make sure its not slowing things greatly) |
@jsalvatier yeah, I have a notebook that's pretty sweet. Will upload a blog post soon. |
OK, doesn't seem like there's a major performance regression. |
Great! |
Here are some visualizations: |
Very cool. Regarding using this for convergence, it is probably more robust to use Gelman-Rubin with multiple chains than Geweke. In general, we should be working towards running multiple chains by default, since every modern machine will be multicore. This would facilitate (on-the-fly) R-hat calculation. |
Agreed. Doing parallel sampling iteratively would require a bit more work but shouldn't be too hard. Will probably need to fix the parallel pickling issues we're having now (do those still exist @jsalvatier?). |
I don't think we're getting errors when doing things in parallel now, but On Thu, Jan 2, 2014 at 10:59 AM, Thomas Wiecki notifications@github.comwrote:
|
Before, `iter_sample` returned a single-chain trace object, not a `MultiTrace` instance like `sample`. This is an issue for functions that rely on a `MultiTrace`-like interface. See #632.
I moved most of
sample()
intoiter_sample()
(open to name suggestions).iter_sample()
can be used in a for-loop. This is useful for convergence checking and animated plotting during sampling.sample()
now just loops overiter_sample()
. I'll wait for the tests to see if that impaired performance somehow.