Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added iterative sampling. #433

Merged
merged 4 commits into from
Jan 1, 2014
Merged

Added iterative sampling. #433

merged 4 commits into from
Jan 1, 2014

Conversation

twiecki
Copy link
Member

@twiecki twiecki commented Dec 31, 2013

I moved most of sample() into iter_sample() (open to name suggestions). iter_sample() can be used in a for-loop. This is useful for convergence checking and animated plotting during sampling.

sample() now just loops over iter_sample(). I'll wait for the tests to see if that impaired performance somehow.

if progressbar:
progress.update(i)
trace.record(point)
yield trace
except KeyboardInterrupt:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like you can remove this try statement

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely

@jsalvatier
Copy link
Member

Seems fine, but I don't think I get the purpose. Why does it help with convergence checking/animated plotting?

@twiecki
Copy link
Member Author

twiecki commented Dec 31, 2013

well, a pattern could be that you loop through iter_sample() and inside the for block check if some criterion is reached. Or you could make a call to update the plot with the most recent samples in the for-block (as I do).

@jsalvatier
Copy link
Member

Is it that you want to make different versions of sample (some that do continuous updating some, that check a criteria) but you want to abstract out the shared part?

How come you can just add some statements in the for loop?

I'm not opposed to this, I just don't quite get the benefit. It does look pretty.

@twiecki
Copy link
Member Author

twiecki commented Dec 31, 2013

Certainly that's possible with this. Not sure we can come up with a method that works for everyone to include in pymc but you could imagine sampling until you got 1000 samples with a geweke score < X.

My specific use case was real-time plotting during sampling.

@jsalvatier
Copy link
Member

Ooh, okay, more for external users if they want to do their own intersampling logic. I get it now.

@twiecki
Copy link
Member Author

twiecki commented Dec 31, 2013

Right, exactly.

@jsalvatier
Copy link
Member

Would definitely be cool to see real-time plotting too at some point.

(lets merge this once it passes and check to make sure its not slowing things greatly)

@twiecki
Copy link
Member Author

twiecki commented Dec 31, 2013

@jsalvatier yeah, I have a notebook that's pretty sweet. Will upload a blog post soon.

@twiecki
Copy link
Member Author

twiecki commented Jan 1, 2014

OK, doesn't seem like there's a major performance regression.

jsalvatier added a commit that referenced this pull request Jan 1, 2014
@jsalvatier jsalvatier merged commit 5a4bdfa into master Jan 1, 2014
@jsalvatier
Copy link
Member

Great!

@twiecki
Copy link
Member Author

twiecki commented Jan 2, 2014

Here are some visualizations:
http://twiecki.github.io/blog/2014/01/02/visualizing-mcmc/

@twiecki twiecki deleted the iter_sample branch January 2, 2014 15:19
@fonnesbeck
Copy link
Member

Very cool.

Regarding using this for convergence, it is probably more robust to use Gelman-Rubin with multiple chains than Geweke. In general, we should be working towards running multiple chains by default, since every modern machine will be multicore. This would facilitate (on-the-fly) R-hat calculation.

@twiecki
Copy link
Member Author

twiecki commented Jan 2, 2014

Agreed. Doing parallel sampling iteratively would require a bit more work but shouldn't be too hard. Will probably need to fix the parallel pickling issues we're having now (do those still exist @jsalvatier?).

@jsalvatier
Copy link
Member

I don't think we're getting errors when doing things in parallel now, but
you have to be careful in what you do.

On Thu, Jan 2, 2014 at 10:59 AM, Thomas Wiecki notifications@github.comwrote:

Agreed. Doing parallel sampling iteratively would require a bit more work
but shouldn't be too hard. Will probably need to fix the parallel pickling
issues we're having now (do those still exist @jsalvatierhttps://github.com/jsalvatier
?).


Reply to this email directly or view it on GitHubhttps://github.com//pull/433#issuecomment-31475215
.

kyleam referenced this pull request Oct 23, 2014
Before, `iter_sample` returned a single-chain trace object, not a
`MultiTrace` instance like `sample`. This is an issue for functions that
rely on a `MultiTrace`-like interface.

See #632.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants