New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thinning sample and increasing burnin after run? #23

Closed
mrkusrk opened this Issue Apr 21, 2015 · 4 comments

Comments

Projects
None yet
2 participants
@mrkusrk

mrkusrk commented Apr 21, 2015

After a recent sample run, I used pymc.Matplot.plot to display the trace Here. Obviously it could benefit from a longer burnin period as well as sample thinning. It's clear how to set these parameters before I start the run-- but I'm not sure how to do it after in a way that produces a pymc.database.pickle.Trace object such that I can still use the plot function and .stats() method. It's suggested in the documentation (section 3.5.1) that this is possible, however I cannot find the functions that implement either thinning or revision of the burnin. Do they exist? If so, how can I access them?

@fonnesbeck

This comment has been minimized.

Member

fonnesbeck commented Apr 21, 2015

You can burn and thin after sampling by using Numpy's array indexing, since the traces are Numpy arrays. So, if you wanted to do an extra burn in of 1000 iterations on a parameter called theta, you could do the following:

theta.trace()[1000:]

If you wanted to thin the same trace by a factor of 10:

theta.trace()[::10]

Note that thinning is essentially a waste of time, and I have considered removing it as an argument altogether. All that happens in an un-thinned trace (that is, one with autocorrelation) is that the effective sample size is reduced. When you thin, you do the exact same thing -- you reduce the sample size, but in a direct way. So, thinning does not buy you anything, and may actually cost you, relative to just using the trace as-is.

@mrkusrk

This comment has been minimized.

mrkusrk commented Apr 21, 2015

I undertstand that I can slice the trace using array operations, but as far as I can tell the arrays are not valid arguments to pymc.Matplot.plot -- or is there some workaround?

@fonnesbeck

This comment has been minimized.

Member

fonnesbeck commented Apr 21, 2015

Ah, I see what you are after. You can pass raw output to plot along with a name string:

Matplot.plot(theta.trace()[1000:], 'theta')
@mrkusrk

This comment has been minimized.

mrkusrk commented Apr 21, 2015

Excellent, thank you!

@fonnesbeck fonnesbeck closed this Apr 22, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment