Skip to content

Autocorrelation time: is taking the mean the right thing to do? #209

@fardal

Description

@fardal

I've been running some tests of some MCMC variations with emcee, and have been using the autocorrelation time to estimate the speedup. I was surprised how many steps it took to get a consistent autocorrelation time from EnsembleSampler.acor. So I tried a different script I had for the autocorrelation time, and got much more stable results. I am pretty sure the chief reason for the difference is that my routine calculates the autocorrelation function for each walker, then takes the mean. In contrast emcee first takes the mean of the walkers, and then computes the autocorrelation of this mean chain. It seems like the latter method is washing out the very fluctuations that the autocorrelation is supposed to measure.

Here is an example that compares the two approaches, using an autoregressive model where the exact autocorrelation time is known. Typical results look like this:

True autocorr length:  20.0084033613
Mean, std, min, max of two methods:
Corr-first approach:  19.7929288976 0.358159093104 18.9992952537 20.8373795818
Mean-first approach:  19.685956288 5.69231566791 11.0983690515 48.3939875026

My own routine for the autocorrelation time has its own drawbacks (not fast, questionable window choice) and I'm not suggesting it as a direct replacement, but it does seem to win easily in terms of precision. I think theoretically the advantage should be a factor sqrt(Nwalkers), though in practice it seems slightly larger.

I don't think the autocorrelation or the number of walkers or steps in this example is too unreasonable, compared to what people often run. The behavior is similar with chains actually generated from EnsembleSampler from the Gaussian posterior function in the intro docs, though in that case we don't know the true value for comparison.

I saw a note somewhere in the emcee repository that the reason for taking the mean is explained in Goodman and Weare, but I didn't follow the reasoning. Whatever the motivation is, I wonder if it justifies the much greater variance in the autocorrelation time. While acor is not the primary thing anyone's interested in, estimating it badly can lead to running the simulation much longer than necessary, or stopping it too soon and thus undersampling the posterior.

autocorr

"""Tests of autocorrelation length.
"""
import numpy as np
import matplotlib.pyplot as plt
from emcee import ensemble


def autoregressive():
    m_walkers = 64

    n_steps = 10000
    R1 = 0.9048    # length = 20

    # n_steps = 10000
    # R1 = 0.8    # length = 9

    # n_steps = 30000
    # R1 = 0.9355  # length = 30

    sig_tot = 1.0
    sig1 = sig_tot * np.sqrt(1. -R1**2)
    truelength = (R1**np.abs(np.arange(-1000,1000))).sum()
    # print 'true length: ', truelength
    chain = np.zeros((m_walkers, n_steps))
    chain[:,0] = sig_tot * np.random.normal(size=m_walkers)
    for i in xrange(1,n_steps):
        chain[:,i] = R1 * chain[:,i-1] + sig1 * np.random.normal(size=m_walkers)
    # print 'mean, expected: ', chain.mean(), 0.
    # print 'standard deviation, expected: ', chain.std(), sig_tot
    chain.shape = (m_walkers, n_steps, 1)  # make last index param dimension (1-d here)
    return chain, truelength


def findauto(params):
    ns = params.shape[0]
    mc = params.shape[1]
    nparams = params.shape[2]
    assert nparams == 1
    ncorr = min([400, (ns-10)])
    for k in xrange(nparams):
        chain = params[:,:,k]
        xsig = chain.std()
        chaincorr = np.zeros((ncorr,mc))
        for j in xrange(mc):
            for i in xrange(ncorr):
                nsample = len(chain[i:,j])
                chaincorr[i,j] = (chain[:nsample,j] * chain[i:,j]).mean()
                chainmean_a = chain[:nsample,j].mean()
                chainmean_b = chain[i:,j].mean()
                chaincorr[i,j] -= chainmean_a * chainmean_b
        corr = chaincorr.mean(axis=1)
        corr /= corr[0]
        index = np.arange(ncorr)
        nclust_cum = 2.*corr.cumsum() - 1.
        end = index > 3.*nclust_cum
        if end.any():
            imax = np.where(end)[0][0]
            nclust = nclust_cum[imax-1]
        else:
            imax = -1
            nclust = nclust_cum[-1]
            print 'WARNING: autocorr not converged.'
            
        return nclust  # hack for my test-1-param usage here

    
def findautofromemcee(chain):
    m_walkers, n_steps, dim = chain.shape
    def foofn():
        pass
    sampler = ensemble.EnsembleSampler(m_walkers, dim, foofn, a=2.0)
    sampler._chain = chain
    return sampler.acor[0]

    
def do():
    ntrial = 100
    # ntrial = 5
    corrfirst = np.zeros(ntrial)
    meanfirst = np.zeros(ntrial)
    for i in xrange(ntrial):
        if i % 10 == 0: print i
        chain, truelength = autoregressive()
        biechain = chain[:,:,0].T[:,:,np.newaxis]
        corrfirst[i] = findauto(biechain)
        meanfirst[i] = findautofromemcee(chain)
    print 'True autocorr length: ', truelength
    print 'Mean, std, min, max of two methods:'
    print 'Corr-first approach: ', corrfirst.mean(), corrfirst.std(), corrfirst.min(), corrfirst.max()
    print 'Mean-first approach: ', meanfirst.mean(), meanfirst.std(), meanfirst.min(), meanfirst.max()

    a,b,c = plt.hist(meanfirst, histtype='step', color='b', bins=20, label='Mean-first approach')
    a,b,c = plt.hist(corrfirst, histtype='step', color='r', bins=20, label='Corr-first approach')
    ylim = np.array(plt.ylim()); plt.plot(np.ones(2)+truelength, ylim, 'g--')
    plt.legend()
    plt.show()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions