# MCMC3.5: Jackknife Resampling

To estimate the error of observables accurately, it is recommended to use the Jackknife resampling method.

## Binning

Previously, we faced a large errorbar problem at low temperature in this program:

In [1]:
using ResumableFunctions
using SparseArrays
using LinearAlgebra
const Jx = 1 / 3 # oppposite sign to Motome's
const Jy = 1 / 3
const Jz = 1 / 3
function Metropolis(βF::Float64, βFnew::Float64)::Bool
    βF - βFnew > log(rand())
end
function openhoneycomb(Lx::Int64, Ly::Int64)::Tuple
    N = 2Lx * Ly
    nnx = zip(1 : 2 : (N - 1), 2 : 2 : N)
    nny = Iterators.flatten(zip((1 + 2i) : 2Lx : (2Lx * (Ly - 1)  + 1 + 2i), 2i : 2Lx : (2Lx * (Ly - 1)  + 2i)) for i in 1 : (Lx - 1))
    nnz = zip(1 : 2 : (N - 1), Iterators.flatten(((2Lx + 2) : 2 : N, 2 : 2 : 2Lx)))
    plaquette = Iterators.flatten(zip((Lx * (i - 1) + 1) : (Lx * (i - 1) + Lx - 1), (Lx * (i - 1) + 2) : (Lx * (i - 1) + Lx)) for i in 1 : Ly)
    N, nnx, nny, nnz, plaquette
end
@resumable function measurementflux(method::Function, lattice::Function, β::Float64, Lx::Int64, Ly::Int64)::Float64
    N, nnx, nny, nnz, plaquette = lattice(Lx, Ly)
    iter = Iterators.flatten((Iterators.product(J, nn) for (J, nn) in [(Jx, nnx), (Jy, nny), (Jz, nnz)]))
    h = spzeros(Complex{Float64}, N, N)
    for (J, nn) in iter
        h[nn[1], nn[2]] = 0.5im * J
        h[nn[2], nn[1]] = -0.5im * J
    end
    NNz = collect(nnz)
    Nz = length(NNz)
    η = ones(Int64, Nz)
    βF = 0.0
    hdense = Array(h)
    plaq = collect(plaquette)
    Np = length(plaq)
    while true
        for i in 1 : Nz
            j = rand(1 : Nz)
            hdense[NNz[j][1], NNz[j][2]] = -hdense[NNz[j][1], NNz[j][2]]
            hdense[NNz[j][2], NNz[j][1]] = -hdense[NNz[j][2], NNz[j][1]]
            ev = eigvals(Hermitian(hdense))
            positiveev = Iterators.drop(ev, N >> 1)
            βFnew = -sum((log(2.0 * cosh(β * ϵ / 2.0)) for ϵ in positiveev))
            if method(βF, βFnew)
                η[j] = -η[j]
                βF = βFnew
            else
                hdense[NNz[j][1], NNz[j][2]] = -hdense[NNz[j][1], NNz[j][2]]
                hdense[NNz[j][2], NNz[j][1]] = -hdense[NNz[j][2], NNz[j][1]]
            end
        end
        @yield sum((η[k] * η[l] for (k, l) in plaq)) / Np
    end
end

measurementflux (generic function with 1 method)

One reason is the discreteness of the returned value "flux."

In [2]:
mcstep = Iterators.drop(measurementflux(Metropolis, openhoneycomb, 100.0, 4, 4), 10000)
foreach(println, Iterators.take(mcstep, 10))

0.16666666666666666
0.16666666666666666
0.5
0.6666666666666666
0.6666666666666666
0.8333333333333334
0.5
0.3333333333333333
0.3333333333333333
0.0


We have to flatten these quantized values by binning.

In [3]:
using Statistics
Nsample = 10000
Nbin = 100
Nbinsize = Nsample ÷ Nbin
iter = Iterators.partition(Iterators.take(mcstep, Nsample), Nbinsize)
bin = collect(map(mean, iter))

100-element Array{Float64,1}:
 0.295              
 0.3                
 0.30333333333333334
 0.30499999999999994
 0.33166666666666655
 0.23499999999999996
 0.2433333333333333 
 0.32333333333333336
 0.29166666666666663
 0.29               
 0.26999999999999996
 0.34               
 0.27               
 ⋮                  
 0.24833333333333332
 0.27499999999999997
 0.30666666666666664
 0.31499999999999995
 0.36166666666666664
 0.28               
 0.24666666666666665
 0.3433333333333334 
 0.3066666666666667 
 0.34               
 0.32166666666666666
 0.29833333333333334

Now it works!

In [5]:
m = mean(bin)
s = stdm(bin, m) / sqrt(length(bin))
println("$m ± $s")

0.30245 ± 0.0037873063877276957


Nbinsize has to be determined based on the autocorrelation. In order to reduce Nbin to get a more acculate result, we need to implement some global updating algorithm.

## Delete-1 jackknife resampling

Delete-1 jackknife resampling is simply implemented in https://github.com/ararslan/Jackknife.jl. However, the function is limited, so I will newly define functions for the jackknife resampling.

In [6]:
"""
This function mimics mapreduce but op should be any functions which
returns one value from a vector.
"""
function leaveoneout(f::Function, op::Function, v::AbstractVector)
    ind = eachindex(v)
    map(i -> op(map(f, view(v, filter(!isequal(i), ind)))), ind)
end
meanJ(f::Function, op::Function, v::AbstractVector) = mean(leaveoneout(f, op, v))
stdmJ(f::Function, op::Function, v::AbstractVector, m) = stdm(leaveoneout(f, op, v), m, corrected = false) * sqrt(length(v) - 1)
stdJ(f::Function, op::Function, v::AbstractVector) = stdmJ(f, op, v, meanJ(f, op, v))

stdJ (generic function with 1 method)

The functions are based on Statistics.jl and Jackknife.jl, so please see their reference to know how it works. I again use measurementEf for the demonstration.

In [7]:
@resumable function measurementEf(method::Function, lattice::Function, β::Float64, Lx::Int64, Ly::Int64)::Vector{Float64}
    N, nnx, nny, nnz, plaquette = lattice(Lx, Ly)
    iter = Iterators.flatten((Iterators.product(J, nn) for (J, nn) in [(Jx, nnx), (Jy, nny), (Jz, nnz)]))
    h = spzeros(Complex{Float64}, N, N)
    for (J, nn) in iter
        h[nn[1], nn[2]] = 0.5im * J
        h[nn[2], nn[1]] = -0.5im * J
    end
    NNz = collect(nnz)
    Nz = length(NNz)
    η = ones(Int64, Nz)
    βF = 0.0
    β₂ = β * 0.5
    hdense = Array(h)
    plaq = collect(plaquette)
    Np = length(plaq)
    ev = zeros(Float64, N)
    while true
        for i in 1 : Nz
            j = rand(1 : Nz)
            hdense[NNz[j][1], NNz[j][2]] = -hdense[NNz[j][1], NNz[j][2]]
            hdense[NNz[j][2], NNz[j][1]] = -hdense[NNz[j][2], NNz[j][1]]
            evnew = eigvals(Hermitian(hdense))
            βFnew = -sum(@. log(exp(β₂ * evnew[(N >> 1 + 1) : end]) + exp(-β₂ * evnew[(N >> 1 + 1) : end])))
            if method(βF, βFnew)
                η[j] = -η[j]
                βF = βFnew
                ev .= evnew
            else
                hdense[NNz[j][1], NNz[j][2]] = -hdense[NNz[j][1], NNz[j][2]]
                hdense[NNz[j][2], NNz[j][1]] = -hdense[NNz[j][2], NNz[j][1]]
            end
        end
        Ef = -sum(@. ev[(N >> 1 + 1) : end] * tanh(β₂ * ev[(N >> 1 + 1) : end] )) * 0.5
        ∂Ef∂β = -sum(@. (ev[(N >> 1 + 1) : end] ^ 2) * (sech(β₂ * ev[(N >> 1 + 1) : end]) ^ 2)) * 0.25
        @yield [Ef, ∂Ef∂β]
    end
end

measurementEf (generic function with 1 method)

It is ok to assume the bin size to be 1 for this problem. Thus, I omit Iterators.partition here.

In [8]:
mcstep2 = Iterators.drop(measurementEf(Metropolis, openhoneycomb, 10.0, 4, 4), 10000)
iter2 = Iterators.take(mcstep2, Nsample)
data = collect(iter2)

10000-element Array{Array{Float64,1},1}:
 [-1.6762, -0.0454922] 
 [-1.67994, -0.0460358]
 [-1.69272, -0.0482142]
 [-1.68707, -0.0470462]
 [-1.68707, -0.0470462]
 [-1.68042, -0.0454945]
 [-1.6847, -0.0468573] 
 [-1.68091, -0.0457212]
 [-1.6723, -0.0448412] 
 [-1.69037, -0.0474662]
 [-1.6793, -0.045817]  
 [-1.66923, -0.0439858]
 [-1.68973, -0.0471662]
 ⋮                     
 [-1.67659, -0.0456538]
 [-1.68438, -0.0468955]
 [-1.69643, -0.0487761]
 [-1.68723, -0.0471268]
 [-1.68658, -0.0466584]
 [-1.69136, -0.0477004]
 [-1.69295, -0.0483297]
 [-1.67074, -0.0444242]
 [-1.68534, -0.0471164]
 [-1.67935, -0.0458393]
 [-1.67246, -0.0441295]
 [-1.67686, -0.0450137]

By setting f = identity (i.e. doing nothing for data) and op = mean, meanJ and stdmJ work in the same way as mean and stdm.

In [9]:
m2 = meanJ(identity, mean, data)
s2 = stdmJ(identity, mean, data, m2)
println("Ef = $(m2[1]) ± $(s2[1]), ∂Ef∂β = $(m2[2]) ± $(s2[2])")

Ef = -1.6833559836793284 ± 7.569447076947603e-5, ∂Ef∂β = -0.04645930869265934 ± 1.3418194240851154e-5


This agrees with the standard estimation method for the error bars.

In [10]:
std(data) / sqrt(length(data))

2-element Array{Float64,1}:
 7.569447076955521e-5 
 1.3418194240831252e-5

For such mean values (i.e. op = mean), the jackknife resampling is apparently overkill. However, to estimate the error for the values like the specific heat, the jackknife resampling is very effective.

Since the Jackknife method already includes the bias coming from the autocorrelation, there is no need to think about the autocorrelation in this case. However, the autocorrelation prevents the error from decreacing rapidly, so it is still important to estimate the autocorrelation length.

## Autocorrelation

The simplest way to conider autocorrelation is

fuck