# Capse.jl reloaded: using Chebyshev polynomials

In this notebook we are going to use the trained Capse.jl emulators which takes advantage of the Chebyshev polynomials decomposition. The core ides is that, rather than using a Neural Networks that outputs the $C_\ell$'s directly

$$
\theta\rightarrow \mathrm{NN}(\theta)\rightarrow C_\ell(\theta)
$$

we decompose, at a fixed cosmology, the $C_\ell$'s on the Chebyshev basis

$$
C_\ell(\theta)\approx\sum_{i=0}^N a_i(\theta)T_i
$$

where $T_i$ is the $i$-th grade Chebyshev polynomial.
In this case, the cosmological dependence is encoded in the Chebyshev expansion coefficients, which are the emulation target.

$$
\theta\rightarrow\mathrm{NN}(\theta)\rightarrow a_i(\theta)\rightarrow C_\ell(\theta)
$$

In the first part of the notebook we are showing how the Chebyshev expansions can be used to approximate the CMB $C_\ell$'s.

In the second part  we are showing how to compute some Planck chains using the emulator.

Let us start activating the static Julia environment and importing the relevant packages.

In [1]:
using Pkg
Pkg.activate(".")
Pkg.instantiate()
Pkg.resolve()

[32m[1m  Activating[22m[39m project at `~/Desktop/papers/capse_paper/chebyshev_emulator`
[32m[1m  No Changes[22m[39m to `~/Desktop/papers/capse_paper/chebyshev_emulator/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/papers/capse_paper/chebyshev_emulator/Manifest.toml`


In [2]:
using FastChebInterp
using BenchmarkTools
using LoopVectorization
using SimpleChains
using Turing
using Optim
using LinearAlgebra
using StatsPlots
using Pathfinder
using Capse
using NPZ
import MCMCChains: compute_duration
using MCMCDiagnosticTools
using StatsPlots
using MicroCanonicalHMC
using Transducers
using MCMCDiagnosticTools
using DataFrames
using PlanckLite
include("utils.jl");

Since we are going to focus on the Planck analysis, here we are just going to use the multipoles $\ell\in[2,2508]$.
 In this example we are using a polynomial of grade $47$.

In [3]:
min_idx = 3
max_idx = 2509

grad_cheb = 48
weights_folder = "../weights/weights_cheb_cosmopowerspace_10000/"
l = Float64.(npzread(weights_folder*"l.npy")[min_idx:max_idx]);

# Checking the emulator: using the validation dataset

Here we are going to show how to use the emulators and the emulation error on the validation dataset.
Here we are defining the MultiLayer Perceptron Architecture. Please: do not touch this cell!

In [4]:
mlpd = SimpleChain(
  static(6),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(identity, grad_cheb)
);

Let us load the emulators.

In [5]:
weights_TT = npzread(weights_folder*"weights_TT_lcdm.npy")
trained_emu_TT = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_TT)
CℓTT_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_TT, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓTT_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [6]:
weights_EE = npzread(weights_folder*"weights_EE_lcdm.npy")
trained_emu_EE = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_EE)
CℓEE_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_EE, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓEE_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [7]:
weights_TE = npzread(weights_folder*"weights_TE_lcdm.npy")
trained_emu_TE = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_TE)
CℓTE_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_TE, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓTE_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [8]:
weights_PP = npzread(weights_folder*"weights_PP_lcdm.npy")
trained_emu_PP = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_PP)
CℓPP_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_PP, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓPP_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

The first thing to do is to evalute the PolyGrid: the polynomials on the $\ell$ grid used in the training.
After the evaluation, the result is stored and doesn't need to be computed again.

In [9]:
Capse.eval_polygrid!(CℓEE_emu)
Capse.eval_polygrid!(CℓTT_emu)
Capse.eval_polygrid!(CℓTE_emu)
Capse.eval_polygrid!(CℓPP_emu)

Let us now benchmark the $C_\ell$'s computation

In [10]:
input_test = rand(6)
@benchmark Capse.get_Cℓ($input_test, $CℓTE_emu)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m11.272 μs[22m[39m … [35m 2.775 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 97.86%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m15.810 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m17.086 μs[22m[39m ± [32m38.943 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m3.11% ±  1.37%

  [39m [39m [39m▃[39m█[39m▃[39m [39m [39m [39m [39m [39m [39m [39m▁[39m▅[39m▆[34m▂[39m[39m [39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▃[39m█[39m█[39m█[39m▆

And also the Chebyshev coefficients emulation

In [11]:
@benchmark Capse.get_chebcoefs($input_test, $CℓTE_emu)

BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m3.030 μs[22m[39m … [35m810.948 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 98.88%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m3.742 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m3.849 μs[22m[39m ± [32m  8.090 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m2.08% ±  0.99%

  [39m [39m [39m [39m [39m [39m [39m█[39m▃[39m [39m [39m [39m [39m [39m [39m▂[39m▁[34m▇[39m[39m▃[32m▂[39m[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▃[39m▂[39m▂[39m▁[39m▂

Here we want to emphasize an important point:

- emulating the Chebyshev coefficients requires around $4\,\mu s$
- emulating the Chebyshev coefficients $\textit{and}$  computing the $C_\ell$'s takes $15\,\mu s$

This simple observation can suggest us something: if we are able to write the likelihood in such a way that we do NOT compute the $C_\ell$'s we can improve the overall computational performance.

# PlanckLite & Chebyshev

In [12]:
lsTT = 2:2508
lsTE = 2:1996
facTT=lsTT.*(lsTT.+1)./(2*π)
facTE=lsTE.*(lsTE.+1)./(2*π)

function call_emu_plancklite(θ, Emu_TT, Emu_TE, Emu_EE, facTT, facTE)
    return PlanckLite.bin_Cℓ(Capse.get_Cℓ(θ, Emu_TT)[1:2507]./facTT,
                            Capse.get_Cℓ(θ, Emu_TE)[1:1995]./facTE,
                            Capse.get_Cℓ(θ, Emu_EE)[1:1995]./facTE)
end

call_emu_plancklite (generic function with 1 method)

In [13]:
Γ = sqrt(PlanckLite.cov)
iΓ = inv(Γ)
D = iΓ * PlanckLite.data;

In [14]:
theory_plancklite(θ) = call_emu_plancklite(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, facTT, facTE)

theory_plancklite (generic function with 1 method)

In [15]:
@benchmark theory_plancklite(ones(6))

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m241.318 μs[22m[39m … [35m  6.483 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 89.79%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m287.275 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m320.320 μs[22m[39m ± [32m374.620 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m7.49% ±  6.09%

  [39m▂[39m▃[39m▁[39m▁[39m▁[39m▆[39m▇[39m█[34m█[39m[39m▇[39m▆[39m▄[39m▄[39m▃[32m▃[39m[39m▃[39m▂[39m▂[39m▂[39m▁[39m▁[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█

In [16]:
function bin_grid(Emu_TT, Emu_TE, Emu_EE, facTT, facTE)
    result = zeros(613,48)
    for i in 1:48
    result[:, i] =  PlanckLite.bin_Cℓ(Emu_TT.PolyGrid[1:2507,i]./facTT,
                             Emu_TE.PolyGrid[1:1995,i]./facTE,
                             Emu_EE.PolyGrid[1:1995,i]./facTE)
    end
    return result
end

binned_grid_std = bin_grid(CℓTT_emu, CℓTE_emu, CℓEE_emu, facTT, facTE);

In [17]:
function fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid)
    coeff_TT = Capse.get_chebcoefs(θ, CℓTT_emu)
    coeff_TE = Capse.get_chebcoefs(θ, CℓTE_emu)
    coeff_EE = Capse.get_chebcoefs(θ, CℓEE_emu)
    TT = binned_grid[1:215,:]   * coeff_TT
    TE = binned_grid[216:414,:] * coeff_TE
    EE = binned_grid[415:613,:] * coeff_EE

    return vcat(TT, TE, EE)
end

function super_fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_TT, binned_grid_TE, binned_grid_EE)
    #coeff_TT = Capse.get_chebcoefs(θ, CℓTT_emu)
    #coeff_TE = Capse.get_chebcoefs(θ, CℓTE_emu)
    #coeff_EE = Capse.get_chebcoefs(θ, CℓEE_emu)
    #TT = binned_grid_TT   * coeff_TT
    #TE = binned_grid_TE * coeff_TE
    #EE = binned_grid_TT * coeff_EE

    return binned_grid_TT * Capse.get_chebcoefs(θ, CℓTT_emu) +
                binned_grid_TE * Capse.get_chebcoefs(θ, CℓTE_emu) +
                binned_grid_EE * Capse.get_chebcoefs(θ, CℓEE_emu)
end 

binned_grid_TT_std = binned_grid_std[1:215,:]
binned_grid_TE_std = binned_grid_std[216:414,:]

binned_grid_TT = zeros(613,48)
binned_grid_TE = zeros(613,48)
binned_grid_EE = zeros(613,48)

binned_grid_TT[1:215,:]   = binned_grid_std[1:215,:]
binned_grid_TE[216:414,:] = binned_grid_std[216:414,:]
binned_grid_EE[415:613,:] = binned_grid_std[415:613,:]

binned_grid_TT = iΓ * binned_grid_TT
binned_grid_TE = iΓ * binned_grid_TE
binned_grid_EE = iΓ * binned_grid_EE


theory_fast_std(θ) = fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_std)
theory_fast(θ) = super_fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_TT, binned_grid_TE, binned_grid_EE)

theory_fast (generic function with 1 method)

In [18]:
iΓ * theory_fast_std(ones(6)) .- theory_fast(ones(6))

613-element Vector{Float64}:
 -1.6653345369377348e-15
  6.661338147750939e-16
 -6.661338147750939e-16
  1.9984014443252818e-15
 -1.5543122344752192e-15
 -6.661338147750939e-16
  2.4424906541753444e-15
 -6.661338147750939e-16
 -6.661338147750939e-16
  0.0
 -1.5543122344752192e-15
 -2.220446049250313e-15
  2.220446049250313e-16
  ⋮
 -3.7730235602495554e-17
  1.1102230246251565e-16
 -1.8735013540549517e-16
  8.326672684688674e-17
 -1.1102230246251565e-16
 -9.71445146547012e-17
  7.28583859910259e-17
 -2.7755575615628914e-17
 -1.1796119636642288e-16
 -8.326672684688674e-17
  6.938893903907228e-18
  1.1102230246251565e-16

In [19]:
@benchmark theory_fast(ones(6))

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m23.858 μs[22m[39m … [35m 5.837 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 84.04%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m32.002 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m32.891 μs[22m[39m ± [32m81.632 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m2.91% ±  1.18%

  [39m [39m [39m [39m [39m▂[39m▅[39m█[39m█[39m▇[39m▅[39m▃[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [34m [39m[39m▁[32m▁[39m[39m▄[39m▅[39m▅[39m▃[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▁[39m▂[39m▆[39m█[39m█

In [20]:
@model function CMB_planck_ultra_fast(D)
    #prior on model parameters
    ln10As ~ Uniform(0.25, 0.35)
    ns     ~ Uniform(0.88, 1.06)
    h0     ~ Uniform(0.60, 0.80)
    ωb     ~ Uniform(0.1985, 0.25)
    ωc     ~ Uniform(0.08, 0.20)
    τ      ~ Normal(0.0506, 0.0086)
    yₚ     ~ Normal(1.0, 0.0025)

    θ = [10*ln10As, ns, 100*h0, ωb/10, ωc, τ]
    #compute theoretical prediction
    pred = theory_fast(θ) ./(yₚ^2)
    #compute likelihood
    D ~ MvNormal(pred, I)

    return nothing
end

CMB_model_planck_ultra_fast = CMB_planck_ultra_fast(D)

DynamicPPL.Model{typeof(CMB_planck_ultra_fast), (:D,), (), (), Tuple{Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(CMB_planck_ultra_fast, (D = [9.643012835639828, 10.803277764202981, 12.837021443612652, 12.159764105419248, 11.744821016084751, 13.713281787419634, 13.957299336082897, 15.005683408044218, 15.585981279991291, 13.764821092399968  …  -0.4208097528485547, 0.9948910443777017, 3.0065727299752423, 0.38248077476446013, -1.567094536865492, 1.6989024589667159, 1.0732606672031262, -1.6425448532692997, 2.311295810549454, 0.6863715328875395],), NamedTuple(), DynamicPPL.DefaultContext())

In [27]:
bestfit_Planck = optimize(CMB_model_planck_ultra_fast, MAP(), Optim.Options(iterations=100000, allow_f_increases=true))

ModeResult with maximized lp of -836.24
[0.305187288403472, 0.9634362838080824, 0.6713595762487374, 0.22341839776457806, 0.12063380739590773, 0.05713245520556747, 1.0005309971887268]

In [28]:
@benchmark optimize(CMB_model_planck_ultra_fast, MAP(), Optim.Options(iterations=100000, allow_f_increases=true))

BenchmarkTools.Trial: 81 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m34.039 ms[22m[39m … [35m153.964 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 5.17%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m53.444 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m63.049 ms[22m[39m ± [32m 25.805 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m5.18% ± 6.07%

  [39m▁[39m▁[39m [39m▁[39m▆[39m█[39m▃[39m [39m▃[39m▃[39m▃[34m [39m[39m▁[39m [39m [39m [39m [32m [39m[39m▁[39m [39m▁[39m [39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m█[39m█[39m▇[39m█[39m█[3

In [21]:
result_multi = multipathfinder(CMB_model_planck_ultra_fast, 5000; nruns=6, executor = Transducers.PreferParallel())
@time result_multi = multipathfinder(CMB_model_planck_ultra_fast, 5000; nruns=6, executor = Transducers.PreferParallel())

[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl

  2.251304 seconds (2.75 M allocations: 1.873 GiB, 6.08% gc time)


Multi-path Pathfinder result
  runs: 6
  draws: 5000
  Pareto shape diagnostic: 0.68 (ok)

In [22]:
result_multi.draws_transformed

Chains MCMC chain (5000×7×1 Array{Float64, 3}):

Iterations        = 1:1:5000
Number of chains  = 1
Samples per chain = 5000
parameters        = ln10As, ns, h0, ωb, ωc, τ, yₚ

Summary Statistics
 [1m parameters [0m [1m    mean [0m [1m     std [0m [1m naive_se [0m [1m    mcse [0m [1m       ess [0m [1m    rhat [0m
 [90m     Symbol [0m [90m Float64 [0m [90m Float64 [0m [90m  Float64 [0m [90m Float64 [0m [90m   Float64 [0m [90m Float64 [0m

      ln10As    0.3051    0.0017     0.0000    0.0000   5054.3383    0.9999
          ns    0.9633    0.0043     0.0001    0.0001   4874.5400    1.0000
          h0    0.6714    0.0060     0.0001    0.0001   4851.8876    0.9999
          ωb    0.2235    0.0015     0.0000    0.0000   4860.5026    1.0001
          ωc    0.1206    0.0014     0.0000    0.0000   4901.2249    1.0000
           τ    0.0566    0.0084     0.0001    0.0001   5152.8918    0.9998
          yₚ    1.0005    0.0026     0.0000    0.0000   5097.8528    1.0013

In [23]:
nsteps = 5000
nadapts = 500
nchains = 6

init_params = collect.(eachrow(result_multi.draws_transformed.value[1:nchains, :, 1]));

In [24]:
chains_planck_std_NUTS = sample(CMB_model_planck_ultra_fast, NUTS(nadapts, 0.65), MCMCThreads(), 5000, nchains; init_params = init_params)

[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.000732421875
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.000390625
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.000390625
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.000390625
[32mSampling (6 threads): 100%|█████████████████████████████| Time: 0:01:56[39m


Chains MCMC chain (5000×19×6 Array{Float64, 3}):

Iterations        = 501:1:5500
Number of chains  = 6
Samples per chain = 5000
Wall duration     = 233.86 seconds
Compute duration  = 1031.24 seconds
parameters        = ln10As, ns, h0, ωb, ωc, τ, yₚ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
 [1m parameters [0m [1m    mean [0m [1m     std [0m [1m naive_se [0m [1m    mcse [0m [1m        ess [0m [1m    rhat [0m [1m[0m ⋯
 [90m     Symbol [0m [90m Float64 [0m [90m Float64 [0m [90m  Float64 [0m [90m Float64 [0m [90m    Float64 [0m [90m Float64 [0m [90m[0m ⋯

      ln10As    0.3052    0.0017     0.0000    0.0000   11536.2204    1.0004   ⋯
          ns    0.9635    0.0044     0.0000    0.0000    9843.4582    1.0003   ⋯
          h0    0.6714    0.0060     0.0000    0.0001    7818.1383

In [25]:
CPU_s_Planck_NUTS = compute_duration(chains_planck_std_NUTS)
Planck_NUTS_ESS = mean(MCMCDiagnosticTools.ess_rhat(chains_planck_std_NUTS)[[:ln10As, :ns, :h0, :ωb,:ωc, :τ, :yₚ],:ess])
Planck_NUTS_ESS_s = Planck_NUTS_ESS/CPU_s_Planck_NUTS

13.625230630885428

## MCHMC Stuff

In [29]:
d = 7
target = TuringTarget(CMB_model_planck_ultra_fast)
nadapts = 20_000
nsteps = 200000

spl = MCHMC(nadapts, 0.001; init_eps=0.05, L=sqrt(d),# sigma=ones(d),  #try higher init_eps
            adaptive=true)
start_mchmc = time()
@time planck_mchmc = Sample(spl, target, nsteps;
                    progress=true,
                    dialog=true, file_name="chain_1",
                    initial_x=bestfit_Planck.values.array)
end_mchmc = time()
end_mchmc - start_mchmc

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▌                            |  ETA: 0:00:08[39m

Burn in step: 4000


[32mMCHMC (tuning):  20%|███████                            |  ETA: 0:00:08[39m

eps --->0.001071261550603264


[32mMCHMC (tuning):  39%|█████████████▊                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.09003837253635981


[32mMCHMC (tuning):  59%|████████████████████▊              |  ETA: 0:00:04[39m

Burn in step: 12000
eps --->0.09950108084928186


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.08612716213341816


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.10749541188850717


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.10749541188850717
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.10996861200892329
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.07711308411879499, 0.20260245978857944, 0.21503762824055003, 0.12840579611625044, 0.08114520897726812, 0.009293887991398367, 0.0023192761480757096]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:05[39m

 78.544728 seconds (71.28 M allocations: 73.659 GiB, 7.93% gc time, 4.41% compilation time: <1% of which was recompilation)


78.57601404190063

In [31]:
n_parallel_mchmc = 8
chains = Vector{Any}(undef, n_parallel_mchmc)
vec_ess = zeros(n_parallel_mchmc)

start_mchmc = time()
@time for i in 1:n_parallel_mchmc
    chains[i] = Sample(MCHMC(nadapts, 0.001; init_eps=0.05, L=sqrt(d), adaptive=true), target, nsteps;
                       progress=true,
                       dialog=true, file_name="chain_1",
                       initial_x=bestfit_Planck.values.array)
    vec_ess[i] = mean(Summarize(chains[i])[1][1:7])
end

end_mchmc = time()
time_mchmc_parallel_Planck = end_mchmc - start_mchmc

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|██████▉                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0008712916688737017


[32mMCHMC (tuning):  40%|██████████████                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.014923325257076434


[32mMCHMC (tuning):  59%|████████████████████▊              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.017751291101307364


[32mMCHMC (tuning):  79%|███████████████████████████▋       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.02291769305407718


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:07[39m

Burn in step: 20000
eps --->0.013700462742265019


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.013700462742265019
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.03856421604989937
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.5379006559205911, 0.6154570877709639, 0.7515065999224838, 0.4564835692695301, 0.28386569311500365, 0.020019522921321836, 0.026068370100390946]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:03[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▌                            |  ETA: 0:00:05[39m

Burn in step: 4000
eps --->0.0017088986204636804


[32mMCHMC (tuning):  39%|█████████████▋                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.005931500676072603


[32mMCHMC (tuning):  59%|████████████████████▋              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.008997865368823688


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.00891709657347562


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:07[39m

Burn in step: 20000
eps --->0.009690149908083463


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.009690149908083463
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.032408020877642114
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [1.1269817274221217, 0.30930376800083703, 0.47086813163425284, 1.0692721142423747, 0.1493989735698836, 0.04821889029455214, 0.016151785614939216]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:06[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|███████                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.001180472742811993


[32mMCHMC (tuning):  39%|█████████████▋                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.013504408206032674


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.01565893143720719


[32mMCHMC (tuning):  79%|███████████████████████████▋       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.019892067530506232


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.023097628796563272


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.023097628796563272
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.05016186075795197
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.38696770569233513, 0.1863104762568004, 0.4189243323402523, 0.1516399053779516, 0.16729112577611902, 0.016652575999558803, 0.02218479093275954]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:06[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|███████                            |  ETA: 0:00:05[39m

Burn in step: 4000
eps --->0.000818679703252221


[32mMCHMC (tuning):  40%|██████████████                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.0064710630155018715


[32mMCHMC (tuning):  59%|████████████████████▊              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.008661245618626266


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.013079331135260034


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.017687775130696862


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.017687775130696862
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.043851191163672226
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.9294305286306811, 0.26715243354898266, 0.34712646983740586, 0.37857623243829164, 0.1615264414550656, 0.023686401411764924, 0.039233862165349284]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:05[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|██████▉                            |  ETA: 0:00:05[39m

Burn in step: 4000
eps --->0.001121682589609043


[32mMCHMC (tuning):  39%|█████████████▊                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.013068466945803894


[32mMCHMC (tuning):  60%|█████████████████████              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.015969583366740397


[32mMCHMC (tuning):  80%|████████████████████████████       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.026451172160349794


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.025665414400416243


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.025665414400416243
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.052902412270285186
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.42895983547440664, 0.2936403097263702, 0.6733522807576505, 0.5567256236322892, 0.14812816685373653, 0.025943761936916897, 0.011296043516274567]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:05[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▌                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0011111875350427742


[32mMCHMC (tuning):  40%|██████████████                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.05386379032917042


[32mMCHMC (tuning):  59%|████████████████████▋              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.06755398474148226


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.10322874771956835


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.09182133577952538


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.09182133577952538
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.10133115639164428
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.062416327166186336, 0.5127141567876453, 0.19351785832374635, 0.15950917523227773, 0.13464471032236594, 0.008242060268859559, 0.002396356918780667]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:04[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0011898772924798736


[32mMCHMC (tuning):  39%|█████████████▋                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.02579652563436573


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.016052392944034728


[32mMCHMC (tuning):  79%|███████████████████████████▋       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.020930126381991893


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.021857324965269917


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.021857324965269917
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.04878500561020701
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.41029027284913544, 0.26060847992556374, 0.1826518136286482, 0.18350161708391902, 0.16111531273888893, 0.01766734409535758, 0.02526668724851994]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:06[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0009537521967685694


[32mMCHMC (tuning):  40%|██████████████                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.006130977648340512


[32mMCHMC (tuning):  59%|████████████████████▋              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.010201540274976923


[32mMCHMC (tuning):  80%|███████████████████████████▉       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.01168944087272976


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.011839433964718591


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.011839433964718591
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.03583681224321547
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.7480362620435304, 0.21964586727268903, 0.29230392367052555, 0.18142550497808402, 0.1607324730938147, 0.034908067729428185, 0.014483530292841433]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:04[39m

605.279083 seconds (521.36 M allocations: 633.994 GiB, 8.96% gc time)


605.3349211215973

In [33]:
Planck_MCHMC_parallel_ESS_s = sum(vec_ess)/time_mchmc_parallel_Planck

32.70205001929367

In [36]:
x = [mapreduce(permutedims, vcat, chains[i]) for i in 1:n_parallel_mchmc]

planck_mchmc_multi_chains = zeros(nsteps*n_parallel_mchmc, 7)
for i in 1:7
    planck_mchmc_multi_chains[:,i] = extract_single(x, i, n_parallel_mchmc)
end

In [37]:
npzwrite("chains_Planck_cheb_MOPED_MCHMC_multi.npy", planck_mchmc_multi_chains)