# Capse.jl reloaded: using Chebyshev polynomials

In this notebook we are going to use the trained Capse.jl emulators which takes advantage of the Chebyshev polynomials decomposition. The core ides is that, rather than using a Neural Networks that outputs the $C_\ell$'s directly

$$
\theta\rightarrow \mathrm{NN}(\theta)\rightarrow C_\ell(\theta)
$$

we decompose, at a fixed cosmology, the $C_\ell$'s on the Chebyshev basis

$$
C_\ell(\theta)\approx\sum_{i=0}^N a_i(\theta)T_i
$$

where $T_i$ is the $i$-th grade Chebyshev polynomial.
In this case, the cosmological dependence is encoded in the Chebyshev expansion coefficients, which are the emulation target.

$$
\theta\rightarrow\mathrm{NN}(\theta)\rightarrow a_i(\theta)\rightarrow C_\ell(\theta)
$$

In the first part of the notebook we are showing how the Chebyshev expansions can be used to approximate the CMB $C_\ell$'s.

In the second part  we are showing how to compute some Planck chains using the emulator.

Let us start activating the static Julia environment and importing the relevant packages.

In [1]:
using Pkg
Pkg.activate(".")
Pkg.instantiate()
Pkg.resolve()

[32m[1m  Activating[22m[39m project at `~/Desktop/papers/capse_paper/chebyshev_emulator`
[32m[1m  No Changes[22m[39m to `~/Desktop/papers/capse_paper/chebyshev_emulator/Project.toml`
[32m[1m  No Changes[22m[39m to `~/Desktop/papers/capse_paper/chebyshev_emulator/Manifest.toml`


In [2]:
using FastChebInterp
using BenchmarkTools
using LoopVectorization
using SimpleChains
using Turing
using Optim
using LinearAlgebra
using StatsPlots
using Pathfinder
using Capse
using NPZ
import MCMCChains: compute_duration
using MCMCDiagnosticTools
using StatsPlots
using MicroCanonicalHMC
using Transducers
using MCMCDiagnosticTools
using DataFrames
using PlanckLite
include("utils.jl");

Since we are going to focus on the Planck analysis, here we are just going to use the multipoles $\ell\in[2,2508]$.
 In this example we are using a polynomial of grade $47$.

In [3]:
min_idx = 3
max_idx = 2509

grad_cheb = 48
weights_folder = "../data/weights/weights_cheb_cosmopowerspace_10000/"
l = Float64.(npzread(weights_folder*"l.npy")[min_idx:max_idx]);

# Checking the emulator: using the validation dataset

Here we are going to show how to use the emulators and the emulation error on the validation dataset.
Here we are defining the MultiLayer Perceptron Architecture. Please: do not touch this cell!

In [4]:
mlpd = SimpleChain(
  static(6),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(tanh, 64),
  TurboDense(identity, grad_cheb)
);

Let us load the emulators.

In [5]:
weights_TT = npzread(weights_folder*"weights_TT_lcdm.npy")
trained_emu_TT = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_TT)
CℓTT_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_TT, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓTT_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [6]:
weights_EE = npzread(weights_folder*"weights_EE_lcdm.npy")
trained_emu_EE = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_EE)
CℓEE_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_EE, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓEE_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [7]:
weights_TE = npzread(weights_folder*"weights_TE_lcdm.npy")
trained_emu_TE = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_TE)
CℓTE_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_TE, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓTE_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

In [8]:
weights_PP = npzread(weights_folder*"weights_PP_lcdm.npy")
trained_emu_PP = Capse.SimpleChainsEmulator(Architecture= mlpd, Weights = weights_PP)
CℓPP_emu = Capse.CℓEmulator(TrainedEmulator = trained_emu_PP, ℓgrid = l,
                             InMinMax = npzread(weights_folder*"inMinMax_lcdm.npy"),
                             OutMinMax = npzread(weights_folder*"outMinMaxCℓPP_lcdm.npy"),
                             PolyGrid= zeros(50,50), ChebDegree = grad_cheb);

The first thing to do is to evalute the PolyGrid: the polynomials on the $\ell$ grid used in the training.
After the evaluation, the result is stored and doesn't need to be computed again.

In [9]:
Capse.eval_polygrid!(CℓEE_emu)
Capse.eval_polygrid!(CℓTT_emu)
Capse.eval_polygrid!(CℓTE_emu)
Capse.eval_polygrid!(CℓPP_emu)

Let us now benchmark the $C_\ell$'s computation

In [10]:
input_test = rand(6)
@benchmark Capse.get_Cℓ($input_test, $CℓTE_emu)

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m10.413 μs[22m[39m … [35m 2.758 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 98.16%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m14.708 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m15.763 μs[22m[39m ± [32m38.017 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m3.34% ±  1.39%

  [39m [39m [39m▅[39m█[39m▂[39m [39m [39m [39m [39m [39m [39m [39m [39m▂[34m▃[39m[39m▁[39m [39m [32m [39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▄[39m█[39m█[39m█[39m▅

And also the Chebyshev coefficients emulation

In [11]:
@benchmark Capse.get_chebcoefs($input_test, $CℓTE_emu)

BenchmarkTools.Trial: 10000 samples with 9 evaluations.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m2.852 μs[22m[39m … [35m790.853 μs[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 98.89%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m3.643 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m3.729 μs[22m[39m ± [32m  7.882 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m2.10% ±  0.99%

  [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m█[39m [39m [39m [39m [39m [39m [39m [39m▃[34m▂[39m[39m▂[32m▃[39m[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▁[39m▁[39m▁[39m▁[39m▂

Here we want to emphasize an important point:

- emulating the Chebyshev coefficients requires around $4\,\mu s$
- emulating the Chebyshev coefficients $\textit{and}$  computing the $C_\ell$'s takes $15\,\mu s$

This simple observation can suggest us something: if we are able to write the likelihood in such a way that we do NOT compute the $C_\ell$'s we can improve the overall computational performance.

# PlanckLite & Chebyshev

In [12]:
lsTT = 2:2508
lsTE = 2:1996
facTT=lsTT.*(lsTT.+1)./(2*π)
facTE=lsTE.*(lsTE.+1)./(2*π)

function call_emu_plancklite(θ, Emu_TT, Emu_TE, Emu_EE, facTT, facTE)
    return PlanckLite.bin_Cℓ(Capse.get_Cℓ(θ, Emu_TT)[1:2507]./facTT,
                            Capse.get_Cℓ(θ, Emu_TE)[1:1995]./facTE,
                            Capse.get_Cℓ(θ, Emu_EE)[1:1995]./facTE)
end

call_emu_plancklite (generic function with 1 method)

In [13]:
Γ = sqrt(PlanckLite.cov)
iΓ = inv(Γ)
D = iΓ * PlanckLite.data;

In [14]:
theory_plancklite(θ) = call_emu_plancklite(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, facTT, facTE)

theory_plancklite (generic function with 1 method)

In [15]:
@benchmark theory_plancklite(ones(6))

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m247.032 μs[22m[39m … [35m  7.347 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 86.29%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m303.349 μs               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m366.383 μs[22m[39m ± [32m378.963 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m6.23% ±  5.95%

  [39m▆[39m▅[39m▄[39m▄[39m▇[39m█[34m▇[39m[39m▆[39m▅[39m▄[39m▄[39m▃[39m▂[32m▂[39m[39m▂[39m▂[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m▂[39m▂[39m▂[39m▁[39m [39m▁[39m▁[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m▂[39m▂[39m▂[39m▁[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m▂
  [39m█[39m█[39m█

In [16]:
function bin_grid(Emu_TT, Emu_TE, Emu_EE, facTT, facTE)
    result = zeros(613,48)
    for i in 1:48
    result[:, i] =  PlanckLite.bin_Cℓ(Emu_TT.PolyGrid[1:2507,i]./facTT,
                             Emu_TE.PolyGrid[1:1995,i]./facTE,
                             Emu_EE.PolyGrid[1:1995,i]./facTE)
    end
    return result
end

binned_grid_std = bin_grid(CℓTT_emu, CℓTE_emu, CℓEE_emu, facTT, facTE);

In [17]:
function fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid)
    coeff_TT = Capse.get_chebcoefs(θ, CℓTT_emu)
    coeff_TE = Capse.get_chebcoefs(θ, CℓTE_emu)
    coeff_EE = Capse.get_chebcoefs(θ, CℓEE_emu)
    TT = binned_grid[1:215,:]   * coeff_TT
    TE = binned_grid[216:414,:] * coeff_TE
    EE = binned_grid[415:613,:] * coeff_EE

    return vcat(TT, TE, EE)
end

function super_fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_TT, binned_grid_TE, binned_grid_EE)
    #coeff_TT = Capse.get_chebcoefs(θ, CℓTT_emu)
    #coeff_TE = Capse.get_chebcoefs(θ, CℓTE_emu)
    #coeff_EE = Capse.get_chebcoefs(θ, CℓEE_emu)
    #TT = binned_grid_TT   * coeff_TT
    #TE = binned_grid_TE * coeff_TE
    #EE = binned_grid_TT * coeff_EE

    return binned_grid_TT * Capse.get_chebcoefs(θ, CℓTT_emu) +
                binned_grid_TE * Capse.get_chebcoefs(θ, CℓTE_emu) +
                binned_grid_EE * Capse.get_chebcoefs(θ, CℓEE_emu)
end 

binned_grid_TT_std = binned_grid_std[1:215,:]
binned_grid_TE_std = binned_grid_std[216:414,:]

binned_grid_TT = zeros(613,48)
binned_grid_TE = zeros(613,48)
binned_grid_EE = zeros(613,48)

binned_grid_TT[1:215,:]   = binned_grid_std[1:215,:]
binned_grid_TE[216:414,:] = binned_grid_std[216:414,:]
binned_grid_EE[415:613,:] = binned_grid_std[415:613,:]

binned_grid_TT = iΓ * binned_grid_TT
binned_grid_TE = iΓ * binned_grid_TE
binned_grid_EE = iΓ * binned_grid_EE


theory_fast_std(θ) = fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_std)
theory_fast(θ) = super_fast_computation(θ, CℓTT_emu, CℓTE_emu, CℓEE_emu, binned_grid_TT, binned_grid_TE, binned_grid_EE)

theory_fast (generic function with 1 method)

In [18]:
iΓ * theory_fast_std(ones(6)) .- theory_fast(ones(6))

613-element Vector{Float64}:
 -1.6653345369377348e-15
  6.661338147750939e-16
 -6.661338147750939e-16
  1.9984014443252818e-15
 -1.5543122344752192e-15
 -6.661338147750939e-16
  2.4424906541753444e-15
 -6.661338147750939e-16
 -6.661338147750939e-16
  0.0
 -1.5543122344752192e-15
 -2.220446049250313e-15
  2.220446049250313e-16
  ⋮
 -3.7730235602495554e-17
  1.1102230246251565e-16
 -1.8735013540549517e-16
  8.326672684688674e-17
 -1.1102230246251565e-16
 -9.71445146547012e-17
  7.28583859910259e-17
 -2.7755575615628914e-17
 -1.1796119636642288e-16
 -8.326672684688674e-17
  6.938893903907228e-18
  1.1102230246251565e-16

In [19]:
@benchmark theory_fast(ones(6))

BenchmarkTools.Trial: 10000 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m23.114 μs[22m[39m … [35m 5.523 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 84.80%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m30.806 μs              [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m31.179 μs[22m[39m ± [32m56.831 μs[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m1.50% ±  0.85%

  [39m [39m [39m [39m▁[39m▆[39m█[39m▇[39m▄[39m▂[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [34m▃[39m[32m▅[39m[39m▄[39m▃[39m▃[39m▁[39m▁[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▁[39m▂[39m▅[39m█[39m█[39m█

In [20]:
@model function CMB_planck_ultra_fast(D)
    #prior on model parameters
    ln10As ~ Uniform(0.25, 0.35)
    ns     ~ Uniform(0.88, 1.06)
    h0     ~ Uniform(0.60, 0.80)
    ωb     ~ Uniform(0.1985, 0.25)
    ωc     ~ Uniform(0.08, 0.20)
    τ      ~ Normal(0.0506, 0.0086)
    yₚ     ~ Normal(1.0, 0.0025)

    θ = [10*ln10As, ns, 100*h0, ωb/10, ωc, τ]
    #compute theoretical prediction
    pred = theory_fast(θ) ./(yₚ^2)
    #compute likelihood
    D ~ MvNormal(pred, I)

    return nothing
end

CMB_model_planck_ultra_fast = CMB_planck_ultra_fast(D)

DynamicPPL.Model{typeof(CMB_planck_ultra_fast), (:D,), (), (), Tuple{Vector{Float64}}, Tuple{}, DynamicPPL.DefaultContext}(CMB_planck_ultra_fast, (D = [9.643012835639828, 10.803277764202981, 12.837021443612652, 12.159764105419248, 11.744821016084751, 13.713281787419634, 13.957299336082897, 15.005683408044218, 15.585981279991291, 13.764821092399968  …  -0.4208097528485547, 0.9948910443777017, 3.0065727299752423, 0.38248077476446013, -1.567094536865492, 1.6989024589667159, 1.0732606672031262, -1.6425448532692997, 2.311295810549454, 0.6863715328875395],), NamedTuple(), DynamicPPL.DefaultContext())

In [21]:
bestfit_Planck = optimize(CMB_model_planck_ultra_fast, MAP(), Optim.Options(iterations=100000, allow_f_increases=true))

ModeResult with maximized lp of -836.24
[0.305187288403472, 0.9634362838080817, 0.6713595762487367, 0.223418397764578, 0.12063380739590789, 0.057132455205567215, 1.0005309971887266]

In [22]:
@benchmark optimize(CMB_model_planck_ultra_fast, MAP(), Optim.Options(iterations=100000, allow_f_increases=true))

BenchmarkTools.Trial: 89 samples with 1 evaluation.
 Range [90m([39m[36m[1mmin[22m[39m … [35mmax[39m[90m):  [39m[36m[1m33.255 ms[22m[39m … [35m114.834 ms[39m  [90m┊[39m GC [90m([39mmin … max[90m): [39m0.00% … 4.30%
 Time  [90m([39m[34m[1mmedian[22m[39m[90m):     [39m[34m[1m54.234 ms               [22m[39m[90m┊[39m GC [90m([39mmedian[90m):    [39m0.00%
 Time  [90m([39m[32m[1mmean[22m[39m ± [32mσ[39m[90m):   [39m[32m[1m56.819 ms[22m[39m ± [32m 15.992 ms[39m  [90m┊[39m GC [90m([39mmean ± σ[90m):  [39m3.68% ± 4.54%

  [39m [39m [39m▂[39m▂[39m [39m [39m▅[39m [39m▂[39m [39m [39m▂[39m█[39m▅[39m [39m [39m█[39m▂[34m [39m[39m▅[32m [39m[39m [39m▂[39m [39m▅[39m▂[39m▂[39m [39m▅[39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m [39m 
  [39m▅[39m▅[39m█[39m█[39m▅[3

In [23]:
result_multi = multipathfinder(CMB_model_planck_ultra_fast, 5000; nruns=6, executor = Transducers.PreferParallel())
@time result_multi = multipathfinder(CMB_model_planck_ultra_fast, 5000; nruns=6, executor = Transducers.PreferParallel())

[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl:212[39m
[33m[1m└ [22m[39m[90m@ Pathfinder ~/.julia/packages/Pathfinder/1B4yO/src/singlepath.jl

  2.372071 seconds (3.07 M allocations: 2.156 GiB, 7.45% gc time)


Multi-path Pathfinder result
  runs: 6
  draws: 5000
  Pareto shape diagnostic: 0.86 (bad)

In [24]:
result_multi.draws_transformed

Chains MCMC chain (5000×7×1 Array{Float64, 3}):

Iterations        = 1:1:5000
Number of chains  = 1
Samples per chain = 5000
parameters        = ln10As, ns, h0, ωb, ωc, τ, yₚ

Summary Statistics
 [1m parameters [0m [1m    mean [0m [1m     std [0m [1m naive_se [0m [1m    mcse [0m [1m       ess [0m [1m    rhat [0m
 [90m     Symbol [0m [90m Float64 [0m [90m Float64 [0m [90m  Float64 [0m [90m Float64 [0m [90m   Float64 [0m [90m Float64 [0m

      ln10As    0.3052    0.0015     0.0000    0.0000   4956.0946    0.9999
          ns    0.9641    0.0044     0.0001    0.0001   4761.1231    1.0000
          h0    0.6724    0.0066     0.0001    0.0001   4678.8754    0.9999
          ωb    0.2234    0.0014     0.0000    0.0000   4498.5708    1.0002
          ωc    0.1204    0.0015     0.0000    0.0000   4847.6505    0.9999
           τ    0.0575    0.0074     0.0001    0.0001   4965.5844    0.9998
          yₚ    1.0004    0.0024     0.0000    0.0000   4752.8772    0.9998

In [25]:
nsteps = 5000
nadapts = 500
nchains = 6

init_params = collect.(eachrow(result_multi.draws_transformed.value[1:nchains, :, 1]));

In [26]:
chains_planck_std_NUTS = sample(CMB_model_planck_ultra_fast, NUTS(nadapts, 0.65), MCMCThreads(), 5000, nchains; init_params = init_params)

[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[36m[1m┌ [22m[39m[36m[1mInfo: [22m[39mFound initial step size
[36m[1m└ [22m[39m  ϵ = 0.00078125
[32mSampling (6 threads): 100%|█████████████████████████████| Time: 0:01:59[39m


Chains MCMC chain (5000×19×6 Array{Float64, 3}):

Iterations        = 501:1:5500
Number of chains  = 6
Samples per chain = 5000
Wall duration     = 276.41 seconds
Compute duration  = 1118.97 seconds
parameters        = ln10As, ns, h0, ωb, ωc, τ, yₚ
internals         = lp, n_steps, is_accept, acceptance_rate, log_density, hamiltonian_energy, hamiltonian_energy_error, max_hamiltonian_energy_error, tree_depth, numerical_error, step_size, nom_step_size

Summary Statistics
 [1m parameters [0m [1m    mean [0m [1m     std [0m [1m naive_se [0m [1m    mcse [0m [1m        ess [0m [1m    rhat [0m [1m[0m ⋯
 [90m     Symbol [0m [90m Float64 [0m [90m Float64 [0m [90m  Float64 [0m [90m Float64 [0m [90m    Float64 [0m [90m Float64 [0m [90m[0m ⋯

      ln10As    0.3052    0.0017     0.0000    0.0000   11639.8212    1.0004   ⋯
          ns    0.9635    0.0044     0.0000    0.0000   10188.7458    1.0002   ⋯
          h0    0.6714    0.0061     0.0000    0.0001    8512.0156

In [27]:
CPU_s_Planck_NUTS = compute_duration(chains_planck_std_NUTS)
Planck_NUTS_ESS = mean(MCMCDiagnosticTools.ess_rhat(chains_planck_std_NUTS)[[:ln10As, :ns, :h0, :ωb,:ωc, :τ, :yₚ],:ess])
Planck_NUTS_ESS_s = Planck_NUTS_ESS/CPU_s_Planck_NUTS

13.113964795355797

## MCHMC Stuff

In [28]:
d = 7
target = TuringTarget(CMB_model_planck_ultra_fast)
nadapts = 20_000
nsteps = 200000

spl = MCHMC(nadapts, 0.001; init_eps=0.05, L=sqrt(d),# sigma=ones(d),  #try higher init_eps
            adaptive=true)
start_mchmc = time()
@time planck_mchmc = Sample(spl, target, nsteps;
                    progress=true,
                    dialog=true, file_name="chain_1",
                    initial_x=bestfit_Planck.values.array)
end_mchmc = time()
end_mchmc - start_mchmc

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▌                            |  ETA: 0:00:08[39m

Burn in step: 4000
eps --->0.0008829448115867843


[32mMCHMC (tuning):  39%|█████████████▊                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.02966841877080876


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.029843598044139125


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.02533791589915911


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.040625519544314585


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.040625519544314585
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.06674730901284426
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.1634958803877195, 0.8212187234497526, 0.262846033343063, 0.21954311540080532, 0.3475448868158938, 0.016689597632101504, 0.0026615390685791906]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:07[39m

 79.793967 seconds (71.28 M allocations: 73.658 GiB, 8.91% gc time, 4.28% compilation time: <1% of which was recompilation)


79.8123459815979

In [29]:
n_parallel_mchmc = 8
chains = Vector{Any}(undef, n_parallel_mchmc)
vec_ess = zeros(n_parallel_mchmc)

start_mchmc = time()
@time for i in 1:n_parallel_mchmc
    chains[i] = Sample(MCHMC(nadapts, 0.001; init_eps=0.05, L=sqrt(d), adaptive=true), target, nsteps;
                       progress=true,
                       dialog=true, file_name="chain_1",
                       initial_x=bestfit_Planck.values.array)
    vec_ess[i] = mean(Summarize(chains[i])[1][1:7])
end

end_mchmc = time()
time_mchmc_parallel_Planck = end_mchmc - start_mchmc

[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0013449731916332305


[32mMCHMC (tuning):  40%|█████████████▉                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.019980299044730494


[32mMCHMC (tuning):  59%|████████████████████▊              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.01862988518083222


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.02592944204619462


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.043868544136649125


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.043868544136649125
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.0694030398099291
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.25680592153641824, 0.20055287900132526, 0.6062114562216662, 0.25136564281333457, 0.20150830626346186, 0.025137689252103006, 0.006237074846953144]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:10[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|██████▉                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0014753473541276416


[32mMCHMC (tuning):  39%|█████████████▊                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.016610930737759155


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.016756065462298655


[32mMCHMC (tuning):  80%|███████████████████████████▉       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.014290416392221185


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.018107129668612502


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.018107129668612502
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.0443714981412406
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.17091482953569923, 0.7818296153287377, 0.6538226504773079, 0.26698359051927145, 0.7631834297024751, 0.009693936537677875, 0.002718700488156737]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:06[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|██████▉                            |  ETA: 0:00:05[39m

Burn in step: 4000
eps --->0.001246149401440086


[32mMCHMC (tuning):  40%|█████████████▉                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.02387924121423621


[32mMCHMC (tuning):  59%|████████████████████▋              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.023677761401292394


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.03350080508713381


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.03937270867456345


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.03937270867456345
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.065694438159419
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.22072638675224698, 0.5921380931522368, 0.6187270846312162, 0.3376309611234896, 0.25698216724641293, 0.014704115110827753, 0.006824927429159037]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:08[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0013863830772652846


[32mMCHMC (tuning):  40%|██████████████                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.03250846424110592


[32mMCHMC (tuning):  59%|████████████████████▋              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.04162347309975169


[32mMCHMC (tuning):  80%|███████████████████████████▉       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.043869662150601604


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.04888621056961595


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.04888621056961595
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.07333466201691945
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.11878455947644172, 0.21643818548171934, 0.48912201794295357, 0.6135499160757449, 0.13132072651847282, 0.01247726899861755, 0.0030821101748852503]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:08[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▌                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0008270512809983288


[32mMCHMC (tuning):  40%|█████████████▉                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.010955583097204818


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.01781813974501131


[32mMCHMC (tuning):  80%|███████████████████████████▉       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.024006352348185194


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.018836996088219548


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.018836996088219548
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.045263189411676924
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.44844543169193624, 0.1934457071460337, 0.2730860751129681, 0.2919774947104515, 0.09743948753798433, 0.02994146265024506, 0.00973025796851606]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:08[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  20%|██████▉                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.0011362815815864314


[32mMCHMC (tuning):  39%|█████████████▊                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.048015623252826795


[32mMCHMC (tuning):  59%|████████████████████▊              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.07049827677333437


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.09059055053697662


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.09580528197047496


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.09580528197047496
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.10358497483436328
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.08598148346307614, 0.6283480746616772, 0.26121270562213605, 0.43046267218443574, 0.06982834946904665, 0.010709674458709505, 0.0026408372504911886]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:08[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.001150150243592848


[32mMCHMC (tuning):  39%|█████████████▋                     |  ETA: 0:00:04[39m

Burn in step: 8000
eps --->0.01710626356420542


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.028862366462219294


[32mMCHMC (tuning):  80%|████████████████████████████       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.03576190370083832


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.04405990928638161


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.04405990928638161
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.06955678084841024
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [0.11487761082062531, 0.12769266875677945, 0.31057229368615696, 0.5044361841927906, 0.334772696554643, 0.014288163075227917, 0.0032421706827797723]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:09[39m[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning sigma ⏳
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mTuning eps ⏳
[32mMCHMC (tuning):  19%|██████▊                            |  ETA: 0:00:06[39m

Burn in step: 4000
eps --->0.002178755492108674


[32mMCHMC (tuning):  39%|█████████████▋                     |  ETA: 0:00:05[39m

Burn in step: 8000
eps --->0.0047645998005947455


[32mMCHMC (tuning):  60%|████████████████████▉              |  ETA: 0:00:03[39m

Burn in step: 12000
eps --->0.0035478036171247987


[32mMCHMC (tuning):  79%|███████████████████████████▊       |  ETA: 0:00:02[39m

Burn in step: 16000
eps --->0.00498175424546503


[A2mMCHMC (tuning): 100%|███████████████████████████████████| Time: 0:00:08[39m

Burn in step: 20000
eps --->0.004367432679407797


[36m[1m[ [22m[39m[36m[1mInfo: [22m[39meps: 0.004367432679407797
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mL: 2.6457513110645907
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39mnu: 0.02173517436475978
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39msigma: [1.8539839306045331, 1.046719649926175, 0.7433681891444796, 0.24270596474824502, 0.3953075484982785, 0.031656218640451664, 0.04896061482102929]
[36m[1m[ [22m[39m[36m[1mInfo: [22m[39madaptive: true
[A2mMCHMC: 100%|████████████████████████████████████████████| Time: 0:01:11[39m

637.483093 seconds (521.41 M allocations: 633.998 GiB, 9.16% gc time)


637.5245652198792

In [30]:
Planck_MCHMC_parallel_ESS_s = sum(vec_ess)/time_mchmc_parallel_Planck

36.53307303904201

In [31]:
x = [mapreduce(permutedims, vcat, chains[i]) for i in 1:n_parallel_mchmc]

planck_mchmc_multi_chains = zeros(nsteps*n_parallel_mchmc, 7)
for i in 1:7
    planck_mchmc_multi_chains[:,i] = extract_single(x, i, n_parallel_mchmc)
end

In [33]:
npzwrite("chains_Planck_cheb_PlanckLite_MCHMC_multi.npy", planck_mchmc_multi_chains)