
#  
## Calibrate-Emulate-Sample Demo: Application to a Cloud Microphysics Toy Model

This notebook demonstrates how parameters of cloud droplet size distributions can be learned from data generated by Cloudy, a cloud microphysics toy model that can be downloaded here: https://github.com/climate-machine/Cloudy.jl. 

Cloudy takes the following input:
* parameters defining a **cloud droplet size distribution** (e.g., parameters defining a gamma distribution or an exponential distribution)
* a **''kernel''** defining the efficiency with which cloud droplets collide and coalesce (i.e., stick together and form bigger droplets). 

Cloudy then simulates how the cloud droplet size distribution evolves over time as a result of the droplet interactions defined by the given kernel. It does so by computing a (user-defined) number of moments of the distribution and propagating them forward in time. The simulation stops after a user-defined amount of time $T$. 
Treating the kernel as given, Cloudy can be viewed as a function $G(u): \mathbb{R}^p \rightarrow \mathbb{R}^d$ that maps the vector of parameters describing the initial droplet distribution, $u = [u_1, u_2, \ldots, u_p]$, to a vector of moments, $M = [M_0^T, M_1^T, \ldots, M_d^T]$

The model equation assumes that we have observations $y$ of the moments coming from some observing system, and that these observations are corrupted by additive noise $\eta$ such that
\begin{equation}
    y = G(u) + \eta,
\end{equation}

where the noise $\eta$ is drawn from a d-dimensional Gaussian with distribution $N(0, \Gamma_y)$.

In this demo, we test the calibrate-emulate-sample framework in a **''perfect-model'' setting**, which means that the observations $y$ do not come from some external observing system but are generated by running Cloudy with the  parameters set to their ''true'' values. 

**Given knowledge of the artificial observations $y$, the forward model $G: \mathbb{R}^p \rightarrow \mathbb{R}^d$, and some information about the noise level such as its size or distribution (but not its value), the inverse problem we want to solve is to find the unknown parameters $u$.**

A comprehensive treatment of the calibrate-emulate-sample approach to Bayesian inverse problems can be found in Cleary et al., 2020: https://arxiv.org/pdf/2001.03689.pdf

In a one-sentence summary, the **calibrate** step of the algorithm consists of an Ensemble Kalman Inversion that is used to find good training points for a Gaussian process regression, which in turn is used as a replacement (**emulator**) of the original forward model $G$ in the subsequent Markov chain Monte Carlo **sampling** of the posterior distributions of the unknown parameters.

**Final remarks**: Calibrate-emulate-sample was developed to solve inverse problems in situations when running the forward model $G$ is computationally expensive. Cloudy is very cheap to run and it would be computationally feasible to skip the ''calibrate'' and ''emulate'' steps, but this being a demo, we will walk the user through the full pipeline. Applying the full pipeline may also improve the result even when it's not necessary for computational reasons, e.g., because the Gaussian process regression smooths the cost function.

<img src="cloudy_model.jpg" alt="Cloudy" style="width:600px">



##### Load modules

In [1]:
# Import Cloudy modules
using Cloudy.ParticleDistributions
using Cloudy.KernelTensors
using Cloudy.Sources

┌ Info: Precompiling Cloudy [9e3b23bb-e7cc-4b94-886c-65de2234ba87]
└ @ Base loading.jl:1273


In [2]:
# Import modules
using Distributions  # probability distributions and associated functions 
using StatsBase
using LinearAlgebra
using StatsPlots
using GaussianProcesses

┌ Info: Precompiling Distributions [31c24e10-a181-5473-b8eb-7969acd0382f]
└ @ Base loading.jl:1273
┌ Info: Precompiling StatsPlots [f3b207a7-027a-5e70-b257-86293d7955fd]
└ @ Base loading.jl:1273
┌ Info: Precompiling GaussianProcesses [891a1506-143c-57d2-908e-e1f8e92e6de9]
└ @ Base loading.jl:1273


In [3]:
# Import Calibrate-Emulate-Sample modules (adjust paths 
# if necessary)
include("CalibrateEmulateSample.jl/src/Priors.jl")
include("CalibrateEmulateSample.jl/src/EKI.jl")
include("CalibrateEmulateSample.jl/src/Observations.jl")
include("CalibrateEmulateSample.jl/src/GPEmulator.jl")
include("CalibrateEmulateSample.jl/src/MCMC.jl")
include("CalibrateEmulateSample.jl/src/GModel.jl")
include("CalibrateEmulateSample.jl/src/Utilities.jl")

┌ Info: Precompiling Sundials [c3572dad-4567-51f8-b174-8c6c989267f4]
└ @ Base loading.jl:1273
┌ Info: Precompiling DifferentialEquations [0c46a032-eb83-5123-abaf-570d42b7fbaa]
└ @ Base loading.jl:1273


[32m[1m  Building[22m[39m MbedTLS ─────────→ `~/.julia/packages/MbedTLS/a1JFn/deps/build.log`
[32m[1m  Building[22m[39m SpecialFunctions → `~/.julia/packages/SpecialFunctions/ne2iw/deps/build.log`
[32m[1m  Building[22m[39m NNlib ───────────→ `~/.julia/packages/NNlib/TOv7C/deps/build.log`
[32m[1m  Building[22m[39m FFTW ────────────→ `~/.julia/packages/FFTW/qqcBj/deps/build.log`
[32m[1m  Building[22m[39m FastTransforms ──→ `~/.julia/packages/FastTransforms/MjTJy/deps/build.log`
[32m[1m  Building[22m[39m CodecZlib ───────→ `~/.julia/packages/CodecZlib/5t9zO/deps/build.log`
[32m[1m  Building[22m[39m GR ──────────────→ `~/.julia/packages/GR/NSt7D/deps/build.log`
[32m[1m  Building[22m[39m Plots ───────────→ `~/.julia/packages/Plots/WwFyB/deps/build.log`
[32m[1m  Building[22m[39m Sass ────────────→ `~/.julia/packages/Sass/EZMlY/deps/build.log`
[32m[1m  Building[22m[39m Sundials ────────→ `~/.julia/packages/Sundials/SKP8f/deps/build.log`
[32m[1m  Build

Main.Utilities

### Set up experiment

In [4]:
# Define the parameters that we want to learn
# We assume that the true particle mass distribution is a 
# Gamma distribution with parameters N0_true, θ_true, k_true
param_names = ["N0", "θ", "k"]
n_param = length(param_names)

# Define the data from which we want to learn these parameters
data_names = ["M0", "M1", "M2"]
moments = [0.0, 1.0, 2.0]
n_moments = length(moments)

3

### Define priors

In [5]:
# Assume lognormal priors for all three parameters
# Note: For the model G (=Cloudy) to run, N0 needs to be 
# nonnegative, and θ and k need to be positive. 
# The EKI update can result in violations of these constraints - 
# therefore, we perform CES in log space, i.e., we try to find 
# the logarithms of the true parameters (and of course, the actual
# parameters could then simply be obtained by exponentiating the 
# final results). 

function logmean_and_logstd(μ, σ)
    σ_log = sqrt(log(1.0 + σ^2/μ^2))
    μ_log = log(μ / (sqrt(1.0 + σ^2/μ^2)))
    return μ_log, σ_log
end

N0_true = 300.0 
θ_true = 1.5597  
k_true = 0.0817                  
params_true = [N0_true, θ_true, k_true]
# Note that dist_true has to be a Cloudy distribution, not a 
# "regular" Julia distribution
dist_true = ParticleDistributions.Gamma(N0_true, θ_true, k_true)

logmean_N0, logstd_N0 = logmean_and_logstd(260., 40.)
logmean_θ, logstd_θ = logmean_and_logstd(3.0, 1.5)
logmean_k, logstd_k = logmean_and_logstd(0.5, 0.5)

priors = [Priors.Prior(Normal(logmean_N0, logstd_N0), "N0"), # prior on N0
          Priors.Prior(Normal(logmean_θ, logstd_θ), "θ"),    # prior on θ
          Priors.Prior(Normal(logmean_k, logstd_k), "k")]    # prior on k

3-element Array{Main.Priors.Prior,1}:
 Main.Priors.Prior(Normal{Float64}(μ=5.548985191228175, σ=0.15294730979884993), "N0")
 Main.Priors.Prior(Normal{Float64}(μ=0.9870405130110048, σ=0.47238072707743883), "θ")
 Main.Priors.Prior(Normal{Float64}(μ=-1.039720770839918, σ=0.8325546111576977), "k") 

### Cloudy settings

In [6]:
# Collision-coalescence kernel to be used in Cloudy
coalescence_coeff = 1/3.14/4
kernel_func = x -> coalescence_coeff
kernel = CoalescenceTensor(kernel_func, 0, 100.0)

# Time period over which to run Cloudy
tspan = (0., 0.5)  

(0.0, 0.5)

### Generate truth

In [7]:
# Generate (artificial) truth samples
g_settings_true = GModel.GSettings(kernel, dist_true, moments, tspan)
yt = GModel.run_G(params_true, g_settings_true, 
                  ParticleDistributions.update_params, 
                  ParticleDistributions.moment,
                  Sources.get_int_coalescence)
n_samples = 100
samples = zeros(n_samples, length(yt))
noise_level = 0.05
Γy = noise_level^2 * convert(Array, Diagonal(yt))
μ = zeros(length(yt))

# Add noise
for i in 1:n_samples
    samples[i, :] = yt + noise_level^2 * rand(MvNormal(μ, Γy))
end

truth = Observations.Obs(samples, Γy, data_names)

Main.Observations.Obs{Float64}(Array{Float64,1}[[43.03246277249641, 38.22712734274022, 122.67085842805292], [43.03226933672069, 38.227191982460674, 122.67238636170657], [43.034260797337595, 38.22864858423109, 122.67280137989115], [43.03239857934877, 38.22764868134113, 122.67168223714746], [43.033362249169826, 38.22865802646954, 122.67134035525882], [43.03354614184931, 38.22922918113856, 122.6715572649863], [43.033008935657136, 38.229920240270594, 122.6721888942935], [43.03405681441463, 38.22826355739958, 122.67419915123452], [43.03349300451546, 38.22804080415025, 122.67223471891718], [43.0330550580651, 38.22822790403888, 122.67510698335646]  …  [43.034413125970104, 38.22614747438573, 122.67209550696579], [43.033684574832314, 38.2273268633264, 122.67280486443065], [43.03337828325457, 38.226781797541584, 122.67234804364057], [43.03394366936321, 38.229053072055194, 122.67082341311118], [43.03261390610507, 38.227929153887025, 122.66974046062998], [43.033295628540365, 38.22811635916539, 122

# Calibrate: Ensemble Kalman Inversion


In [8]:
log_transform(a::AbstractArray) = log.(a)
exp_transform(a::AbstractArray) = exp.(a)

exp_transform (generic function with 1 method)

In [9]:
N_ens = 50 # number of ensemble members
N_iter = 5 # number of EKI iterations
# initial parameters: N_ens x N_params
initial_params = EKI.construct_initial_ensemble(N_ens, priors; rng_seed=6)
ekiobj = EKI.EKIObj(initial_params, 
                    param_names, truth.mean, truth.cov)

Main.EKI.EKIObj{Float64,Int64}(Array{Float64,2}[[5.348011865060303 1.4941585750686186 -0.7841009076201841; 5.569333064179009 0.5565288294373119 0.05929838684811961; … ; 5.373152409506014 0.9596137079057969 0.3042290696461112; 5.346617347698648 0.37529332407633287 -3.0302645039719103]], ["N0", "θ", "k"], [43.03331131462083, 38.22815761007536, 122.67243910381221], [0.10758339830758915 0.0 0.0; 0.0 0.09557061750000001 0.0; 0.0 0.0 0.30668158241115073], 50, Array{Float64,2}[], Float64[])

In [10]:
# Initialize a ParticleDistribution with dummy parameters
# The parameters will then be set in run_cloudy_ensemble
dummy = 1.0
dist_type = ParticleDistributions.Gamma(dummy, dummy, dummy)
g_settings = GModel.GSettings(kernel, dist_type, moments, tspan)

Main.GModel.GSettings{Float64,CoalescenceTensor{Float64},Cloudy.ParticleDistributions.Gamma{Float64}}(CoalescenceTensor{Float64}(0, [0.07961783439490445]), Cloudy.ParticleDistributions.Gamma{Float64}(1.0, 1.0, 1.0), [0.0, 1.0, 2.0], (0.0, 0.5))

In [11]:
# EKI iterations
for i in 1:N_iter
    # Note that the parameters are exp-transformed for use as input
    # to Clouedy
    params_i = deepcopy(exp_transform(ekiobj.u[end]))
    g_ens = GModel.run_G_ensemble(params_i, 
                                  g_settings,
                                  ParticleDistributions.update_params,
                                  ParticleDistributions.moment,
                                  Sources.get_int_coalescence)
    EKI.update_ensemble!(ekiobj, g_ens) 
end

In [12]:
# EKI results: Has the ensemble collapsed toward the truth?
println("True parameters: ")
println(params_true)

println("\nEKI results:")
println(mean(deepcopy(exp_transform(ekiobj.u[end])), dims=1))

True parameters: 
[300.0, 1.5597, 0.0817]

EKI results:
[297.6468889534519 1.5385238200212998 0.08399722771783089]


# Emulate: Gaussian Process Regression

In [21]:
gppackage = GPEmulator.GPJL()
pred_type = GPEmulator.YType()
# Construct kernel:
# Sum kernel consisting of Matern 5/2 ARD kernel, a Squared
# Exponential Iso kernel and white noise
# Note that the kernels take the signal standard deviations on a 
# log scale as input.
len1 = 1.0
kern1 = SE(len1, 1.0)
len2 = zeros(3)
kern2 = Mat52Ard(len2, 0.0)
# regularize with white noise
white = Noise(log(2.0))
# construct kernel
GPkernel =  kern1 + kern2 + white
    
u_tp, g_tp = Utilities.extract_GP_tp(ekiobj, N_iter-1)
normalized = true
gpobj = GPEmulator.GPObj(u_tp, g_tp, gppackage; GPkernel=GPkernel, 
                         normalized=normalized, prediction_type=pred_type)

Main.GPEmulator.GPObj{Float64,Main.GPEmulator.GPJL}([-0.5385909660594311 2.2887924648414617 0.2336029405822238; 0.11950313437923568 -1.2285449765115737 2.543848330617082; … ; 0.8010610943647124 -0.3052126442492973 -0.5744089470337274; -0.16399322560109544 -0.23555319451699894 -0.4900001711974435], [42.72032423061568 67.62801111823997 394.6553301355645; 42.85747772900061 52.22872419479635 170.8751656382441; … ; 43.18853125132217 38.04934242233415 122.06801655356936; 42.916552538130816 37.856236577293366 122.27058075183629], [5.687475127365449 0.49425718523618484 -2.417734297922255], [22.53518786292536 0.5237634426243115 0.9344233889502005; 0.5237634426243116 4.209575897398398 1.1719723389858225; 0.9344233889502015 1.1719723389858225 4.556307096987945], Any[GP Exact object:
  Dim = 3
  Number of observations = 200
  Mean function:
    Type: MeanZero, Params: Float64[]
  Kernel:
    Type: SumKernel{SumKernel{SEIso{Float64},Mat52Ard{Float64}},Noise{Float64}}
      Type: SumKernel{SEIso{Flo

In [22]:
# Check how well the Gaussian Process regression predicts on the
# true parameters
y_mean, y_var = GPEmulator.predict(gpobj, reshape(log.(params_true), 1, :))

println("GP prediction on true parameters: ")
println(vec(y_mean))
println("true data: ")
println(truth.mean)

GP prediction on true parameters: 
[43.02261226197652, 38.23221770222881, 122.66695728898048]
true data: 
[43.03331131462083, 38.22815761007536, 122.67243910381221]


# Sample: Markov chain Monte Carlo

In [15]:
# initial values
u0 = vec(mean(u_tp, dims=1))
println("initial parameters: ", u0)

# MCMC parameters    
mcmc_alg = "rwm" # random walk Metropolis

# First let's run a short chain to determine a good step size
burnin = 0
step = 0.1 # first guess
max_iter = 5000
yt_sample = truth.mean
mcmc_test = MCMC.MCMCObj(yt_sample, truth.cov, 
                         priors, step, u0, 
                         max_iter, mcmc_alg, burnin)
new_step = MCMC.find_mcmc_step!(mcmc_test, gpobj)

initial parameters: [5.657921251303976, 0.5966180575131845, -2.1512319841668597]
Begin step size search
iteration 0; current parameters [5.657921251303976 0.5966180575131845 -2.1512319841668597]
iteration 2000; acceptance rate = 0.544727636181909, current parameters [5.709747088157291 0.23819034583864007 -2.16745808050693]
new step size: 0.2
iteration 2000; acceptance rate = 0.2768615692153923, current parameters [5.646460628430417 0.563965192461627 -2.4027489012399963]


0.2

In [16]:
# Now begin the actual MCMC
println("Begin MCMC - with step size ", new_step)
u0 = vec(mean(u_tp, dims=1))

# reset parameters 
burnin = 1000
max_iter = 500000

mcmc = MCMC.MCMCObj(yt_sample, truth.cov, priors, 
                    new_step, u0, max_iter, mcmc_alg, burnin)
MCMC.sample_posterior!(mcmc, gpobj, max_iter)

Begin MCMC - with step size 0.2
iteration 0; current parameters [5.657921251303976 0.5966180575131845 -2.1512319841668597]
iteration 1000 of 500000; acceptance rate = 0.3036963036963037, current parameters [5.648942837489793 0.47227404684834995 -1.8662676046342903]
iteration 2000 of 500000; acceptance rate = 0.351824087956022, current parameters [5.669715155443763 1.0713464721327675 -3.3104759836390962]
iteration 3000 of 500000; acceptance rate = 0.36387870709763415, current parameters [5.779359690025154 -0.025635426926531868 -1.4690562221169619]
iteration 4000 of 500000; acceptance rate = 0.39665083729067735, current parameters [5.501185626289903 0.4357960037169867 -1.872867634365765]
iteration 5000 of 500000; acceptance rate = 0.3935212957408518, current parameters [5.555621939117068 0.08840322894781431 -1.9720049479600075]
iteration 6000 of 500000; acceptance rate = 0.3951008165305782, current parameters [5.5886175824983395 0.26505627709883994 -2.2097781608761546]
iteration 7000 of 

In [17]:
posterior = MCMC.get_posterior(mcmc)      

post_mean = mean(posterior, dims=1)
post_cov = cov(posterior, dims=1)
println("post_mean")
println(post_mean)
println("post_cov")
println(post_cov)
println("D util")
println(det(inv(post_cov)))
println(" ")

post_mean
[5.637548159648833 0.36308186291775646 -2.3301997526371587]
post_cov
[0.007777715383270081 -0.0020596108891693713 -0.004345831950812004; -0.0020596108891693713 0.07351704260447198 -0.08249539862997235; -0.004345831950812004 -0.08249539862997235 0.16009319357642068]
D util
28518.55104645924
 


In [20]:
# Plot the posteriors together with the priors and the true 
# parameter values
using StatsPlots; 

true_values = [log(N0_true) log(θ_true) log(k_true)]
n_params = length(true_values)

for idx in 1:n_params
    if idx == 1
        param = "N0"
        xs = collect(4.5:0.01:6.5)
    elseif idx == 2
        param = "Theta"
        xs = collect(-1.0:0.01:2.5)
    elseif idx == 3
        param = "k"
        xs = collect(-4.0:0.01:1.0)
    else
        throw("not implemented")
    end

    label = "true " * param
    histogram(posterior[:, idx], bins=100, normed=true, 
              fill=:slategray, lab="posterior")
    plot!(xs, mcmc.prior[idx].dist, w=2.6, color=:blue, lab="prior")
    plot!([true_values[idx]], seriestype="vline", w=2.6, lab=label)

    title!(param)
    StatsPlots.savefig("posterior_"*param*".png")
end