# Importance weights

In this notebook we compute the importance weights using the `PyBird` likelihoods calculated in `pybird_importance_likes.ipynb`.

In [1]:
import matryoshka.emulator as MatEmu
import numpy as np
import matplotlib.pyplot as plt
from tqdm import tqdm
import time
from scipy.optimize import minimize
from scipy.stats import norm
import zeus
from scipy.interpolate import interp1d
import corner

As with most of the other notebooks in this repo we start by specifying the repo location.

In [2]:
path_to_repo = "/Users/jamie/Desktop/GitHubProjects/matryoshka_II_paper/"

The `PyBird` likelihoods have been calculated for samples from an `EFTEMU` posterior with $V_s^{1/3}=3700\ \mathrm{Mpc}\ h^{-1}$, but using a covariance with $V_s^{1/3}=5000\ \mathrm{Mpc}\ h^{-1}$. We do this as the posterior with the smaller $V_s$ will be inflated, mitigating the risk of loosing the tails of the `PyBird` posterior.

To calculate the importance weights we need to calculate the `EFTEMU` likelihood for the same samples. So we start by loading the samples, along with the `PyBird` likelihoods.

In [3]:
chain = []
pybird_likes = []
for i in range(6):
    chain.append(np.load(path_to_repo+"results/chain--EFTEMU_z-0.61_V-3700_kmin-def_kmax-def_{i}.npy".format(i=i)))
    pybird_likes.append(np.load(path_to_repo+"results/pybird_likes--EFTEMU_z-0.61_Vc-3700_Vi-5000_kmin-def_kmax-def_{i}.npy".format(i=i)))
chain = np.vstack(chain)
pybird_likes = np.concatenate(pybird_likes)

We then load the mocks and covariance with $V_s^{1/3}=3700\ \mathrm{Mpc}\ h^{-1}$.

In [4]:
P0_true = np.load(path_to_repo+"data/P18/z0.61/poles/P0_P18--z-0.61_optiresum-False.npy")[1]
P2_true = np.load(path_to_repo+"data/P18/z0.61/poles/P2_P18--z-0.61_optiresum-False.npy")[1]
klin = np.load(path_to_repo+"data/P18/z0.61/poles/P2_P18--z-0.61_optiresum-False.npy")[0]
cov = np.load(path_to_repo+"data/P18/z0.61/covs/cov_P18--z-0.61_Vs-3700.npy")
icov = np.linalg.inv(cov)

Next we define the truths,

In [5]:
cosmo_true = np.array([0.11933, 0.02242, 0.6766, 3.047, 0.9665])
bs_CMASS = np.array([2.22, 1.2, 0.1, 0.0, 0.4, -7.7, 0., 0., 0., -3.7])
fb_true = cosmo_true[1]/(cosmo_true[0]+cosmo_true[1])
ng = 3e-4

and initalise the emulators.

In [6]:
P0_emu = MatEmu.EFT(multipole=0, version='EFTv2', redshift=0.61)
P2_emu = MatEmu.EFT(multipole=2, version='EFTv2', redshift=0.61)

2022-02-11 12:43:45.978678: I tensorflow/core/platform/cpu_feature_guard.cc:142] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


In the following cell we define the `log_like()` and `fix_params()` functions. For some more details about these functions see `mcmc_z0.61_v3700_EFTEMU.ipynb`.

In [7]:
def log_like(theta, kobs, obs, icov):
    # Oc, Ob, h, As, ns
    cosmo = theta[:,:5]
    # b1, c2, b3, c4, cct, cr1, cr2
    bias = theta[:,5:12]
    # ce1, cmono, cquad
    stoch = theta[:,12:]
    
    
    c2 = np.copy(bias[:,1])
    c4 = np.copy(bias[:,3])
    
    bias[:,1] = (c2+c4)/np.sqrt(2)
    bias[:,3] = (c2-c4)/np.sqrt(2)
            
    P0_pred = P0_emu.emu_predict(cosmo, bias, stochastic=stoch, ng=ng)
    P2_pred = P2_emu.emu_predict(cosmo, bias, stochastic=stoch, ng=ng)
        
    preds = np.hstack([interp1d(MatEmu.kbird[:39], P0_pred)(kobs), 
                           interp1d(MatEmu.kbird[:39], P2_pred)(kobs)])
    
    res = preds-obs
    
    return -0.5*np.einsum("nj,ij,in->n", res, icov, res.T)

def fix_params(theta, fix_val, fb):
    
    # Define indicies of parameters that vary.
    var_id = np.array([0,2,3,5,6,7,9,10,12,14])
    
    # Define indicies of fixed params.
    fix_id = np.array([4,8,11,13])
    
    fix_theta = np.zeros((theta.shape[0], 15))
    fix_theta[:,var_id] = theta
    fix_theta[:,fix_id] = np.vstack(theta.shape[0]*[fix_val])
    
    # Comput w_b from baryon fraction and w_c
    fix_theta[:,1] = -fb*theta[:,0]/(fb-1)
    
    return fix_theta

We fix all the relevant parameters,

In [8]:
fixed_params = np.array([cosmo_true[-1], 0., 0., 0.])
fixed_chain = fix_params(chain, fixed_params, fb_true)

and calculate the `EFTEMU` likelihoods.

In [9]:
EFTEMU_likes = log_like(fixed_chain, klin, np.concatenate([P0_true, P2_true]), icov)

The importance weights are given by,

$$I_i = \frac{P(\theta_i)\mathcal{L}_\texttt{PyBird}(P'|\theta_i)}{P(\theta_i)\mathcal{L}_\texttt{EFTEMU}(P'|\theta_i)}\ ,$$

with $\mathcal{L}_\texttt{PyBird}(P'|\theta_i)$ and $\mathcal{L}_\texttt{EFTEMU}(P'|\theta_i)$ being the likelihods for `PyBird` and the `EFTEMU` respectively. Remeber that $V_s^{1/3}=5000\ \mathrm{Mpc}\ h^{-1}$ in the `PyBird` likelihood and $V_s^{1/3}=3700\ \mathrm{Mpc}\ h^{-1}$ in the `EFTEMU` likelihood. $P(\theta_i)$ is the prior, but as it is idenical for both `PyBird` and the `EFTEMU` the importance weight $I_i$ reduces to a ratio of the likelihoods.

Seeing as we have actually been computing the log-likelihood we first need to exponentiate. To improve the numerical stability we subtract the mean `EFTEMU` log-likelihood from both the `PyBird` and `EFTEMU` log-likelihoods. 

In [10]:
ratio = np.exp(pybird_likes-EFTEMU_likes.mean())/np.exp(EFTEMU_likes-EFTEMU_likes.mean())

We can compute the _effective sample size_ (ESS) from these weights,

$$\mathrm{ESS} = \frac{\left(\sum_i I_i\right)^2}{\sum_i I_i^2}\ .$$

In [11]:
ESS = np.sum(ratio)**2/np.sum(ratio**2)
ESS, ESS/chain.shape[0]

(817.4200934398144, 0.06396088368073666)

In [12]:
np.save(path_to_repo+"results/weights--EFTEMU_z-0.61_Vc-3700_Vi-5000_kmin-def_kmax-def_all.npy", ratio/ratio.sum())