# Building an Interpolator with CAMB

#### Introduction
In this notebook, we are going to learn about numerical interpolation and why it is useful. We'll end up with something at the end of this notebook that we'll need for fitting our CMB and Large-Scale Structure data.

#### The Problem
Now that we've got to grips with CAMB, we should be aware that it's pretty quick at computing all the observables that we're interested in. We've also looked at how to fit a straight line using the Metropolis-Hastings algorithm. If we wanted to fit some CMB data and extract some cosmological parameters, the obvious thing to do then would be to put CAMB inside our likelihood and run the Metropolis-Hastings algorithm for the cosmological parameters we are interested in; we would loop over our parameters of interest and for each set of parameters call CAMB to get our model predictions. We would then compare this to the data to compute the likelihood, and hence get the posterior samples for our cosmological parameters.

However, there is one problem with this. Even though CAMB is fast it's not fast enough for fitting data in this way. For example, say a typical CAMB run takes 1 second to compute everything. Given the number of samples we generated in the previous notebook it would take over 10,000 seconds to fit our data, which is longer than we might want to wait for results. And this only gets worse if we realise we need more samples, or want to fit other data simultaneously.

#### The Solution
A solution to this is to use numerical interpolation. We can pre-compute a set of observables using CAMB for the cosmological parameters we are interested in, and then every time we change the parameters and want to compute the likelihood, rather than running CAMB again, we can interpolate between the observables we already computed.

Numerical Interpolation is an extremely useful technique in all fields of astronomy and data analysis in general. It can be used to avoid solving complicated equations multiple times, or for solving equations that can't be inverted analytically. As an example of these, we'll start by building an interpolator for the comoving distance given a redshift. This equation normally involves a numerical integral to get the comoving distance given a redshift
\begin{equation}
\chi(z) = c\int^{z}_{0} \frac{dz'}{H(z')},
\end{equation}
and it also can't be inverted easily to compute the redshift if we already know the distance. Numerical interpolation is faster than numerical integration and we can swap what we treat as the input and the output to invert the equation. In the exercises in this notebook we'll try to build and test our own linear interpolator.

#### Linear Interpolation
Linear interpolation works by assuming that the function $y(x)$ between two known $(x,y)$ pairs is a straight line.
Assuming $y=mx+c$, we can write
\begin{equation}
\frac{y-y_{0}}{x-x_{0}} = \frac{y_{1}-y_{0}}{x_{1}-x_{0}},
\end{equation}
where $(x_{0},y_{0})$ and $(x_{1},y_{1})$ are the values we are interpolating between. In this case these are the comoving distances computed using CAMB. The above equation can be verified by substituting $y=mx+c$.

If we rearrange the above equation we find
\begin{equation}
y = y_{0}\biggl(1-\frac{x-x_{0}}{x_{1}-x_{0}}\biggl) + y_{1}\biggl(\frac{x-x_{0}}{x_{1}-x_{0}}\biggl).
\end{equation}
This is what we need to code up to create our interpolator. An example algorithm is then:
1. Compute and store $y_{i}(x_{i})$ for a suitably large range of $x_{i}$.
2. For every value $y(x)$ we want to interpolate:
    1. Find the values of $x_{0}$ and $x_{1}$ in our list of $x_{i}$ that bound $x$, such that $x_{1}>x$ and $x_{0} < x$.
    2. Retrieve the corresponding $y_{0}$ and $y_{1}$
    3. Use these two pairs of values to compute $y$.
    
So, let's try coding this up for the comoving distance. We'll use CAMB to get the input comoving distances for our interpolator. I've already coded up part of this, but you'll need to use what you learnt last week to get the comoving distances given the input redshifts, and complete the interpolator.

In [1]:
import numpy as np
from scipy import interpolate
import camb

# The default cosmology we will use for the comoving distance calculation
Omega_bh2 = 0.02242    # The baryon density times the hubble constant squared
Omega_cdmh2 = 0.11933   # The cold dark matter density times the hubble constant squared
Omega_k   = 0.0        # The intrinsic curvature of the Universe at the present day
H0 = 67.66             # The expansion rate of the Universe at the present day; The Hubbble Constant
DE_EoS = -1.0               # The equation of state of dark energy
scalar_amplitude = 2.105e-9   # The amplitude of the fluctuations in the Universe after inflation

# Generate a list of redshifts and the associated comoving distances
zmin, zmax, nz = 0.01, 1.0, 20
redshifts = np.logspace(np.log10(zmin), np.log10(zmax), nz)

# Now use these parameters to set the cosmology of your Universe. 
# **Add your code here**
my_cosmology = 

# Run CAMB. 
# **Add your code here**
first_run =

# Get the comoving distance to the redshifts. 
# **Add your code here**
d = 

# Now we have a list of redshifts and the corresponding distances, let's make a function
# that takes these and compute d(z) for any value of z.
def linear_interpolator(x, xi, yi):
    
    # Let's loop over all the x values we passed in
    y = np.empty(len(x))
    for i, xval in enumerate(x): 
        
        # First find the values of x_{0} and x_{1}. 
        # **Add your code here**


        # Now find the corresponding distances to those redshifts. 
        # **Add your code here**


        # Now compute y using the pairs (x0,y0) and (x1,y1). 
        # **Add your code here**
        y[i] = 
        
    return y
    
# Now use our interpolator to compute the comoving distance to z=0.2. Compare to the CAMB value.
print(str("True comoving distance to z=0.2: %4.6lf Mpc" % # **Add your code here** ))
print(str("Interpolated comoving distance to z=0.2: %4.6lf Mpc" % # **Add your code here** ))

# We can also invert the redshift-distance relationship by swapping the x and y arguments of the interpolator.
# Modify this line so that it estimates the cosmological redshift corresponding to distance of 2 Gpc, 
# which is not solvable analytically.
print(str("Redshift corresponding to 2Gpc: %2.6lf" % # **Add your code here**))

# Now we can go from redshifts to distances and vice-versa. Let's test this by going from z=0.8 to distance 
# and back again. Code this up using nested calls to our interpolation function.
print(str("Input Redshift: %2.6lf" % 0.8))
print(str("Output Redshift: %2.6lf" % # **Add your code here**))

SyntaxError: invalid syntax (<ipython-input-1-0a04ce824a9c>, line 19)

If you got everything working, the code should:
1. print out two calculations of the comoving distance to z=0.2.
2. Print out the redshift corresponding to a distance of $2$Gpc.
3. Print out an input redshift and an output redshift that equal each other.

#### Accuracy of the interpolator
You might notice that the interpolated value for the distance to $z=0.2$ and the CAMB value don't quite agree. This is due to some of the approximations used in interpolation. So now let's perform a quick exercise to test the accuracy of our interpolator. We assumed a straight line connecting two x and y values, and interpolated between them using a linear equation. We can improve on this by using higher-order interpolation schemes. Often, we get better accuracy using a so-called cubic-spline, which fits a special cubic polynomial between every pair of values. Rather than code this ourself, we can use the SciPy package to build our interpolator.

The command we need is
`spline_interpolator = interpolate.interp1d(redshifts, d, kind="cubic")`, 
which builds a function for us that can be given a value of $z$ and will return the comoving distance using cubic-spline interpolation.

As an exercise, plot the percentage accuracy of the comoving distance interpolator as a function of redshift for both linear and cubic spline interpolation. Use the same redshifts to build the interpolators as above.

In [2]:
# You can code your exercise here. I'll start you off by loading the necessary packages
import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('ggplot')
plt.style.use('seaborn-deep')
plt.rc('xtick',labelsize=16)
plt.rc('ytick',labelsize=16)
font = {'size': 18}
plt.rc('font', **font)

# Generate the cubic spline interpolator. 
# **Add your code here**

# Generate a bunch of test redshifts and use CAMB to get the true distance
ntest = 10000
test_redshifts = np.linspace(1.001*zmin, 0.999*zmax, ntest)
true_distance = # **Add your code here**

# Now plot the percentage error of our linear interpolator for the test redshifts
# against the true distance. 
# **Add your code here**


SyntaxError: invalid syntax (<ipython-input-2-8ed63e0518ea>, line 19)

#### 2D Linear interpolation
In the exercises above, we looked at coding up a linear interpolator in 1D. We can also generate interpolators in multiple dimensions. For instance if we have a function z(x,y) we can interpolate in both x and y to return z without needing to evaluate this every time. This is exactly what we want for our model fitting purposes. We want to interpolate our observables (z), as a function of our two cosmological parameters.

The formula for Bilinear interpolation is quite a bit more complicated than the 1D version above, but the principle is the same and the algorithm quite similar
1. Compute and store $z_{ij}(x_{i},y_{j})$ for a suitably large range of $x_{i}$ and $y_{j}$.
2. For every value $z(x,y)$ we want to interpolate:
    1. Find the values of $x_{0}$ and $x_{1}$ in our list of $x_{i}$ that bound $x$, such that $x_{1}>x$ and $x_{0} < x$.
    2. Find the values of $y_{0}$ and $y_{1}$ in our list of $y_{i}$ that bound $y$, such that $y_{1}>y$ and $y_{0} < y$.
    2. Retrieve the corresponding $z_{00}, z_{01}, z_{10}$ and $z_{11}$
    3. Use all of these values to compute $z(x,y)$.
    
The formula we'll need is
\begin{equation}
z(x,y) = \biggl(1-\frac{y-y_{0}}{y_{1}-y_{0}}\biggl)\biggl[\biggl(1-\frac{x-x_{0}}{x_{1}-x_{0}}\biggl)z_{00} + \frac{x-x_{0}}{x_{1}-x_{0}}z_{10}\biggl] + \frac{y-y_{0}}{y_{1}-y_{0}}\biggl[\biggl(1-\frac{x-x_{0}}{x_{1}-x_{0}}\biggl)z_{01} + \frac{x-x_{0}}{x_{1}-x_{0}}z_{11}\biggl]
\end{equation}

This may look daunting, but if we look closely, we'll see that this is really just a weighted sum of our computed quantities $z$ at four locations where the weights can be computed in the same way as for the 1D interpolation.

So, let's try coding this up. Let's turn the example above on it's head and look at interpolating the comoving distance for the two cosmological parameters we are interested in but for a single fixed redshift. In this case $x$ and $y$ are our cosmological parameters, whilst $z$ is our comoving distance. I've coded up some of this below. The code currently runs CAMB for $100$ $(10\times 10)$ different combinations of $\Omega_{b}$ and $w$ and stores these in a grid. I need you to use this grid of values to create a 2D interpolator, where we can put in any values of $\Omega_{b}$ and $w$ and get the comoving distance.

The code then trys to make a couple of plots using your interpolator where it fixes either $\Omega_{b}$ or $w$ and interpolates along the other cosmological parameter and compares the interpolated results to a few random points calculated from CAMB. Have a go at completing the following code and seeing what the plots look like.

In [3]:
import numpy as np
from scipy import interpolate
import matplotlib.pyplot as plt
import camb
%matplotlib inline
plt.style.use('ggplot')
plt.style.use('seaborn-deep')
plt.rc('xtick',labelsize=16)
plt.rc('ytick',labelsize=16)
font = {'size': 18}
plt.rc('font', **font)

# Firstly, it would be really useful if we could create a function that returned the CAMB comoving distance for our
# cosmological parameter. If only we had some code from a previous workshop we could modify to did this ;)
# **Add your code here**


# Now generate a list of cosmological parameter values for input. These will form our x and y arrays.
# I'd recommend 10x10 values for your cosmological parameters, as even then it will take a few minutes to run.
Omega_b_min, Omega_b_max, nOmega_b = 0.03, 0.06, 10
w_min, w_max, nw = -0.9, -1.1, 10
Omega_b_vals = np.linspace(Omega_b_min, Omega_b_max, nOmega_b)
w_vals = np.linspace(w_min, w_max, nw)

# Now loop over our cosmological parameter combinations and save the results in a 2D array
dist_vals = np.empty((nOmega_b,nw))
for i, Omega_b in enumerate(Omega_b_vals):
    for j, w in enumerate(w_vals):
        dist_vals[i,j] = # **Add your code here**
        
# Now build our 2D interpolator. Try and follow the method we used for the 
# 1D interpolator and the equations given above.
def bilinear_interpolator(x, y, xi, yj, zij):
    
    # Let's loop over all the x and y values we passed in
    z = np.empty((len(x),len(y)))
    for i, xval in enumerate(x): 
        for j, yval in enumerate(y): 

            # First find the values of x_{0} and x_{1}. **Add your code here**

            # Now find the values of y_{0} and y_{1}. **Add your code here**
            
            # Now find the corresponding distances at the four points we need. **Add your code here**

            # Now compute z using the formula from above. **Add your code here**
            z[i,j] = 

    return z


# Let's test the interpolator by making plots of the interpolator vs. a few true CAMB values (not the ones we used
# as input!) when one axis is fixed
ntest = 10
interpolator_Omega_b = np.linspace(1.001*Omega_b_min, 0.999*Omega_b_max, 1000)
interpolator_w = np.linspace(1.001*w_min, 0.999*w_max, 1000)
CAMB_Omega_b = (0.999*Omega_b_max-1.001*Omega_b_min)*np.random.rand(ntest) + 1.001*Omega_b_min
CAMB_w = (0.999*w_max-1.001*w_min)*np.random.rand(ntest) + 1.001*w_min

test_vals = np.empty(ntest)
for i, Omega_b in enumerate(CAMB_Omega_b):
    test_vals[i] = run_camb(Omega_bh2=Omega_b*0.6766**2)

fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
ax.plot(interpolator_Omega_b, bilinear_interpolator(interpolator_Omega_b, [-1.0], Omega_b_vals, w_vals, dist_vals), color='k', linestyle='-', linewidth=1.3, zorder=1)
ax.plot(CAMB_Omega_b, test_vals, color='r', marker='o', linestyle='None', zorder=5)
ax.set_title(r'$Fixed\,w$')
ax.set_xlabel(r'$\Omega_{b}$')
ax.set_ylabel(r'$\mathrm{Comoving\,Distance\,to\,z=0.2\,(h^{-1}\,\mathrm{Mpc})}$')
plt.show()

test_vals = np.empty(ntest)
for i, w in enumerate(CAMB_w):
    test_vals[i] = run_camb(DE_EoS=w)

fig = plt.figure()
ax = fig.add_axes([0, 0, 1, 1])
ax.plot(interpolator_w, bilinear_interpolator([0.02242/0.6766**2], interpolator_w, Omega_b_vals, w_vals, dist_vals).T, color='k', linestyle='-', linewidth=1.3, zorder=1)
ax.plot(CAMB_w, test_vals, color='r', marker='o', linestyle='None', zorder=5)
ax.set_title(r'$Fixed\,\Omega_{b}$')
ax.set_xlabel(r'$w$')
ax.set_ylabel(r'$\mathrm{Comoving\,Distance\,to\,z=0.2\,(h^{-1}\,\mathrm{Mpc})}$')
plt.show()

SyntaxError: invalid syntax (<ipython-input-3-d35a8a9de8ee>, line 29)

Do the plots above seem correct?

#### Building the interpolator for all the observables
So we've managed to build a function that will return the comoving distance for us without us having to call CAMB each time, speeding up how fast we can compute this. We can use the same procedure to build interpolators for all the observables we are interested in.

The observables we will be using in the next notebook are: The CMB TT, TE and EE power spectra at different $\ell$ values, the value of the BAO parameter $r_{s}/D_{V}$ at four different redshifts, and the power spectrum values at a bunch of $k$-values *and* four different redshifts.

In preparation, have a think about how you could build such a set of interpolators *for your cosmological parameters*. Do you know how to modify the above 2D example for the parameters you are interested in? What would be the best way to deal with the fact you will need to interpolate for a whole range of $\ell$ and $k$ values?