# Cross Correlations with CMB lensing -- Example: BOSS x Planck

In [None]:
# Load the usual packages:
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

## Healpix ##

While there are several ways of pixelizing the sphere (i.e. putting scalar or tensor data onto the sphere in a discrete way) the cosmology community almost uniformly uses the [Healpix](https://healpix.jpl.nasa.gov/) scheme.  Healpix is built around 12 base pixels (each about 1 steradian in area) which are then subdivided into 4, each of those in 4 again and so on.  You can see pictures in the documentation link earlier.  There are two advantages of Healpix for our purposes:

1. Each pixel is approximately equal area, so integrals over the sphere become sums over pixels in a simple way.

2. Pixels lie in iso-latitude rings, i.e. groups have the same $\theta$ in spherical coordinates.  This means in the spherical harmonic transform, the $\phi$ integral becomes an FFT (at fixed $\theta$), and the $\theta$ integral can be done using recurrence relations among the $P_\ell^m$.

We'll use the `Healpy` package for manipulating Healpix maps (more on other packages below).  You can install `Healpy` using pip, e.g. from within a notebook
```
!pip install healpy
```
or from the terminal just pip install.
Beware that the Conda version uses obsolete libraries and this can cause "issues".

In [None]:
# There are numerous, public libraries to perform analysis of
# sky maps, including power spectra, correlation functions, etc.
# Here we'll use a very basic version, the HealPy package which
# handles Healpix maps [https://healpix.jpl.nasa.gov/].
# (See later for alternative packages).
#
# !pip install healpy
import healpy as hp
#
# Let's look at the 12 "base" pixels using a full-sky
# Mollweide projection (as provided in the Healpy package).
# We'll use the default, "ring" scheme.
hp.mollview(np.arange(12))

In [None]:
# Now let's go to a higher "Nside".  Nside must be a power of 2.
Nside = 4
hp.mollview(np.arange(12*Nside**2))

You can play with multiple options to mollview to rotate the map, overlay maps, change coordinates, put on various cuts, scales etc.  If you prefer other projections of the sky the package also contains a few other options.

In [None]:
# The maps I've made below have N_side=1024, so set this here.
Nside = 1024
Npix  = 12*Nside**2 # Healpix map has 12 * N_side**2 pixels
Lmax  = 3*Nside-1   # Maximum ell supported.

## Pseudo-Cl analysis ##

We will be computing the cross-correlation within the "pseudo-Cl" formalism, which is close to optimal at large ell.  Almost all CMB/LSS angular power spectrum analyses use a version of this method, sometimes in combination with a more optimal method at very low $\ell$.

The basic idea of the pseudo-$C_\ell$ analysis is to pretend we had a full-sky map with no missing data or other oddities.  In this case computing the power spectrum or cross-spectrum would be trivial: take the $Y_{\ell m}$ transform, square the coefficients and average the $2\ell+1$ $m$-modes into a single $\ell$-bin.  Step 1 of the pseudo-$C_\ell$ analysis is to do this, calling our result $\tilde{C}_\ell$:
$$
  \tilde{C}_\ell = \frac{1}{2\ell+1}\sum_{m} a_{\ell m} a^\star_{\ell m}
$$
But what we have is not a full sky map but rather the map multiplied by a "mask".  So what we've computed here is not $C_\ell$ but $\tilde{C}_\ell$, which is $C_\ell$ convolved with a "mode mixing matrix", $M_{\ell\ell'}$.  It turns out this mode mixing matrix is purely geometrical, and it can be straightforwardly computed given the mask.  Suppose our mask, $W$, is $1$ where we have data and $0$ elsewhere.  Then $W^2=W$.  If we define
$$
  W_{\ell\ell' mm'} = \int d\hat{n}\ W(\hat{n})Y_{\ell m}^\star(\hat{n})Y_{\ell'm'}(\hat{n})
  = \int_{\rm obs} d\hat{n}\ Y_{\ell m}^\star(\hat{n})Y_{\ell'm'}(\hat{n})
$$
then what we've computed is
$$
  \tilde{a}_{\ell m} = \sum_{\ell' m'} a_{\ell' m'}^{\rm true} W_{\ell\ell' mm'}
  \quad\Rightarrow\quad
  \left\langle\tilde{C}_\ell\right\rangle = \frac{1}{2\ell+1}\sum_{\ell'} C_{\ell'}
  \sum_{mm'} \left|W_{\ell\ell' mm'}\right|^2 = \sum_{\ell'} M_{\ell\ell'} C_{\ell'}
$$
If we deconvolve this mask, we recover $C_\ell$.

Imagine $C_\ell$ is very slowly varying on the scale where $W_{\ell\ell' mm'}$ is non-trivial.  Then
\begin{eqnarray}
  \left\langle \tilde{C}_\ell\right\rangle &=& \frac{1}{2\ell+1}\sum_{\ell'} C_\ell
  \sum_{mm'} \left| W_{\ell\ell' mm'} \right|^2 \\
  &\approx& \frac{C_\ell}{2\ell+1}\sum_{\ell'mm'} \int d\hat{n}\ W(\hat{n})Y_{\ell m}^\star(\hat{n})Y_{\ell'm'}(\hat{n})
  \int d\hat{n}'\ W(\hat{n}')Y_{\ell m}(\hat{n}')Y_{\ell'm'}^\star(\hat{n}') \\
  &=& C_\ell \int d\hat{n}\ W^2(\hat{n})\ \frac{1}{2\ell+1}\sum_{m} Y_{\ell m}^\star(\hat{n})Y_{\ell'm'}(\hat{n}) \\
  &=& C_\ell \int\frac{d\hat{n}}{4\pi}\ W(\hat{n}) \\
  &=& C_\ell f_{\rm sky}
\end{eqnarray}
So the "quick" way to deconvolve the mask is simply to divide by $f_{\rm sky}$!

The lowest order effect of having incomplete sky coverage is that the number of modes per $\ell$ is no longer $2\ell+1$ but rather $(2\ell+1)f_{\rm sky}$ so that
$$
  {\rm Cov}\left[ C_\ell , C_{\ell'} \right] = \frac{1}{(2\ell+1)f_{\rm sky}}
  \left( C_\ell^{\rm sig}+C_\ell^{\rm noise} \right)^2 \delta_{\ell\ell'}
$$

## Loading maps

### Planck lensing map ###

To save you some time I downloaded the Planck 2018 lensing map from the Planck Legacy Archive.  It comes packaged as a set of a_lm, but I converted it into a map using:

```
pl_lensing_alm = hp.read_alm('dat_klm.fits')       # Load alm file of kappa map
pl_lensing = hp.alm2map(pl_lensing_alm, Nside)     # Make map for viewing.
```

The result is "P18_lens_kap.fits".  I also downgraded the Planck mask to Nside=1024 using the hp.ud_grade command.

If you want to duplicate this, go to
`https://pla.esac.esa.int/`
then cosmology products, lensing products and download
`COM_Lensing_4096_R3.00.tgz`
Unpack it.  We need:
* COM_Lensing_4096_R3.00/MV/dat_klm.fits
* COM_Lensing_4096_R3.00/mask.fits.gz

and you can delete the rest.

In [None]:
pl_lensing = hp.read_map("P18_lens_kap.fits")
pl_mask = hp.read_map('P18_lens_msk.fits')

Let's make a plot of this just to see what we've got.  Here it is at low resolution and with no filtering...

In [None]:
hp.mollview(hp.ud_grade(pl_lensing*pl_mask,256),min=-15,max=15,title='Planck $\kappa$ map')

We can make a much nicer plot that this.  One thing we could do is to Wiener filter the map.  This amounts to multiplying all of the $a_{\ell m}$ by $S/(S+N)$.  Along with lensing data the Planck team provides a noise power spectrum.  The lensing power spectrum can be computed either from a Boltzmann code or from the data itself so the filtering kernel is pretty straightforward.  We'll do something easier, since this is just for visualization purposes.  To avoid ringing due to a sharp $\ell$-cut let's filter with
$$
  F_\ell = \frac{1}{2}\left[ 1 - \tanh\left(\frac{\ell-\ell_{\rm max}}{\sigma}\right) \right]
$$
where e.g. $\sigma=20$ and we can choose $\ell_{\rm max}$ as we like.

In [None]:
# Set up our filtering kernel... the numbers below are
# fairly arbitrary but make an okay looking figure.
filt_ells = np.arange(Lmax)
filt_sigm = 20.
filt_lmax = 400.
filt_vals = 0.5*(1-np.tanh((filt_ells-filt_lmax)/filt_sigm))
#
# Now filter the map ...
unfl_alm = hp.map2alm(pl_lensing)
filt_alm = hp.almxfl(unfl_alm,filt_vals)
filt_map = hp.alm2map(filt_alm,256)
# and plot it.  We'll use a "masked array" so that we
# get the nice greyed out regions where we have masked
# the sky. Note Healpix/Healpy use the 'inverse' of the
# NumPy masked array convention, hence the "1-" below...
m = hp.ma(filt_map)
m.mask = 1-hp.ud_grade(pl_mask,256)
hp.mollview(m.filled(),title=r'Planck $\kappa$ map')
hp.graticule()

## Now read the galaxy data. ##

To save you some time, I downloaded the BOSS DR12 galaxy catalog from

`
wget https://data.sdss.org/sas/dr12/boss/lss/galaxy_DR12v5_CMASSLOWZ_North.fits.gz
`

and then used the following to make a Healpix map of the number of galaxies per pixel:

```
from astropy.table import Table
t = Table.read('galaxy_DR12v5_CMASSLOWZ_North.fits.gz')['RA','DEC']
print(t[:5])
# Make a histogram of the counts, converted from RA/DEC to galactic
# coordinates.  This will be n_gal(\hat{n}).
import astropy.units as u
from astropy.coordinates import SkyCoord
g = SkyCoord(ra=t['RA']*u.degree,dec=t['DEC']*u.degree,frame='fk5')
g.transform_to('galactic')
ipix = hp.ang2pix(Nside,g.galactic.l.value,g.galactic.b.value,lonlat=True)
boss,b = np.histogram(ipix,bins=np.arange(12*Nside**2+1)-0.5)
```

This map is "BOSS_DR12.fits".  If you'd prefer to avoid the astropy routines you can also change coordinates using the Healpix "rotator" methods.  In our case we would rotate coordinates ```coord=['C','G']```.

In [None]:
boss = hp.read_map('BOSS_DR12.fits')
#
# We want to make a mask for these data -- what we should REALLY
# do is take their observing strategy and/or random catalog and
# weights and make a mask from that.  To be really crude let's just
# downgrade the map and pick non-zero pixels.
boss_low = hp.ud_grade(boss,64)
boss_mask = np.zeros_like(boss_low)
boss_mask[boss_low>0]=1.0
boss_mask = hp.ud_grade(boss_mask,Nside)
print("Sky fraction {:f}".format(np.sum(boss_mask)/len(boss_mask)))
hp.mollview(hp.ud_grade(boss_mask,128))

## Apply the mask

In [None]:
# The mask is product of PLANCK lensing and CMASS masks
mask = boss_mask * pl_mask
fsky = np.sum(mask) * 1. / len(mask)

# Apply the mask. Same for all, for now
masked_pl = pl_lensing * mask
masked_boss = boss * mask

# Do some tidying up.
del(boss_mask)
del(pl_mask)
del(pl_lensing)
del(boss)

## Convert BOSS  n(x) -> delta n / n

In [None]:
mean_boss = np.sum(masked_boss) / np.sum(mask)
masked_boss_dn = masked_boss / mean_boss - 1.
masked_boss_dn = mask * masked_boss_dn 

## Cross correlation

In [None]:
Cls = hp.anafast(masked_pl, map2 = masked_boss_dn, lmax = 800)
ls = np.arange(len(Cls))
pixwinf = hp.pixwin(Nside)[0:len(Cls)]
# Now remove the pixel window function and deconvolve the mode mixing
# matrix.  If the sky coverage is large, and we are using wide bins in
# ell, then inverting the mixing matrix just reduces to dividing by
# f_sky, the sky fraction.
Cls = Cls / (pixwinf **2)                # Remove pixel window function
Cls = Cls / fsky                         # Correct for f_sky
print('Done with cross-correlation...')

## Auto correlations -- needed for the error

In [None]:
# Same as before!
Clkk = hp.anafast(masked_pl, lmax = 800)
Clgg = hp.anafast(masked_boss_dn, lmax = 800)
Clkk = Clkk / (pixwinf **2)
Clgg = Clgg / (pixwinf **2)
Clkk = Clkk / fsky
Clgg = Clgg / fsky
print('Done with auto-correlations...')

## Binning

In [None]:
# Number of bins and range
Nbins = 8
lmin = 30
lmax = 800
#
bins = np.round(np.linspace(lmin, lmax, Nbins+1))   # Bin edges
bins = bins.astype(int)
lcenterbin = np.zeros(len(bins)-1)
binnedCl = np.zeros(len(bins)-1)
binnedkk = np.zeros(len(bins)-1)
binnedgg = np.zeros(len(bins)-1)
#
for k in range(0, len(bins)-1):  
    lmaxvec = np.arange(bins[k], bins[k+1], 1)
    lcenterbin[k] = np.round(0.5 * (bins[k] + bins[k+1]))   # bin center
    for l in lmaxvec:
        binnedCl[k] += Cls[l]
    binnedCl[k] = binnedCl[k] / len(lmaxvec)
    #print("ell={:.1f}, Clkg={:12.4e}".format(lcenterbin[k],binnedCl[k]))
#
for k in range(0, len(bins)-1): 
    lmaxvec = np.arange(bins[k], bins[k+1], 1)
    for l in lmaxvec:
        binnedkk[k] += Clkk[l]
    binnedkk[k] = binnedkk[k] / len(lmaxvec)
#
for k in range(0, len(bins)-1): 
    lmaxvec = np.arange(bins[k], bins[k+1], 1)
    for l in lmaxvec:
        binnedgg[k] += Clgg[l]
    binnedgg[k] = binnedgg[k] / len(lmaxvec)

## Computing theory errors

__NOTE:__ The $C_\ell$ here are the "empirical" power spectra, i.e. they include noise (shot noise, reconstruction noise etc).  For convenience here we use the measured ones. It would be better to have a "smooth" fiducial model that fits the data.  Rather than give you yet another file we'll just use the data ...

In [None]:
sigmavecth = np.zeros(len(bins)-1)
for k in range(0, len(bins)-1):
    lmaxvec = np.arange(bins[k], bins[k+1], 1)
    for l in lmaxvec:
        sigmavecth[k] += fsky * (2. * l + 1.) / (Clkk[l] * Clgg[l] + Cls[l]**2)
    sigmavecth[k] = 1. / sigmavecth[k]
sigmavecth = np.sqrt(sigmavecth)

## Fitting

Here we would try to fit the bias assuming some theory:
$$
(C_\ell^{\kappa g})^{\rm measured} = b \ (C_\ell^{\kappa g})^{\rm theory}_{b = 1}
$$
We won't do this for now.

## Plotting

In [None]:
# We should get something pretty flat, around
# the level of 10^{-5}ish.
fig,ax = plt.subplots(1,1,figsize=(8,4))
ax.errorbar(lcenterbin, lcenterbin * binnedCl, yerr = lcenterbin * sigmavecth, fmt = 'o', label = 'Data')
ax.set_xlabel(r'$\ell$', fontsize = 18)
ax.set_ylabel(r'$\ell C_{\ell}^{\kappa g}$', fontsize = 18)
ax.set_xlim([lmin, lmax])

## Improvements ##

Though we got most of the answer, this was *not* a state-of-the-art analysis!  Some things we left out:

-  Use weights on galaxy catalog (completeness, stars, seeing, fiber collisions, ...) and a proper galaxy mask.
-  We didn't apodize our masks at all, to reduce "ringing" in Fourier/harmonic space.  This can be especially important.
-  We could deconvolve the mask (i.e. $M^{-1}$) (a nice place to look for more info is https://arxiv.org/abs/0705.3980, https://arxiv.org/pdf/0801.0644.pdf)
-  Bias modeling: constant bias is probably okay on very large scales, but really not good enough for high precision work.
-  If we were fitting, we could use the covariance matrix, including off-diagonal elements.  We could derive this from mocks:
    -  Planck mock CMB lensing maps: https://wiki.cosmos.esa.int/planckpla2015/index.php/Simulation_data#Lensing_Simulations
-  Convolve any theory curve with "bin" window function for any fits
-  Include cosmological dependence of theory curve and jointly fit cosmological parameters
-  Foreground biases to cross correlation (CIB, tSZ, kSZ, radio point sources)
-  ...



## Further reading ##

Some place to find more information are:

MASTER algorithm (https://arxiv.org/abs/astro-ph/0105302)
Implemented in https://wwwmpa.mpa-garching.mpg.de/~komatsu/crl/list-of-routines.html http://www2.iap.fr/users/hivon/software/PolSpice/

POKER algorithm (https://arxiv.org/abs/1111.0766)
Implemented in http://www.ipag.osug.fr/~ponthien/Poker/Poker.html

Optimal quadratic estimator (Matrix inversion! deal with inhomogeneous noise etc. as well, https://arxiv.org/pdf/astro-ph/9611174.pdf)

For example https://github.com/dhanson/quicklens/ https://arxiv.org/abs/0705.3980

Packages come and packages go, but recently the community seems to be moving towards [the NaMaster package](https://arxiv.org/abs/1809.09603) for doing power spectrum work.  This takes slightly more work to install than HealPy, which is why I didn't use it here, but once you have it the package takes care of computing the binning matrices, window function corrections, angular power spectra and (Gaussian) covariances for you!