## Validating galaxy-galaxy lensing signal against SDSS measurements

In this notebook we show the galaxy-galaxy lensing signal ($\Delta \Sigma$) measured from CosmoDC2 where we attempt to match the lens selection to those used in Mandelbaum et al. (2016; M16; 1509.06762). This analysis tests the extent to which we can trust the simulations in terms of:
* scales useful for lensing measurements
* galaxy-halo connection as a function of stellar mass and colors

The lens selections are described as follows:
* 7 stellar mass bins: $\log(M*/M_{\odot})=$ [10-10.4, 10.4-10.7, 10.7-11.0, 11.0-11.2, 11.2-11.4, 11.4-11.6, 11.6-15.0]
* for each bin, split into red and blue based on $g-r=0.7$
* $0.03 <z<0.2$
* $r<17.7$

I used the cosmoDC2_image catalog, which is roughly 436.4 deg$^{2}$ to extract the lens and source galaxy samples. The code to extract these samples is attached in the Appendix at the end of the notebook. 

This gives the number of lens galaxies:
* Red: [2642., 4061., 4868., 3199., 1844.,  695.,  270.]
* Blue: [1718., 1404., 1024.,  325.,   87.,   18.,    0.]

and the number densities (per arcmin$^2$):
* Red: [0.00168169, 0.00258491, 0.00309858, 0.00203623, 0.00117374, 0.00044238, 0.00017186]
* Blue: [1.09354313e-03, 8.93675527e-04, 6.51797535e-04, 2.06869335e-04, 5.53773297e-05, 1.14573786e-05, 0.00000000e+00]

Compared with M16's number density (7748 deg$^2$):
* Red: [0.00117042, 0.00232929, 0.00341154, 0.00206347, 0.00127105, 0.00051329, 0.00012587]
* Blue: [2.31802652e-03, 2.30361121e-03, 1.85440742e-03, 5.01014009e-04, 1.08213977e-04, 1.27472163e-05, 3.58515459e-06]


In [None]:
# Numbers read off from M16
# np.array([4244.,17542., 44724., 37987., 28008., 12599., 3195.])/np.array([0.13, 0.27, 0.47, 0.66, 0.79, 0.88, 0.91])/7748/60/60
# np.array([20690.,30842.,33621.,11040.,2626.,320.,96.])/np.array([0.32,0.48,0.65,0.79,0.87,0.90,0.96])/7748/60/60

### First we plot the number density:

In [None]:
import numpy as np
import pylab as mplot
%pylab inline
plt.rc('text', usetex=False)
plt.rc('font', family='serif')

SM_min = np.array([10,10.4,10.7,11.0,11.2,11.4,11.6])
SM_max = np.array([10.4,10.7,11.0,11.2,11.4,11.6,15.0])

In [None]:
n_r_dc2 = np.array([0.00168169, 0.00258491, 0.00309858, 0.00203623, 0.00117374, 0.00044238, 0.00017186])
n_r_data = np.array([0.00117042, 0.00232929, 0.00341154, 0.00206347, 0.00127105, 0.00051329, 0.00012587])
n_b_dc2 = np.array([1.09354313e-03, 8.93675527e-04, 6.51797535e-04, 2.06869335e-04, 5.53773297e-05, 1.14573786e-05, 0.00000000e+00])
n_b_data = np.array([2.31802652e-03, 2.30361121e-03, 1.85440742e-03, 5.01014009e-04, 1.08213977e-04, 1.27472163e-05, 3.58515459e-06])

mplot.figure(figsize=(10,4))
mplot.subplot(121)
mplot.scatter(np.arange(7), n_r_dc2, marker='s', label='CosmoDC2', color='r')
mplot.scatter(np.arange(7), n_r_data, marker='x', lw=2, label='M16', color='k')
mplot.title('Red', fontsize=15)
mplot.xlabel('Bin #', fontsize=15)
mplot.ylabel('Number of lens / sq. deg.', fontsize=15)
mplot.ylim(-0.0001,0.004)
mplot.legend(fontsize=13)

mplot.subplot(122)
mplot.scatter(np.arange(7), n_b_dc2, marker='s', label='CosmoDC2', color='b')
mplot.scatter(np.arange(7), n_b_data, marker='x', lw=2, label='M16', color='k')
mplot.title('Blue', fontsize=15)
mplot.xlabel('Bin #', fontsize=15)
mplot.ylim(-0.0001,0.004)
mplot.legend(fontsize=13)

mplot.tight_layout()

#### Comments:

* The trend in the number counts of red galaxies as a function of stellar mass traces the data pretty well, especially for the high-mass end.
* The blue galaxies are less well matched, and in general $\sim2$ times fewer in the simulations compared to data on the low-mass end.


### Next we plot the $\Delta \Sigma$ measurements. 

The measurement code itself is hard to put in a notebook so I'll just show the results, but it's standard treecorr measurement code. The columns used in the calculation are position (ra_true, dec_true), and shear (shear_1, shear_2). The code for extracting these columns are shown in the Appendix. 

In [None]:
h = 0.7
for i in range(7):
    infile_r = np.load('gglens_data/DeltaSigma_cosmoDC2_SM'+str(i+1)+'_r.npz')
    data_r = np.loadtxt('gglens_data/SDSS_main_red_DS.dat')
    infile_b = np.load('gglens_data/DeltaSigma_cosmoDC2_SM'+str(i+1)+'_b.npz')
    data_b = np.loadtxt('gglens_data/SDSS_main_blue_DS.dat')

    mplot.figure(figsize=(10,3.5))
    
    mplot.subplot(121)
    mplot.loglog(infile_r['R']*h,np.sum(infile_r['gt'], axis=0)/np.sum(infile_r['npairs'], axis=0)/h/10**12, color='r', label='CosmoDC2')
    mplot.errorbar(data_r[:,0], data_r[:,2*i+1], yerr=data_r[:,2*i+2], color='k', label=str(SM_min[i])+'$< \log(M*/M_{\odot}) <$'+str(SM_max[i])+'; M16')
    
    
    mplot.ylabel('$\Delta \Sigma$ ($h M_{\odot}$/pc$^{2}$)', fontsize=15)
    mplot.ylim(0.2,1000)
    mplot.grid()
    mplot.legend(loc=1,fontsize=12)
    if i==0:
        mplot.title('Red')
    if i==6:
        mplot.xlabel('$R$ ($Mpc/h$)', fontsize=15)
        
    mplot.subplot(122)
    mplot.loglog(infile_b['R']*h,np.sum(infile_b['gt'], axis=0)/np.sum(infile_b['npairs'], axis=0)/h/10**12, color='b', label='CosmoDC2')
    mplot.errorbar(data_b[:,0], data_b[:,2*i+1], yerr=data_b[:,2*i+2], color='k', label=str(SM_min[i])+'$< \log(M*/M_{\odot}) <$'+str(SM_max[i])+'; M16')
    
    mplot.ylabel('$\Delta \Sigma$ ($h M_{\odot}$/pc$^{2}$)', fontsize=15)
    mplot.ylim(0.2,1000)
    mplot.grid()
    mplot.legend(fontsize=12)
    if i==0:
        mplot.title('Blue')
    if i==6:
        mplot.xlabel('$R$ ($Mpc/h$)', fontsize=15)

    mplot.tight_layout()

#### Comments:

* Similar to the number counts, the trend in the $\Delta \Sigma$ of red galaxies as a function of stellar mass traces the data pretty well, especially for the high-mass end, less so for the blue galaxies.
* Below 0.2-0.3 Mpc/h, the simulations show supressed signal. This is related to the resolution in the simulations -- suggesting that we should not use the simulations at those scales for other analyses.
* Both the number density and the $\Delta \Sigma$ comparison provide a strong test of the galaxy-halo connection in CosmoDC2.

### Appendix: Code to extract samples from CosmoDC2 catalog 

(the code here are for demonstration purpose, will not run in place)

#### First use GCR to query relevant columns from CosmoDC2 and store into separate files

In [None]:
from GCR import GCRQuery
import GCRCatalogs
import fitsio
import astropy.io.fits as pf
import os

Pix = [8786, 8787, 8788, 8789, 8790, 8791, 8792, 8793, 8794, 8913, 8914, 8915, 8916, 8917, 8918, 8919, 8920, 8921, 
       9042, 9043, 9044, 9045, 9046, 9047, 9048, 9049, 9050, 9169, 9170, 9171, 9172, 9173, 9174, 9175, 9176, 9177, 
       9178, 9298, 9299, 9300, 9301, 9302, 9303, 9304, 9305, 9306, 9425, 9426, 9427, 9428, 9429, 9430, 9431, 9432, 
       9433, 9434, 9554, 9555, 9556, 9557, 9558, 9559, 9560, 9561, 9562, 9681, 9682, 9683, 9684, 9685, 9686, 9687, 
       9688, 9689, 9690, 9810, 9811, 9812, 9813, 9814, 9815, 9816, 9817, 9818, 9937, 9938, 9939, 9940, 9941, 9942, 
       9943, 9944, 9945, 9946, 10066, 10067, 10068, 10069, 10070, 10071, 10072, 10073, 10074, 10193, 10194, 10195, 
       10196, 10197, 10198, 10199, 10200, 10201, 10202, 10321, 10322, 10323, 10324, 10325, 10326, 10327, 10328, 
       10329, 10444, 10445, 10446, 10447, 10448, 10449, 10450, 10451, 10452]

columns = ['ra_true', 'dec_true', 'redshift_true','shear_1', 'shear_2', 'mag_true_i_sdss', 'mag_true_z_sdss',
           'mag_true_g_sdss', 'mag_true_r_sdss', 'Mag_true_g_lsst_z0', 'Mag_true_r_lsst_z0', 'stellar_mass_bulge', 
           'stellar_mass_disk','Mag_true_g_sdss_z0','Mag_true_r_sdss_z0', 'mag_i_lsst','size_true']

for i in range(len(Pix)):
    print(i)
    d = catalog.get_quantities(columns, native_filters=[(lambda x: x==Pix[i], 'healpix_pixel')], filters=['mag_i_lsst < 28'])

    CC = []
    
    for j in range(len(columns)):
        print(j)
        cc = pf.Column(name=columns[j], format='E', array=d[columns[j]])
        CC.append(cc)
    hdu = pf.BinTableHDU.from_columns(CC, nrows=len(d['mag_i_lsst']))
    hdu.writeto('CosmoDC2_'+str(Pix[i])+'.fits', clobber=True)

#### Next trim the catalogs to form separate files for lenses and sources

In [None]:
SM_min = np.array([10,10.4,10.7,11.0,11.2,11.4,11.6])
SM_max = np.array([10.4,10.7,11.0,11.2,11.4,11.6,15.0])

# lenses
N_b = np.zeros(7)
N_r = np.zeros(7)

for i in range(7):
    for j in range(len(Pix)):

        res = pf.open('CosmoDC2_'+str(Pix[j])+'.fits')[1].data
        ra = res['ra_true']
        dec = res['dec_true']
        z = res['redshift_true']
        mag_r = res['mag_true_r_sdss']
        
        gr = res['Mag_true_g_sdss_z0'] - res['Mag_true_r_sdss_z0'] # larger number means redder
        sm = res['stellar_mass_bulge'] + res['stellar_mass_disk']
        
        #17.7 0.03 0.2
        mask_lens_b = (z>0.03)*(z<0.2)*(mag_r<17.7)*(np.log10(sm)>SM_min[i])*(np.log10(sm)<SM_max[i])*(gr<0.7)
        mask_lens_r = (z>0.03)*(z<0.2)*(mag_r<17.7)*(np.log10(sm)>SM_min[i])*(np.log10(sm)<SM_max[i])*(gr>0.7)

        N_b[i] += len(ra[mask_lens_b])
        N_r[i] += len(ra[mask_lens_r])

        print(i, j, len(ra[mask_lens_b]), len(ra[mask_lens_r]))

    print(i,N_b, N_r)


# sdss main
# N_b = np.array([1718., 1404., 1024.,  325.,   87.,   18.,    0.])
# N_r = np.array([2642., 4061., 4868., 3199., 1844.,  695.,  270.])

for i in range(7):
    outfilename_r = 'cosmoDC2_SM'+str(i+1)+'_r.fits'
    outfilename_b = 'cosmoDC2_SM'+str(i+1)+'_b.fits'
    RA_r = np.zeros(int(N_r[i]))
    DEC_r = np.zeros(int(N_r[i]))
    Z_r = np.zeros(int(N_r[i]))
    RA_b = np.zeros(int(N_b[i]))
    DEC_b = np.zeros(int(N_b[i]))
    Z_b = np.zeros(int(N_b[i]))
    n_r = 0
    n_b = 0

    for j in range(len(Pix)):

        res = pf.open('CosmoDC2_'+str(Pix[j])+'.fits')[1].data
        ra = res['ra_true']
        dec = res['dec_true']
        z = res['redshift_true']
        mag_r = res['mag_true_r_sdss']

        gr = res['Mag_true_g_sdss_z0'] - res['Mag_true_r_sdss_z0'] # larger number means redder
        sm = res['stellar_mass_bulge'] + res['stellar_mass_disk']

        mask_lens_b = (z>0.03)*(z<0.2)*(mag_r<17.7)*(np.log10(sm)>SM_min[i])*(np.log10(sm)<SM_max[i])*(gr<0.7)
        mask_lens_r = (z>0.03)*(z<0.2)*(mag_r<17.7)*(np.log10(sm)>SM_min[i])*(np.log10(sm)<SM_max[i])*(gr>0.7)

        nn_r = len(ra[mask_lens_r])
        nn_b = len(ra[mask_lens_b])

        print(n_r, n_b, nn_r, nn_b)
        RA_r[n_r:n_r+nn_r] = ra[mask_lens_r]
        RA_b[n_b:n_b+nn_b] = ra[mask_lens_b]
        DEC_r[n_r:n_r+nn_r] = dec[mask_lens_r]
        DEC_b[n_b:n_b+nn_b] = dec[mask_lens_b]
        Z_r[n_r:n_r+nn_r] = z[mask_lens_r]
        Z_b[n_b:n_b+nn_b] = z[mask_lens_b]


        n_r += nn_r
        n_b += nn_b

    c1 = pf.Column(name='RA', format='E', array=RA_r)
    c2 = pf.Column(name='DEC', format='E', array=DEC_r)
    c3 = pf.Column(name='Z', format='E', array=Z_r)

    CC = [c1, c2, c3]
    hdu = pf.BinTableHDU.from_columns(CC, nrows=len(RA_r))
    hdu.writeto(outfilename_r, clobber=True)

    c1 = pf.Column(name='RA', format='E', array=RA_b)
    c2 = pf.Column(name='DEC', format='E', array=DEC_b)
    c3 = pf.Column(name='Z', format='E', array=Z_b)

    CC = [c1, c2, c3]
    hdu = pf.BinTableHDU.from_columns(CC, nrows=len(RA_b))
    hdu.writeto(outfilename_b, clobber=True)

# sources
N_s = 2740801
n_s = 0
nn_s = 0

RA = np.zeros(N_s)
DEC = np.zeros(N_s)
Z = np.zeros(N_s)
E1 = np.zeros(N_s)
E2 = np.zeros(N_s)

for j in range(len(Pix)):

    print(j)
    res = pf.open('CosmoDC2_'+str(Pix[j])+'.fits')[1].data
    ra = res['ra_true']
    dec = res['dec_true']
    z = res['redshift_true']
    mag_r = res['mag_true_r_sdss']
    e1 = res['shear_1']
    e2 = res['shear_2']
     
    mask_source = (z>0.2)*(z<1.0)*(mag_r<22)

    nn_s = len(ra[mask_source])

    RA[n_s:n_s+nn_s] = ra[mask_source]
    DEC[n_s:n_s+nn_s] = dec[mask_source]
    Z[n_s:n_s+nn_s] = z[mask_source]
    E1[n_s:n_s+nn_s] = e1[mask_source]
    E2[n_s:n_s+nn_s] = e2[mask_source]

    n_s += nn_s

c1 = pf.Column(name='RA', format='E', array=RA)
c2 = pf.Column(name='DEC', format='E', array=DEC)
c3 = pf.Column(name='Z', format='E', array=Z)
c4 = pf.Column(name='E1', format='E', array=E1)
c5 = pf.Column(name='E2', format='E', array=E2)

CC = [c1, c2, c3, c4, c5]
hdu = pf.BinTableHDU.from_columns(CC, nrows=len(RA))
hdu.writeto('cosmoDC2_source.fits', clobber=True)

