# API demonstration for paper of v1.0

_the LSST-DESC CLMM team_


Here we demonstrate how to use `clmm` to estimate a WL halo mass from observations of a galaxy cluster when source galaxies follow a given distribution (the Chang. (2013) implemented in `clmm`). It uses several functionalities of the support `mock_data` module to produce mock datasets.

- Setting things up, with the proper imports.
- Computing the binned reduced tangential shear profile, for the 2 datasets, using logarithmic binning.
- Setting up a model accounting for the redshift distribution.
- Perform a simple fit using `scipy.optimize.curve_fit` and visualize the results.

## Setup

First, we import some standard packages.

In [None]:
import matplotlib.pyplot as plt
import numpy as np
plt.rcParams['font.family'] = ['gothambook','gotham','gotham-book','serif']

Next, we import `clmm`'s core modules.

In [None]:
import clmm
import clmm.dataops as da
import clmm.galaxycluster as gc
import clmm.theory as theory
import clmm.support.mock_data as mock
from clmm import Cosmology

## Measuring shear profiles 

`clmm` has a support code to generate a mock catalog given a input cosmology and cluster parameters. We will use this to generate a data sample to be used in this example:

In [None]:
np.random.seed(14) # For reproducibility

# Set cosmology of mock data
cosmo = Cosmology(H0=70.0, Omega_dm0=0.27-0.045, Omega_b0=0.045, Omega_k0=0.0)

# Cluster info
cluster_m = 1.e15 # Cluster mass - ($M200_m$) [Msun]
concentration = 4  # Cluster concentration
cluster_z = 0.4 # Cluster redshift
cluster_ra = 0. # Cluster Ra in deg
cluster_dec = 0. # Cluster Dec in deg

# Make mock galaxies
mock_galaxies = mock.generate_galaxy_catalog(
    cluster_m=cluster_m, cluster_z=cluster_z, cluster_c=concentration, # Cluster data
    cosmo=cosmo, # Cosmology object
    zsrc='chang13', # Galaxy redshift distribution, 
    zsrc_min=0.5, # Minimum redshift of the galaxies
    shapenoise=0.05, # Gaussian shape noise to the galaxy shapes
    photoz_sigma_unscaled=0.05, # Photo-z errors to source redshifts
    ngals=10000 # Number of galaxies to be generated
)['ra', 'dec', 'e1', 'e2', 'z', 'ztrue', 'pzbins', 'pzpdf', 'id']
print(f'This results in a table with the columns: {", ".join(mock_galaxies.colnames)}')

We can extract the column of this mock catalog to show explicitely how the quantities can be used on `clmm` functionality and how to add them to a `GalaxyCluster` object: 

In [None]:
# Put galaxy values on arrays
gal_ra = mock_galaxies['ra'] # Galaxies Ra in deg
gal_dec = mock_galaxies['dec'] # Galaxies Dec in deg
gal_e1 = mock_galaxies['e1'] # Galaxies elipticipy 1
gal_e2 = mock_galaxies['e2'] # Galaxies elipticipy 2
gal_z = mock_galaxies['z'] # Galaxies observed redshift
gal_ztrue = mock_galaxies['ztrue'] # Galaxies true redshift
gal_pzbins = mock_galaxies['pzbins'] # Galaxies P(z) bins  
gal_pzpdf = mock_galaxies['pzpdf'] # Galaxies P(z)
gal_id = mock_galaxies['id'] # Galaxies ID

From the source galaxy quantities, we can compute the elepticities and corresponding radial profile usimg `clmm.dataops` functions:

In [None]:
# Convert elipticities into shears
gal_ang_dist, gal_gt, gal_gx = da.compute_tangential_and_cross_components(cluster_ra, cluster_dec,
                                                                          gal_ra, gal_dec,
                                                                          gal_e1, gal_e2,
                                                                          geometry="flat")

# Measure profile
profile = da.make_radial_profile([gal_gt, gal_gx, gal_z],
                                 gal_ang_dist, "radians", "Mpc",
                                 bins=da.make_bins(0.01, 3.7, 50),
                                 cosmo=cosmo,
                                 z_lens=cluster_z,
                                 include_empty_bins=False)
print(f'Profile table has columns: {", ".join(profile.colnames)},')
print('where p_(0, 1, 2) = (gt, gx, z)')

The other possibility is to use the `GalaxyCluster` object. For that you just have to provide the following information of the cluster:

* Ra, Dec [deg]
* Mass - ($M200_m$) [Msun]
* Concentration
* Redshift


and the source galaxies:

* Ra, Dec [deg]
* 2 axis of eliptticities
* Redshift



In [None]:
# Create a GCData with the galaxies
galaxies = clmm.GCData()
galaxies['ra'] = gal_ra
galaxies['dec'] = gal_dec
galaxies['e1'] = gal_e1
galaxies['e2'] = gal_e2
galaxies['z'] = gal_z
galaxies['ztrue'] = gal_ztrue
galaxies['pzbins'] = gal_pzbins
galaxies['pzpdf'] = gal_pzpdf
galaxies['id'] = gal_id

# Create a GalaxyCluster
cluster = clmm.GalaxyCluster("Name of cluster", cluster_ra, cluster_dec,
                                   cluster_z, galaxies)

# Convert elipticities into shears for the members
cluster.compute_tangential_and_cross_components(geometry="flat")
print(cluster.galcat.colnames)

# Measure profile and add profile table to the cluster
cluster.make_radial_profile(bins=da.make_bins(0.01, 3.7, 50, method='evenlog10width'),
                            bin_units="Mpc",
                            cosmo=cosmo,
                            include_empty_bins=False,
                            gal_ids_in_bins=True,
                           )
print(cluster.profile.colnames)

This resoults in an attribute `table` added to the `cluster` object.

In [None]:
from paper_formating import prep_plot
prep_plot(figsize=(9, 9))

plt.errorbar(cluster.profile['radius'], cluster.profile['gt'],
             cluster.profile['gt_err'], c='k', linestyle='', marker='o',
            markersize=1, elinewidth=.5, capthick=.5)
plt.xlabel('r [Mpc]', fontsize = 10)
plt.ylabel(r'$g_t$', fontsize = 10)
#plt.xlim(min(cl_noisy.profile['radius']), max(cl_noisy.profile['radius']))

## Theoretical prediction

Here we use `clmm.predict_reduced_tangential_shear` to make a prediction that accounts for the redshift distribution of the galaxies in each radial bin:

In [None]:
def predict_reduced_tangential_shear(profile, logm):
    return np.array([np.mean(
        clmm.compute_reduced_tangential_shear(
            r_proj=radial_bin['radius'], # Radial component of the profile
            mdelta=10**logm, # Mass of the cluster [M_sun]
            cdelta=4, # Concentration of the cluster
            z_cluster=cluster_z, # Redshift of the cluster
            z_source=cluster.galcat[radial_bin['gal_id']]['z'], # Redshift value of each source galaxy inside the radial bin
            cosmo=cosmo,
            delta_mdef=200,
            halo_profile_model='nfw'
        )) for radial_bin in profile])

## Mass fitting

We estimate the best-fit mass using `scipy.optimize.curve_fit`.  We compare estimated mass for noisy and ideal data, using both models described above (naive with average redshift or the model taking into account the redshift distribution). The choice of fitting $\log_{10} M$ instead of $M$ lowers the range of pre-defined fitting bounds from several order of magnitude for the mass to unity. From the associated error $\Delta (\log_{10}M)$ we calculate the error to mass as $\Delta M = M_{fit}\log(10)\Delta (\log_{10}M)$.

In [None]:
from clmm.support.sampler import fitters
popt, pcov = fitters['curve_fit'](predict_reduced_tangential_shear,
    cluster.profile, 
    cluster.profile['gt'], 
    cluster.profile['gt_err'], bounds=[10.,17.])

logm_est, logm_err_est = popt[0], np.sqrt(pcov[0][0])

m_est = 10.**logm_est
m_est_err =  m_est*logm_err_est*np.log(10)

In [None]:
print(f'The input mass = {cluster_m:.2e} Msun\n')

print("Accounting for the redshift distribution in the model\n")
print(f'Best fit mass for noisy data = {m_est:.2e} +/- {m_est_err:.2e} Msun')

As expected, the reconstructed mass is biased when the redshift distribution is not accounted for in the model

## Visualization of the results

For visualization purpose, we calculate the reduced tangential shear predicted by the model with estimated masses for noisy and ideal data.

In [None]:
gt_est = predict_reduced_tangential_shear(cluster.profile, logm_est)
gt_est_err = [predict_reduced_tangential_shear(cluster.profile, logm_est+i*logm_err_est)
                      for i in (-3, 3)]

We compare to tangential shear obtained with theoretical mass. We plot the reduced tangential shear models first when redshift distribution is accounted for in the model then for the naive approach, with respective best-fit masses.

In [None]:
prep_plot(figsize=(9 , 9))
plt.errorbar(cluster.profile['radius'], cluster.profile['gt'], cluster.profile['gt_err'],
             c='k', linestyle='', marker='o', label=rf'$M_{{input}} = {cluster_m*1e-15}\times10^{{{15}}} M_\odot$',
            markersize=1, elinewidth=.5, capthick=.5)
pow10 = 15
plt.loglog(cluster.profile['radius'], gt_est,'-b', 
           label=fr'$M_{{fit}} = {m_est/10**pow10:.2f} \pm {m_est_err/10**pow10:.2f}\times 10^{{{pow10}}} M_\odot$',
          lw=.5)
plt.fill_between(cluster.profile['radius'], *gt_est_err, lw=0, color='b', alpha=.3)

plt.xlabel('$r$ [Mpc]', fontsize = 10)
plt.ylabel(r'$g_t$', fontsize = 10)
plt.legend(fontsize = 8)
plt.subplots_adjust(left=.2, bottom=.15)
plt.savefig('r_gt.png')