# Tests of coordinate system effects on shear profiles

Authors: Marina Ricci, Tomomi Sunayama

Tested, modified, and documented by: Camille Avestruz, Caio Lima de Oliveira

In this notebook we illustrate the importance of setting the correct coordinate system for shear catalogs. We start by showing these effects on generated mock data, then on HSC Y3 data.

Throughout this notebook we also show how to correctly set the coordinate system and how to update it and convert data.

### Shear coordinate system definition and conversion

* Celestial coordinate system: the declination $\delta$ and the right ascension $\alpha$ take the role of the spherical angles $\theta$ and $\varphi$.

* Euclidean coordinate system: a cartesian coordinate system defined on the plane tanget to the celestial sphere at the point of observation. Here, the $y$-axis is parallel to the declination $\delta$ and the $x$-axis is antiparallel to the right ascension $\alpha$.

In a small angles, planar approximation of the Celestial coordinates, both coordinate systems are related by a parity transformation of the $x$-axis. The conversion between the Euclidean ellipticity $\epsilon^E = \epsilon_1^E + i \epsilon_2^E$ and the Celestial ellipticity $\epsilon^C = \epsilon_1^C + i \epsilon_2^C$ is then given by the transformation $\varphi \rightarrow \pi - \varphi$:

$$\epsilon^C_1 + i \epsilon_2^C = |\epsilon| e^{2 i \varphi^\prime} = |\epsilon| e^{2 i (\pi  - \varphi)} = |\epsilon| e^{- 2 i \varphi} = \epsilon^E_1 - i \epsilon_2^E$$

### Here we generate mock source catalogs with different coordinate system and explore how that must be accounted to measure the correct shear profiles. 

In [None]:
import clmm

clmm.__version__

In [None]:
import matplotlib.pyplot as plt
import numpy as np
from astropy.io import fits
from scipy import spatial

%matplotlib inline

In [None]:
from clmm.support import mock_data as mock
from clmm import Cosmology

## Generate the mock catalog with different source galaxy options
In this example, the mock data includes: shape noise, galaxies drawn from redshift distribution and photoz errors.

In [None]:
mock_cosmo = Cosmology(H0=70.0, Omega_dm0=0.27 - 0.045, Omega_b0=0.045, Omega_k0=0.0)

In [None]:
cosmo = mock_cosmo
cluster_id = "Awesome_cluster"

cluster_m = 1.0e15  # M200,m
cluster_z = 0.3
# Cluster centre coordinates
cluster_ra = 50.0
cluster_dec = 87.0
concentration = 4

In [None]:
# let's put all these quantities in a single dictionary to facilitate clarity
cluster_kwargs = {
    "cluster_m": cluster_m,
    "cluster_z": cluster_z,
    "cluster_ra": cluster_ra,
    "cluster_dec": cluster_dec,
    "cluster_c": concentration,
    "cosmo": cosmo,
}

In [None]:
# let's put all these quantities in a single dictionary to facilitate clarity
source_kwargs = {
    "zsrc": "chang13",
    "zsrc_min": cluster_z + 0.1,
    "photoz_sigma_unscaled": 0.05,
    "ngals": 1000,
    "pz_bins": np.linspace(0, 10, 1001),
    "shapenoise": 0.05,
}

We must supply the coordinate system information when generating a mock galaxy catalog. If we don't, a warning is issued and problems may arise!

In [None]:
np.random.seed(679)

mock_sources_euclidean_coord = mock.generate_galaxy_catalog(
    **cluster_kwargs, **source_kwargs, coordinate_system="euclidean"
)

In [None]:
np.random.seed(679)

mock_sources_celestial_coord = mock.generate_galaxy_catalog(
    **cluster_kwargs, **source_kwargs, coordinate_system="celestial"
)

In [None]:
np.random.seed(679)

# In this case, we are going to generate a mock catalog without setting
# the coordinate system, which will raise a warning and default to "euclidean".
mock_sources_default_coord = mock.generate_galaxy_catalog(
    **cluster_kwargs,
    **source_kwargs,
)

Note that $e_1$ remains the same for all catalogs, while $e_2$ changes its sign between coordinate systems. Also, notice the `coordinate_system` metadata!

In [None]:
mock_sources_euclidean_coord[:5]

In [None]:
mock_sources_celestial_coord[:5]

In [None]:
mock_sources_default_coord[:5]

Now, we can easily convert between the diferente coordinates with the method `update_coordinate_system()`. However, pay attention to the fact we must supply which columns we want to update!

In [None]:
mock_sources_default_coord.update_coordinate_system("celestial", ("e2",))

mock_sources_default_coord[:5]

We can also update the coordinate system but not convert the data. This is useful if the original coordinate system was incorrectly set but may also lead to errors (as we intentionally do now).

In [None]:
mock_sources_default_coord.update_coordinate_system("euclidean")

mock_sources_wrong_coord = mock_sources_default_coord

mock_sources_wrong_coord[:5]

## Generate cluster objects from mock data

In [None]:
import clmm.dataops
from clmm.galaxycluster import GalaxyCluster

In [None]:
cl_euclidean = GalaxyCluster(
    cluster_id,
    cluster_ra,
    cluster_dec,
    cluster_z,
    mock_sources_euclidean_coord,
)

In [None]:
cl_celestial = GalaxyCluster(
    cluster_id,
    cluster_ra,
    cluster_dec,
    cluster_z,
    mock_sources_celestial_coord,
)

In [None]:
cl_wrong = GalaxyCluster(
    cluster_id,
    cluster_ra,
    cluster_dec,
    cluster_z,
    mock_sources_wrong_coord,
)

In [None]:
fig, ax1 = plt.subplots(1, 1)

ax1.scatter(cl_euclidean.galcat["e1"], cl_euclidean.galcat["e2"], s=2, alpha=0.5, label="euclidean")
ax1.scatter(
    cl_celestial.galcat["e1"],
    cl_celestial.galcat["e2"],
    s=2,
    alpha=0.5,
    color="red",
    label="celestial",
)

ax1.set_xlabel("$\\epsilon_1$")
ax1.set_ylabel("$\\epsilon_2$")
ax1.set_aspect("equal", "datalim")
ax1.set_xlim(-0.125, 0.125)
ax1.set_ylim(-0.125, 0.125)
ax1.axvline(0, linestyle="dotted", color="black")
ax1.axhline(0, linestyle="dotted", color="black")

plt.legend()
plt.show()

## Compute and plot shear profiles

We will now compute the tangential and cross components of the ellipticity for all three clusters. We should get the same $e_t$ and a flipped sign $e_x$ for `cl_euclidean` and `cl_celestial`, but a completely different dataset for `cl_wrong`.

In [None]:
cl_euclidean.compute_tangential_and_cross_components(add=True)
cl_euclidean.galcat["et", "ex"].pprint(max_width=-1)

In [None]:
cl_celestial.compute_tangential_and_cross_components(add=True)
cl_celestial.galcat["et", "ex"].pprint(max_width=-1)

In [None]:
cl_wrong.compute_tangential_and_cross_components(add=True)
cl_wrong.galcat["et", "ex"].pprint(max_width=-1)

In [None]:
f, ax = plt.subplots(1, 2, figsize=(10, 4))

ax[0].hist(cl_euclidean.galcat["et"], bins=50, color="tab:blue", alpha=0.5, label="euclidean")
ax[0].hist(cl_celestial.galcat["et"], bins=50, color="tab:red", histtype="step", label="celestial")
ax[0].hist(
    cl_wrong.galcat["et"],
    bins=50,
    color="tab:orange",
    alpha=0.5,
    histtype="stepfilled",
    label="incorrect coordinate system",
)
ax[0].set_xlabel("$\epsilon_t$", fontsize="xx-large")

ax[1].hist(cl_euclidean.galcat["ex"], bins=50, color="tab:blue", alpha=0.5, label="euclidean")
ax[1].hist(cl_celestial.galcat["ex"], bins=50, color="tab:red", histtype="step", label="celestial")
ax[1].hist(
    cl_wrong.galcat["ex"],
    bins=50,
    color="tab:orange",
    alpha=0.5,
    histtype="stepfilled",
    label="incorrect coordinate system",
)
ax[1].set_xlabel("$\epsilon_x$", fontsize="xx-large")
ax[1].set_yscale("log")

plt.legend()
plt.show()

In [None]:
cl_euclidean.make_radial_profile("kpc", cosmo=cosmo)
cl_euclidean.profile.show_in_notebook()

In [None]:
cl_celestial.make_radial_profile("kpc", cosmo=cosmo)
cl_celestial.profile.show_in_notebook()

In [None]:
cl_wrong.make_radial_profile("kpc", cosmo=cosmo)
cl_wrong.profile.show_in_notebook()

### => When the correct coordinate system is specified, the profiles coming from the two catalogs are identical.

In [None]:
fig, ax = plt.subplots(2, 1, height_ratios=[3, 1], sharex=True)

ax[0].errorbar(
    cl_euclidean.profile["radius"],
    cl_euclidean.profile["gt"],
    yerr=cl_euclidean.profile["gt_err"],
    alpha=0.5,
    marker=".",
    color="tab:red",
    label="euclidean",
)
ax[1].errorbar(
    cl_euclidean.profile["radius"],
    cl_euclidean.profile["gx"],
    yerr=cl_euclidean.profile["gx_err"],
    marker=".",
    alpha=0.5,
    color="tab:red",
)

ax[0].errorbar(
    cl_celestial.profile["radius"] * 1.02,
    cl_celestial.profile["gt"],
    yerr=cl_celestial.profile["gt_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
    label="celestial",
)
ax[1].errorbar(
    cl_celestial.profile["radius"] * 1.02,
    cl_celestial.profile["gx"],
    yerr=cl_celestial.profile["gx_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
)

ax[0].legend()
ax[0].set_xscale("log")
ax[1].set_xlabel("R [kpc]")
ax[0].set_ylabel("$g_t$")
ax[1].set_ylabel("$g_x$")

plt.subplots_adjust(hspace=0)
plt.show()

### => However, when the coordinate system is not correctly specified, the profiles are incorrect.

In [None]:
fig, ax = plt.subplots(2, 1, height_ratios=[3, 1], sharex=True)

ax[0].errorbar(
    cl_wrong.profile["radius"],
    cl_wrong.profile["gt"],
    yerr=cl_wrong.profile["gt_err"],
    alpha=0.5,
    marker=".",
    color="tab:red",
    label="incorrect coordinate system",
)
ax[1].errorbar(
    cl_wrong.profile["radius"],
    cl_wrong.profile["gx"],
    yerr=cl_wrong.profile["gx_err"],
    marker=".",
    alpha=0.5,
    color="tab:red",
)

ax[0].legend()
ax[0].set_xscale("log")
ax[1].set_xlabel("R [kpc]")
ax[0].set_ylabel("$g_t$")
ax[1].set_ylabel("$g_x$")

plt.subplots_adjust(hspace=0)
plt.show()

### To recover the correct profile we need to update the coordinate system, compute the tangential and cross components again and recalculate the profile.

In [None]:
cl_wrong.update_coordinate_system("celestial")
cl_wrong.compute_tangential_and_cross_components(add=True)

cl_wrong.make_radial_profile("kpc", cosmo=cosmo)
cl_wrong.profile.show_in_notebook()

In [None]:
fig, ax = plt.subplots(2, 1, height_ratios=[3, 1], sharex=True)

ax[0].errorbar(
    cl_wrong.profile["radius"],
    cl_wrong.profile["gt"],
    yerr=cl_wrong.profile["gt_err"],
    alpha=0.5,
    marker=".",
    color="tab:red",
    label="correct coordinate system",
)
ax[1].errorbar(
    cl_wrong.profile["radius"],
    cl_wrong.profile["gx"],
    yerr=cl_wrong.profile["gx_err"],
    marker=".",
    alpha=0.5,
    color="tab:red",
)

ax[0].legend()
ax[0].set_xscale("log")
ax[1].set_xlabel("R [kpc]")
ax[0].set_ylabel("$g_t$")
ax[1].set_ylabel("$g_x$")

plt.subplots_adjust(hspace=0)
plt.show()

## Let's now do the same test on real data

Here we present two datasets, each with different coordinate systems:
1. Example source galaxies for galaxy clusters from a [Summer School](https://github.com/oguri/wlcluster_tutorial) taught by Masamune Oguri (data is also in `euclidean` coordinates);
2. HSC Y3 source galaxies with shears post processed by Tomomi Sunayama (data is in `celestial` coordinates)

### Instructions to download text data

First, create a directory where you want to put the example data, e.g. for a given `data_coords_dir`:

```
mkdir -p <YOUR PATH TO DATA COORDS DIR>
cd <YOUR PATH TO DATA COORDS DIR>
```

Download all files from this [dropbox link](https://www.dropbox.com/scl/fo/dwsccslr5iwb7lnkf8jvx/AJkjgFeemUEHpHaZaHHqpAg?rlkey=efbtsr15mdrs3y6xsm7l48o0r&st=xb58ap0g&dl=0).  This will be a zip file, `data_CLMM.zip` of size 242 Mb. `scp` or `mv` this to `data_coords_dir`. From the directory, you should be able to unzip:

```
unzip data_CLMM.zip -d .
```

You now have the necessary data files to run this notebook. **Make sure to change the `data_coords_dir` variable in the cell below to the appropriate location where you unzipped these files.**


In [None]:
#  CHANGE <YOUR PATH TO DATA COORDS DIR> TO YOUR LOCATION
data_coords_dir = "<YOUR PATH TO DATA COORDS DIR>"

### Example source galaxies from M. Oguri

This dataset is a curated selection of cluster and source catalogs from Summer School lectures delivered by Masamune Oguri.  There are eight galaxy clusters in this selection.  

More details on the corresponding tutorial can be found at this [GitHub link](https://github.com/oguri/wlcluster_tutorial). These are also in the `euclidean` coordinate system.

In [None]:
clusters = [
    "a1703",
    "gho1320",
    "sdss0851",
    "sdss1050",
    "sdss1138",
    "sdss1226",
    "sdss1329",
    "sdss1531",
]

zl_all = {
    "a1703": 0.277,
    "gho1320": 0.308,
    "sdss0851": 0.370,
    "sdss1050": 0.60,
    "sdss1138": 0.451,
    "sdss1226": 0.435,
    "sdss1329": 0.443,
    "sdss1531": 0.335,
}

ra_cl_all = {
    "a1703": 198.771833,
    "gho1320": 200.703208,
    "sdss0851": 132.911917,
    "sdss1050": 162.666250,
    "sdss1138": 174.537292,
    "sdss1226": 186.712958,
    "sdss1329": 202.393708,
    "sdss1531": 232.794167,
}

dec_cl_all = {
    "a1703": 51.817389,
    "gho1320": 31.654944,
    "sdss0851": 33.518361,
    "sdss1050": 0.285306,
    "sdss1138": 27.908528,
    "sdss1226": 21.831194,
    "sdss1329": 22.721167,
    "sdss1531": 34.240278,
}

In [None]:
cname = "a1703"

# cluster redshift
zl = zl_all.get(cname)

# coordinates of the cluster center
ra_cl = ra_cl_all.get(cname)
dec_cl = dec_cl_all.get(cname)

# fix source redshift to 1.0
zs = 1.0

We inspect the first galaxy cluster, Abell 1703.

In [None]:
rfile = data_coords_dir + "/data/shear_" + cname + ".dat"
data = np.loadtxt(rfile, comments="#")

ra = data[:, 0]
dec = data[:, 1]
e1 = data[:, 2]
e2 = data[:, 3]
wei = data[:, 4]
ids = np.arange(np.shape(data)[0])
redshifts = np.ones(np.shape(data)[0])

In [None]:
oguri_galaxies_euclidean = clmm.GCData(
    [ra, dec, e1, e2, redshifts, ids],
    names=["ra", "dec", "e1", "e2", "z", "id"],
    meta={"coordinate_system": "euclidean"},
)

oguri_galaxies_celestial = clmm.GCData(
    [ra, dec, e1, e2, redshifts, ids],
    names=["ra", "dec", "e1", "e2", "z", "id"],
    meta={"coordinate_system": "celestial"},
)

In [None]:
oguri_cluster_euclidean = clmm.GalaxyCluster(cname, ra_cl, dec_cl, zl, oguri_galaxies_euclidean)

oguri_cluster_celestial = clmm.GalaxyCluster(cname, ra_cl, dec_cl, zl, oguri_galaxies_celestial)

# Convert elipticities into shears for the members.
oguri_cluster_euclidean.compute_tangential_and_cross_components(add=True)
oguri_cluster_celestial.compute_tangential_and_cross_components(add=True)
print(oguri_cluster_euclidean.galcat.colnames)
print(oguri_cluster_celestial.galcat.colnames)

# Calculate the radial profile of the cluster.
oguri_cluster_euclidean.make_radial_profile("kpc", cosmo=cosmo)
oguri_cluster_celestial.make_radial_profile("kpc", cosmo=cosmo)
print(oguri_cluster_euclidean.profile.colnames)
print(oguri_cluster_celestial.profile.colnames)

In [None]:
fig, ax = plt.subplots(2, 1, height_ratios=[3, 1], sharex=True)

ax[0].errorbar(
    oguri_cluster_euclidean.profile["radius"],
    oguri_cluster_euclidean.profile["gt"],
    yerr=oguri_cluster_euclidean.profile["gt_err"],
    alpha=0.5,
    marker=".",
    color="tab:red",
    label="euclidean",
)
ax[1].errorbar(
    oguri_cluster_euclidean.profile["radius"],
    oguri_cluster_euclidean.profile["gx"],
    yerr=oguri_cluster_euclidean.profile["gx_err"],
    marker=".",
    alpha=0.5,
    color="tab:red",
)

ax[0].errorbar(
    oguri_cluster_celestial.profile["radius"] * 1.02,
    oguri_cluster_celestial.profile["gt"],
    yerr=oguri_cluster_celestial.profile["gt_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
    label="celestial",
)
ax[1].errorbar(
    oguri_cluster_celestial.profile["radius"] * 1.02,
    oguri_cluster_celestial.profile["gx"],
    yerr=oguri_cluster_celestial.profile["gx_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
)

ax[0].legend()
ax[0].set_xscale("log")
ax[1].set_xlabel("R [kpc]")
ax[0].set_ylabel("$g_t$")
ax[1].set_ylabel("$g_x$")

plt.subplots_adjust(hspace=0)
plt.show()

### => As expected, we only see a clear signal in the `euclidean` cluster

### Example source galaxies from HSC Y3

This dataset is a simplified version of HSC Y3 data (GAMA15H), post-processed by Tomomi Sunayama for testing purposes.  The pre-processed data is already public. These catalogs assume a **celestial** coordinate system.

In [None]:
hsc_cluster_cat = np.genfromtxt(
    data_coords_dir + "/GAMA15H/redmapper_dr8_GAMA15H.txt",
    dtype=np.dtype(
        [("ra", np.float64), ("dec", np.float64), ("z", np.float64), ("richness", np.float64)]
    ),
)

hsc_source_cat = fits.getdata(data_coords_dir + "/GAMA15H/GAMA15H_tutorial.fits")

cl = hsc_cluster_cat[0]

Here, we use a KDTree implementation in scipy to extract the background source galaxies for the first galaxy cluster in the dataset.

In [None]:
source1 = hsc_source_cat[hsc_source_cat["photoz"] > (cl["z"] + 0.3)]
tree = spatial.cKDTree(np.array((source1["ra"], source1["dec"])).T)
sel = tree.query_ball_point([cl["ra"], cl["dec"]], 3)
bg = source1[sel]

In [None]:
hsc_galaxies_euclidean = clmm.GCData(
    [bg["RA"], bg["Dec"], bg["e1"], bg["e2"], bg["photoz"], bg["weight"]],
    names=["ra", "dec", "e1", "e2", "z", "w_ls"],
    meta={"coordinate_system": "euclidean"},
)

hsc_galaxies_celestial = clmm.GCData(
    [bg["RA"], bg["Dec"], bg["e1"], bg["e2"], bg["photoz"], bg["weight"]],
    names=["ra", "dec", "e1", "e2", "z", "w_ls"],
    meta={"coordinate_system": "celestial"},
)

hsc_cluster_euclidean = clmm.GalaxyCluster(
    "Eucliden HSC cluster",
    cl["ra"],
    cl["dec"],
    cl["z"],
    hsc_galaxies_euclidean,
)

hsc_cluster_celestial = clmm.GalaxyCluster(
    "Celestial HSC cluster",
    cl["ra"],
    cl["dec"],
    cl["z"],
    hsc_galaxies_celestial,
)

# Convert elipticities into shears for the members.
hsc_cluster_euclidean.compute_tangential_and_cross_components(add=True)
hsc_cluster_celestial.compute_tangential_and_cross_components(add=True)
print(hsc_cluster_euclidean.galcat.colnames)
print(hsc_cluster_celestial.galcat.colnames)

# Calculate the radial profile of the cluster.
hsc_cluster_euclidean.make_radial_profile("kpc", cosmo=cosmo)
hsc_cluster_celestial.make_radial_profile("kpc", cosmo=cosmo)
print(hsc_cluster_euclidean.profile.colnames)
print(hsc_cluster_celestial.profile.colnames)

In [None]:
fig, ax = plt.subplots(2, 1, height_ratios=[3, 1], sharex=True)

ax[0].errorbar(
    hsc_cluster_euclidean.profile["radius"],
    hsc_cluster_euclidean.profile["gt"],
    yerr=hsc_cluster_euclidean.profile["gt_err"],
    alpha=0.5,
    marker=".",
    color="tab:red",
    label="euclidean",
)
ax[1].errorbar(
    hsc_cluster_euclidean.profile["radius"],
    hsc_cluster_euclidean.profile["gx"],
    yerr=hsc_cluster_euclidean.profile["gx_err"],
    marker=".",
    alpha=0.5,
    color="tab:red",
)

ax[0].errorbar(
    hsc_cluster_celestial.profile["radius"] * 1.02,
    hsc_cluster_celestial.profile["gt"],
    yerr=hsc_cluster_celestial.profile["gt_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
    label="celestial",
)
ax[1].errorbar(
    hsc_cluster_celestial.profile["radius"] * 1.02,
    hsc_cluster_celestial.profile["gx"],
    yerr=hsc_cluster_celestial.profile["gx_err"],
    alpha=0.3,
    marker=".",
    color="tab:blue",
)

ax[0].legend()
ax[0].set_xscale("log")
ax[1].set_xlabel("R [kpc]")
ax[0].set_ylabel("$g_t$")
ax[1].set_ylabel("$g_x$")

plt.subplots_adjust(hspace=0)
plt.show()

### => As expected, we only see a clear signal in the `celestial` cluster