# Measuring peak coordinates for LoVoCCS clusters

One of the initial stages of our analysis of the LoVoCCS sample was to examine the MCXC (the catalog which LoVoCCS was selected from) coordinate in the context of more modern telescopes, and to adjust it to a better 'starting coordinate' if it was significantly offset from the center of the ICM emission. This also included adding new LoVoCCS sample entries if the original MCXC cluster actually has multiple subcomponents.

This notebook makes use of these 'starting coordinates', our processed XMM observations, and XGA functionality, to measure the peak coordinates of each of our galaxy clusters - this is one of the most commonly used central position definitions for X-ray emission from galaxy clusters (in another notebook we measure the ICM centroid, another popular central coordinate measure).

## Import Statements

In [5]:
import pandas as pd
pd.set_option('display.max_columns', 500)
import numpy as np
from astropy.units import Quantity, UnitConversionError
from astropy.cosmology import LambdaCDM
from shutil import rmtree
import os
from matplotlib import pyplot as plt

import xga
temp_dir = xga.OUTPUT
actual_dir = temp_dir.split('notebooks/')[0]+'notebooks/xga_output/'
xga.OUTPUT = actual_dir
xga.utils.OUTPUT = actual_dir
# As currently XGA will setup an xga_output directory in our current directory, I remove it to keep it all clean
if os.path.exists('xga_output'):
    rmtree('xga_output')
from xga.samples import ClusterSample

# This is a bit cheeky, but suppresses the warnings that XGA spits out (they are 
#  useful, but not when I'm trying to present this notebook on GitHub)
import warnings
warnings.filterwarnings('ignore')

# Set up a variable that controls how long individual XSPEC fits are allowed to run
timeout = Quantity(6, 'hr')

%matplotlib inline

## Setting up necessary directories

Here we ensure that the directories we need to store the outputs in have been created:

In [None]:
if not os.path.exists("../../outputs/figures/positions/"):
    os.makedirs("../../outputs/figures/positions/")
    
if not os.path.exists("../../outputs/coordinates"):
    os.makedirs("../../outputs/coordinates/")
    
if not os.path.exists("../../outputs/cluster_visualisations/peak_coord_meas/"):
    os.makedirs("../../outputs/cluster_visualisations/peak_coord_meas/")

## Reading in the sample

We read in the LoVoCCS sample relevant to the current work:

In [6]:
samp = pd.read_csv("../../sample_files/X-LoVoCCSI.csv")
samp['LoVoCCS_name'] = samp['LoVoCCSID'].apply(lambda x: "LoVoCCS-" + str(x))
samp

Unnamed: 0,LoVoCCSID,Name,start_ra,start_dec,MCXC_Redshift,MCXC_R500,MCXC_RA,MCXC_DEC,manual_xray_ra,manual_xray_dec,LoVoCCS_name
0,1,A2029,227.734300,5.745471,0.0766,1.3344,227.73000,5.720000,227.734300,5.745471,LoVoCCS-1
1,2,A401,44.740000,13.580000,0.0739,1.2421,44.74000,13.580000,,,LoVoCCS-2
2,4,A85,10.458750,-9.301944,0.0555,1.2103,10.45875,-9.301944,,,LoVoCCS-4
3,5,A3667,303.157313,-56.845978,0.0556,1.1990,303.13000,-56.830000,303.157313,-56.845978,LoVoCCS-5
4,7,A3827,330.480000,-59.950000,0.0980,1.1367,330.48000,-59.950000,,,LoVoCCS-7
...,...,...,...,...,...,...,...,...,...,...,...
61,121,A3128,52.466189,-52.580728,0.0624,0.8831,52.50000,-52.600000,52.466189,-52.580728,LoVoCCS-121
62,122,A1023,157.000000,-6.800000,0.1176,0.8553,157.00000,-6.800000,,,LoVoCCS-122
63,123,A3528,193.670000,-29.220000,0.0544,0.8855,193.67000,-29.220000,,,LoVoCCS-123
64,131,A761,137.651250,-10.581111,0.0916,0.8627,137.65125,-10.581111,,,LoVoCCS-131


## Defining an XGA ClusterSample

We are going to use the [XGA hierarchical clustering peak finder](https://xga.readthedocs.io/en/latest/notebooks/techniques/hierarchical_clustering_peak_finding.html), which is built into [the RateMap](https://xga.readthedocs.io/en/latest/xga.products.html#xga.products.phot.RateMap.clustering_peak) class - this is the XGA Python class that provides convenient interfaces with count-rate maps. However, we won't use the count-rate map method directly, instead we'll declare a cluster sample and use the hierarchical peak finder that way - the advantage of this approach is twofold; 

**(1)** declaring the cluster sample will assemble a mask to remove contaminating sources from each cluster's observations, which may improve the performance of the peak finder; and **(2)** the GalaxyCluster call to the peak finder is iterative, and will try to converge on the best value.

We use the 'start_ra' and 'start_dec' columns of the sample, which act as good starting positions for our analyses of these clusters (they will contain the MCXC position column values if we didn't manually adjust the position, and the manual_xray_* column values if we did).

As we define the ClusterSample with `use_peak=True` and `peak_find_method='hierarchical'`, the peaks will automatically be calculated during declaration. We have also set `clean_obs=False`, which means that **no** checks will be performed on obserations to make sure that a minimum fraction of $R_{500}$ falls on them - this shouldn't be necessary for this analysis.

In [None]:
srcs = ClusterSample(samp['start_ra'].values, samp['start_dec'].values, samp['MCXC_Redshift'].values, 
                     samp['LoVoCCS_name'].values, r500=Quantity(samp['MCXC_R500'].values, 'Mpc'), use_peak=True,
                     peak_find_method='hierarchical', clean_obs=False)

In [None]:
srcs.info()

## Visualise the offset distribution

We can use a built in method of the ClusterSample to produce a histogram showing the measured offsets between user defined and peak coordinates; in fact we'll make two versions, the first in angular distance:

In [None]:
srcs.view_offset_dist('arcmin', 
                      save_path="../../outputs/figures/positions/startcoord_peakcoord_offsetdist_arcmin.pdf")

The second in proper distance:

In [None]:
srcs.view_offset_dist('kpc', 
                      save_path="../../outputs/figures/positions/startcoord_peakcoord_offsetdist_kpc.pdf")

## Examine the peak coordinates

We can use the view methods of the XGA RateMap class to visualise the start coordinates and peak coordinates, as well as the 'point clusters' that the hierarchical peak finding method identified and used. This gives some visual context to the peak coordinates, and helps us to make sure that they all look sensible - we also save the visualisations to disk:

In [None]:
for src in srcs:
    # Fetch the combined ratemap that was used for peak identification
    cur_rt = src.get_combined_ratemaps()
    cur_chos_pnt_clst = src.point_clusters[0]
    cur_oth_pnt_clst = src.point_clusters[1]
    cur_int_mask = src.get_interloper_mask()

    fig, ax_arr = plt.subplots(ncols=2, figsize=(14, 7))
    for ax_ind, ax in enumerate(ax_arr):
        plt.tick_params(left=False, right=False, top=False, bottom=False)

    cur_rt.get_view(ax_arr[0], Quantity([src.peak, src.ra_dec]), cur_int_mask, zoom_in=True)

    cur_rt.get_view(ax_arr[1], chosen_points=cur_chos_pnt_clst, other_points=cur_oth_pnt_clst, zoom_in=True)
    plt.tight_layout()
    
    file_name = "{n}_xmm_peak_search_vis.pdf".format(n=src.name)
    plt.savefig("../../outputs/cluster_visualisations/peak_coord_meas/"+file_name)
    plt.show()

## Saving the peak coordinates

We make sure to save the peak coordinates to a truncated sample file (just containing cluster name and the measured positions):

In [None]:
out_df_data = np.concatenate([srcs.names[..., None], srcs.peaks])
out_df = pd.DataFrame(out_df_data, columns=['LoVoCCS_name', 'peak_ra', 'peak_dec'])
out_df.to_csv("../../outputs/coordinates/xmm_peak_coords.csv", index=False)
out_df