# Galaxy Cluster Catalogs
The main object for galaxy cluster catalogs is `ClCatalog`, it has same properties of `astropy` tables, with additional functionality.

* [ClCatalog](#cat)
* [Creating a catalog](#creating)
  * [Create a catalog from fits files](#creating_fits)
  * [Important inputs of ClCatalog](#clcat_input)
  * [Reserved keyword arguments](#clcat_input_special)
* [Saving catalogs](#saving)
* [Accessing catalog data](#data)
* [Inbuilt function of catalogs](#funcs)
* [Adding members to cluster catalogs](#memcat)
  * [Read members from fits files](#memcat_fits)
  * [Important inputs of members catalog](#memcat_input)
  * [Reserved keyword arguments](#memcat_input_special)
  * [Saving members](#memcat_saving)

In [None]:
%load_ext autoreload
%autoreload 2


## ClCatalog<a id='cat'/>

The `ClCatalog` has the following internal attributes:
- `name`: ClCatalog name
- `data`: Table with main catalog data (ex: id, ra, dec, z) and matching data (mt_self, mt_other, mt_cross, mt_multi_self, mt_multi_other)
- `mt_input`: Table containing the necessary inputs for the match (added by Match objects)
- `size`: Number of objects in the catalog
- `id_dict`: Dictionary of indicies given the object id
- `labels`: Labels of data columns for plots
- `members`: Members of clusters (optional)
- `leftover_members`: Galaxies in the input members not hosted by the cluster catalog (optional)

## Creating a catalog<a id='creating'/>
To create a catalog, you have to pass the name as the initial argument and the data columns for the table as keyword arguments:

In [None]:
from clevar import ClCatalog
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14])
cat['mass'].info.format = '.2e' # Format for nice display

`ClCatalog` will always have the matching columns (with prefix `mt_`) added:

In [None]:
display(cat)

All catalogs have an `id` column. If it is not included in the input, one will be created:

In [None]:
cat = ClCatalog('cluster', mass=[1e13, 1e14])
cat['mass'].info.format = '.2e' # Format for nice display
cat

Almost all keyword arguments will become columns of the catalog (see exeptions in [Important inputs of `ClCatalog`](#clcat_input)):

In [None]:
cat = ClCatalog('test name', test_column=[1, 2],
                other=[True, False], third=[None, []])
cat

The catalogs have a `label` attibute that is used for plots. If it is not provided as argument, a default value is assigned:

In [None]:
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14])
cat.labels

In [None]:
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14],
                labels={'id':'cluster ID', 'mass':'cluster M_200'})
cat.labels

### Create a catalog from `fits` files<a id='creating_fits'/>
The catalogs objects can also be read directly from file, by passing the fits file as the first argument, the catalog name as the second, and the names of the columns in the fits files as keyword arguments:

In [None]:
cat = ClCatalog.read('../demo/cat1.fits', 'my cluster',
                     id='ID', mass='MASS')

### Important inputs of `ClCatalog`<a id='clcat_input'/>

As shown above, `ClCatalog` can have any column in its main data table. However there are a few key columns these catalogs must have to be used for matching:

- `id` - necessary in membership matching (must correspond to `id_cluster` in the cluster member catalog).
- `ra` (in degrees) - necessary for proxity matching.
- `dec` (in degrees) - necessary for proxity matching.
- `z` - necessary for proxity matching if used as matching criteria (or for angular to physical convertion).
- `mass` (or mass proxy) - necessary for proxity matching if `shared_member_fraction` used as preference criteria for unique matches (default use in membership matching).
- `radius` - necessary for proxity matching if used as a criteria of matching (also requires `radius_unit` to be passed)

### Reserved keyword arguments<a id='clcat_input_special'/>

There is some keyword arguments that have a fixed meaning and do not become columns in the cluster data table:

- `radius_unit`: can be in angular units (`radians`, `degrees`, `arcmin`, `arcsec`) or physical units (`Mpc`, `kpc`, `pc`) or can enven be given by mass overdensity units (`m200b`, `m500c`) and are case insensitive. In the proximity matching the radius is converted to angular distances (degrees).
- `labels`: Dictionary with labels of data columns to be used in plots.
- `members`: Members of clusters, see [cluster members](#memcat) section for details.
- `members_warning`: Warn if the members catalog contains galaxies not hosted by the cluster catalog.
- `mt_input`: Table containing the necessary inputs for the match. This attribute is usually added during the matching process, but it can be passed in the `ClCatalog` construction.

Here are some examples of information being added to `mt_input` after the catalog creation:

In [None]:
from clevar.match import ProximityMatch
from clevar.cosmology import AstroPyCosmology
mt = ProximityMatch()
cosmo = AstroPyCosmology()

In [None]:
cat = ClCatalog('Cat', radius=[0.01, 0.02], radius_unit='radians')
mt.prep_cat_for_match(cat, delta_z=None, match_radius='cat')
cat.mt_input['ang']

In [None]:
cat = ClCatalog('Cat', radius=[0.01, 0.02], radius_unit='degrees')
mt.prep_cat_for_match(cat, delta_z=None, match_radius='cat')
cat.mt_input['ang']

In [None]:
cat = ClCatalog('Cat', radius=[1, 1.5], z=[.4, .5], radius_unit='mpc')
mt.prep_cat_for_match(cat, delta_z=None, match_radius='cat', cosmo=cosmo)
cat.mt_input['ang']

In [None]:
cat = ClCatalog('Cat', radius=[1e13, 1e14], z=[.4, .5], radius_unit='m200c')
mt.prep_cat_for_match(cat, delta_z=None, match_radius='cat', cosmo=cosmo)
cat.mt_input['ang']

## Saving catalogs<a id='saving'/>

The `ClCatalog` object has a `write` inbuilt function to save them to `.fits` files.
This function also take the argument `add_header` that add the name and labels informations to those files.
If the file was saved with this argument, it can be read without the requirement of a `name` argument:

In [None]:
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14],
                labels={'id':'cluster ID', 'mass':'cluster M_200'})
cat.write('cat1_with_info.fits', overwrite=True)

In [None]:
cat_temp = cat.read_full('cat1_with_info.fits')
cat_temp['mass'].info.format = '.2e' # Format for nice display
cat_temp

## Accessing catalog data<a id='data'/>

The main data table of the catalog can be accessed with `[]` operations in the same way as `astropy` tables. The output is a new `ClCatalog` object, exept when only 1 row or column is required:

In [None]:
cat['id']

In [None]:
cat['id']

In [None]:
cat['id', 'mass']

In [None]:
cat[[1, 0]]

In [None]:
cat[:1]

In [None]:
cat[0]

## Inbuilt function of catalogs<a id='funcs'/>
The `ClCatalog` object has some inbuilt functionality to facilitate the matching. `ids2inds` returns the indicies of objects given an id list. Other functions are related to footprint computations, see <a href='footprint.ipynb'>footprint.ipynb</a> for information on those.

In [None]:
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14])
cat['mass'].info.format = '.2e' # Format for nice display
inds = cat.ids2inds(['c2', 'c1'])
display(cat)
display(cat[inds])

## Adding members to cluster catalogs<a id='memcat'/>

The members are used as an internal table like object of `ClCatalog`, accessed by `.members`.
This object have the following attributes:
- `name`: ClCatalog name
- `data`: Table with main catalog data (ex: id, id_cluster, ra, dec, z)
- `size`: Number of objects in the catalog
- `id_dict`: Dictionary of indicies given the object id
- `labels`: Labels of data columns for plots
- `id_dict_list`: Dictionary of indicies given the object id, retiruning lists to account members with repeated `id`.

The members can be added to the cluster object using the `add_members` function.
It has a similar instanciating format of a `ClCatalog` object, where the columns are added by keyword arguments (the key `id_cluster` is always necessary and must correspond to `id` in the main cluster catalog):

In [None]:
cat = ClCatalog('cluster', id=['c1', 'c2'], mass=[1e13, 1e14])
cat['mass'].info.format = '.2e' # Format for nice display
cat.add_members(id=['m1', 'm2', 'm3'], id_cluster=['c1', 'c2', 'c1'])
display(cat)
display(cat.members)

### Read members from `fits` files<a id='memcat_fits'/>
The catalogs objects can also be read directly from file, by passing the fits file as the first argument, the catalog name as the second, and the names of the columns in the fits files as keyword arguments:

In [None]:
cat = ClCatalog.read('../demo/cat1.fits', 'my cluster',
                     id='ID', mass='MASS')
cat.read_members('../demo/cat1_mem.fits',
                 id='ID', id_cluster='ID_CLUSTER')
cat['mass'].info.format = '.2e' # Format for nice display
display(cat)
display(cat.members)

### Important inputs of members catalog<a id='memcat_input'/>

There are a few key columns these catalogs must have to be used for matching:

- `id` - necessary in membership matching of members.
- `id_cluster` - always necessary and must correspond to `id` in the main cluster catalog.
- `ra` (in degrees) - necessary for proxity matching of members.
- `dec` (in degrees) - necessary for proxity matching of members.
- `pmem` - Probability of the galaxy being a member, must be [0, 1]. If not provided, it will assing 1 for all members.

### Reserved keyword arguments<a id='memcat_input_special'/>

There are three keyword arguments with specific uses:

- `members_consistency`: Require that all input members belong to this cluster catalog.
- `members_warning`: Raise warning if members are do not belong to this cluster catalog, and save them in leftover_members attribute.
- `members_catalog`: Members catalog if avaliable, mostly for internal use.

When `members_consistency=True`, only galaxies hosted by the cluster catalog is kept. If `members_warning=True`, a warning is raised and the clusters not hosted are stored in `leftover_members`:

In [None]:
cat = ClCatalog('cluster', id=['c1'], mass=[1e13])
cat['mass'].info.format = '.2e' # Format for nice display
cat.add_members(id=['m1', 'm2', 'm3'], id_cluster=['c1', 'c2', 'c1'])
display(cat)
display(cat.members)
display(cat.leftover_members)

### Saving members<a id='memcat_saving'/>

The `member` object has a `write` inbuilt function to save them to `.fits` files.
This function also take the argument `add_header` that add the name and labels informations to those files.
If the file was saved with this argument, it can be read without the requirement of a `name` argument:

In [None]:
cat.members.write('mem1_with_info.fits', overwrite=True)

### Memory consuption<a id='memcat_memory'/>

IMPORTANT! The member catalogs are usually hundreds of times larger than the cluster catalogs. Therefore it is advised not to add it unless you are using it for a specific goal (ex: membership matching). This catalog also can lead to memory overload and makes the other functions slower.

To remove the members from the cluster catalog, use the `remove_members` function:

In [None]:
cat.remove_members()
print(cat.members, cat.leftover_members)