# Matching catalogs based on square boxes (simple)
Matching two catalogs based on boxes based on a configuration dictionary

In [None]:
%load_ext autoreload
%autoreload 2

## ClCatalogs
Given some input data

In [None]:
import numpy as np
from astropy.table import Table

input1 = Table(
    {
        "ID": [f"CL{i}" for i in range(5)],
        "RA": [0.0, 0.0001, 0.00011, 25, 20],
        "DEC": [0.0, 0.0, 0.0, 0.0, 0.0],
        "Z": [0.2, 0.3, 0.25, 0.4, 0.35],
        "MASS": [10**13.5, 10**13.5, 10**13.3, 10**13.8, 10**14],
    }
)
input2 = Table(
    {
        "ID": ["CL0", "CL1", "CL2", "CL3"],
        "RA": [0.0, 0.0001, 0.00011, 25],
        "DEC": [0.0, 0, 0, 0],
        "Z": [0.3, 0.2, 0.25, 0.4],
        "MASS": [10**13.3, 10**13.4, 10**13.5, 10**13.8],
    }
)
for col in ("RA", "DEC"):
    input1[f"{col}_MAX"] = input1[col] + 0.01
    input1[f"{col}_MIN"] = input1[col] - 0.01
    input2[f"{col}_MAX"] = input2[col] + 0.01
    input2[f"{col}_MIN"] = input2[col] - 0.01
display(input1)
display(input2)

Create two `ClCatalog` objects, they have the same properties of `astropy` tables with additional functionality. You can tag the main properties of the catalog, or have columns with those names (see `catalogs.ipynb` for detailts). For the box matching, the main tags/columns to be included are:
- `id` - if not included, one will be assigned
- `ra_min`, `ra_max` (in degrees) - necessary
- `dec_min`, `dec_max` (in degrees) - necessary
- `z` - necessary if used as matching criteria

In [None]:
from clevar.catalog import ClCatalog

tags = {"id": "ID", "ra": "RA", "dec": "DEC", "z": "Z", "mass": "MASS"}
c1 = ClCatalog("Cat1", data=input1, tags=tags)
c2 = ClCatalog("Cat2", data=input2, tags=tags)

# Format for nice display
for c in ("ra", "dec", "z", "ra_min", "dec_min", "ra_max", "dec_max"):
    c1[c].info.format = ".2f"
    c2[c].info.format = ".2f"
for c in ("mass",):
    c1[c].info.format = ".2e"
    c2[c].info.format = ".2e"
display(c1)
display(c2)

The `ClCatalog` object can also be read directly from a file,
for details, see <a href='catalogs.ipynb'>catalogs.ipynb</a>.

## Matching
Import the `BoxMatch` and create a object for matching

In [None]:
from clevar.match import BoxMatch

mt = BoxMatch()

Prepare the configuration. The main values are:

- `type`: Type of matching to be considered. Can be a simple match of ClCatalog1->ClCatalog2 (`cat1`), ClCatalog2->ClCatalog1 (`cat2`) or cross matching.
- `metric`: Metric to be used for matching. Can be: `GIoU` (generalized Intersection over Union); `IoA*` (Intersection over Area, with area choice in [`min`, `max`, `self`, `other`]);
- `metric_cut`: Minimum value of metric for match.
- `rel_area`: Minimum relative size of area for match.
- `preference`: In cases where there are multiple matched, how the best candidate will be chosen. Options are: `'more_massive'`, `'angular_proximity'`, `'redshift_proximity'`, `'shared_member_fraction'`, `'GIoU'` (generalized Intersection over Union), `'IoA*'` (Intersection over Area, with area choice in `min`, `max`, `self`, `other`).

- `verbose`: Print result for individual matches (default=`True`).

We also need to provide some specific configuration for each catalog with:

- `delta_z`: Defines redshift window for matching. The possible values are:
  - `'cat'`: uses redshift properties of the catalog
  - `'spline.filename'`: interpolates data in `'filename'` assuming (z, zmin, zmax) format
  - `float`: uses `delta_z*(1+z)`
  - `None`: does not use z

In [None]:
match_config = {
    "type": "cross",  # options are cross, cat1, cat2
    "preference": "GIoU",
    "catalog1": {"delta_z": None},
    "catalog2": {"delta_z": None},
}

Once the configuration is prepared, the whole process can be done with one call:

In [None]:
%%time
mt.match_from_config(c1, c2, match_config)

This will fill the matching columns in the catalogs:
- `mt_multi_self`: Multiple matches found
- `mt_multi_other`: Multiple matches found by the other catalog
- `mt_self`: Best candidate found
- `mt_other`: Best candidate found by the other catalog
- `mt_cross`: Best candidate found in both directions

If `preference` in (`GIoU`, `IoA*`), it also add the value of `mt_self_preference` and `mt_other_preference`.

In [None]:
display(c1)
display(c2)

The steps of matching are stored in the catalogs and can be checked:

In [None]:
c1.show_mt_hist()

## Save and Load
The results of the matching can easily be saved and load using `ClEvaR` tools:

In [None]:
mt.save_matches(c1, c2, out_dir="temp", overwrite=True)

In [None]:
mt.load_matches(c1, c2, out_dir="temp")
display(c1)
display(c2)

## Getting Matched Pairs

There is functionality inbuilt in `clevar` to plot some results of the matching, such as:
- Recovery rates
- Distances (anguar and redshift) of cluster centers
- Scaling relations (mass, redshift, ...)
for those cases, check the <a href='match_metrics.ipynb'>match_metrics.ipynb</a> and <a href='match_metrics_advanced.ipynb'>match_metrics_advanced.ipynb</a> notebooks.

If those do not provide your needs, you can get directly the matched pairs of clusters: 

In [None]:
from clevar.match import get_matched_pairs

mt1, mt2 = get_matched_pairs(c1, c2, "cross")

These will be catalogs with the corresponding matched pairs:

In [None]:
import pylab as plt

plt.scatter(mt1["mass"], mt2["mass"])

## Outputing matched catalogs

To save the current catalogs, you can use the `write` inbuilt function:

In [None]:
c1.write("c1_temp.fits", overwrite=True)

This will allow you to save the catalog with its current labels and matching information.

### Outputing matching information to original catalogs

If your input data came from initial files,
`clevar` also provides functions create output files 
that combine all the information on them with the matching results.

To add the matching information to an input catalog, use:

```
from clevar.match import output_catalog_with_matching
output_catalog_with_matching('input_catalog.fits', 'output_catalog.fits', c1)
```

- note: `input_catalog.fits` must have the same number of rows that `c1`.


To create a matched catalog containig all columns of both input catalogs, use:

```
from clevar.match import output_matched_catalog
output_matched_catalog('input_catalog1.fits', 'input_catalog2.fits',
    'output_catalog.fits', c1, c2, matching_type='cross')
```

where `matching_type` must be `cross`, `cat1` or `cat2`.

- note: `input_catalog1.fits` must have the same number of rows that `c1` (and the same for `c2`).