# Example filtering enriched metadata from KNTraP
KNTraP candidate light-curves and positional information is used to:
1. Positional cross-match with catalogues: to provide information on the actual source if catalogued or its environment (aka galaxy close-by)
2. Light-curve processing: to obtain magnitude increase/decrease rates and colour


These "metadata" can be then used to discard known stars and select most promising transients.


This notebook allows you to create your own selection.

In [4]:
import pandas as pd
import numpy as np

Beware! as of March 9th 2022 the enriched metadata is saved as a pickle file. This allows numpy arrays to be saved properly

In [5]:
df = pd.read_pickle('./Fink_outputs/353A_tmpl.pickle')

What are the available metadata?

| key | Content | Type |
| --- | --- | --- |
|id| KNTraP id, beware this is run dependent| int |
|ra| Candidate ra forced phot | float |
|dec | Candidate dec forced phot | float |
|max_mag_i| Maximum magnitude in forced photometry i band|float |
|max_mag_g| Maximum magnitude in forced photometry g band|float |
|min_mag_i| Minimum magnitude in forced photometry i band|float |
|min_mag_g| Minimum magnitude in forced photometry g band|float |
|mean_mag_i| Mean magnitude in forced photometry i band|float |
|mean_mag_g| Mean magnitude in forced photometry g band|float |
|std_mag_i| Dispersion magnitude in forced photometry i band|float |
|std_mag_g| Dispersion magnitude in forced photometry g band|float |
|dmag_i| Delta Magnitude in i band | np.array |
|dmag_g| Delta Magnitude in g band | np.array |
|dmag_rate_i| Delta Magnitude rate (per day) in i band | np.array |
|dmag_rate_g| Delta Magnitude rate (per day) in g band | np.array |
|color| i-g color if measurements on same night (how red it is) | np.array |
|color_avg| Average color complete light-curve forced phot | float |
|ndet| Numbers of detections in forced phot | int |
|two_mags_gt_225| Are two measured magnitudes above 22.5? (useful for shallow fields)| Boolean |
|two_mags_gt_235| Are two measured magnitudes above 23.5? (useful for deep fields) | Boolean |
|ra_unforced| Candidate ra unforced phot | float | 
|dec_unforced| Candidate dec unforced phot | float | 
|max_mag_i_unforced| Same as above unforced | float |
|max_mag_g_unforced| | |
|min_mag_i_unforced| | |
|min_mag_g_unforced| | |
|mean_mag_i_unforced| | |
|mean_mag_g_unforced| | |
|std_mag_i_unforced| | |
|std_mag_g_unforced| | |
|dmag_i_unforced| | |
|dmag_g_unforced| | |
|dmag_rate_i_unforced| | |
|dmag_rate_g_unforced| | |
|color_unforced| | |
|color_avg_unforced| | |
|ndet_unforced| | |
|two_mags_gt_225_unforced| | |
|two_mags_gt_235_unforced| | |
|simbad_type| Simbad positional cross match type 5''| string |
|simbad_ctlg| Simbad positional cross match catalogue 5'' | string |
|simbad_sptype| Simbad positional cross match sptype 5'' | string |
|simbad_redshift| Simbad positional cross match redshift | float |
|gaia_DR2_source| Gaia DR2 positional cross match 2'' source | string |
|gaia_DR2_ra| Gaia DR2 positional 2'' cross match ra | float |
|gaia_DR2_dec| Gaia DR2 positional 2'' cross match dec | float |
|gaia_DR2_parallax| Gaia DR2 positional 2'' cross match parallax | float |
|gaia_DR2_parallaxerr| Gaia DR2 positional 2'' cross match parallax error | float |
|gaia_DR2_gmag| Gaia DR2 positional 2'' cross match magnitude in g | float |
|gaia_DR2_angdist| Gaia DR2 positional 2'' cross match angular distance | float |
|gaia_eDR3_source| idem but for early data release 3|
|gaia_eDR3_ra| |
|gaia_eDR3_dec| |
|gaia_eDR3_parallax| |
|gaia_eDR3_parallaxerr| |
|gaia_eDR3_gmag| |
|gaia_eDR3_angdist| |
|USNO_source| USNO (stars) positional cross match 2'' source | string |
|USNO_angdist| USNO (stars) positional cross match 2'' angular distance | float |

# Define your filtering
Now you can define which candidates you are interested in inspecting.

A basic example would be querying for transients with at least 3 detections in forced photometry and two measured magnitudes greater than 23.5

In [6]:
my_query_string = ("ndet>=3 & two_mags_gt_235==True")
df_out = df.query(my_query_string)

In [7]:
print(f"Our original list of {len(df)} candidates, has now been reduced to {len(df_out)}")

Our original list of 171 candidates, has now been reduced to 89


Now you can save this list for visualization (uncomment to do so)

In [8]:
# df_out.to_csv('my_filtering',sep=";",index=False)

# More advanced queries

#### Adding SIMBAD cross-matches filtering

SIMBAD type will return "Unknown" if no matching source found, "Fail" if service had an issue cross-matching it.

If looking for extra-galactic objects it may also return a host-galaxy cross-match as the closest source. Suggest to keep these sources for the filtering.

In [9]:
list_simbad_galaxies = [
    "galaxy",
    "Galaxy",
    "EmG",
    "Seyfert",
    "Seyfert_1",
    "Seyfert_2",
    "BlueCompG",
    "StarburstG",
    "LSB_G",
    "HII_G",
    "High_z_G",
    "GinPair",
    "GinGroup",
    "BClG",
    "GinCl",
    "PartofG",
]
keep_cds = ["Unknown", "Transient", "Fail"] + list_simbad_galaxies

In [25]:
my_query_string = (f"ndet>=3 & two_mags_gt_235==True & simbad_type == @keep_cds")
df_out = df.query(my_query_string)
print(f"Our original list of {len(df)} candidates, has now been reduced to {len(df_out)}")

Our original list of 171 candidates, has now been reduced to 85


#### Adding GAIA cross-matches filtering

You probably want to avoid inspecting catalogued stars (unless you are looking for flares!). Gaia and USNO are good catalogues to filter out these stars

In [30]:
my_query_string = (f"ndet>=3 & two_mags_gt_235==True & simbad_type == @keep_cds & gaia_DR2_source=='Unknown' & gaia_eDR3_source=='Unknown' & USNO_source=='Unknown'")
df_out = df.query(my_query_string)
print(f"Our original list of {len(df)} candidates, has now been reduced to {len(df_out)}")

Our original list of 171 candidates, has now been reduced to 9


#### Adding a requirement for delta mag array

You could want to check if there is at least one measurement with delta mag in the i band > 0.3, to do so
- create a column True or False depending on my requirement for delta mag

In [33]:
df = df.assign(mymagrequirement=df.dmag_i.apply(lambda x: np.any(x>0.3)))

In [34]:
my_query_string = (f"ndet>=3 & two_mags_gt_235==True & simbad_type == @keep_cds & gaia_DR2_source=='Unknown' & gaia_eDR3_source=='Unknown' & USNO_source=='Unknown' & mymagrequirement==True")

In [36]:
df_out = df.query(my_query_string)
print(f"Our original list of {len(df)} candidates, has now been reduced to {len(df_out)}")

Our original list of 171 candidates, has now been reduced to 1
