# Example use-case: cross-match ZTF BTS and NGC

Here we demonstrate how to cross-match [Zwicky Transient Facility](https://ztf.caltech.edu) (ZTF) [Bright Transient Survey](https://sites.astro.caltech.edu/ztf/bts) (BTS) and [New General Catalogue](https://en.wikipedia.org/wiki/New_General_Catalogue) (NGC) using LSDB.

In [None]:
# Install astroquery, comment this line if you already have it
!pip install --quiet astroquery

In [None]:
import lsdb
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from astropy.coordinates import SkyCoord
from astropy.table import Table
from astroquery.vizier import Vizier
from dask.distributed import Client

### Download ZTF BTS and convert coordinates to degrees

In [None]:
%%time

df_ztf_bts = pd.read_csv(
    "http://sites.astro.caltech.edu/ztf/bts/explorer.php?format=csv",
    na_values="-",
)
coord = SkyCoord(df_ztf_bts["RA"], df_ztf_bts["Dec"], unit=("hourangle", "deg"))
df_ztf_bts["ra_deg"], df_ztf_bts["dec_deg"] = coord.ra.deg, coord.dec.deg
df_ztf_bts.head()

### Download NGC with `astroquery`

Please install astroquery first with `pip install astroquery` or `conda install -c conda-forge astroquery`.

In [None]:
%%time

vizier = Vizier(row_limit=50_000)
tables = vizier.get_catalogs("VII/118/ngc2000")
df_ngc = tables[0].to_pandas()
coord = SkyCoord(df_ngc["RAB2000"], df_ngc["DEB2000"], unit=("hourangle", "deg"))
df_ngc["ra_deg"], df_ngc["dec_deg"] = coord.ra.deg, coord.dec.deg
df_ngc.head()

### Put both catalogs to LSDB and plan cross-match

Of course ZTF looks much deeper than NGC galaxies from 19th century, so we filter ZTF transients by redshift.

LSDB is built upon [Dask](https://dask.org) and can be used with Dask distributed cluster. In this cell we just plan computations and do not actually run them.

In [None]:
%%time

ztf_bts = lsdb.from_dataframe(df_ztf_bts, ra_column="ra_deg", dec_column="dec_deg")
ngc = lsdb.from_dataframe(df_ngc, ra_column="ra_deg", dec_column="dec_deg", margin_threshold=3600)

ztf_bts = ztf_bts.query("redshift < 0.01")

matched = ztf_bts.crossmatch(ngc, radius_arcsec=1200, suffixes=("_ztf", "_ngc"))
matched

### Run LSDB pipeline

In [None]:
%%time

# Create default local cluster
with Client():
    matched_df = matched.compute()

# Let's output transient name, NGC name and angular distance between them
matched_df = matched_df[["IAUID_ztf", "Name_ngc", "_dist_arcsec", "RA_ztf", "Dec_ztf"]].sort_values(
    by=["_dist_arcsec"]
)
matched_df

We may have some false matches here, because NGC is too shallow for this task. However, if we sort the table by the cross-match distance, we can see the first one is a supernova ([SN2022xxf](https://www.wis-tns.org/object/2022xxf)) in the nearby galaxy NGC 3705.

### Make some plot

This part is not related to LSDB and adopted from [PanSTARRS image this tutorial](https://spacetelescope.github.io/mast_notebooks/notebooks/PanSTARRS/PS1_image/PS1_image.html).

Now let's download host galaxy image from the PanSTARRS survey and plot it out (with SN location in the middle and marked with a "+" 

In [None]:
def getimages(ra, dec, size=240, filters="grizy"):
    """Query ps1filenames.py service to get a list of images

    ra, dec = position in degrees
    size = image size in pixels (0.25 arcsec/pixel)
    filters = string with filters to include
    Returns a table with the results
    """

    service = "https://ps1images.stsci.edu/cgi-bin/ps1filenames.py"
    url = ("{service}?ra={ra}&dec={dec}&size={size}&format=fits" "&filters={filters}").format(**locals())
    table = Table.read(url, format="ascii")
    return table


def get_ps1_image(url, size=240):
    """
    size: pixel number for 0.25 arcsec/pixel
    """
    from PIL import Image
    import requests
    from io import BytesIO

    try:
        r = requests.get(url)
        im = Image.open(BytesIO(r.content))
    except:
        print("Can't get ps1 image")
        im = None
    return im

In [None]:
c = SkyCoord(matched_df["RA_ztf"].values[0], matched_df["Dec_ztf"].values[0], unit=("hourangle", "deg"))
ra = c.ra.degree
dec = c.dec.degree
oid = matched_df["IAUID_ztf"].values[0]
table = getimages(ra, dec, size=1200, filters="grizy")
url = (
    "https://ps1images.stsci.edu/cgi-bin/fitscut.cgi?"
    "ra={}&dec={}&size=1200&format=jpg&red={}&green={}&blue={}"
).format(ra, dec, table["filename"][0], table["filename"][1], table["filename"][2])
im = get_ps1_image(url)
fig, ax = plt.subplots(figsize=(7, 3))
if im is not None:
    ax.imshow(im)
    ax.set_xticks([])
    ax.set_yticks([])
    ax.scatter(np.average(plt.xlim()), np.average(plt.ylim()), marker="+", color="yellow")
    ax.set_title(oid)
    plt.show()