# DELIGHT test notebook

Francisco Förster et al. 2022

The **Deep Learning Identification of Galaxy Hosts in Transients (DELIGHT, Förster et al. 2022)** is a library created by the [ALeRCE broker](http://alerce.science/) to automatically identify host galaxies of transient candidates using multi-resolution images and a convolutional neural network.

The library has a class with different subroutines that allows you to get the most likely host coordinates starting from given transient coordinates.

In order to do this, the delight object needs a list of object identifiers and coordinates (oid, ra, dec). With this information, it downloads PanSTARRS images centered around the position of the transients (2 arcmin x 2 arcmin), gets their WCS solutions, creates the multi-resolution images, does some extra preprocessing of the data, and finally predicts the position of the hosts using a multi-resolution image and a convolutional neural network. It can also estimate the host's semi-major axis if requested taking advantage of the multi-resolution images.

Note that DELIGHT's prediction time is currently dominated by the time to download PanSTARRS images using the [panstamps service](https://readthedocs.org/projects/panstamps/). In the future, we expect that there will be services that directly provide multi-resolution images, which should be more lightweight with no significant loss of information.

**Dependencies**:

* pandas, numpy, matplotlib
* xarray (python -m pip install xarray)
* astropy (pip install astropy)
* sep (pip install sep)
* tensorflow (https://www.tensorflow.org/install/pip)
* pantamps (pip install panstamps)

In [None]:
# Add delight
#! pip install astro-delight

Load libraries

In [None]:
import os, sys
import pandas as pd

from delight.delight import *

In [None]:
try:
    import google.colab
    IN_COLAB = True
    # panstamps colab fix
    !mkdir /root/.config/panstamps
    !panstamps init
except:
    IN_COLAB = False

In [None]:
%load_ext autoreload
%autoreload 2

# Load reference data

This contains ids, ra, dec

In [None]:
# data directory and file with names and coordinates                                                                                                                                                                                                 
datadir = '../data'
if not IN_COLAB:
    df = pd.read_csv(os.path.join(datadir, 'testcoords.csv'))
else:
    df = pd.read_csv("https://raw.githubusercontent.com/fforster/DELIGHT/main/data/testcoords.csv")

# Start DELIGHT

This requires defining a data directory, ids, ras and decs

In [None]:
# start Delight client                                                                                                                                                                                        
dclient = Delight(datadir, df.oid.values, df.ra.values, df.dec.values)

Download data and get pixel coordinates

In [None]:
# download missing data (will check for existing files first)
dclient.download()

In [None]:
# Check downloaded files
os.listdir(os.path.join(datadir, "fits"))

In [None]:
# check the shape of the dataframe
dclient.df.shape

Read WCS solutions to move between pixel and celestial coordinates

In [None]:
# get coordinates using WCS solution (we turn warnings off temporarily)
dclient.get_pix_coords()

# Create multi resolution images

This requires defining the number of levels, whether to mask by the median absolute deviation (`domask=True`, `doobject=False`), or using sextractor (`domask=False`, `doobject=True`).

In [None]:
nlevels = 5
domask = False
doobject = True
doplot = False

dclient.compute_multiresolution(nlevels, domask, doobject, doplot)

# Apply classification model 

Load the tensorflow model

In [None]:
dclient.load_model()

## Preprocess the multi resolution data

In [None]:
dclient.preprocess()

## Predict host galaxies using the model

In [None]:
dclient.predict()

# Optional: get host sizes

In [None]:
for oid in dclient.df.index:
    dclient.get_hostsize(oid, doplot=False)

# See final dataframe

In [None]:
dclient.df

# Save and load the data

Save the data

In [None]:
dclient.save()

Load the data

In [None]:
dclient.load()

# Visualize the outputs

## See the contents of the dataframe for a given transient

In [None]:
dclient.df.loc["SN2004aq"]

## See the host and predicted position

In most cases the model works very well, but if there are two nearby sources that could be identified as hosts the model may return a predicted position between both of them.

In [None]:
dclient.plot_host("SN2004aq")

## See the host semi-major axis estimation

In [None]:
coordsdata = dclient.get_hostsize("SN2004aq", doplot=True)

## Visualize all transient candidates

In [None]:
for oid in dclient.df.index:
    dclient.plot_host(oid)
    dclient.get_hostsize(oid, doplot=True)

In [None]:
dclient.df