# ALeRCE client starter

Francisco Förster

Last modification: 20221104

Very simple examples on how to interact with the ALeRCE client https://alerce.readthedocs.io/en/latest/index.html.

See https://alerce.readthedocs.io/en/latest/apis.html for the API documentation.

We recommend that you run this notebook from the following [link](https://colab.research.google.com/github/alercebroker/usecases/blob/master/notebooks/ADASS_XXXII_Nov2022_tutorial/ALeRCE_Client_starter.ipynb).

Load libraries

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import matplotlib.colors

#!pip install alerce

In [None]:
from alerce.core import Alerce

# Initialize alerce api object

Let's start the ALeRCE client

In [None]:
client = Alerce()

We will explore different methods from the client:

* Query global properties of an individual object
* Query properties per band of an individual object
* Query detections of an individual object
* Query image stamps
* Crossmatch with objects in the vicinity
* Query non detections of an individual object
* Query features of an individual object
* Query probabilities of an individual object
* Query global properties of a group of objects

# Query properties for an individual object

We will now query the global properties of one object based on a given object id, ZTF20aaelulu in this case. You can see this object in the website https://alerce.online/object/ZTF20aaelulu.


In [None]:
oid = "ZTF20aaelulu"
query_results = client.query_objects(
        oid=oid,
        format='pandas')
properties = query_results # save for later
query_results

The column names are the following

In [None]:
", ".join(query_results)

The columns are described in here https://alerce.readthedocs.io/en/latest/models.html

# Query properties per band for an indivual object

In [None]:
query_results = client.query_magstats(
        oid=oid,
        format='pandas')
query_results

You can see that the results are different statistics in two rows, one per band. The columns are described in here https://alerce.readthedocs.io/en/latest/models.html

In [None]:
", ".join(query_results)

# Query detections and non-detections for an individual object

Now we obtain a list of all the detections and non-detections (limiting magnitude) in two bands. 

Note that both dataframes include a unique identifier for the telescope (`tid`), which hints at the multi-stream nature of ALeRCE (soon to be available).

In [None]:
detections = client.query_detections(
        oid=oid,
        format='pandas')
nondetections = client.query_non_detections(
        oid=oid,
        format='pandas')
display(detections)
display(nondetections)

The most important columns for the detections are the time (`mjd`), the unique detection identifier (`candid`), the band (`fid`), the difference magnitude (`magpsf`) and its error (`sigmapsf`). 

The columns for the non-detections are the time (`mjd`), the band (`fid`), and the limiting magnitude (`diffmaglim`).

All the field are explained in here The columns are described in here https://alerce.readthedocs.io/en/latest/models.html.

# Query image stamps

We can also query the image stamps associated to a specific object and candid.

We will use the first candid from the previously queried detections.

In [None]:
stamps = client.get_stamps(oid, detections.iloc[0].candid)
print(stamps)

The image stamps are a triplet of science, reference and difference images. Let's look at the first image:

In [None]:
fig, ax = plt.subplots(ncols=3)
for i in range(3):
    ax[i].imshow(stamps[i].data)

We can also directly plot the images stamps using the plot_stamps command.

In [None]:
client.plot_stamps(oid, detections.iloc[0].candid)

If `candid` is not given the last available value will be used.

In [None]:
client.plot_stamps(oid)
print(detections.candid.max())

# Crossmatch with objects in the vicinity

We can also crossmatch the object's position with objects from the catsHTM collection of catalogs (https://github.com/maayane/catsHTM). The catalogs included in catsHTM are:

* 2MASS (input name: TMASS)
* 2MASSxsc (input name: TMASSxsc) - 2MASS extended source catalog
* AKARI (input name: AKARI)
* APASS (input name: APASS) - AAVSO All Sky Photometric Sky Survey (~5.5x10^7 sources)
* Cosmos (input name: Cosmos) - Sources in the Cosmos field
* DECaLS (input name: DECaLS) - DECaLS DR5 release
* FIRST (input name: FIRST) - (~9.5x10^5 sources)
* GAIA/DR1 (input name: GAIADR1) - (~1.1x10^9 sources).
* GAIA/DR2 (input name: GAIADR2) - NEW! (~1.6x10^9 sources)
* GAIA/EDR3 (input name: GAIAEDR3) - NEW! (~1.8x10^9 sources)
* GALEX (input name: GALEX) - GALAEX/GR6Plus7 (~1.7x10^8 sources).
* HSC/v2 (input name: HSCv2)- Hubble source catalog
* IPHAS/DR2 (input name: IPHAS)
* NED redshifts (input name: NEDz)
* NVSS (input name: NVSS) - (~1.8x10^6 sources)
* HYPERLEDA (input name: PGC)
* PS1 (input name: PS1) - Pan-STARRS (~2.6x10^9 sources; A cleaned version of the PS1 stack catalog; some missing tiles below declination of zero [being corrected])
* The PTF photometric catalog (input name: PTFpc)
* ROSATfsc (input name: ROSATfsc) - ROSAT faint source catalog
* SDSS/DR10 (input name: SDSSDR10)- Primary sources from SDSS/DR10 (last photometric release)
* Skymapper DR1 (input name: Skymapper)
* SpecSDSS/DR14 (input name: SpecSDSS) - SDSS spectroscopic catalog
* Spitzer/SAGE (input name SAGE)
* Spitzer/IRAC (input name IRACgc) - Spitzer IRAC galactic center survey
* UCAC4 (input name: UCAC4) - (~1.1x10^8 sources)
* UKIDSS/DR10 (input name: UKIDSS)
* USNOB1 (not yet available)
* VISTA/Viking/DR3 (not yet available)
* VST/ATLAS/DR3 (input name: VSTatlas)
* VST/KiDS/DR3 (input name: VSTkids)
* WISE (input name: WISE) - ~5.6x10^8 sources
* XMM (input name: XMM)- 7.3x10^5 sources 3XMM-DR7 (Rosen et al. 2016; A&A 26, 590)
* ZTF-DR1 stellar variability catalog (input name: ztfSrcLCDR1)
* ZTF-DR1 variable star candidates (input name: ztfSrcLCDR1)

In [None]:
ra = properties.meanra
dec = properties.meandec
radius = 30 # arcsec
#catalog_name = "GAIA/DR1"
cone_objects = client.catshtm_conesearch(ra, dec, radius, format="pandas")
cone_objects

We obtained many crossmatches from the [catsHTM](https://arxiv.org/abs/1805.02666_) catalog. Let's look at the catalogs where matches were found:

Let's plot the location of all crossmatches  at the SDSS/DR10 xmatches, highlighting objects with redshift if there are any from the table SpecSDSS.

In [None]:
fig, ax = plt.subplots(figsize=(10, 10))
cmap = plt.cm.tab20
norm = matplotlib.colors.Normalize(vmin=0, vmax=len(cone_objects.keys()))
for idx, i in enumerate(cone_objects.keys()):
    try:
        aux = cone_objects[i].rename({"RA": "ra", "Dec": "dec"}, axis=1)
        ax.scatter(aux.ra, aux.dec, color=cmap(norm(idx)), label=i, s=50)
        if i == "SpecSDSS": # the table with spectroscopic redshifts
            s = 500
            for idxrow, row in aux.iterrows():
                ax.text(row.ra, row.dec, "z=%.5f" % row.z, fontsize=20)
    except:
        print(i)
        
ax.scatter(ra, dec, c='k', marker='o', s=100, label=oid)
ax.set_xlim(ax.get_xlim()[::-1])
plt.legend()

# Query features for an individual object

Now we will query the features used by our light curve classifier. These are hand made statistics or contextual information based on the object's light curve.

In [None]:
query_results = client.query_features(
        oid=oid,
        format='pandas')
query_results

You can see that there are 178 rows, where each row has a feature name (`name`), a value (`value`), a band id (`fid`), and a feature version (`version`).

A detailed explanation of all the features can be found in http://alerce.science/features/.

We can pivot these dataframe to make the features appear as columns. To do this we will add an auxiliary columns that contains the feature name and filter id all in one string, as well as adding the object identifier as a column.

In [None]:
query_results["oid"] = oid
query_results['feature'] = [f"{name}_{fid}" for name, fid in zip(query_results.name, query_results.fid)]

In [None]:
query_results.pivot(index='oid', columns='feature', values='value')

# Query probabilities for an individual object

Similarly, we can query the classification probabilities for a given object. 

Note that an object can be classified by different classifiers with different versions, which is shown in the columns `classifier_name` and `classifier_version`, respectively.

In [None]:
query_results = client.query_probabilities(
        oid=oid,
        format='pandas')
query_results

We see that we get many rows for a single object. This shows all the probabilities associated to all the ALeRCE classifiers and the classes in their associated taxonomies. The columns indicate the name of the classifier (`classifier_name`), its version (`classifier_version`), the class (`class_name`), the probability (`probability`) and the ranking (`ranking`) growing from most to least likely. 

Let's check the unique classifier versions associated to each classifier:

In [None]:
for clf in query_results.classifier_name.unique():
    mask = query_results.classifier_name == clf
    print(clf, query_results.loc[mask].classifier_version.unique())

And now the unique classes associated to each classifier:

In [None]:
for clf in query_results.classifier_name.unique():
    mask = query_results.classifier_name == clf
    print(clf, query_results.loc[mask].class_name.unique())

The classifiers are two classifiers, the `lc_classifier` and the `stamp_classifier`, where `lc_classifier` is a hierarchical classifier composed of four independent classifiers: `lc_classifier_top`, `lc_classifier_periodic`, `lc_classifier_stochastic`, and `lc_classifier_transient`. You can see more details about these classifiers in [Sánchez-Sáez+2021](https://ui.adsabs.harvard.edu/abs/2021AJ....161..141S/abstract) and [Carrasco-Davis+2021](https://ui.adsabs.harvard.edu/abs/2020arXiv200803309C/abstract). 

`lc_classifier`:
* AGN
* Blazar
* CEP
* CV/Nova
* DSCT
* E
* LPV
* Periodic-Other
* QSO
* RRL
* SLSN
* SNIa
* SNIbc
* SNII
* YSO

`stamp_classifier`:
* AGN
* asteroid
* bogus
* SN
* VS


`lc_classifier_top`:
* Periodic
* Stochastic
* Transient

`lc_classifier_periodic`:
* CEP
* DSCT
* E
* LPV
* Periodic-Other
* RRL

`lc_classifier_stochastic`:
* AGN
* Blazar
* CV/Nova
* QSO
* YSO

`lc_classifier_transient`:
* SLSN
* SNIa
* SNIbc
* SNII

# Query global properties of a set of objects

## Query objects based on the most likely class

We will query the top 200 objects classified SNIa according to the light curve classifier. In here we ask for the top 200 objects in pandas format. By default this query asks for objects with classification `ranking=1`. We will ask for the results to be ordered by probability in descending order (`DESC`).

In [None]:
query_results = client.query_objects(
        classifier="lc_classifier",
        class_name="SNIa",
        page_size=200,
        order_by='probability',
        order_mode='DESC',
        format='pandas')

In [None]:
query_results

Note that now the columns `class`, `classifier`, and `probability` are included.

## Query objects by classified ranking

When an object is classified by our classifiers, a `ranking` column is available to quickly extract the most likely class (`ranking=1`). We can also extract objects that were classified as SNIa as the 2nd or 3rd most likely classes. Note that not asking for a ranking is equivalent to asking for `ranking=1` (the most likely class). 

In [None]:
fig, ax = plt.subplots()

for ranking in [1, 2, 3]:
    query_results = client.query_objects(
        classifier="lc_classifier",
        class_name="SNIa",
        ranking=ranking,
        page_size=200,
        order_by='probability',
        order_mode='DESC',        format='pandas')
    query_results.probability.plot.hist(bins=20, ax=ax, lw=2, log=True,
                                        alpha=0.5, histtype='step', label="ranking=%i" % ranking)
query_results = client.query_objects(
        classifier="lc_classifier",
        class_name="SNIa",
        page_size=200,
        order_by='probability',
        order_mode='DESC',
        format='pandas')
query_results.probability.plot.hist(bins=20, ax=ax, lw=2, log=True, histtype='step', label="no ranking")

ax.set_xlabel("probability")
plt.legend()

You can see that the probabilities when no ranking is specified and `ranking=1` overlap because `ranking=1` is the default value. Also, the typical probabilities of `ranking=1` > `ranking=2` > `ranking=3`. 

# Let's now generate a link to look at all the previous objects

Here we restrict the number of objects to 200, which is the maximum number that the ALeRCE explorer can accept. 

In [None]:
suffix = "&count=true&page=1&perPage=1000&sortDesc=true&selectedClassifier=lc_classifier"
url = "https://alerce.online/?" + "&".join("oid=%s" % i for i in query_results.oid.iloc[:200]) + suffix
print(url)

Open the link in your browser to see the explore the objects.