# Querying the ZTF ALeRCE Python client
```Author: Eden Girma, Last updated 20210422```

# Table of contents:
* [Setting up ALeRCE Python client](#setup)
* [Querying objects from the day before](#query)
* [Understanding retrieved object data](#data)
* [Exporting data to .xml](#export)

**Goal:**
 
1) To query the ALeRCE database for objects with the following attributes:
* detected 24 - 48 hours from the current time
* classified by the stamp classifier (version 1.0.4)

2) To return a table consisting of ALeRCE alert objects that includes, per row:
* aggregated detection properties per object (e.g. mean RA/Dec, number of detections)
* probability of the highest ranking class assigned by the stamp classifier (v1.0.4)

In [2]:
import sys

# Packages for direct database access
# %pip install psycopg2
import psycopg2
import json

# Packages for data and number handling
import numpy as np
import pandas as pd
import math

# Packages for calculating current time and extracting ZTF data to VOTable
from astropy.time import Time
from astropy.table import Table, unique, vstack
from astropy.io.votable import from_table, writeto
from datetime import datetime

# Packages for display and data plotting, if desired
from IPython.display import HTML
from IPython.display import display
import matplotlib.pyplot as plt
%matplotlib inline

## Setting up ALeRCE Python client <a class="anchor" id="setup"></a>

In [3]:
# Set up ALeRCE python client
from alerce.core import Alerce
client = Alerce()

## Querying objects from between 24 - 48 hrs ago <a class="anchor" id="query"></a>

We will retrieve these objects per class, first by building a function that uses the ALeRCE client to query objects according to stamp classifier predictions.

Note that according to the ZTF API (```ztf-api/api/sql/astro_object/astro_object.py```), the default ranking for ```query_objects``` when ranking is not specified is 1:

 ``` 
[...]

def _parse_filters(self, args):
    (
        classifier,
        classifier_version,
        class_,
        ndet,
        firstmjd,
        lastmjd,
        probability,
        ranking,
        oids,
    ) = (True, True, True, True, True, True, True, True, True)
    if args["classifier"]:
        classifier = models.Probability.classifier_name == args["classifier"]
    if args["class"]:
        class_ = models.Probability.class_name == args["class"]
    
    [...]
    
    if args["ranking"]:
        ranking = models.Probability.ranking == args["ranking"]
    elif not args["ranking"] and (
        args["classifier"] or args["class"] or args["classifier_version"]
    ):
        # Default ranking 1
        ranking = models.Probability.ranking == 1

```

In [5]:
# Define function that queries objects according to class
def query_class_objects(cn, min_lastmjd, max_lastmjd):
    objects = client.query_objects(classifier = 'stamp_classifier',
                                   classifier_version = 'stamp_classifier_1.0.4',
                                   class_name = cn,
                                   lastmjd = [min_lastmjd, max_lastmjd],
                                   page_size = int(1e6),
                                   format='votable')
    return objects

In [13]:
# Querying the ALeRCE client for objects detected 24 - 48 hours from the current time, over a range of classes

min_lastmjd = Time(datetime.today(), scale='utc').mjd - 2
max_lastmjd = Time(datetime.today(), scale='utc').mjd - 1
classes = ["AGN", "SN", "VS", "asteroid", "bogus"]
objects = Table()

for class_name in classes:
    class_objects = query_class_objects(class_name, min_lastmjd, max_lastmjd)
    if class_name == classes[0]:
        objects = class_objects
    else:
        objects = vstack([objects, class_objects])
    
    print('Class queried: %s' % (class_name))
    
    if class_name == classes[-1]:
        print('Done.')

Class queried: AGN
Class queried: SN
Class queried: VS
Class queried: asteroid
Class queried: bogus
Done.


Note that when we specify a ```votable``` format, the output of our ```query_objects``` is an  ```astropy.table.table.Table``` object:

In [14]:
print(type(objects))

<class 'astropy.table.table.Table'>


We can also re-order and sort the resulting table, depending on how we want the final exported information to be organized.

In [18]:
# Re-ordering votable so that the 'OID' column is in the front
nc = ['oid'] + [c for c in objects.columns if c != 'oid']
objects = objects[nc]

# Sort objects by lastmjd, first mjd, then oid, all in descending order
objects.sort(['lastmjd','firstmjd', 'oid'])
objects = objects[::-1]

Below we'll see the total number of entries and names of each column in the data table.

In [15]:
objects.info

<Table length=174548>
     name      dtype 
------------- -------
        class    str8
   classifier   str16
    corrected    bool
      deltajd float64
     firstmjd float64
      g_r_max  object
 g_r_max_corr  object
     g_r_mean  object
g_r_mean_corr  object
      lastmjd float64
      meandec float64
       meanra float64
   mjdendhist float64
 mjdstarthist float64
     ncovhist   int64
         ndet   int64
     ndethist    str4
          oid   str12
  probability float64
     sigmadec  object
      sigmara  object
      stellar    bool
 step_id_corr   str16

## Understanding retrieved object data <a class="anchor" id="data"></a>

To quickly get a sense of what we're looking at, let's print out the first 5 rows of our table:

In [30]:
objects[0:5]

oid,class,classifier,corrected,deltajd,firstmjd,g_r_max,g_r_max_corr,g_r_mean,g_r_mean_corr,lastmjd,meandec,meanra,mjdendhist,mjdstarthist,ncovhist,ndet,ndethist,probability,sigmadec,sigmara,stellar,step_id_corr
str12,str8,str16,bool,float64,float64,object,object,object,object,float64,float64,float64,float64,float64,int64,int64,str4,float64,object,object,bool,str16
ZTF21aaxbzdg,bogus,stamp_classifier,False,0.0,59325.50875000004,,,,,59325.50875000004,41.9033927,351.7398069,59325.50875000004,58507.158391200006,1213,1,8,0.6109096,,,False,correction_0.0.1
ZTF21aaxbzdb,bogus,stamp_classifier,False,0.0,59325.50875000004,,,,,59325.50875000004,40.7263525,352.3253321,59325.50875000004,59325.50875000004,1368,1,4,0.7413783,,,False,correction_0.0.1
ZTF21aaxbzda,bogus,stamp_classifier,False,0.0,59325.50875000004,,,,,59325.50875000004,40.7264833,352.3256025,59325.50875000004,58767.27603009995,1368,1,5,0.66656053,,,False,correction_0.0.1
ZTF21aaxbzcz,bogus,stamp_classifier,False,0.0,59325.50875000004,,,,,59325.50875000004,40.7264797,352.3255966,59325.50875000004,58767.27603009995,1368,1,5,0.6658804,,,False,correction_0.0.1
ZTF21aaxbzcl,bogus,stamp_classifier,False,0.0,59325.50875000004,,,,,59325.50875000004,37.5465427,353.4647511,59325.50875000004,58854.13714119978,1378,1,2,0.7857003,,,False,correction_0.0.1


The following prints out the number of OIDs that correspond to each class name:

In [26]:
obj_classes = objects.group_by('class')
for key, group in zip(obj_classes.groups.keys, obj_classes.groups):
    l = len(group)
    print('%s : %i' % (key['class'], l))

AGN : 18564
SN : 1001
VS : 106670
asteroid : 26262
bogus : 22051


We can double check to see if there are some rows with duplicate OIDs (though there shouldn't be!)

In [16]:
# Identify duplicate OID entries - rows with same OID but different classes and probabilities
obsgroup = objects.group_by(['oid'])
duplicates = []
for key, group in zip(obsgroup.groups.keys, obsgroup.groups):
    if len(group) > 1:
        oid = group['oid'][1]
        duplicates.append(oid)

print('Number of duplicate OIDs: %i' % (len(duplicates)))

# Print example rows with duplicate OIDs
if len(duplicates) > 0:
    oid = duplicates[0]
    mask = (objects['oid'] == oid)
    objects[mask]

Number of duplicate OIDs: 0


## Exporting data to .xml <a class="anchor" id="export"></a>

```astropy``` offers a simple method in which to export their Table object as an .xml file:

In [31]:
# Export our queried objects table to an .xml file
writeto(objects, "ztf_API_output.xml")