## ZTF - Data Processing

In this notebook, we query the local UW/DIRAC database for ZTF alerts and process them into a format that can be used by THOR. 

The resulting processed data files can be downloaded [here](https://dirac.astro.washington.edu/~moeyensj/projects/thor/paper1/data/ztf).

In [1]:
import os
import glob
import numpy as np
import pandas as pd
import sqlite3 as sql

import mysql.connector as mariadb
from astropy.time import Time

from thor import __version__
print("THOR Version: {}".format(__version__))

THOR Version: 1.1.dev177+g1be98bd.d20210224


In [2]:
os.nice(1)

1

## Data Processing

Here we connect to the alert database and query it for two weeks of observations from night ID 610 up to and including night 624. 

A description of the format of the alerts can be found here: https://zwickytransientfacility.github.io/ztf-avro-alert/schema.html

In [3]:
# Connect to database
con = mariadb.connect(user='ztf', database='ztf')

In [4]:
# Read alerts for solar system objects from after the photometry fix 
sso_alert_fix_date1 = Time('2018-05-16T23:30:00', format='isot', scale='utc') # first attribution fix
sso_alert_fix_date2 = Time('2018-06-08T23:30:00', format='isot', scale='utc') # second attribution fix
sso_alert_phot_fix_date = Time('2018-06-18T23:30:00', format='isot', scale='utc') # photometry fix date

In [5]:
# Only consider alerts post photometry fix
jd_good = sso_alert_phot_fix_date.jd
#ssdistnr >= 0 
df = pd.read_sql_query('select distinct nid from alerts where jd > {}'.format(jd_good), con)
print(len(df))

497


In [6]:
# Set the night range (the nights were picked by looking for an average two week period 
# in terms of the alert volume)
night_range = [610, 624]
df = pd.read_sql_query('select * from alerts where nid >= {} and nid <= {}'.format(*night_range), con)
print(len(df))

4966353


In [7]:
df.sort_values(by=["jd"], inplace=True)
df.reset_index(inplace=True)

Only keep observations with real bogus value above 0.5 and that have been observed less than 4 times in the same area (removes static sources). 

In [8]:
df = df[(df["rb"] >= 0.5) & (df["ndethist"] <= 4)]
len(df)

827546

In [9]:
df.to_csv("ztf_observations_610_624.csv", index=False, sep=" ")