## Make multiwavelength light curves using archival data

### Summary:
 - model plots after van Velzen et al. 2021, https://arxiv.org/pdf/2111.09391.pdf
 
### Input:
 - a catalog of CLAGN from the literature

### Output:
 - an archival optical + IR + neutrino light curve
 
### Technical Goals:
 - should be able to run from a clean checkout from github
 - should be able to automatically download all catalogs & images used
 - need to have all photometry in the same physical unit
 - need to have a data structure that is easy to use but holds light curve information (time and units) and is extendable to ML applications
 - need to have a curated list of catalogs to search for photometry that is generalizeable to other input catalogs
 
### Authors:
IPAC SP team

### Acknowledgements:
Suvi Gezari, Antara Basu-zych,
MAST, HEASARC, & IRSA Fornax teams

In [59]:
import numpy as np
import time
import pandas as pd
import axs
import os
import sys
import re
import matplotlib.pyplot as plt
import json
import requests

from astroquery.ipac.ned import Ned
from astroquery.heasarc import Heasarc
from astroquery.gaia import Gaia

from astropy.coordinates import SkyCoord
import astropy.units as u
from astropy.table import Table, vstack, hstack, unique
from astropy.io import ascii


try: # Python 3.x
    from urllib.parse import quote as urlencode
    from urllib.request import urlretrieve
except ImportError:  # Python 2.x
    from urllib import pathname2url as urlencode
    from urllib import urlretrieve

try: # Python 3.x
    import http.client as httplib 
except ImportError:  # Python 2.x
    import httplib   

!pip install lightkurve --upgrade
import lightkurve as lk

!pip install ztfquery
from ztfquery import lightcurve

You should consider upgrading via the '/opt/conda/bin/python3 -m pip install --upgrade pip' command.[0m[33m
You should consider upgrading via the '/opt/conda/bin/python3 -m pip install --upgrade pip' command.[0m[33m
[0m

## 1. Define the Sample

In [2]:
# use the following paper to make a sample of CLAGN: https://iopscience.iop.org/article/10.3847/1538-4357/aaca3a 

# This sample can later be switched out to a differen/larger sample of "interesting" targets

#use ADS to find the refcode for this paper
CLAGN = Ned.query_refcode('2018ApJ...862..109Y')



### What is the best data structure for this work?
 - list of requirements is being kept here: https://github.com/fornax-navo/fornax-demo-notebooks/issues/69 
 - some things to keep an eye on as other people are actively working on this field
     - astropy has a light curve class
         -would require development work to make this work for multiwavelength application
     - LINCC people are interested in this and might have some suggestions on a 6mo. timescale
     - xarray
     - pandas pint has units support but also has a warning that it doesn't yet work perfectly
     - lightKurve is not suitable for this application
     - sunpy is also not suitable for this application

### Since there is nothing perfectly ready now, we need to go with something practical for the time being
 - instead of one large dataframe with the multiwavelength information, we could keep them as seperate astropy light curves for each band, do the feature extraction on each light curve and keep the features in one large dataframe. - how would we link targets between bands?
 - ZTF keeps the light curve info as multidimensional arrays in pandas columns - this works out of the box but doesn't have unit support so we just need to do that manually.

In [3]:
type(CLAGN)

astropy.table.table.Table

In [4]:
#### Build a list of skycoords from target ra and dec #####
coords_list = [
    SkyCoord(ra, dec, frame='icrs', unit='deg')
    for ra, dec in zip(CLAGN['RA'], CLAGN['DEC'])
]


In [95]:
coords_list

[<SkyCoord (ICRS): (ra, dec) in deg
     (0.28136, -0.09789)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (21.70037, -8.66335)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (29.99, 0.55301)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (29.99015, 0.55288)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (120.94815, 42.97747)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (127.88438, 36.77146)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (132.49077, 27.79139)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (137.38346, 47.79186)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (144.37635, 26.04226)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (144.39777, 32.5472)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (150.84777, 35.41774)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (152.97077, 54.7018)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (166.09674, 63.71816)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (166.22989, 1.31573)>,
 <SkyCoord (ICRS): (ra, dec) in deg
     (167.60602, -0.05948)>,
 <SkyCoord (ICRS): (ra, dec) in deg
  

## 2. Find light curves for these targets in NASA catalogs
- look at NAVO use cases to get help with tools to do this - although they mostly use pyvo
- deciding up front to use astroquery instead of pyvo
    - astroquery is apparently more user friendly
- data access concerns:
    - can't ask the archives to search their entire holdings
        - not good enough meta data
        - not clear that the data is all vetted and good enough to include for science
        - all catalogs have differently named columns so how would we know which columns to keep
    - instead work with a curated list of catalogs for each archive
        - focus on general surveys
        - try to ensure that this list is also appropriate for a generalization of this use case to other input catalogs
        - could astroquery.NED be useful in finding a generalized curated list
- How do we know we have a match that is good enough to include in our light curve
     - look at nway for the high energy catalogs
     - probably need to generate a table of search radii for each catalog based on bandpass
         - need domain knowledge for that
     


## 2.1 HEASARC: FERMI & Beppo SAX


In [41]:
mission_list = ['FERMIGTRIG', 'SAXGRBMGRB']
radius = 0.1*u.degree

for ccount, coord in enumerate(coords_list):
    #use astroquery to search that position for either a Fermi or Beppo Sax trigger
    for mcount, mission in enumerate(mission_list):
        try:
            results = heasarc.query_region(coord, mission = mission, radius = radius)#, sortvar = 'SEARCH_OFFSET_')
            print ("got a live one")
            #need to figure out what this result would look like and how to add that to the saved data structure
        except AttributeError:
            print("no results at that location for ", mission)


#**** These HEASARC searches are returning an attribute error because of an astroquery bug
# bug submitted to astroquery Oct 18, waiting for a fix.
# if that gets fixed, can probably change this cell 

no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for  FERMIGTRIG
no results at that location for  SAXGRBMGRB
no results at that location for 

## 2.2 IRSA: ZTF

In [92]:
#python package ztfquery is not a good solution for this because it requires IRSA password
#Instead will construct the URL for an API query
#https://irsa.ipac.caltech.edu/docs/program_interface/ztf_lightcurve_api.html
ztf_radius = 0.000278   #as suggested by Dave Shupe

for count, coord in enumerate(coords_list):
    #doesn't take SkyCoord
    ra = CLAGN['RA'][count]
    dec = CLAGN['DEC'][count]
    
    #make the string for the URL query
    #ask for all three bands (g, r, i)
    #don't want data that is flagged as unusable by the ZTF pipeline
    urlstr = 'https://irsa.ipac.caltech.edu/cgi-bin/ZTF/nph_light_curves?POS=CIRCLE %f %f %f&BANDNAME=g,r,i&FORMAT=ipac_table&BAD_CATFLAGS_MASK=32768'%(ra, dec,ztf_radius)

    response = requests.get(urlstr)
    if response.ok:
        ztf_lc = ascii.read(response.text, format='ipac')
        #print(count, len(ztf_lc))
        #this could be up to 3 light curves because there are 3 filters
        #need to sort by filtercode 'zg','zr','zi'
        #and store the light curves
    else:
        print(count, " There is no ZTF light curve at this position")

0 404
1 429
2 612
3 612
4 387
5 1109
6 1202
7 162
8 791
9 525
10 888
11 2346
12 931
13 183
14 347
15 423
16 1029
17 397
18 1094
19 1088
20 1300
21 1359
22 407
23 33
24 1901
25 644
26 1010
27 2295
28 883
29 2024
30 438


## 2.3 IRSA:WISE

- Dave Shupe has made a catalog of neowise light curves of half the sky in a parquet file

- Pandas is not a good option for working with this catalog because it is so large (2 billion rows?)

- Instead we can use AXS to cross match the CLAGN sample with the neowise catalog to find those rows in neowise which correspond to the CLAGN sample. AXS is a part of spark. 


In [12]:
#%%time
#could load the neowise light curves into pandas, but would need to severely
# filter the catalog to get it to fit into memory.  Since these targets are all over the sky
# it is not obvious how to filter the catalog

#Here is one way it could work in Pandas if we had a way to filter significantly before matching
#subset = pd.read_parquet('/stage/irsa-data-download10/parquet-work/NEOWISE-R/neowise_lc_half.parquet',
#                    engine='pyarrow', 
#                    filters=[ ('ra', '<', 121) , ('ra', '>', 120) , 
#                            ('dec', '<', 68) , ('dec', '>', -9),
#                            ('cw_w1mpro', '>', 15.0) ])
#
#len(subset)

CPU times: user 1h 53min 48s, sys: 4h 16min 37s, total: 6h 10min 25s
Wall time: 23min 46s


In [7]:
#start up SPARK
os.environ['SPARK_CONF_DIR'] = '/home/jkrick/axs_store/conf_alt'

def spark_start(work_dir, database_dir, warehouse_dir):
    from pyspark.sql import SparkSession
    import os
    
    spark = (
            SparkSession.builder
            .appName("spark trial")
            .config("spark.sql.warehouse.dir", warehouse_dir)
            .config('spark.master', "local[20]")
            .config('spark.driver.memory', '64G') # 128
            .config('spark.executor.memory', '30G')
            .config('spark.local.dir', work_dir)
            .config('spark.memory.offHeap.enabled', 'true')
            .config('spark.memory.offHeap.size', '128G') # 256
            .config("spark.sql.execution.arrow.enabled", "true")
            .config("spark.driver.maxResultSize", "60G")
            .config("spark.driver.extraJavaOptions", 
                    f"-Dderby.system.home={database_dir}")
            .config("spark.sql.hive.metastore.sharedPrefixes",
                    "org.apache.derby")
            .enableHiveSupport()
            .getOrCreate()
                    )   

    return spark

spark_session = spark_start(
    "/stage/irsa-staff-jkrick/spark_work",
    "/home/jkrick/axs_store",
    "/stage/irsa-staff-jkrick/sp_axs_warehouse/warehouse")

In [8]:
#if the one we want is not yet available, add it to the list
catalog = axs.AxsCatalog(spark_session)
catlist = catalog.list_table_names()

if 'neowise_lc_half' not in catlist:
    catalog.import_existing_table('neowise_lc_half', 
        '/stage/irsa-data-download10/parquet-work/NEOWISE-R/neowise_lc_half.parquet',
        import_into_spark=True)

In [9]:
#lazy load in the catalog
neowise_lc_half = catalog.load('neowise_lc_half')

In [10]:
#now figure out how to get the CLAGN catalog into AXS
#can't go direct from astropy table into AXS, so first to pandas

if 'axs_clagn' not in catlist:

    pd_CLAGN = CLAGN.to_pandas()

    #then pandas to spark dataframe
    sp_CLAGN = spark_session.createDataFrame(pd_CLAGN)

    #ok, saving below can't handle capital "RA" and "DEC", so need to change that
    #also can't handle column names with spaces in them so need to rename those as well.
    sp_CLAGN2 = sp_CLAGN.withColumnRenamed("RA","ra").withColumnRenamed("DEC","dec").withColumnRenamed("Object Name", "Object_name").withColumnRenamed("Redshift Flag","redshift_flag").withColumnRenamed("Magnitude and Filter", "magnitude_and_filter").withColumnRenamed("Photometry Points","photometry_points").withColumnRenamed("Redshift Points", "redshift_points").withColumnRenamed("Diameter Points","diameter_points")

    #now save spark to AXS
    catalog.save_axs_table(sp_CLAGN2, 'AXS_CLAGN', calculate_zone=True)

In [11]:
#just confirm that worked:
catalog = axs.AxsCatalog(spark_session)
catlist = catalog.list_table_names()
catlist

['gaia_edr3', 'catwise_corrected', 'neowise_lc_half', 'axs_clagn']

In [12]:
#lazy load in the catalog
axs_clagn = catalog.load('axs_clagn')

In [13]:
#ready to try the crossmatch

neowise_CLAGN = neowise_lc_half.crossmatch(axs_clagn, 2*axs.Constants.ONE_ASEC, return_min = True, include_dist_col = True)



In [14]:
%%time
#lazy evaluation means the cross match won't happen until this cell gets executed
neowise_CLAGN.count()

CPU times: user 22.4 ms, sys: 12 ms, total: 34.4 ms
Wall time: 2min 25s


29

In [20]:
type(neowise_CLAGN)

axs.axsframe.AxsFrame

In [93]:
%%time
#now get it into a format that I can handle
#this is taking a long time 45min? for 29 rows?

#neowise_CLAGN.toPandas()

#instead maybe try parquet?  csv doesn't work since there are arrays in the columns
#neowise_CLAGN.write.parquet("neowise_CLAGN.parquet")


CPU times: user 6 µs, sys: 2 µs, total: 8 µs
Wall time: 15.7 µs


In [17]:
%%time
#instead try pulling the data into a pandas dataframe
#is this faster? no 1h 58 min.
#pd_neowise_CLAGN = pd.DataFrame.from_records(neowise_CLAGN.collect(), columns=neowise_CLAGN.columns)

CPU times: user 1.64 s, sys: 529 ms, total: 2.17 s
Wall time: 1h 58min 12s


## 2.4 MAST: Pan-STARRS

In [None]:
def ps1cone(ra,dec,radius,table="mean",release="dr1",format="csv",columns=None,
           baseurl="https://catalogs.mast.stsci.edu/api/v0.1/panstarrs", verbose=False,
           **kw):
    """Do a cone search of the PS1 catalog
    
    Parameters
    ----------
    ra (float): (degrees) J2000 Right Ascension
    dec (float): (degrees) J2000 Declination
    radius (float): (degrees) Search radius (<= 0.5 degrees)
    table (string): mean, stack, or detection
    release (string): dr1 or dr2
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'nDetections.min':2)
    """
    
    data = kw.copy()
    data['ra'] = ra
    data['dec'] = dec
    data['radius'] = radius
    return ps1search(table=table,release=release,format=format,columns=columns,
                    baseurl=baseurl, verbose=verbose, **data)


def ps1search(table="mean",release="dr1",format="csv",columns=None,
           baseurl="https://catalogs.mast.stsci.edu/api/v0.1/panstarrs", verbose=False,
           **kw):
    """Do a general search of the PS1 catalog (possibly without ra/dec/radius)
    
    Parameters
    ----------
    table (string): mean, stack, or detection
    release (string): dr1 or dr2
    format: csv, votable, json
    columns: list of column names to include (None means use defaults)
    baseurl: base URL for the request
    verbose: print info about request
    **kw: other parameters (e.g., 'nDetections.min':2).  Note this is required!
    """
    
    data = kw.copy()
    if not data:
        raise ValueError("You must specify some parameters for search")
    checklegal(table,release)
    if format not in ("csv","votable","json"):
        raise ValueError("Bad value for format")
    url = f"{baseurl}/{release}/{table}.{format}"
    if columns:
        # check that column values are legal
        # create a dictionary to speed this up
        dcols = {}
        for col in ps1metadata(table,release)['name']:
            dcols[col.lower()] = 1
        badcols = []
        for col in columns:
            if col.lower().strip() not in dcols:
                badcols.append(col)
        if badcols:
            raise ValueError('Some columns not found in table: {}'.format(', '.join(badcols)))
        # two different ways to specify a list of column values in the API
        # data['columns'] = columns
        data['columns'] = '[{}]'.format(','.join(columns))

# either get or post works
#    r = requests.post(url, data=data)
    r = requests.get(url, params=data)

    if verbose:
        print(r.url)
    r.raise_for_status()
    if format == "json":
        return r.json()
    else:
        return r.text


def checklegal(table,release):
    """Checks if this combination of table and release is acceptable
    
    Raises a VelueError exception if there is problem
    """
    
    releaselist = ("dr1", "dr2")
    if release not in ("dr1","dr2"):
        raise ValueError("Bad value for release (must be one of {})".format(', '.join(releaselist)))
    if release=="dr1":
        tablelist = ("mean", "stack")
    else:
        tablelist = ("mean", "stack", "detection")
    if table not in tablelist:
        raise ValueError("Bad value for table (for {} must be one of {})".format(release, ", ".join(tablelist)))


def ps1metadata(table="mean",release="dr1",
           baseurl="https://catalogs.mast.stsci.edu/api/v0.1/panstarrs"):
    """Return metadata for the specified catalog and table
    
    Parameters
    ----------
    table (string): mean, stack, or detection
    release (string): dr1 or dr2
    baseurl: base URL for the request
    
    Returns an astropy table with columns name, type, description
    """
    
    checklegal(table,release)
    url = f"{baseurl}/{release}/{table}/metadata"
    r = requests.get(url)
    r.raise_for_status()
    v = r.json()
    # convert to astropy table
    tab = Table(rows=[(x['name'],x['type'],x['description']) for x in v],
               names=('name','type','description'))
    return tab


def addfilter(dtab):
    """Add filter name as column in detection table by translating filterID
    
    This modifies the table in place.  If the 'filter' column already exists,
    the table is returned unchanged.
    """
    if 'filter' not in dtab.colnames:
        # the filterID value goes from 1 to 5 for grizy
        id2filter = np.array(list('grizy'))
        dtab['filter'] = id2filter[(dtab['filterID']-1).data]
    return dtab


In [None]:
#try for panstarrs
radius = 1.0/3600.0 # radius = 1 arcsec
plt.rcParams.update({'font.size': 14})
plt.figure(1,(10,10))

        
#for all objects
for count, coord in enumerate(coords_list):
    #doesn't take SkyCoord
    ra = CLAGN['RA'][count]
    dec = CLAGN['DEC'][count]

    #see if there is an object in panSTARRS at this location
    results = ps1cone(ra,dec,radius,release='dr2')
    tab = ascii.read(results)
    
    # improve the format
    for filter in 'grizy':
        col = filter+'MeanPSFMag'
        tab[col].format = ".4f"
        tab[col][tab[col] == -999.0] = np.nan
        
    #in case there is more than one object within 1 arcsec, sort them by match distance
    tab.sort('distance')
    
    #if there is an object at that location
    if len(tab) > 0:   
        #got a live one
        #print( 'for object', count, 'there is ',len(tab), 'match in panSTARRS', tab['objID'])

        #take the closest match as the best match
        objid = tab['objID'][0]
        
        #setup to pull light curve info
        dconstraints = {'objID': objid}
        dcolumns = ("""objID,detectID,filterID,obsTime,ra,dec,psfFlux,psfFluxErr,psfMajorFWHM,psfMinorFWHM,
                    psfQfPerfect,apFlux,apFluxErr,infoFlag,infoFlag2,infoFlag3""").split(',')
        # strip blanks and weed out blank and commented-out values
        dcolumns = [x.strip() for x in dcolumns]
        dcolumns = [x for x in dcolumns if x and not x.startswith('#')]


        #get the actual detections and light curve info for this target
        dresults = ps1search(table='detection',release='dr2',columns=dcolumns,**dconstraints)
        
        #sometimes there isn't actually a light curve for the target???
        try:
            ascii.read(dresults)
        except FileNotFoundError:
            print("There is no light curve")
            #no need to store PanSTARRS data for this one
        else:
            #There is a light curve for this target
            
            #fix the column names to include filter names
            dtab = addfilter(ascii.read(dresults))
            dtab.sort('obsTime')

            #not yet ready to store these, but here is the light curve
            #mixed from all 5 bands
            t = dtab['obsTime']
            flux = dtab['psfFlux']

            #plot light curves on same plot just to know they are there?
            #not currently working
            #xlim = np.array([t.min(),t.max()])
            #xlim = xlim + np.array([-1,1])*0.02*(xlim[1]-xlim[0])
            #for i, filter in enumerate("grizy"):
            #    plt.subplot(511+i)
            #    w = np.where(dtab['filter']==filter)
            #    plt.plot(t[w],flux[w],'-o')
            #    plt.ylabel(filter+' [Jy]')
            #    plt.xlim(xlim)
            #    #plt.gca().invert_yaxis()
            #    if i==0:
            #        plt.title(objid)
            #plt.xlabel('Time [MJD]')
            #plt.tight_layout()

## 2.5 MAST: ATLAS all-sky stellar reference catalog (g, r, i) < 19mag
 -  MAST has this catalog but it is not clear that it has the individual epoch photometry and it is only accessible with casjobs, not through python notebooks.  

 https://archive.stsci.edu/hlsp/atlas-refcat2#section-a737bc3e-2d56-4827-9ab4-838fbf8d67c1
 
 - if we really want to pursue this, we can put in a MAST helpdesk ticket to see if a) they do have the light curves, and b) they could switch the catalog to a searchable with python version.  There are some ways of accessing casjobs through python (<https://github.com/spacetelescope/notebooks/blob/master/notebooks/MAST/HSC/HCV_CASJOBS/HCV_casjobs_demo.ipynb), but apparently not this particular catalog.  
 

## 2.6 MAST: TESS, Kepler and K2
 - use lightKurve to search all 3

In [5]:
radius = 1.0  #arcseconds

#for all objects
for count, coord in enumerate(coords_list):
    print("working on object", count, coord)
    
    #use lightkurve to search TESS, Kepler and K2
    search_result = lk.search_lightcurve(coord, radius = radius)
    
    #figure out what to do with the results
    if len(search_result) < 1:
        #there is no data in these missions at this location
    else:
        #don't know what this looks like because none of these targets has a light curve
        #https://docs.lightkurve.org/tutorials/1-getting-started/searching-for-data-products.html
        #has a tutorial on how to do this
        #might look something like this:
        #lc_collection = search_result[*].download_all()


working on object 0 <SkyCoord (ICRS): (ra, dec) in deg
    (0.28136, -0.09789)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (0.28136, -0.09789)>".


0
working on object 1 <SkyCoord (ICRS): (ra, dec) in deg
    (21.70037, -8.66335)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (21.70037, -8.66335)>".


0
working on object 2 <SkyCoord (ICRS): (ra, dec) in deg
    (29.99, 0.55301)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (29.99, 0.55301)>".


0
working on object 3 <SkyCoord (ICRS): (ra, dec) in deg
    (29.99015, 0.55288)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (29.99015, 0.55288)>".


0
working on object 4 <SkyCoord (ICRS): (ra, dec) in deg
    (120.94815, 42.97747)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (120.94815, 42.97747)>".


0
working on object 5 <SkyCoord (ICRS): (ra, dec) in deg
    (127.88438, 36.77146)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (127.88438, 36.77146)>".


0
working on object 6 <SkyCoord (ICRS): (ra, dec) in deg
    (132.49077, 27.79139)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (132.49077, 27.79139)>".


0
working on object 7 <SkyCoord (ICRS): (ra, dec) in deg
    (137.38346, 47.79186)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (137.38346, 47.79186)>".


0
working on object 8 <SkyCoord (ICRS): (ra, dec) in deg
    (144.37635, 26.04226)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (144.37635, 26.04226)>".


0
working on object 9 <SkyCoord (ICRS): (ra, dec) in deg
    (144.39777, 32.5472)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (144.39777, 32.5472)>".


0
working on object 10 <SkyCoord (ICRS): (ra, dec) in deg
    (150.84777, 35.41774)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (150.84777, 35.41774)>".


0
working on object 11 <SkyCoord (ICRS): (ra, dec) in deg
    (152.97077, 54.7018)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (152.97077, 54.7018)>".


0
working on object 12 <SkyCoord (ICRS): (ra, dec) in deg
    (166.09674, 63.71816)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (166.09674, 63.71816)>".


0
working on object 13 <SkyCoord (ICRS): (ra, dec) in deg
    (166.22989, 1.31573)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (166.22989, 1.31573)>".


0
working on object 14 <SkyCoord (ICRS): (ra, dec) in deg
    (167.60602, -0.05948)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (167.60602, -0.05948)>".


0
working on object 15 <SkyCoord (ICRS): (ra, dec) in deg
    (168.90238, 5.74715)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (168.90238, 5.74715)>".


0
working on object 16 <SkyCoord (ICRS): (ra, dec) in deg
    (169.62351, 32.06666)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (169.62351, 32.06666)>".


0
working on object 17 <SkyCoord (ICRS): (ra, dec) in deg
    (173.12142, 3.95808)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (173.12142, 3.95808)>".


0
working on object 18 <SkyCoord (ICRS): (ra, dec) in deg
    (177.66385, 36.54956)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (177.66385, 36.54956)>".


0
working on object 19 <SkyCoord (ICRS): (ra, dec) in deg
    (178.11465, 32.16646)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (178.11465, 32.16646)>".


0
working on object 20 <SkyCoord (ICRS): (ra, dec) in deg
    (194.81978, 55.25199)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (194.81978, 55.25199)>".


0
working on object 21 <SkyCoord (ICRS): (ra, dec) in deg
    (199.87808, 67.89872)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (199.87808, 67.89872)>".


0
working on object 22 <SkyCoord (ICRS): (ra, dec) in deg
    (209.07707, -1.2539)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (209.07707, -1.2539)>".


0
working on object 23 <SkyCoord (ICRS): (ra, dec) in deg
    (209.73263, 49.5706)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (209.73263, 49.5706)>".


0
working on object 24 <SkyCoord (ICRS): (ra, dec) in deg
    (221.97599, 28.55669)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (221.97599, 28.55669)>".


0
working on object 25 <SkyCoord (ICRS): (ra, dec) in deg
    (233.48331, 1.17494)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (233.48331, 1.17494)>".


0
working on object 26 <SkyCoord (ICRS): (ra, dec) in deg
    (236.3735, 25.19107)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (236.3735, 25.19107)>".


0
working on object 27 <SkyCoord (ICRS): (ra, dec) in deg
    (237.57179, 41.65064)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (237.57179, 41.65064)>".


0
working on object 28 <SkyCoord (ICRS): (ra, dec) in deg
    (238.24292, 27.62456)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (238.24292, 27.62456)>".


0
working on object 29 <SkyCoord (ICRS): (ra, dec) in deg
    (238.66774, 36.4978)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (238.66774, 36.4978)>".


0
working on object 30 <SkyCoord (ICRS): (ra, dec) in deg
    (354.01242, 0.29132)>


No data found for target "<SkyCoord (ICRS): (ra, dec) in deg
    (354.01242, 0.29132)>".


0


## 2.6 MAST: HCV
 - hubble catalog of variables (https://archive.stsci.edu/hlsp/hcv)
 - follow notebook here to know how to search and download light curves https://archive.stsci.edu/hst/hsc/help/HCV/HCV_API_demo.html

## 3. Find light curves for these targets in relevant, non-NASA catalogs


### Gaia (Faisst)
- astroquery.gaia will presumably work out of the box for this

In [12]:
############ EXTRACT GAIA DATA FOR OBJECTS ##########

## Select Gaia table (DR3)
Gaia.MAIN_GAIA_TABLE = "gaiaedr3.gaia_source"

## Define search radius
radius = u.Quantity(20, u.arcsec)

## Search and Cross match.
# This can be done in a smarter way by matching catalogs on the Gaia server, or grouping the
# sources and search a larger area.

# get catalog
gaia_table = Table()
t1 = time.time()
for cc,coord in enumerate(coords_list):
    print(len(coords_list)-cc , end=" ")

    gaia_search = Gaia.cone_search_async(coordinate=coord, radius=radius , background=True)
    gaia_search.get_data()["dist"].unit = "deg"
    gaia_search.get_data()["dist"] = gaia_search.get_data()["dist"].to(u.arcsec) # Change distance unit from degrees to arcseconds
    
    
    # match
    if len(gaia_search.get_data()["dist"]) > 0:
        gaia_search.get_data()["input_object_name"] = CLAGN["Object Name"][cc] # add input object name to catalog
        sel_min = np.where( (gaia_search.get_data()["dist"] < 1*u.arcsec) & (gaia_search.get_data()["dist"] == np.nanmin(gaia_search.get_data()["dist"]) ) )[0]
    else:
        sel_min = []
        
    #print("Number of sources matched: {}".format(len(sel_min)) )
    
    if len(sel_min) > 0:
        gaia_table = vstack( [gaia_table , gaia_search.get_data()[sel_min]] )
    else:
        gaia_table = vstack( [gaia_table , gaia_search.get_data()[sel_min]] )

print("\nSearch completed in {:.2f} seconds".format((time.time()-t1) ) )
print("Number of objects mached: {} out of {}.".format(len(gaia_table),len(CLAGN) ) )

31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 
Search completed in 79.36 seconds
Number of objects mached: 28 out of 31.


In [13]:
########## EXTRACT PHOTOMETRY #########
# Note that the fluxes are in e/s, not very useful. However, there are magnitudes (what unit??) but without errors.
# We can get the errors from the flux errors?

## Define keys (columns) that will be used later. Also add wavelength in angstroms for each filter
mag_keys = ["phot_bp_mean_mag" , "phot_g_mean_mag" , "phot_rp_mean_mag"]
magerr_keys = ["phot_bp_mean_mag_error" , "phot_g_mean_mag_error" , "phot_rp_mean_mag_error"]
flux_keys = ["phot_bp_mean_flux" , "phot_g_mean_flux" , "phot_rp_mean_flux"]
fluxerr_keys = ["phot_bp_mean_flux_error" , "phot_g_mean_flux_error" , "phot_rp_mean_flux_error"]
mag_lambda = ["5319.90" , "6735.42" , "7992.90"]

## Get photometry. Note that this includes only objects that are 
# matched to the catalog. We have to add the missing ones later.
_phot = gaia_table[mag_keys]
_err = hstack( [ 2.5/np.log(10) * gaia_table[e]/gaia_table[f] for e,f in zip(fluxerr_keys,flux_keys) ] )
gaia_phot2 = hstack( [_phot , _err] )

## Clean up (change units and column names)
_ = [gaia_phot2.rename_column(f,m) for m,f in zip(magerr_keys,fluxerr_keys)]
for key in magerr_keys:
    gaia_phot2[key].unit = "mag"
gaia_phot2["input_object_name"] = gaia_table["input_object_name"].copy()

## Also add object for which we don't have photometry.
# Add Nan for now, need to think about proper format. Also, there are probably smarter ways to do this.
# We do this by matching the object names from the original catalog to the photometry catalog. Then add
# an entry [np.nan, ...] if it does not exist. To make life easier, we add a dummy entry as the first
# row so we can compy all the 
gaia_phot = Table( names=gaia_phot2.keys() , dtype=gaia_phot2.dtype )
for ii in range(len(CLAGN)):
    sel = np.where( CLAGN["Object Name"][ii] == gaia_phot2["input_object_name"] )[0]
    if len(sel) > 0:
        gaia_phot = vstack([gaia_phot , gaia_phot2[sel] ])
    else:
        tmp = Table( np.repeat(np.NaN , len(gaia_phot2.keys())) , names=gaia_phot2.keys() , dtype=gaia_phot2.dtype )
        gaia_phot = vstack([gaia_phot , tmp ])

In [16]:
gaia_phot.pprint_all()

phot_bp_mean_mag phot_g_mean_mag phot_rp_mean_mag phot_bp_mean_mag_error phot_g_mean_mag_error phot_rp_mean_mag_error     input_object_name    
      mag              mag             mag                 mag                    mag                   mag                                    
---------------- --------------- ---------------- ---------------------- --------------------- ---------------------- -------------------------
             nan             nan              nan                    nan                   nan                    nan                       nan
       19.334736       20.655428        18.006554     0.0489463475382072   0.01195253816419854    0.03560993262228968 WISEA J012648.10-083948.0
       19.742887       20.841955         18.55511    0.09464535378467852   0.02097086422081428    0.04248351283430384   2MASS J01595763+0033105
       19.742887       20.841955         18.55511    0.09464535378467852   0.02097086422081428    0.04248351283430384 WISEA J015957.63+0

### ASAS-SN (all sky automated survey for supernovae) has a website that can be manually searched (Faisst)
- see if astroquery.vizier can find it



### icecube has a 2008 - 2018 catalog which we can download and is small (Faisst)
- https://icecube.wisc.edu/data-releases/2021/01/all-sky-point-source-icecube-data-years-2008-2018/

## 4. Make plots of luminosity as a function of time
- time could be days since peak, or days since first observation, or??

## Image extension: look for archival images of these targets
- NASA NAVO use cases should help us to learn how to do this
- can use the cutout service now in astropy from the first fornax use case

## ML Extension 
Consider training a ML model to do light curve classification based on this sample of CLAGN
 - once we figure out which bands these are likely to be observed in, could then have a optical + IR light curve classifier
 - what would the features of the light curve be?
 - what models are reasonable to test as light curve classifiers?
 - could we make also a sample of TDEs, SNe, flaring AGN? - then train the model to distinguish between these things?
 - need a sample of non-flaring light curves
 
After training the model:
 - would then need a sample of optical + IR light curves for "all" galaxies = big data to run the model on.

Some resources to consider:
- https://github.com/dirac-institute/ZTF_Boyajian
- https://ui.adsabs.harvard.edu/abs/2022AJ....164...68S/abstract
- https://ui.adsabs.harvard.edu/abs/2019ApJ...881L...9F/abstract

