# HEASARC data access on SciServer

Here we show several methods for getting the lists of the files you're interested in.  

In [1]:
import sys,os
import pyvo as vo
import astropy.coordinates as coord
import requests
import glob
import numpy as np

### Get the HEASARC TAP service

We can use the Virtual Observatory interfaces to the HEASARC to find the data we're  interested in.  Specifically, we want to look at the observation tables.  So first we get a list of all the tables HEASARC serves and then look for the ones related to RXTE.

We start with the Registry of all VO services.  The HEASARC table service is using the same backend as our [Xamin web interface](https://heasarc.gsfc.nasa.gov/xamin/), the same database that [Browse](https://heasarc.gsfc.nasa.gov/cgi-bin/W3Browse/w3browse.pl) also uses.  

In [2]:
tap_services=vo.regsearch(servicetype='table',keywords=['heasarc'])



We then ask the service for all of the tables that are available at the HEASARC:

In [3]:
heasarc_tables=tap_services[0].service.tables

And then we look for the ones related to XTE:

In [4]:
for tablename in heasarc_tables.keys():
    if "xte" in tablename:  
        print(" {:20s} {}".format(tablename,heasarc_tables[tablename].description))


 xteao                XTE Proposal Info & Abstracts
 xteasmlong           XTE All-Sky Monitor Long-Term Observed Sources
 xteassagn            XTE All-Sky Slew Survey AGN Catalog
 xteasscat            XTE All-Sky Slew Survey Catalog
 xteindex             XTE Target Index Catalog
 xtemaster            XTE Master Catalog
 xtemlcat             XTE Mission-Long Source Catalog
 xteslew              XTE Archived Public Slew Data


The "xtemaster" catalog is the one that we're interested in.  

Let's see what this table has in it.  Alternatively, we can google it and find the same information here:

https://heasarc.gsfc.nasa.gov/W3Browse/all/xtemaster.html


In [5]:
for c in heasarc_tables['xtemaster'].columns:
    print("{:20s} {}".format(c.name,c.description))

subject_category     Subject Category of Proposal
exposure             Total Good Time of the Observation (s)
"__x_ra_dec"         System unit vector column
lii                  Galactic Longitude
scheduled_date       Scheduled Start Date of Observation
hexte_modeb          HEXTE Mode B
obsid                Observation ID of Scheduled Observation
pi_lname             Principal Investigator Last Name
hexte_angleb         HEXTE Angle B
dec                  Declination
processed_date       Date Observation Data Was Processed
"__z_ra_dec"         System unit vector column
priority             Target Priority
hexte_energyb        HEXTE Energy B
prnb                 Proposal Number
time_awarded         Total Observation Time (s)
pca_config5          PCA Configuration 5
pca_config2          PCA Configuration 2
hexte_dwellb         HEXTE Dwell B
pi_no                Principal Investigator Unique Number
observed_date        Start Date of Observation
hexte_energya        HEXTE Energy A
pi_fname 

We're interested in Eta Carinae, and we want to get the RXTE cycle, proposal, and observation ID etc. for every observation it took of this source based on its position.  (Just in case the name has been entered differently, which can happen.)  This constructs a query in the ADQL language to select the columns (target_name, cycle, prnb, obsid, time, exposure, ra, dec) where the point defined by the observation's RA and DEC lies inside a circle defined by our chosen source position.  The results will be sorted by time.  See the [NAVO website](https://heasarc.gsfc.nasa.gov/vo/summary/python.html) for more information on how to use these services with python and how to construct ADQL queries for catalog searches.

In [6]:
# Get the coordinate for Eta Car
pos=coord.SkyCoord.from_name("eta car")
query="""SELECT target_name, cycle, prnb, obsid, time, exposure, ra, dec 
    FROM public.xtemaster as cat 
    where 
    contains(point('ICRS',cat.ra,cat.dec),circle('ICRS',{},{},0.1))=1 
    and 
    cat.exposure > 0 order by cat.time
    """.format(pos.ra.deg, pos.dec.deg)

In [7]:
results=tap_services[0].search(query).to_table()
results

target_name,cycle,prnb,obsid,time,exposure,ra,dec
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,mjd,s,deg,deg
object,int16,int32,object,float64,float64,float64,float64
ETA_CAR,1,10004,10004-01-40-00,50122.64263,1091,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-39-00,50129.42992,945,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-38-00,50134.57053,1018,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-41-00,50142.85058,958,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-37-00,50147.83134,1778,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-36-00,50150.58692,696,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-36-01,50154.99084,1098,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-35-00,50162.75743,1109,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-32-00,50180.19184,921,161.2583,-59.6800
ETA_CAR,1,10004,10004-01-31-00,50185.68444,1003,161.2583,-59.6800


###  Xamin's servlet API 

An alternative, if for some reason you don't want to use PyVO, is to use the Xamin API specifically:

The base URL for the Xamin query servelet is 

 https://heasarc.gsfc.nasa.gov/xamin/QueryServlet?
 
 And it then takes options
 * table:  e.g., "table=xtemaster"
 * constraint:   eg., "obsid=10004-01-40-00"
 * object:  "object=andromeda" or "object=10.68,41.27"
  
 So we can do:

In [8]:
url="https://heasarc.gsfc.nasa.gov/xamin/QueryServlet?products&"
result=requests.get(url,params={"table":"xtemaster",
                                "object":"eta car",
                                "resultmax":"10"
                               })
result.text.split('\n')[0:2]

['obsid         |prnb |status  |pi_lname |pi_fname|target_name |ra        |dec      |time               |duration|exposure|__dp|__p_xtemaster_xte_tar_tar_no|__p_xtemaster_xte_tar_root                               |__p_xtemaster_xte_tar_obsid|__p_xtemaster_xte_tar_prnb|__p_xtemaster_point_bib_id|__p_xtemaster_point_bib_table|__w_xtemaster_point_bib|__p_xtemaster_xte_target_merged_root       |__p_xtemaster_xte_abstract_root|__p_xtemaster_xte_abstract_propno|__p_xtemaster_xte_obs_root                        |__p_xtemaster_xte_obs_obsid',
 '30138-01-01-22|30138|archived|MCCONNELL|MARK    |GRO_J0332-87|08 44 14.2|-87 26 49|1998-02-04T15:06:29|    997.|    null|   1|                           1|https://heasarc.gsfc.nasa.gov/cgi-bin/W3Browse/xteTar.pl?|30138-01-01-22             |                     30138|30138-01-01-22            |xtemaster                    |false                  |/FTP/xte/data/archive/AO3//P30138/30138-01/|/FTP/xte/abstracts/abstracts/  |                            301

And then you can construct a file list from the second to last field in each row, the *obs_root.  

###  Know the archive structure

You're still going to have to know how to find the files you're interested in for the given mission.  Then you can take the list of observations from XTE above and find the specific files of the type you want for each of those observations.  

Let's collect all the standard product light curves for RXTE.  (These are described on the [RXTE analysis pages](https://heasarc.gsfc.nasa.gov/docs/xte/recipes/cook_book.html).)

In [9]:
## Need cycle number as well, since after AO9, 
##  no longer 1st digit of proposal number
ids=np.unique( results['cycle','prnb','obsid','time'])
ids.sort(order='time')
ids

array([( 1, 10004, '10004-01-40-00', 50122.64263),
       ( 1, 10004, '10004-01-39-00', 50129.42992),
       ( 1, 10004, '10004-01-38-00', 50134.57053), ...,
       (15, 96002, '96002-01-50-00', 55909.92591),
       (15, 96002, '96002-01-51-00', 55916.80008),
       (15, 96002, '96002-01-52-00', 55923.7098 )],
      dtype=[('cycle', '<i2'), ('prnb', '<i4'), ('obsid', 'O'), ('time', '<f8')])

In [10]:
## Construct a file list.
## Though Jupyter Lab container, either works:
#rootdir="/home/idies/workspace/headata/FTP"
## This one is a link
rootdir="/FTP"
rxtedata="rxte/data/archive"
filenames=[]
for (k,val) in enumerate(ids['obsid']):
    fname="{}/{}/AO{}/P{}/{}/stdprod/xp{}_n2a.lc.gz".format(
        rootdir,
        rxtedata,
        ids['cycle'][k],
        ids['prnb'][k],
        ids['obsid'][k],
        ids['obsid'][k].replace('-',''))
    #print(fname)
    f=glob.glob(fname)
    if (len(f) > 0):
        filenames.append(f[0])
print("Found {} out of {} files".format(len(filenames),len(ids)))

Found 1364 out of 1368 files
