This notebook presents a tutorial of how to access HEASARC data using the virtual observatory (VO) python client `pyvo`.  
This notebook searches the Swift master catalog numaster using pyvo. We specifically use the conesearch service, which the VO service that allows for searching around a position in the sky.

## 1. Module Imports

In [1]:
import os

# pyvo for accessing VO services
import pyvo

# Use SkyCoord to obtain the coordinates of the source
from astropy.coordinates import SkyCoord

## 2. Finding and Downloading the data

This part assumes we know the ID of the VO service. Generally these are of the form: `ivo://nasa.heasarc/{table_name}`.

### 2.1 The Search Serivce  
First, we create a cone search service:

In [2]:
# Create a cone-search service
st_services = pyvo.regsearch(ivoid='ivo://nasa.heasarc/swiftmastr')[0]
cs_service = st_services.get_service('conesearch')

### 2.2 Find the Data  
Next, we will use the search function in `cs_service` to search for observations around our source, OJ 287.  
The `search` function takes as input, the sky position either as a list of `[RA, DEC]`, or as a an astropy sky coordinate object `SkyCoord`.  
The search result is then printed as an astropy Table for a clean display.

In [3]:
# Find the coordinates of the source
pos = SkyCoord.from_name('OJ 287')

search_result = cs_service.search(pos)

# display the result as an astropy table
search_result.to_table()

__row,name,obsid,ra,dec,start_time,processing_date,xrt_exposure,uvot_exposure,bat_exposure,archive_date,Search_Offset
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,deg,deg,d,d,s,s,s,d,Unnamed: 11_level_1
object,object,object,float64,float64,float64,float64,float64,float64,float64,int32,float64
268142,OJ287,00035011087,133.75616,19.95937,60323.8631,60333.0,1238.65900,0.00000,1244.00000,60334,9.4252
268149,OJ287,00035011086,133.67751,19.96148,60315.5472,60325.0,1103.34900,495.90900,1117.00000,60326,8.9438
268224,saa-cold-158-7,00067473001,133.00427,19.97991,53528.9007,56979.0,90.98200,0.00000,201.00000,53539,40.1690
268252,saa-cold-164-08,00075185005,132.99087,19.98304,59146.3824,59156.0,0.00000,0.00000,315.00000,59157,40.8747
264506,saa-cold-165-7,00069837001,133.01315,19.98785,53900.8868,57088.0,423.14100,0.00000,272.00000,53911,39.5871
264521,saa-cold-144-07,00075185002,133.01826,19.98886,56801.9236,58165.0,85.21500,0.00000,168.00000,56812,39.2924
264569,saa-cold-163-24,00069823001,133.03090,19.99212,53898.8771,57084.0,339.01000,0.00000,293.00000,53909,38.5558
264597,saa-cold-147-07,00076527002,132.99248,19.99381,58630.8851,58640.0,185.39400,0.00000,426.00000,58641,40.6703
264614,saa-cold-127-07,00076527001,133.03883,19.99480,58610.906,58620.0,35.57700,0.00000,710.00000,58621,38.0869
...,...,...,...,...,...,...,...,...,...,...,...


### 2.3 Filter the Results  
The search returned all available observations of the source.
We can now filter the results by looping through the entries in the table.

In [9]:
# obs_to_explore = [res for res in search_result if 58932 <= res['start_time'] <= 58993]
# obs_to_explore = [res for res in search_result if res['xrt_exposure'] > 10000]
# obs = obs_to_explore[0]

obs_to_explore = [res for res in search_result if res['obsid'] == "00035905082"]
obs_to_explore

[('266376', 'OJ287', '00035905082', '133.66639', '20.10676', '59005.3678', '59015.0', '1038.021', '1011.322', '1044.0', '59016', '2.101638965376633')]

### 2.4 Find Links for the Data
To see what data products are available for these 3 observations, we use the VO's datalinks. A datalink is a way to query data products related to some search result. The results of a datalink call will depend on the specific observation.

In [10]:
obs = obs_to_explore[0]
dlink = obs.getdatalink()

# only 3 summary columns are printed
dlink.to_table()[['ID', 'access_url', 'content_type']]

ID,access_url,content_type
object,object,object
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/xamin/query?table=swiftbalog&constraint=obsid='00035905082',text/html
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/xamin/query?table=swiftuvlog&constraint=obsid='00035905082',text/html
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/xamin/bib?table=swiftmastr&id=35905,text/html
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/xamin/query?table=swiftxrlog&constraint=obsid='00035905082',text/html
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/xamin/vo/datalink?datalink_key&id=ivo://nasa.heasarc/swiftmastr?00035905082/swift.obs,application/x-votable+xml;content=datalink
ivo://nasa.heasarc/swiftmastr?00035905082,https://heasarc.gsfc.nasa.gov/FTP/swift/data/obs/2020_06//00035905082/,directory


### 2.5 Filter the Links
From the `content_type` column, we see that one is a `directory` containing the observation files. The `access_url` column gives the direct url to the data (The other two include another datalink service for house keeping data, and a document to list publications related to the selected observation).

Note that an empty datalink product indicates that no public data is available for that observation, likely because it is in proprietary mode.

In [11]:
dlink_to_dir = [dl for dl in dlink if dl['content_type'] == 'directory']
link = dlink_to_dir[0]['access_url']

### 2.6 Download the Data
On Sciserver, all the data is available locally under `/FTP/`, so all we need is to use the link text after `FTP` and copy them to the current directory.

If this is run ourside Sciserver, we can download the data directories using `wget` (or `curl`). [Learn more](https://swift.gsfc.nasa.gov/archive/archive_start.html)

In [15]:
# copy data locally on sciserver
os.system(f"cp -r /FTP/{link.split('FTP')[1]} .")

0

In [16]:
# use wget to download the data
wget_cmd = ("wget -nH --no-check-certificate --cut-dirs=5 -r -w1 -l0 -c -N -np -R 'index*' -erobots=off --retr-symlinks {}")
# os.system(wget_cmd.format(link))
wget_cmd.format(link)

"wget -nH --no-check-certificate --cut-dirs=5 -r -w1 -l0 -c -N -np -R 'index*' -erobots=off --retr-symlinks https://heasarc.gsfc.nasa.gov/FTP/swift/data/obs/2020_06//00035905082/"