# Find Eclipsing Binary Star Candidates from a Radial Velocity Catalog on Vizier

***

## Learning Goals

By the end of this tutorial, you will:

- Use `astroquery` to load the APOGEE Binary Radial Velocity catalog from `Vizier`.
- Find the binary candidates most likely to be observed as eclipsing binaries by using a set of parameters from the APOGEE catalog. 
- Determine if a candidate system in the APOGEE catalog has a light curve in either (a) the TESS Data for Asteroseismology Lightcurves archive from TASOC (the TESS Asteroseismic Science Operations Center) or (b) the TESS Lightcurves From The MIT Quick-Look Pipeline ("QLP") archive using `astroquery` and the MAST archive.

This is Part 1 of a two-part tutorial. In a separate notebook for Part 2, `plot_analyze_with_lightkurve`, you will also:
- Download and plot a light curve file using `astroquery.mast`.
- Download and plot a collection of the light curves using `lightkurve`.
- Create a periodogram of a collection of light curves to find the possible eclipsing binary period. 

## Introduction
Just like extrasolar planets, binary star systems can be discovered by multiple methods. The two stars orbit around their mutual center of mass, so all three methods rely on detecting the signatures of this motion. Three of the methods are:

1. Astrometry. For binaries whose plane of orbit is perpendicular (or nearly perpendicular) to our line of sight, we may be able to actually see stars (or just the primary star) moving back and forth relative to the more distant background stars. 
2. Radial velocity. For binaries whose plane of orbit is more aligned with our line of sight, their spectroscopic emission and absorption lines will be periodically blueshifted and redshifted as the star moves towards and away from us, respectively. In the catalog we will use, it is the radial velocity of only the primary (brighter) star that is detected. 
3. Eclipses. When the dimmer star passes in front of the brighter star, a notable drop in the brightness of the stellar system can be detected. This dip in brightness is analogous to the transit of an exoplanet in front of a star. Depending on the difference in brightness of the two stars, a secondary eclipse may also be detectable when the dimmer star passes behind the brighter star. 

The [Apache Point Observatory Galactic Evolution Experiment 2 (APOGEE-2)](https://www.sdss.org/surveys/apogee-2/) observed near-infrared spectra of hundreds of thousands of stars in our Milky Way. This survey was used to search for stars with spectral lines showing the tell-tale radial velocity shifts associated with stellar binaries. Near the end of the APOGEE-2 survey, the [Transiting Exoplanet Survey Satellite (TESS)](https://tess.mit.edu/) was launched to search for exoplanets using the transit method; eclipses of primary stars by companion stars would also be observed by TESS. Finding a stellar system that has been observed by both surveys would provide confirmation of the properties of the system. 

In this tutorial, we will load a catalog of binary stellar systems discovered through the radial velocity method and investigate if those systems also have light curves in the MAST archive from the TESS mission. Two sources for those already processed light curves are the TESS Data for Asteroseismology Lightcurves archive from TASOC (the TESS Asteroseismic Science Operations Center) or in the TESS Lightcurves From The MIT Quick-Look Pipeline ("QLP") archive. Though we'll focus on a narrow set of eclipsing binaries with previously-measured properties, the radial velocity catalog we'll be working with has nearly 5000 stellar systems that are yet to be fully characterized. 

In Part 2 (`plot_analyze_with_lightkurve`), we'll actually view and analyze the light curves. 

The workflow for this notebook consists of:
- Loading the APOGEE Binary Radial Velocity catalog from VizieR
- Narrowing down the list of stellar systems in the catalog's Table 4
- Searching the TASOC and QLP archives for a light curve by coordinate
- Exercises

## Imports

- *numpy* to handle array functions
- *astropy.io fits* for accessing FITS files
- *astropy.io ascii* for writing an astropy table to a .csv file (for an Exercise)
- *astropy.table Table* for creating tidy tables of the data
- *astropy.coordinates SkyCoord* for creating sky coordinate objects
- *astropy.units* for coordinate units
- *matplotlib.pyplot* for plotting data
- *astroquery.mast Observations* for querying MAST for observations
- *astroquery.vizier Vizier* for querying Vizier for published tables

In [1]:
%matplotlib inline
import numpy as np
from astropy.io import fits
from astropy.io import ascii
from astropy.table import Table
from astropy.coordinates import SkyCoord
from astropy import units as u
import matplotlib.pyplot as plt
from astroquery.mast import Observations
from astroquery.vizier import Vizier

***

## Loading the APOGEE Binary Radial Velocity catalog from VizieR

The catalogs and data tables produced for astrophysical publications in refereed journals are made accessible in digital form through the CDS VizieR catalog service. First, we'll use astroquery to find the tables associated with [Price-Whelan et al. 2018 (AJ, 156, 1, 18)](https://ui.adsabs.harvard.edu/abs/2018AJ....156...18P/abstract), a catalog of ~5000 binary companions of evolved stars in APOGEE DR14. 

If we didn't know the the specific 'key' or unique string that identifies the dataset we want, we could query Vizier by the author name. In the first query below, ```catalog_list1``` is an ordered dictionary. Alternatively, we could search for any catalogs that could be found with the key words 'APOGEE' and 'binary,' as we do for catalog_list2. More information about how to use VizieR can be found in [Astroquery:docs, VizieR Queries](https://astroquery.readthedocs.io/en/latest/vizier/vizier.html). 

In [3]:
catalog_list1 = Vizier.find_catalogs('Price-Whelan') 
print(str(len(catalog_list1.items()))+' results for "Price-Whelan" query:')
print({k:v.description for k,v in catalog_list1.items()})
catalog_list2 = Vizier.find_catalogs('APOGEE binary') 
print(str(len(catalog_list2.items()))+' results for "APOGEE binary" query:')
print({k:v.description for k,v in catalog_list2.items()})



17 results for "Price-Whelan" query:
{'J/ApJ/760/12': 'LIGO/Virgo gravitational-wave (GW) bursts with GRBs (Abadie+, 2012)', 'J/ApJ/785/119': 'Gravitational waves from known pulsars (Aasi+, 2014)', 'J/ApJ/809/59': 'Ophiuchus stellar stream with PS1 data (Sesar+, 2015)', 'J/ApJ/813/39': 'LIGO gravitational-wave (GW) searches from SNRs (Aasi+, 2015)', 'J/ApJ/816/L4': 'Candidate BHB stars in Ophiuchus stream (Sesar+, 2016)', 'J/ApJ/838/107': 'Distances to RRab stars from WISE and Gaia (Sesar+, 2017)', 'J/ApJ/854/47': 'RRab stars of Monoceros Ring & A13 overdensities (Sheffield+, 2018)', 'J/ApJ/859/L8': 'Properties of TriAnd stars (Hayes+, 2018)', 'J/ApJ/866/133': 'Continuum-H{beta} light curves of 5 Seyfert 1 (De Rosa+, 2018)', 'J/ApJ/887/19': 'DECam phot. of Gaia stars in Price-Whelan 1 (Price-Whelan+, 2019)', 'J/ApJ/887/115': 'Spectra of 28 stars in Price-Whelan 1 association (Nidever+, 2019)', 'J/ApJ/889/63': 'Properties of Sgr Stars (Hayes+, 2020)', 'J/AJ/153/257': 'Comoving stars in 



Using either of the results above, we can find that 'J/AJ/156/18' is the key that matches the paper we were looking for. 

It's also possible to skip the searching step. When working with data from a publication, look for the "Data Products" or "Related Materials" sections of its entry in [ADS](https://ui.adsabs.harvard.edu/) for direct links to online supplemental material. In our case, "Catalog: 2019yCat..51560018P" is listed under related materials; this is the ADS entry for the catalog itself. "CDS(1)" is listed under "Data Products"; this is the link to the VizieR entry, where we can see that the catalog key is J/AJ/156/18. With this key, we can directly load all the tables in this catalog. 

In [4]:
catalogs=Vizier.get_catalogs('J/AJ/156/18')
print(catalogs)

TableList with 8 tables:
	'0:J/AJ/156/18/table2' with 5 column(s) and 50 row(s) 
	'1:J/AJ/156/18/table3' with 1 column(s) and 50 row(s) 
	'2:J/AJ/156/18/table4a' with 13 column(s) and 50 row(s) 
	'3:J/AJ/156/18/table4b' with 10 column(s) and 50 row(s) 
	'4:J/AJ/156/18/table4c' with 12 column(s) and 50 row(s) 
	'5:J/AJ/156/18/table5a' with 12 column(s) and 50 row(s) 
	'6:J/AJ/156/18/table5b' with 6 column(s) and 50 row(s) 
	'7:J/AJ/156/18/table5c' with 6 column(s) and 50 row(s) 




There are actually multiple tables available within this catalog entry, so we'll need to investigate more to find which ones contain the list of binary star systems. Furthermore, the title of the paper says there should be about 5000 sources in the catalog, so the "50 row(s)" shown above must mean our tables are being truncated. Using the command below, we'll remove the row limit and retrieve the catalog again. 

In [5]:
Vizier.ROW_LIMIT = -1
catalogs=Vizier.get_catalogs('J/AJ/156/18')
print(catalogs)

TableList with 8 tables:
	'0:J/AJ/156/18/table2' with 5 column(s) and 96231 row(s) 
	'1:J/AJ/156/18/table3' with 1 column(s) and 4898 row(s) 
	'2:J/AJ/156/18/table4a' with 13 column(s) and 320 row(s) 
	'3:J/AJ/156/18/table4b' with 10 column(s) and 320 row(s) 
	'4:J/AJ/156/18/table4c' with 12 column(s) and 320 row(s) 
	'5:J/AJ/156/18/table5a' with 12 column(s) and 212 row(s) 
	'6:J/AJ/156/18/table5b' with 6 column(s) and 212 row(s) 
	'7:J/AJ/156/18/table5c' with 6 column(s) and 212 row(s) 




The tables in the TableList can be referenced by their integer number (```catalogs[0]```) or named key (```catalogs['J/AJ/156/18/table2']```). By the length alone, ```catalogs[1]``` appears to be the list of the nearly 5000 binary candidates. Reading the source paper itself lets us know exactly what these tables show:

- Table 2 contains the 96,231 stars that were the parent sample for this work.
- Table 3 contains the 4898 stars that likely have a companion, but orbital properties could not be constrained.
- Tables 4a, 4b, and 4c contain the 320 systems with uniquely determined companion orbits.
- Tables 5a, 5b, and 5c contain the 106 systems with two distinct companion orbit possibilities each.

While any of these stars may have visible eclipses observed by TESS, we'll start our search with the most likely candidates. The larger the mass of the evolved star's companion, the brigher it will be, and the change in the observed luminosity of the stellar system will be the most pronounced during an eclipse. Furthermore, a system with a shorter orbital period will also be the most likely to have an eclipse (or even better, multiple eclipses) successfully observed by TESS. Let's focus on one of the stellar systems with already determined orbits (Table 4) to demonstrate how to find its corresponding light curve from TESS. 

## Narrowing down the list of stellar systems in the catalog's Table 4

We need to see which columns of data are in which table, since Table 4 is separated into 3 parts. 

In [6]:
catids=[2,3,4]
for i in catids:
    print(catalogs[i].columns)

<TableColumns names=('APOGEE','M0','omega','K','Teff','logg','__Fe_H_','logM','Jmag','Hmag','Ksmag','RAJ2000','DEJ2000')>
<TableColumns names=('APOGEE','TeffA','loggA','Vmicro','Vmacro','vsini','__Z_H_','__a_M_','chi2A','TClass')>
<TableColumns names=('APOGEE','__C_Fe_','__CI_Fe_','__N_Fe_','__O_Fe_','__Na_Fe_','__Mg_Fe_','_4.5magW','_4.5targ','EK','pmRA','pmDE')>


We wanted to make our selections based on the period and the relative masses of the two stars in the system, but those columns are not included here! Not all columns are selected by default with VizieR. Visiting [the VizieR site itself for Table 4a](https://vizier.cfa.harvard.edu/viz-bin/VizieR-3?-source=J/AJ/156/18/table4a) shows us which columns are included by default as well as the names of the other columns we are missing. Let's redo our call to VizieR to specifically choose the columns we want. We can do this by creating a new instance of the ```VizierClass``` that will only include the columns we choose. From [Table 4a's online version in VizieR](https://vizier.cfa.harvard.edu/viz-bin/VizieR-3?-source=J/AJ/156/18/table4a), we can find the keys for the following columns:

- APOGEE, the identifier used by APOGEE.
- Per, the period in days.
- M1, the primary mass estimate.
- M2min, the minimum companion mass.
- qmin, the minimum mass ratio.
- RAJ2000, Right Ascension in decimal degrees (J2000)
- DECJ2000, Declination in decimal degrees (J2000)
- Conv, binary flag indicating whether the sampling converged.

Additionally, we can also filter the rows to only include
- those stellar systems that have a period of less than 100 days,
- that also have a measured value of qmin, the minimum mass ratio of the companion to the primary star,
- and that Conv=1 (True), which means the sampler did converge on a single period mode.

For details on the derivation of these parameters, please refer to the [Price-Whelan et al. 2018](https://ui.adsabs.harvard.edu/abs/2018AJ....156...18P/abstract) paper. 

Finally, we'll rename Table 4a as tbl and sort it in ascending order by the stellar mass ratio, qmin.

In [7]:
v=Vizier(columns=['APOGEE','Per','M1','M2min','qmin','RAJ2000', 'DEJ2000'],
           column_filters={"Per":"<100","Conv":"=1"})
v.ROW_LIMIT = -1
catalogs=v.get_catalogs('J/AJ/156/18')
tbl=catalogs[2]
tbl.sort('qmin')
tbl.reverse()
tbl=tbl[~tbl['qmin'].mask]
print("Number of systems meeting our selection criteria:\n",len(tbl))
print(tbl)

Number of systems meeting our selection criteria:
 17
      APOGEE          Per        M1     M2min     qmin    RAJ2000    DEJ2000  
                       d      solMass  solMass              deg        deg    
------------------ ---------- -------- -------- -------- ---------- ----------
2M20183197+1953430   51.30696  1.96604  1.36037  0.69193 304.633241  19.895283
2M00085727+7341257   83.66193  1.77608  1.12293  0.63225   2.238636  73.690475
2M00092789+0145417    2.61133  0.78317  0.40848  0.52157   2.366216   1.761607
2M07282763+2225408   80.29508  1.91156  0.97309  0.50906 112.115125  22.428013
2M01210284+8431304   54.25957  2.25258  0.96650  0.42906  20.261862  84.525131
2M04411627+5855354   56.05161  1.46516  0.61969  0.42295  70.317793  58.926521
2M00104203+0152065   23.47142  0.92806  0.36483  0.39311   2.675149   1.868474
2M07103169+0712585   62.37094  1.65363  0.58727  0.35514 107.632065   7.216252
2M19364967+3813244   24.72355  0.88201  0.31293  0.35479 294.206988  38.22345

Our table has been narrowed down from 320 stellar systems to only 17 that meet our criteria.

## Searching the TASOC and QLP archives for a light curve by coordinate

It's possible to search the MAST archive by object name, but the APOGEE names are not resolvable into a sky position. Instead, we will search by the RA and Dec coordinates. The default radius for a coordinate search is 0.2 degrees, or 720 arcseconds. For this tutorial, we will limit the search radius to 0; this may exclude some observations that are in fact of the same stellar system.

Let's search the MAST archive for the first object in our table (```ind=0```) and examine the results.

In [8]:
ind=0
coord=SkyCoord(ra=tbl['RAJ2000'].data[ind]*u.degree, dec=tbl['DEJ2000'].data[ind]*u.degree, frame='icrs')
print('Searching for coordinate: ',coord.to_string('decimal'))
obs_tbl = Observations.query_region(coord, radius=0)
print("Number of observations in MAST:\n",len(obs_tbl))
print(obs_tbl)

Searching for coordinate:  304.633 19.8953
Number of observations in MAST:
 20
intentType obs_collection provenance_name instrument_name ... mtFlag srcDen  obsid   distance
---------- -------------- --------------- --------------- ... ------ ------ -------- --------
   science           TESS            SPOC      Photometer ...  False    nan 27463637      0.0
   science           TESS            SPOC      Photometer ...  False    nan 62870782      0.0
   science           TESS            SPOC      Photometer ...  False    nan 92616895      0.0
   science           TESS            SPOC      Photometer ...  False    nan 95133352      0.0
   science            PS1             3PI            GPC1 ...     -- 5885.0  2316686      0.0
   science            PS1             3PI            GPC1 ...     -- 5885.0  2316687      0.0
   science            PS1             3PI            GPC1 ...     -- 5885.0  2316688      0.0
   science            PS1             3PI            GPC1 ...     -- 5885.0

The MAST archive has multiple observations of this stellar system! However, we don't want to access the raw observations; we are interested in observations that have already been processed and transformed into a light curve; the light curve itself is what is called a "High Level Science Product" or HLSP. The products we want will have "HLSP" in the ```obs_collection``` column and "TASOC" or "QLP" in the ```provenance_name``` column.

Let's instead write a short loop that will search the MAST archive for our whole table of coordinates and tell us which ones have TASOC or QLP light curves. Before the loop, we'll add a boolean column to ```tbl``` that will switch from False to True if a TASOC or QLP light curve is found. 

As you'll see, searching in a loop is very inefficient. It can take a while for our 17 rows, so trying to do this for all 5000 possible binary systems would be impractical. Instead, cross-referencing catalogs of that size would require e.g. the use of ```Casjobs``` or the use of the online MAST portal (see the second Exercise, below). 

In [9]:
n=len(tbl)
col_TASOC = Table.Column(name='TASOC',data=np.full(n,False))
col_QLP = Table.Column(name='QLP',data=np.full(n,False))
if 'TASOC' not in tbl.colnames: 
    tbl.add_column(col_TASOC) # Can only add the column once.
else:
    tbl['TASOC']=col_TASOC
if 'QLP' not in tbl.colnames: 
    tbl.add_column(col_QLP)
else: 
    tbl['QLP']=col_QLP

In [10]:
for i in range(0,n):
    coord=SkyCoord(ra=tbl['RAJ2000'].data[i]*u.degree, dec=tbl['DEJ2000'].data[i]*u.degree, frame='icrs')
    obs_tbl = Observations.query_region(coord, radius=0)
    
    print('Searching for index '+str(i)+', coordinate: '+coord.to_string('decimal'))
    if np.any(obs_tbl['provenance_name']=='TASOC'):
        tbl['TASOC'][i]=True
        print('Found TASOC light curve for index '+str(i)+', coordinate: '+coord.to_string('decimal'))
    if np.any(obs_tbl['provenance_name']=='QLP'):
        tbl['QLP'][i]=True
        print('Found QLP light curve for index '+str(i)+', coordinate: '+coord.to_string('decimal'))
print('Done.')

Searching for index 0, coordinate: 304.633 19.8953
Searching for index 1, coordinate: 2.23864 73.6905
Found QLP light curve for index 1, coordinate: 2.23864 73.6905
Searching for index 2, coordinate: 2.36622 1.76161
Searching for index 3, coordinate: 112.115 22.428
Searching for index 4, coordinate: 20.2619 84.5251
Searching for index 5, coordinate: 70.3178 58.9265
Searching for index 6, coordinate: 2.67515 1.86847
Searching for index 7, coordinate: 107.632 7.21625
Searching for index 8, coordinate: 294.207 38.2235
Searching for index 9, coordinate: 299.228 22.1054
Searching for index 10, coordinate: 94.5524 31.8039
Searching for index 11, coordinate: 122.85 32.8616
Found QLP light curve for index 11, coordinate: 122.85 32.8616
Searching for index 12, coordinate: 112.088 22.5467
Searching for index 13, coordinate: 323.17 12.4008
Searching for index 14, coordinate: 292.561 26.2298
Searching for index 15, coordinate: 273.55 -0.00303
Searching for index 16, coordinate: 80.6826 43.0118
Don

In [11]:
print('Total number of systems with TASOC light curves found: '+str(np.sum(tbl['TASOC'])))
print('Total number of systems with QLP light curves found: '+str(np.sum(tbl['QLP'])))

Total number of systems with TASOC light curves found: 0
Total number of systems with QLP light curves found: 2


2 of our 17 likely candidates have QLP light curves with exact coordinate matches (as of the writing of this tutorial). Let's view the observations from MAST for the star that is index 1. We will narrow down the obs_tbl results to only those that are QLP light curves.

In [12]:
ind=1
coord=SkyCoord(ra=tbl['RAJ2000'].data[ind]*u.degree, dec=tbl['DEJ2000'].data[ind]*u.degree, frame='icrs')
obs_tbl = Observations.query_region(coord, radius=0)
obs_tbl=obs_tbl[obs_tbl['provenance_name']=='QLP']
print(obs_tbl)

intentType obs_collection provenance_name instrument_name ... mtFlag srcDen  obsid   distance
---------- -------------- --------------- --------------- ... ------ ------ -------- --------
   science           HLSP             QLP      Photometer ...  False    nan 39201475      0.0
   science           HLSP             QLP      Photometer ...  False    nan 38106544      0.0
   science           HLSP             QLP      Photometer ...  False    nan 34983240      0.0
   science           HLSP             QLP      Photometer ...  False    nan 34099561      0.0


There are multiple light curves for this stellar system! In Part 2 of this tutorial, we will examine the light curves individually. 

## Exercises

1. An alternative method to narrowing down candidates from Table 4a: Instead of filtering by ```Per``` and sorting by qmin, filter by some qmin value (such as > 0.3) and then sort by period from lowest to highest. Still require Conv=1. How do your resulting tables vary? Which one do you think would be most helpful in trying to find eclipses, and why? 
2. An alternative method to searching astroquery in a loop: Export the APOGEE name, RA, and Dec from our filtered version of Table 4a (`tbl`) above to a .csv file, and use the Upload Target List feature on the online MAST Portal to search for QLP and TASOC observations in bulk. Hint: After cross-referencing your uploaded list to all MAST observations, filter (in the left hand column) by Mission: HLSP and Product Type: time series. Reference the [Writing Tables](https://docs.astropy.org/en/stable/io/ascii/write.html) page of the astropy documentation and [Search a List of Targets](https://outerspace.stsci.edu/display/MASTDOCS/Search+a+List+of+Targets) from MAST for help. Export the data (list of all HLSP light curves) to a local file and load it into this notebook. Compared to our loop above, which method is faster? Which is easier to find what you need?

In [13]:
# Place for code for Exercise 1

In [14]:
# Place for code for Exercise 2

## Citations

If you use `astropy`, `astroquery`, or `VizieR` for published research, please cite the
authors. Follow these links for more information about citing these tools:

* [Citing `astropy`](https://www.astropy.org/acknowledging.html)
* If you use astroquery, please cite the paper [Ginsburg, Sipőcz, Brasseur et al 2019](https://ui.adsabs.harvard.edu/abs/2019AJ....157...98G/abstract).
* [Citing `VizieR`](https://cds.unistra.fr/vizier-org/licences_vizier.html?#copyrightvizier)

## About this Notebook

**Author:** Julia Kamenetzky, ScienceBetter Consultant  
**Last Updated:** Sep 2022  
**Next Review:** Mar 2023

For support, please contact the Archive HelpDesk at archive@stsci.edu.

***

[Top of Page](#top)
<img style="float: right;" src="https://raw.githubusercontent.com/spacetelescope/notebooks/master/assets/stsci_pri_combo_mark_horizonal_white_bkgd.png" alt="Space Telescope Logo" width="200px"/> 