# Cross-Match KOA and Megafile

This code cross-matches the file that contains all the information for the HGCA stars, or the 'megafile', and the file which contains all the observations that have been reduced in KOA.

Because the megafile stars are by definition accelerating, it is important that the RA and Dec of these files is converted to be correct for the date of the Keck observations. Because it would take to long to convert every RA to every line in the megafile we only convert by each date in the megafile and then check the stars en masse. 

For the observation time in the megafile we use the average epoch of the Gaia RA and Dec. 

In [18]:
# Necessary Modules 
import pandas as pd
from astropy.coordinates import SkyCoord, Distance
from astropy import units as u
from astropy.time import Time
import numpy as np
import time

In [2]:
# Reading in the HGCA megafile
megafile = pd.read_csv('/Users/Jess/HGCA_survey_paper/megafile.csv')
print('There are ' + str(len(megafile)) + ' HCGA stars.')

There are 115663 HCGA stars.


In [3]:
# Reading in KOA file
koafile = pd.read_csv('/Users/Jess/Downloads/new_keck_stars_edited.csv')
print('There are ' + str(len(koafile)) + ' files in the KOA database.')

There are 335474 files in the KOA database.


In [4]:
# Create a list of unique observation dates
myset = set(koafile['date'])
myset = list(myset)
print('The KOA list of files has ' + str(len(koafile['date'])) + ' files, which have ' + str(len(myset)) + ' unique observation dates.')

The KOA list of files has 335474 files, which have 1784 unique observation dates.


In [5]:
# Determine the average epoch time
ave_time = (megafile['epoch_ra_gaia'] + megafile['epoch_dec_gaia']) / 2

In [14]:
# Set up catalog of all the stars in the megafile
c = SkyCoord(ra = list(megafile['ra'])*u.deg, dec = list(megafile['dec'])*u.deg, 
distance = Distance(parallax = list(megafile['gaia_parallax']) * u.mas), 
pm_ra_cosdec = list(megafile['pmra_gaia']) * u.mas/u.yr, 
pm_dec = list(megafile['pmdec_gaia']) * u.mas/u.yr, 
obstime = Time(ave_time, format='decimalyear'))
print(len(c))

115663


In [17]:
# Set up a catalog of...
catalog = SkyCoord(ra=koafile['ra']*u.degree, dec=koafile['dec']*u.degree)
clean = (~np.isnan(catalog.ra) & ~np.isnan(catalog.dec))
print('The initial file has ' + str(len(catalog)) + ' lines,') 
print('After being cleaned it has ' + str(clean.sum()) + ' lines,')
print('Meaning that ' + str(len(catalog)-clean.sum()) + ' files have been removed.')
catalog_clean = catalog[clean]

The initial file has 335474 lines,
After being cleaned it has 335141 lines,
Meaning that 333 files have been removed.


In [19]:
t0 = time.time()

for j in range(len(myset)):
    print('Running for KOA date No# ' + str(j) + ': ' + str(myset[j])) 
    sc = c.apply_space_motion(new_obstime=Time(myset[j]))
    idxsc, idxcatalog, d2d, d3d = catalog_clean.search_around_sky(sc, 20*u.arcsecond)
    
t1 = time.time()
total = t1-t0
print('The time to run this code segment is ' + str(total/60) + 'minutes.')

Running for KOA date No# 0: 2003-12-16
Running for KOA date No# 1: 2013-08-27
Running for KOA date No# 2: 2018-04-21
Running for KOA date No# 3: 2014-12-12
Running for KOA date No# 4: 2012-07-30
Running for KOA date No# 5: 2019-05-28
Running for KOA date No# 6: 2011-11-13
Running for KOA date No# 7: 2018-05-01
Running for KOA date No# 8: 2013-08-25
Running for KOA date No# 9: 2008-03-29
Running for KOA date No# 10: 2011-05-15
Running for KOA date No# 11: 2016-08-20
Running for KOA date No# 12: 2012-10-27
Running for KOA date No# 13: 2016-07-21
Running for KOA date No# 14: 2016-03-20
Running for KOA date No# 15: 2011-08-14
Running for KOA date No# 16: 2017-01-26
Running for KOA date No# 17: 2013-02-04
Running for KOA date No# 18: 2013-06-22
Running for KOA date No# 19: 2009-10-05
Running for KOA date No# 20: 2005-07-31
Running for KOA date No# 21: 2012-10-23
Running for KOA date No# 22: 2006-12-11
Running for KOA date No# 23: 2015-08-26
Running for KOA date No# 24: 2018-08-19
Running fo

Running for KOA date No# 203: 2010-07-05
Running for KOA date No# 204: 2012-07-20
Running for KOA date No# 205: 2017-12-08
Running for KOA date No# 206: 2012-05-15
Running for KOA date No# 207: 2015-02-02
Running for KOA date No# 208: 2010-07-08
Running for KOA date No# 209: 2011-12-16
Running for KOA date No# 210: 2014-03-11
Running for KOA date No# 211: 2017-05-04
Running for KOA date No# 212: 2017-01-18
Running for KOA date No# 213: 2017-11-23
Running for KOA date No# 214: 2018-01-07
Running for KOA date No# 215: 2017-05-25
Running for KOA date No# 216: 2013-10-17
Running for KOA date No# 217: 2003-03-14
Running for KOA date No# 218: 2017-10-10
Running for KOA date No# 219: 2017-06-26
Running for KOA date No# 220: 2009-04-15
Running for KOA date No# 221: 2015-07-27
Running for KOA date No# 222: 2014-06-10
Running for KOA date No# 223: 2014-08-03
Running for KOA date No# 224: 2013-08-01
Running for KOA date No# 225: 2003-11-14
Running for KOA date No# 226: 2008-03-25
Running for KOA 

Running for KOA date No# 403: 2011-04-26
Running for KOA date No# 404: 2013-08-13
Running for KOA date No# 405: 2006-07-16
Running for KOA date No# 406: 2014-12-09
Running for KOA date No# 407: 2013-09-23
Running for KOA date No# 408: 2014-10-17
Running for KOA date No# 409: 2010-06-29
Running for KOA date No# 410: 2005-11-09
Running for KOA date No# 411: 2012-01-12
Running for KOA date No# 412: 2012-03-27
Running for KOA date No# 413: 2013-08-23
Running for KOA date No# 414: 2010-12-02
Running for KOA date No# 415: 2005-04-16
Running for KOA date No# 416: 2012-03-06
Running for KOA date No# 417: 2014-11-08
Running for KOA date No# 418: 2012-08-26
Running for KOA date No# 419: 2006-06-24
Running for KOA date No# 420: 2013-02-03
Running for KOA date No# 421: 2014-01-09
Running for KOA date No# 422: 2006-11-25
Running for KOA date No# 423: 2013-09-19
Running for KOA date No# 424: 2013-03-25
Running for KOA date No# 425: 2004-10-23
Running for KOA date No# 426: 2009-12-12
Running for KOA 

Running for KOA date No# 603: 2018-05-08
Running for KOA date No# 604: 2018-05-21
Running for KOA date No# 605: 2013-10-25
Running for KOA date No# 606: 2010-06-23
Running for KOA date No# 607: 2005-02-25
Running for KOA date No# 608: 2009-07-31
Running for KOA date No# 609: 2014-08-06
Running for KOA date No# 610: 2013-02-18
Running for KOA date No# 611: 2011-01-06
Running for KOA date No# 612: 2008-01-13
Running for KOA date No# 613: 2007-07-31
Running for KOA date No# 614: 2002-05-29
Running for KOA date No# 615: 2013-11-15
Running for KOA date No# 616: 2016-10-14
Running for KOA date No# 617: 2018-01-26
Running for KOA date No# 618: 2016-02-04
Running for KOA date No# 619: 2003-05-15
Running for KOA date No# 620: 2015-06-21
Running for KOA date No# 621: 2008-11-03
Running for KOA date No# 622: 2009-09-07
Running for KOA date No# 623: 2012-05-22
Running for KOA date No# 624: 2014-07-30
Running for KOA date No# 625: 2018-08-06
Running for KOA date No# 626: 2013-11-16
Running for KOA 

Running for KOA date No# 803: 2019-12-21
Running for KOA date No# 804: 2015-04-10
Running for KOA date No# 805: 2007-12-12
Running for KOA date No# 806: 2004-05-27
Running for KOA date No# 807: 2010-05-02
Running for KOA date No# 808: 2008-10-22
Running for KOA date No# 809: 2002-05-26
Running for KOA date No# 810: 2012-12-04
Running for KOA date No# 811: 2002-05-28
Running for KOA date No# 812: 2017-09-11
Running for KOA date No# 813: 2015-11-29
Running for KOA date No# 814: 2012-12-29
Running for KOA date No# 815: 2002-04-26
Running for KOA date No# 816: 2017-06-28
Running for KOA date No# 817: 2006-11-29
Running for KOA date No# 818: 2011-08-29
Running for KOA date No# 819: 2019-06-29
Running for KOA date No# 820: 2008-06-30
Running for KOA date No# 821: 2016-12-14
Running for KOA date No# 822: 2010-09-30
Running for KOA date No# 823: 2015-01-16
Running for KOA date No# 824: 2017-10-06
Running for KOA date No# 825: 2017-06-12
Running for KOA date No# 826: 2014-05-09
Running for KOA 

Running for KOA date No# 1003: 2009-01-23
Running for KOA date No# 1004: 2004-08-31
Running for KOA date No# 1005: 2014-01-13
Running for KOA date No# 1006: 2018-01-04
Running for KOA date No# 1007: 2013-05-28
Running for KOA date No# 1008: 2011-07-16
Running for KOA date No# 1009: 2018-12-12
Running for KOA date No# 1010: 2017-05-21
Running for KOA date No# 1011: 2018-08-04
Running for KOA date No# 1012: 2009-08-03
Running for KOA date No# 1013: 2018-08-03
Running for KOA date No# 1014: 2016-09-21
Running for KOA date No# 1015: 2016-05-24
Running for KOA date No# 1016: 2008-04-23
Running for KOA date No# 1017: 2008-06-17
Running for KOA date No# 1018: 2012-03-03
Running for KOA date No# 1019: 2018-02-12
Running for KOA date No# 1020: 2008-07-11
Running for KOA date No# 1021: 2014-04-18
Running for KOA date No# 1022: 2019-06-26
Running for KOA date No# 1023: 2004-02-05
Running for KOA date No# 1024: 2009-05-08
Running for KOA date No# 1025: 2017-12-04
Running for KOA date No# 1026: 201

Running for KOA date No# 1199: 2017-05-15
Running for KOA date No# 1200: 2004-07-11
Running for KOA date No# 1201: 2017-04-05
Running for KOA date No# 1202: 2014-01-12
Running for KOA date No# 1203: 2011-11-20
Running for KOA date No# 1204: 2006-07-17
Running for KOA date No# 1205: 2015-01-11
Running for KOA date No# 1206: 2015-08-28
Running for KOA date No# 1207: 2005-11-26
Running for KOA date No# 1208: 2012-10-26
Running for KOA date No# 1209: 2014-05-11
Running for KOA date No# 1210: 2015-04-03
Running for KOA date No# 1211: 2006-02-06
Running for KOA date No# 1212: 2010-07-28
Running for KOA date No# 1213: 2017-08-19
Running for KOA date No# 1214: 2010-07-14
Running for KOA date No# 1215: 2015-12-20
Running for KOA date No# 1216: 2005-04-30
Running for KOA date No# 1217: 2006-12-13
Running for KOA date No# 1218: 2017-10-31
Running for KOA date No# 1219: 2018-01-25
Running for KOA date No# 1220: 2016-05-16
Running for KOA date No# 1221: 2013-08-17
Running for KOA date No# 1222: 201

Running for KOA date No# 1395: 2012-05-21
Running for KOA date No# 1396: 2010-03-24
Running for KOA date No# 1397: 2008-11-10
Running for KOA date No# 1398: 2010-03-30
Running for KOA date No# 1399: 2016-11-05
Running for KOA date No# 1400: 2007-09-29
Running for KOA date No# 1401: 2017-07-07
Running for KOA date No# 1402: 2018-02-22
Running for KOA date No# 1403: 2018-05-27
Running for KOA date No# 1404: 2014-08-19
Running for KOA date No# 1405: 2015-09-25
Running for KOA date No# 1406: 2018-02-13
Running for KOA date No# 1407: 2012-02-04
Running for KOA date No# 1408: 2003-07-12
Running for KOA date No# 1409: 2016-06-11
Running for KOA date No# 1410: 2002-03-08
Running for KOA date No# 1411: 2012-08-11
Running for KOA date No# 1412: 2008-02-22
Running for KOA date No# 1413: 2016-02-22
Running for KOA date No# 1414: 2009-06-11
Running for KOA date No# 1415: 2015-02-04
Running for KOA date No# 1416: 2017-03-14
Running for KOA date No# 1417: 2015-09-05
Running for KOA date No# 1418: 200

Running for KOA date No# 1591: 2008-07-27
Running for KOA date No# 1592: 2018-04-24
Running for KOA date No# 1593: 2017-09-02
Running for KOA date No# 1594: 2004-02-03
Running for KOA date No# 1595: 2013-06-15
Running for KOA date No# 1596: 2007-06-22
Running for KOA date No# 1597: 2010-08-30
Running for KOA date No# 1598: 2002-02-28
Running for KOA date No# 1599: 2017-06-05
Running for KOA date No# 1600: 2007-08-08
Running for KOA date No# 1601: 2004-02-07
Running for KOA date No# 1602: 2015-08-21
Running for KOA date No# 1603: 2009-08-15
Running for KOA date No# 1604: 2007-09-19
Running for KOA date No# 1605: 2008-08-24
Running for KOA date No# 1606: 2008-07-26
Running for KOA date No# 1607: 2013-04-26
Running for KOA date No# 1608: 2009-06-15
Running for KOA date No# 1609: 2013-03-20
Running for KOA date No# 1610: 2009-07-20
Running for KOA date No# 1611: 2011-12-27
Running for KOA date No# 1612: 2012-05-26
Running for KOA date No# 1613: 2015-12-28
Running for KOA date No# 1614: 200

In [20]:
idx_set = set(idxsc)
print('There are ' + str(len(idx_set)) + ' unique stars in this list.')

There are 2433 unique stars in this list.


In [22]:
matches = c[idx_set]
idx_set = list(idx_set)
cross_matched_stars = megafile.iloc[idx_set]
cross_matched_stars.to_csv('/Users/Jess/HGCA_survey_paper/completed_cross_match.csv')

IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices

In [None]:
print("There are " + str(len(cross_matched_stars)) + ' matches.')