## I - ``clean_bat``


For a ``filename.bat`` containg the names of the light curves (yet to be downloaded) of systems with periods larger than the chosen range, this code fetches the LCs from the Kepler database *provided* the SNR of the first reported transit is greater than 7.1. The file ``filename.bat`` comes straight from the Exoplanet Archive Database and contains the light curve IDs (both long cadence and short cadence). See `readme.txt` for more information.

In [12]:
import numpy as np
import re
import pandas as pd

In [13]:
path_file = '/Users/mbadenas/Documents/Master UAB/Tesis UAB/TFM2018/clean_bat_files/LC_p13point5up/'
filename = 'all_targets_P13point5up.bat' 

"""
path_file = '/Users/mbadenas/Documents/Master UAB/Tesis UAB/TFM2018/clean_bat_files/LC_p15to15point5/'
filename = 'targets_15to15point.bat'
"""
pattern_slc = re.compile(r'^wget -O \'kplr(?P<k_id>[0-9]{9})-([0-9]{13})_slc.fits.+',
                         re.MULTILINE) #we only want short cadence (slc)

After identifying the name pattern of Kepler short-cadence light curves, we will look for targets for which the first transit in the LC has a SN > 7.1 (TESS constraint).

In [14]:
props = path_file+'/all_targets_P13point5up.csv'
#props = path_file+'/all_targets_P15to15point5.csv'
dataset = pd.read_csv(props, sep=',', comment='#', na_values = '\\N')

original_IDs = []

for row in dataset.itertuples(index = True, name='Pandas'):
    koi = getattr(row, "kepid")
    snr = getattr(row, "koi_model_snr")
    N = getattr(row, "koi_num_transits")
    num_planets = getattr(row, "koi_count")
    snr_first_transit = snr/np.sqrt(N)
    
    if (snr_first_transit>=7.1) and (num_planets==1):
        original_IDs.append(koi)

goodSN_IDs = np.array(original_IDs,dtype=int)

In [15]:
"""
print('The number of targets with P>[15,15.5] days is **{}** but \
only **{}** have a SN>7.1 AND only 1 planet'.format(len(dataset.index), len(goodSN_IDs)))
"""
print('The number of targets with P>13.5 days is **{}** but \
only **{}** have a SN>7.1 AND only 1 planet'.format(len(dataset.index), len(goodSN_IDs)))

The number of targets with P>13.5 days is **984** but only **131** have a SN>7.1 AND only 1 planet


We will now proceed to download the SHORT-CADENCE photometric data for targets with 1 planet for which their first transit has a SN > 7.1. Note that some systems only have long-cadence light curves, so the final number of targets may be different from the one shown in the above cell. 

There are two ways to download the short-cadence light curves: we can either download 1 light curve for each target (in which case, the code will just fetch the first available LC online), or download them all (in which case, the code will be much slower, especially since one target can have many many LCs accumulated over all the cadences of the Kepler Mission). Our choice is regulated by the boolean `not_all_LC`: the first case (only 1 LC) is achieved with `not_all_LC=True`, while the second (all LCs) can be obtained with `not_all_LC=False`. 

The outputs of the code below are: a `*.bat`file with the LC(s) IDs of the appropiate systems, and a `kepler_id.txt` file with the IDs of such systems.  

In [16]:
#Create the .bat file where we'll store the LC id's 
filename_out = path_file+'/good_list_ids.bat'
f_out = open(filename_out,'w')
f_out.write('#!/bin/sh\n\n')
ids_from_bat = set([])

#How many LC do you want? False to download *all* the LCs for a given system or True to only get 1. 
not_all_LC = False

with open(path_file+filename,'r') as f_kepler:
    for line in f_kepler:
        is_slc = pattern_slc.match(line)
        if is_slc:
            if int(is_slc.group('k_id')) in goodSN_IDs:
                if (not_all_LC and (is_slc.group('k_id') not in ids_from_bat)): 
                    f_out.write(line)
                elif not not_all_LC:
                    f_out.write(line)
                ids_from_bat.add(is_slc.group('k_id'))          
f_out.close()

ids_final = np.array(list(ids_from_bat),dtype=int)
print(len(ids_final))

np.savetxt(path_file+'kepler_id.txt',ids_final,fmt='%d', newline='\n')


print("There are a total of {} systems with *SHORT-CADENCE* LCs and with a SN (for the first transit) \
greater than 7.1 and only 1 planet".format(len(ids_final)))

55
There are a total of 55 systems with *SHORT-CADENCE* LCs and with a SN (for the first transit) greater than 7.1 and only 1 planet
