In [1]:
import pandas as pd
import numpy as np
import glob
import re

from datetime import date
from astropy.io import fits

The goal of this notebook is to have data ready for the exploratory data analysis.  
We will extract the required data from our data sources and store them in a SQLite database.

Our data sources are:

1. The cumulative KOIs Activity Table, a CSV file downloaded from the Exoplanet Archive website:  
https://exoplanetarchive.ipac.caltech.edu/cgi-bin/TblView/nph-tblView?app=ExoTbls&config=cumulative

2. The Data Validation FITS files, available on the Mikulski Archive Space Telescopes (MAST) website:    
https://archive.stsci.edu/missions/kepler/dv_files/tar_files_dr25/

The role of the cumulative KOIs acitivity table is to provide the list of KOIs, the classification target variable and the features justifying the classification.  
The data validation FITS files provide all the features: the stellar properties and transit parameters.

Notes:

* The features are extracted from more than 140Go of FITS files, not provided in this project for evident reason
* This notebook is a technical notebook and doesn't explain the data sources, notably the FITS files explained in detail in the `Annex_Data_Validation_FITS_Files.ipynb` notebook

# 1. The Cumulative Kepler Objects of Interest Activity Table

## 1.1 Load the data source

In [2]:
# Load the source Cumulative KOIs Activity Table (skip the first 75 rows containing the column descriptions)
filename = '../resources/source_data/cumulative_kois_activity_table.csv'
kois = pd.read_csv(filename, skiprows=36)

# Format the kepler ids (length 9, padded with 0)
kois['kepid'] = kois['kepid'].astype(str)
kois['kepid'] = kois['kepid'].str.zfill(9)

# Add a column to store the FITS filename
kois['filename']  = ""

# Show data
print('KOIs Activity Table Shape:', kois.shape)
kois.head(5)

KOIs Activity Table Shape: (8054, 29)


Unnamed: 0,kepid,kepoi_name,kepler_name,koi_disposition,koi_vet_stat,koi_vet_date,koi_pdisposition,koi_score,koi_fpflag_nt,koi_fpflag_ss,...,koi_tce_plnt_num,koi_tce_delivname,koi_quarters,koi_bin_oedp_sig,koi_trans_mod,koi_model_dof,koi_model_chisq,koi_datalink_dvr,koi_datalink_dvs,filename
0,10797460,K00752.01,Kepler-227 b,CONFIRMED,Done,2018-08-16,CANDIDATE,1.0,0,0,...,1,q1_q17_dr25_tce,11111111111111111000000000000000,0.6864,Mandel and Agol (2002 ApJ 580 171),,,010/010797/010797460/dv/kplr010797460-20160209...,010/010797/010797460/dv/kplr010797460-001-2016...,
1,10797460,K00752.02,Kepler-227 c,CONFIRMED,Done,2018-08-16,CANDIDATE,0.969,0,0,...,2,q1_q17_dr25_tce,11111111111111111000000000000000,0.0023,Mandel and Agol (2002 ApJ 580 171),,,010/010797/010797460/dv/kplr010797460-20160209...,010/010797/010797460/dv/kplr010797460-002-2016...,
2,10811496,K00753.01,,CANDIDATE,Done,2018-08-16,CANDIDATE,0.0,0,0,...,1,q1_q17_dr25_tce,11111101110111011000000000000000,0.6624,Mandel and Agol (2002 ApJ 580 171),,,010/010811/010811496/dv/kplr010811496-20160209...,010/010811/010811496/dv/kplr010811496-001-2016...,
3,10848459,K00754.01,,FALSE POSITIVE,Done,2018-08-16,FALSE POSITIVE,0.0,0,1,...,1,q1_q17_dr25_tce,11111110111011101000000000000000,0.0,Mandel and Agol (2002 ApJ 580 171),,,010/010848/010848459/dv/kplr010848459-20160209...,010/010848/010848459/dv/kplr010848459-001-2016...,
4,10854555,K00755.01,Kepler-664 b,CONFIRMED,Done,2018-08-16,CANDIDATE,1.0,0,0,...,1,q1_q17_dr25_tce,01111111111111111000000000000000,0.309,Mandel and Agol (2002 ApJ 580 171),,,010/010854/010854555/dv/kplr010854555-20160209...,010/010854/010854555/dv/kplr010854555-001-2016...,


## 1.2  Add the FITS filename 

Each observation in the KOIs Activity Table is a registred TCE identified by its full KOI Name (`kepoi_name` column) around a star identified by the first part of the KOI Name but also its KIC number (`kepid` column). All the TCE's features are stored in FITS files containing the KIC number in their filenames.

In this section, we map the corresponding FITS filename to observations for which we have FITS file:

In [3]:
# Path at which the FITS files are stored
FITS_PATH = '/Users/grillety/Exoplanets/Fits'

In [4]:
# Get the list of files with a given extension from a base path
def get_list_of_path(base_path, extension, recursive=True):
    return glob.glob(base_path + '/**/*.' + extension, recursive=recursive)

# Get the list of all .FITS files stored on my disk drive
fits_files = get_list_of_path(FITS_PATH, 'fits')
fits_files[:5]

['/Users/grillety/Exoplanets/Fits/kplr005513897-20160128150956_dvt.fits',
 '/Users/grillety/Exoplanets/Fits/kplr007898352-20160128150956_dvt.fits',
 '/Users/grillety/Exoplanets/Fits/kplr012416597-20160128150956_dvt.fits',
 '/Users/grillety/Exoplanets/Fits/kplr010907307-20160128150956_dvt.fits',
 '/Users/grillety/Exoplanets/Fits/kplr010864656-20160128150956_dvt.fits']

In [5]:
# Add the corresponding FITS filename to the KOI observations
for file in fits_files:
    
    # Extract the filename
    filename = file.split('/')[-1]
    
    # Extract the KIC number from the filename
    kic = re.search(r"(?<=kplr).*(?=-)", filename).group(0)
    
    # Update the dataframe with the filename corresponding to the kepler id
    kois.loc[kois['kepid'] == kic, 'filename'] = filename

## 1.3 Basic Cleaning

### 1.3.1 Drop observations with missing FITS files and integrity issues

We must drop TCEs without FITS files and integrity issues (no TCE registred, 0 transit, etc.).  
The rules allowing this cleaning have been discovered iteratively during the first attempts to build the datasets.

In [6]:
# Drop observations for which we have no fits file
filter = (kois['filename'] == "") 
kois.drop(index=kois[filter].index, inplace=True)
print(kois.shape)

(8054, 29)


In [7]:
# Drop observations with missing values in the TCE Number and Score columns
kois.dropna(subset=['koi_tce_plnt_num', 'koi_score'], axis=0, inplace=True)
print(kois.shape)

(8054, 29)


In [8]:
# Drop observations containing NOFITS in their comment
filter = kois['koi_comment'].str.contains('NOFITS')
kois.drop(index=kois[filter].index, inplace=True)
print(kois.shape)

(7745, 29)


In [9]:
# Drop observation with 0 transits
filter = (kois['koi_num_transits'] == 0)
kois.drop(index=kois[filter].index, inplace=True)
print(kois.shape)

(7737, 29)


In [10]:
# Drop strange observations for which we have FITS file but no transit model inside
# No column has been found to exclude them automatically
missing_model_kepois = ['K03158.01', 'K03156.02', 'K03175.01', 'K07344.01', 'K06460.01']
filter = kois['kepoi_name'].isin(missing_model_kepois)
kois.drop(index=kois[filter].index, inplace=True)
print(kois.shape)

(7732, 29)


In [11]:
# Reset the dataframe index
kois.reset_index(inplace=True, drop=True)

### 1.3.2 Drop the empty variables

In [12]:
# Get the empty variables
vars_novalue = list(kois.columns[kois.isnull().all()])

# Drop the empty variables
kois.dropna(axis=1, how='all', inplace=True)

print('Empty variables dropped:', vars_novalue)
print('\nDataFrame Shape:', kois.shape)

Empty variables dropped: ['koi_model_dof', 'koi_model_chisq']

DataFrame Shape: (7732, 27)


### 1.3.3 Drop the variables with only one unique value

#### Object variables

In [13]:
# Get the object variables containing only one unique value
s_unique = kois.describe(include=np.object).T.loc[:, 'unique']
vars_unique = s_unique[s_unique == 1]

# Drop object variables with one unique values
kois.drop(axis=1, columns=vars_unique.index, inplace=True)

print('Object variables with only one unique value dropped:', list(vars_unique.index))
print('\nDataFrame Shape:', kois.shape)

Object variables with only one unique value dropped: ['koi_vet_stat', 'koi_vet_date', 'koi_disp_prov', 'koi_tce_delivname', 'koi_trans_mod']

DataFrame Shape: (7732, 22)


#### Numeric variables

In [14]:
# Get statistics about the number variables
df_stats = kois.describe(include=np.number).T

# Variable with only on unique value (min value = max value)
vars_unique = df_stats.loc[(df_stats['min'] == df_stats['max']), :].index

# Drop object variables with only one unique values
kois.drop(axis=1, columns=vars_unique, inplace=True)

print('Numerical variables with only one unique value dropped:', list(vars_unique))
print('\nDataFrame Shape:', kois.shape)

Numerical variables with only one unique value dropped: []

DataFrame Shape: (7732, 22)


## 1.4 Save the target and justification variables in a database table

We are interested to keep only the classification target variable and the features justifiying it, only available in the KOIs activity table:
* the classification target variable (`koi_disposition`)
* the classifcation using kepler data variable (`koi_pdisposition`)
* the score
* the false positive flags
* the comment

The other features was necessary to link the target variable with the FITS files from which the features will be extracted.  
They will reappear during the data extraction from the TCE extension headers.

In [15]:
import sqlite3

# Connect to the database
DB_PATH = '../resources/capstone_db.sqlite'
db = sqlite3.connect(DB_PATH)

# Get a cursor
cursor = db.cursor()

In [16]:
# 1. CREATE A DATAFRAME WITH ONLY THE TARGET AND JUSTIFICATION VARIABLES
# ----------------------------------------------------------------------

# Target and justification variables coming from the KOIs activity table
cols = [
    'kepoi_name',
    'kepler_name',
    'koi_disposition',
    'koi_pdisposition',
    'koi_score',
    'koi_fpflag_nt',
    'koi_fpflag_ss',
    'koi_fpflag_co',
    'koi_fpflag_ec',
    'koi_comment',
    'filename'
]

# Create a dataframe with only the stellar properties variables 
df = kois.reindex(columns=cols)



# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS kois_activity_table;')

# Create the kois_activity_table table
query = '''
CREATE TABLE kois_activity_table
(
    kepoi_name TEXT PRIMARY KEY,
    kepler_name TEXT,
    koi_disposition TEXT NOT NULL,
    koi_pdisposition TEXT NOT NULL,
    koi_score REAL NOT NULL,
    koi_fpflag_nt INTEGER NOT NULL,
    koi_fpflag_ss INTEGER NOT NULL,
    koi_fpflag_co  INTEGER NOT NULL,
    koi_fpflag_ec INTEGER NOT NULL,
    koi_comment TEXT,
    filename TEXT
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='kois_activity_table', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    kois_activity_table;
'''
df = pd.read_sql_query(query, db)
print('Database Table Shape: ', df.shape)
df.head()

Database Table Shape:  (7732, 11)


Unnamed: 0,kepoi_name,kepler_name,koi_disposition,koi_pdisposition,koi_score,koi_fpflag_nt,koi_fpflag_ss,koi_fpflag_co,koi_fpflag_ec,koi_comment,filename
0,K00752.01,Kepler-227 b,CONFIRMED,CANDIDATE,1.0,0,0,0,0,NO_COMMENT,kplr010797460-20160128150956_dvt.fits
1,K00752.02,Kepler-227 c,CONFIRMED,CANDIDATE,0.969,0,0,0,0,NO_COMMENT,kplr010797460-20160128150956_dvt.fits
2,K00753.01,,CANDIDATE,CANDIDATE,0.0,0,0,0,0,DEEP_V_SHAPED,kplr010811496-20160128150956_dvt.fits
3,K00754.01,,FALSE POSITIVE,FALSE POSITIVE,0.0,0,1,0,0,MOD_ODDEVEN_DV---MOD_ODDEVEN_ALT---DEEP_V_SHAPED,kplr010848459-20160128150956_dvt.fits
4,K00755.01,Kepler-664 b,CONFIRMED,CANDIDATE,1.0,0,0,0,0,NO_COMMENT,kplr010854555-20160128150956_dvt.fits


# 2. The Data Validation FITS Files

The `Annex_Data_Validation_FITS_Files.ipynb` notebook explains in detail the FITS file format.

We have to:

1. Extract the stellar properties from the FITS files primary header
2. Extract the transit parameters from the FITS files TCE extension headers

There are one FITS file per hosting star.  
A star can be the host of many TCEs, so a FITS file can have many TCE extension headers, one for each registred TCE.

## 2.1 Get the list of keys to extract from the primary and TCE extension headers

In [17]:
# FITS file used to get the list of features/keys to extract from the primary and tce extension headers
filepath = '../resources/source_data/kplr010797460-20160128150956_dvt.fits'

# Get the primary header keys/features to extract
header_keys = []
with fits.open(filepath, mode='readonly') as hdulist:
    # Get all the keys stored in the primary header
    header_keys = list(hdulist[0].header.keys())
    
    # Keep only keys relative to the KOI star and drop keys describing the FITS file structure
    header_keys = header_keys[7:-1]
    
# Get the extension header keys/features to extract
extension_keys = []
with fits.open(filepath, mode='readonly') as hdulist:
    
    # Get all the keys stored in the extension header
    extension_keys = list(hdulist[1].header.keys())
    
    # Keep only keys relative to the TCE and drop keys describing the FITS file structure and duplicated key from the primary header
    extension_keys = extension_keys[58:-1]

## 2.2 Extract the data from the primary and TCE extension headers

In [18]:
# Exctrat the data from the FITS files and store them in a temporary list
data = []
for i in range(kois.shape[0]):
    
    # Get filename and TCE number
    filename = kois.filename[i]
    tce_number = int(kois.koi_tce_plnt_num[i])
    
    # Print progress status
    print('{}: {} - {}'.format(i, filename, tce_number))
    
    # FIT filepath
    filepath = FITS_PATH + '/' + filename
    
    with fits.open(filepath, mode='readonly') as hdulist:
        
        fits_values = [kois.kepoi_name[i], kois.koi_disposition[i], kois.kepid[i], kois.filename[i]]
        
        # Get primary header features including the stellar properties
        for key in header_keys:
            fits_values.append(hdulist[0].header[key])
        
        # Get extension header features including the transit parameters
        for key in extension_keys:
            fits_values.append(hdulist[tce_number].header[key])
            
        data.append(fits_values)

0: kplr010797460-20160128150956_dvt.fits - 1
1: kplr010797460-20160128150956_dvt.fits - 2
2: kplr010811496-20160128150956_dvt.fits - 1
3: kplr010848459-20160128150956_dvt.fits - 1
4: kplr010854555-20160128150956_dvt.fits - 1
5: kplr010872983-20160128150956_dvt.fits - 1
6: kplr010872983-20160128150956_dvt.fits - 2
7: kplr010872983-20160128150956_dvt.fits - 3
8: kplr006721123-20160128150956_dvt.fits - 1
9: kplr010910878-20160128150956_dvt.fits - 1
10: kplr011446443-20160128150956_dvt.fits - 1
11: kplr010666592-20160128150956_dvt.fits - 1
12: kplr006922244-20160128150956_dvt.fits - 1
13: kplr010984090-20160128150956_dvt.fits - 2
14: kplr010419211-20160128150956_dvt.fits - 1
15: kplr010464078-20160128150956_dvt.fits - 1
16: kplr010480982-20160128150956_dvt.fits - 1
17: kplr010485250-20160128150956_dvt.fits - 1
18: kplr010526549-20160128150956_dvt.fits - 1
19: kplr010583066-20160128150956_dvt.fits - 1
20: kplr010583180-20160128150956_dvt.fits - 1
21: kplr010601284-20160128150956_dvt.fits - 

178: kplr006849310-20160128150956_dvt.fits - 2
179: kplr006849310-20160128150956_dvt.fits - 3
180: kplr006862328-20160128150956_dvt.fits - 1
181: kplr009471974-20160128150956_dvt.fits - 2
182: kplr006862603-20160128150956_dvt.fits - 1
183: kplr007366258-20160128150956_dvt.fits - 2
184: kplr007366258-20160128150956_dvt.fits - 1
185: kplr007366258-20160128150956_dvt.fits - 3
186: kplr007366258-20160128150956_dvt.fits - 4
187: kplr007373451-20160128150956_dvt.fits - 1
188: kplr006948054-20160128150956_dvt.fits - 3
189: kplr006949607-20160128150956_dvt.fits - 1
190: kplr006949607-20160128150956_dvt.fits - 2
191: kplr007031517-20160128150956_dvt.fits - 1
192: kplr011869052-20160128150956_dvt.fits - 1
193: kplr007109675-20160128150956_dvt.fits - 1
194: kplr007109675-20160128150956_dvt.fits - 2
195: kplr007118364-20160128150956_dvt.fits - 1
196: kplr007134976-20160128150956_dvt.fits - 1
197: kplr007134976-20160128150956_dvt.fits - 2
198: kplr007135852-20160128150956_dvt.fits - 1
199: kplr0072

365: kplr005956342-20160128150956_dvt.fits - 1
366: kplr005956342-20160128150956_dvt.fits - 2
367: kplr005956342-20160128150956_dvt.fits - 3
368: kplr005956342-20160128150956_dvt.fits - 4
369: kplr005956656-20160128150956_dvt.fits - 1
370: kplr003109930-20160128150956_dvt.fits - 1
371: kplr002854914-20160128150956_dvt.fits - 1
372: kplr002854914-20160128150956_dvt.fits - 2
373: kplr003337425-20160128150956_dvt.fits - 1
374: kplr003116412-20160128150956_dvt.fits - 1
375: kplr003865567-20160128150956_dvt.fits - 1
376: kplr003544689-20160128150956_dvt.fits - 1
377: kplr004037164-20160128150956_dvt.fits - 1
378: kplr003443790-20160128150956_dvt.fits - 1
379: kplr003966912-20160128150956_dvt.fits - 1
380: kplr008394721-20160128150956_dvt.fits - 1
381: kplr001161345-20160128150956_dvt.fits - 1
382: kplr010227501-20160128150956_dvt.fits - 1
383: kplr002854698-20160128150956_dvt.fits - 1
384: kplr002854698-20160128150956_dvt.fits - 2
385: kplr008559644-20160128150956_dvt.fits - 2
386: kplr0058

554: kplr010514770-20160128150956_dvt.fits - 1
555: kplr009775938-20160128150956_dvt.fits - 2
556: kplr009787239-20160128150956_dvt.fits - 1
557: kplr009787239-20160128150956_dvt.fits - 3
558: kplr009787239-20160128150956_dvt.fits - 2
559: kplr009787239-20160128150956_dvt.fits - 4
560: kplr009787239-20160128150956_dvt.fits - 5
561: kplr002309719-20160128150956_dvt.fits - 1
562: kplr002558363-20160128150956_dvt.fits - 1
563: kplr002716853-20160128150956_dvt.fits - 1
564: kplr002445154-20160128150956_dvt.fits - 1
565: kplr002715135-20160128150956_dvt.fits - 1
566: kplr008644288-20160128150956_dvt.fits - 3
567: kplr003113266-20160128150956_dvt.fits - 1
568: kplr003247268-20160128150956_dvt.fits - 1
569: kplr003247268-20160128150956_dvt.fits - 2
570: kplr003232859-20160128150956_dvt.fits - 1
571: kplr003098184-20160128150956_dvt.fits - 1
572: kplr010342065-20160128150956_dvt.fits - 1
573: kplr010352945-20160128150956_dvt.fits - 1
574: kplr010354039-20160128150956_dvt.fits - 1
575: kplr0103

734: kplr006774826-20160128150956_dvt.fits - 1
735: kplr005652237-20160128150956_dvt.fits - 1
736: kplr007211469-20160128150956_dvt.fits - 1
737: kplr007375795-20160128150956_dvt.fits - 1
738: kplr007255336-20160128150956_dvt.fits - 1
739: kplr007831264-20160128150956_dvt.fits - 1
740: kplr007802136-20160128150956_dvt.fits - 1
741: kplr007532973-20160128150956_dvt.fits - 1
742: kplr009632895-20160128150956_dvt.fits - 1
743: kplr007449844-20160128150956_dvt.fits - 1
744: kplr008766222-20160128150956_dvt.fits - 1
745: kplr008678594-20160128150956_dvt.fits - 1
746: kplr010925104-20160128150956_dvt.fits - 3
747: kplr008678594-20160128150956_dvt.fits - 2
748: kplr008703884-20160128150956_dvt.fits - 1
749: kplr008560840-20160128150956_dvt.fits - 1
750: kplr004076098-20160128150956_dvt.fits - 1
751: kplr004551328-20160128150956_dvt.fits - 1
752: kplr004282872-20160128150956_dvt.fits - 1
753: kplr004639868-20160128150956_dvt.fits - 1
754: kplr010555375-20160128150956_dvt.fits - 1
755: kplr0043

928: kplr006545358-20160128150956_dvt.fits - 1
929: kplr006390824-20160128150956_dvt.fits - 1
930: kplr006546528-20160128150956_dvt.fits - 1
931: kplr010676923-20160128150956_dvt.fits - 1
932: kplr010604335-20160128150956_dvt.fits - 1
933: kplr010604335-20160128150956_dvt.fits - 2
934: kplr010864656-20160128150956_dvt.fits - 1
935: kplr010975146-20160128150956_dvt.fits - 1
936: kplr010538176-20160128150956_dvt.fits - 1
937: kplr007376983-20160128150956_dvt.fits - 2
938: kplr007376983-20160128150956_dvt.fits - 3
939: kplr007376983-20160128150956_dvt.fits - 4
940: kplr006946199-20160128150956_dvt.fits - 2
941: kplr006946199-20160128150956_dvt.fits - 1
942: kplr011497958-20160128150956_dvt.fits - 3
943: kplr011177707-20160128150956_dvt.fits - 1
944: kplr011611600-20160128150956_dvt.fits - 1
945: kplr011254382-20160128150956_dvt.fits - 1
946: kplr011122894-20160128150956_dvt.fits - 3
947: kplr006677841-20160128150956_dvt.fits - 2
948: kplr006677841-20160128150956_dvt.fits - 1
949: kplr0065

1111: kplr009818462-20160128150956_dvt.fits - 1
1112: kplr006803202-20160128150956_dvt.fits - 1
1113: kplr005299459-20160128150956_dvt.fits - 2
1114: kplr009651668-20160128150956_dvt.fits - 1
1115: kplr012506770-20160128150956_dvt.fits - 1
1116: kplr005629985-20160128150956_dvt.fits - 1
1117: kplr009898364-20160128150956_dvt.fits - 1
1118: kplr011337372-20160128150956_dvt.fits - 1
1119: kplr011337833-20160128150956_dvt.fits - 1
1120: kplr011546211-20160128150956_dvt.fits - 1
1121: kplr011547505-20160128150956_dvt.fits - 1
1122: kplr004471747-20160128150956_dvt.fits - 1
1123: kplr011502867-20160128150956_dvt.fits - 1
1124: kplr009909735-20160128150956_dvt.fits - 1
1125: kplr009909735-20160128150956_dvt.fits - 2
1126: kplr011551692-20160128150956_dvt.fits - 1
1127: kplr011551692-20160128150956_dvt.fits - 2
1128: kplr012266636-20160128150956_dvt.fits - 1
1129: kplr009850893-20160128150956_dvt.fits - 1
1130: kplr004826110-20160128150956_dvt.fits - 1
1131: kplr007869917-20160128150956_dvt.f

1285: kplr003657758-20160128150956_dvt.fits - 1
1286: kplr009790806-20160128150956_dvt.fits - 1
1287: kplr006382217-20160128150956_dvt.fits - 1
1288: kplr005375194-20160128150956_dvt.fits - 1
1289: kplr006468138-20160128150956_dvt.fits - 1
1290: kplr010019708-20160128150956_dvt.fits - 1
1291: kplr011875734-20160128150956_dvt.fits - 1
1292: kplr007697568-20160128150956_dvt.fits - 1
1293: kplr007046804-20160128150956_dvt.fits - 1
1294: kplr007449136-20160128150956_dvt.fits - 1
1295: kplr008680979-20160128150956_dvt.fits - 1
1296: kplr008680979-20160128150956_dvt.fits - 2
1297: kplr009030537-20160128150956_dvt.fits - 1
1298: kplr006949061-20160128150956_dvt.fits - 1
1299: kplr007269493-20160128150956_dvt.fits - 1
1300: kplr005513648-20160128150956_dvt.fits - 1
1301: kplr009164836-20160128150956_dvt.fits - 1
1302: kplr007887791-20160128150956_dvt.fits - 1
1303: kplr011657891-20160128150956_dvt.fits - 1
1304: kplr006305192-20160128150956_dvt.fits - 1
1305: kplr006382217-20160128150956_dvt.f

1469: kplr003648437-20160128150956_dvt.fits - 1
1470: kplr007877496-20160128150956_dvt.fits - 1
1471: kplr007286173-20160128150956_dvt.fits - 1
1472: kplr011126381-20160128150956_dvt.fits - 1
1473: kplr004149450-20160128150956_dvt.fits - 1
1474: kplr010136549-20160128150956_dvt.fits - 2
1475: kplr005511081-20160128150956_dvt.fits - 1
1476: kplr005511081-20160128150956_dvt.fits - 2
1477: kplr005511081-20160128150956_dvt.fits - 3
1478: kplr005511081-20160128150956_dvt.fits - 4
1479: kplr010978763-20160128150956_dvt.fits - 1
1480: kplr011069176-20160128150956_dvt.fits - 2
1481: kplr008098728-20160128150956_dvt.fits - 1
1482: kplr006152974-20160128150956_dvt.fits - 2
1483: kplr002449431-20160128150956_dvt.fits - 1
1484: kplr005631630-20160128150956_dvt.fits - 1
1485: kplr006504954-20160128150956_dvt.fits - 1
1486: kplr009410930-20160128150956_dvt.fits - 1
1487: kplr005731623-20160128150956_dvt.fits - 1
1488: kplr006153672-20160128150956_dvt.fits - 1
1489: kplr010676014-20160128150956_dvt.f

1650: kplr007286911-20160128150956_dvt.fits - 1
1651: kplr008591173-20160128150956_dvt.fits - 1
1652: kplr006962109-20160128150956_dvt.fits - 1
1653: kplr005706595-20160128150956_dvt.fits - 1
1654: kplr005706595-20160128150956_dvt.fits - 2
1655: kplr010489345-20160128150956_dvt.fits - 1
1656: kplr011774303-20160128150956_dvt.fits - 1
1657: kplr010329469-20160128150956_dvt.fits - 1
1658: kplr009654875-20160128150956_dvt.fits - 1
1659: kplr009717943-20160128150956_dvt.fits - 1
1660: kplr008026752-20160128150956_dvt.fits - 1
1661: kplr001872821-20160128150956_dvt.fits - 1
1662: kplr008013439-20160128150956_dvt.fits - 2
1663: kplr008013439-20160128150956_dvt.fits - 1
1664: kplr008013439-20160128150956_dvt.fits - 3
1665: kplr007207061-20160128150956_dvt.fits - 2
1666: kplr006921944-20160128150956_dvt.fits - 1
1667: kplr009532052-20160128150956_dvt.fits - 1
1668: kplr008164012-20160128150956_dvt.fits - 1
1669: kplr004140813-20160128150956_dvt.fits - 1
1670: kplr008611781-20160128150956_dvt.f

1839: kplr006768394-20160128150956_dvt.fits - 2
1840: kplr006768394-20160128150956_dvt.fits - 1
1841: kplr005211199-20160128150956_dvt.fits - 1
1842: kplr008804455-20160128150956_dvt.fits - 2
1843: kplr008804455-20160128150956_dvt.fits - 1
1844: kplr005546761-20160128150956_dvt.fits - 1
1845: kplr009205938-20160128150956_dvt.fits - 1
1846: kplr010723367-20160128150956_dvt.fits - 1
1847: kplr010723367-20160128150956_dvt.fits - 2
1848: kplr005005121-20160128150956_dvt.fits - 1
1849: kplr004833421-20160128150956_dvt.fits - 5
1850: kplr008229458-20160128150956_dvt.fits - 1
1851: kplr010684670-20160128150956_dvt.fits - 1
1852: kplr011495458-20160128150956_dvt.fits - 1
1853: kplr009003401-20160128150956_dvt.fits - 1
1854: kplr005978559-20160128150956_dvt.fits - 1
1855: kplr007746958-20160128150956_dvt.fits - 1
1856: kplr006768394-20160128150956_dvt.fits - 3
1857: kplr006922710-20160128150956_dvt.fits - 1
1858: kplr011348997-20160128150956_dvt.fits - 1
1859: kplr005705819-20160128150956_dvt.f

2026: kplr012156347-20160128150956_dvt.fits - 1
2027: kplr008636539-20160128150956_dvt.fits - 1
2028: kplr009283156-20160128150956_dvt.fits - 1
2029: kplr008547429-20160128150956_dvt.fits - 1
2030: kplr008803882-20160128150956_dvt.fits - 1
2031: kplr009893318-20160128150956_dvt.fits - 1
2032: kplr006864885-20160128150956_dvt.fits - 1
2033: kplr005097868-20160128150956_dvt.fits - 1
2034: kplr008570333-20160128150956_dvt.fits - 1
2035: kplr008478994-20160128150956_dvt.fits - 2
2036: kplr005036705-20160128150956_dvt.fits - 1
2037: kplr009574158-20160128150956_dvt.fits - 1
2038: kplr009653622-20160128150956_dvt.fits - 1
2039: kplr004391348-20160128150956_dvt.fits - 1
2040: kplr007294743-20160128150956_dvt.fits - 1
2041: kplr008947520-20160128150956_dvt.fits - 1
2042: kplr010468885-20160128150956_dvt.fits - 1
2043: kplr006765135-20160128150956_dvt.fits - 1
2044: kplr003861595-20160128150956_dvt.fits - 1
2045: kplr005725087-20160128150956_dvt.fits - 1
2046: kplr010489206-20160128150956_dvt.f

2214: kplr006429812-20160128150956_dvt.fits - 1
2215: kplr006846911-20160128150956_dvt.fits - 1
2216: kplr005385439-20160128150956_dvt.fits - 1
2217: kplr005963582-20160128150956_dvt.fits - 1
2218: kplr011403389-20160128150956_dvt.fits - 1
2219: kplr009851662-20160128150956_dvt.fits - 1
2220: kplr006467363-20160128150956_dvt.fits - 1
2221: kplr006467363-20160128150956_dvt.fits - 2
2222: kplr008758204-20160128150956_dvt.fits - 1
2223: kplr008733898-20160128150956_dvt.fits - 1
2224: kplr008733898-20160128150956_dvt.fits - 3
2225: kplr003228945-20160128150956_dvt.fits - 1
2226: kplr007008149-20160128150956_dvt.fits - 1
2227: kplr010332213-20160128150956_dvt.fits - 1
2228: kplr007090524-20160128150956_dvt.fits - 1
2229: kplr011133306-20160128150956_dvt.fits - 1
2230: kplr006307063-20160128150956_dvt.fits - 1
2231: kplr005871985-20160128150956_dvt.fits - 1
2232: kplr008292840-20160128150956_dvt.fits - 1
2233: kplr009730163-20160128150956_dvt.fits - 1
2234: kplr009730163-20160128150956_dvt.f

2390: kplr005642620-20160128150956_dvt.fits - 3
2391: kplr008429415-20160128150956_dvt.fits - 1
2392: kplr005014903-20160128150956_dvt.fits - 1
2393: kplr005980783-20160128150956_dvt.fits - 1
2394: kplr008873090-20160128150956_dvt.fits - 1
2395: kplr008560804-20160128150956_dvt.fits - 1
2396: kplr012008872-20160128150956_dvt.fits - 1
2397: kplr008415200-20160128150956_dvt.fits - 1
2398: kplr004072333-20160128150956_dvt.fits - 1
2399: kplr009886361-20160128150956_dvt.fits - 1
2400: kplr009886361-20160128150956_dvt.fits - 3
2401: kplr003730335-20160128150956_dvt.fits - 1
2402: kplr006528464-20160128150956_dvt.fits - 1
2403: kplr007428316-20160128150956_dvt.fits - 1
2404: kplr010275766-20160128150956_dvt.fits - 1
2405: kplr006370120-20160128150956_dvt.fits - 1
2406: kplr005642620-20160128150956_dvt.fits - 1
2407: kplr005209845-20160128150956_dvt.fits - 1
2408: kplr009652649-20160128150956_dvt.fits - 1
2409: kplr011100670-20160128150956_dvt.fits - 1
2410: kplr006129524-20160128150956_dvt.f

2565: kplr007668416-20160128150956_dvt.fits - 1
2566: kplr006029239-20160128150956_dvt.fits - 2
2567: kplr006063220-20160128150956_dvt.fits - 1
2568: kplr006071903-20160128150956_dvt.fits - 1
2569: kplr006289257-20160128150956_dvt.fits - 1
2570: kplr010905239-20160128150956_dvt.fits - 1
2571: kplr011623629-20160128150956_dvt.fits - 1
2572: kplr003545478-20160128150956_dvt.fits - 1
2573: kplr004815520-20160128150956_dvt.fits - 1
2574: kplr006603043-20160128150956_dvt.fits - 1
2575: kplr007175184-20160128150956_dvt.fits - 1
2576: kplr007109851-20160128150956_dvt.fits - 1
2577: kplr011565544-20160128150956_dvt.fits - 1
2578: kplr007347246-20160128150956_dvt.fits - 1
2579: kplr011403530-20160128150956_dvt.fits - 1
2580: kplr005088536-20160128150956_dvt.fits - 2
2581: kplr012072872-20160128150956_dvt.fits - 1
2582: kplr006609270-20160128150956_dvt.fits - 1
2583: kplr012406807-20160128150956_dvt.fits - 1
2584: kplr008046689-20160128150956_dvt.fits - 1
2585: kplr005096590-20160128150956_dvt.f

2742: kplr005091016-20160128150956_dvt.fits - 1
2743: kplr008827575-20160128150956_dvt.fits - 1
2744: kplr008827575-20160128150956_dvt.fits - 2
2745: kplr002716801-20160128150956_dvt.fits - 1
2746: kplr007281454-20160128150956_dvt.fits - 1
2747: kplr007216284-20160128150956_dvt.fits - 1
2748: kplr008703887-20160128150956_dvt.fits - 1
2749: kplr009899352-20160128150956_dvt.fits - 1
2750: kplr006945786-20160128150956_dvt.fits - 1
2751: kplr005978170-20160128150956_dvt.fits - 1
2752: kplr005440317-20160128150956_dvt.fits - 1
2753: kplr010545066-20160128150956_dvt.fits - 1
2754: kplr006056992-20160128150956_dvt.fits - 1
2755: kplr010552611-20160128150956_dvt.fits - 1
2756: kplr010552611-20160128150956_dvt.fits - 2
2757: kplr010587105-20160128150956_dvt.fits - 1
2758: kplr010587105-20160128150956_dvt.fits - 2
2759: kplr006948480-20160128150956_dvt.fits - 1
2760: kplr006438099-20160128150956_dvt.fits - 1
2761: kplr011090556-20160128150956_dvt.fits - 1
2762: kplr011090556-20160128150956_dvt.f

2914: kplr008616637-20160128150956_dvt.fits - 1
2915: kplr008616637-20160128150956_dvt.fits - 2
2916: kplr008625925-20160128150956_dvt.fits - 1
2917: kplr008822216-20160128150956_dvt.fits - 1
2918: kplr004548011-20160128150956_dvt.fits - 3
2919: kplr009967884-20160128150956_dvt.fits - 1
2920: kplr010016874-20160128150956_dvt.fits - 1
2921: kplr010189546-20160128150956_dvt.fits - 1
2922: kplr010189546-20160128150956_dvt.fits - 2
2923: kplr010189546-20160128150956_dvt.fits - 3
2924: kplr010748393-20160128150956_dvt.fits - 2
2925: kplr011192998-20160128150956_dvt.fits - 2
2926: kplr011192998-20160128150956_dvt.fits - 3
2927: kplr008823868-20160128150956_dvt.fits - 2
2928: kplr011255761-20160128150956_dvt.fits - 1
2929: kplr011497977-20160128150956_dvt.fits - 1
2930: kplr007021681-20160128150956_dvt.fits - 3
2931: kplr009941859-20160128150956_dvt.fits - 1
2932: kplr009941859-20160128150956_dvt.fits - 2
2933: kplr009941859-20160128150956_dvt.fits - 3
2934: kplr010068030-20160128150956_dvt.f

3100: kplr009946525-20160128150956_dvt.fits - 1
3101: kplr009946525-20160128150956_dvt.fits - 2
3102: kplr002716853-20160128150956_dvt.fits - 2
3103: kplr009946525-20160128150956_dvt.fits - 3
3104: kplr007289157-20160128150956_dvt.fits - 1
3105: kplr002695110-20160128150956_dvt.fits - 1
3106: kplr007098355-20160128150956_dvt.fits - 1
3107: kplr007101828-20160128150956_dvt.fits - 1
3108: kplr002832589-20160128150956_dvt.fits - 2
3109: kplr007269974-20160128150956_dvt.fits - 1
3110: kplr007269974-20160128150956_dvt.fits - 2
3111: kplr007440748-20160128150956_dvt.fits - 1
3112: kplr005689351-20160128150956_dvt.fits - 2
3113: kplr005780715-20160128150956_dvt.fits - 1
3114: kplr010793172-20160128150956_dvt.fits - 3
3115: kplr005812960-20160128150956_dvt.fits - 1
3116: kplr006266741-20160128150956_dvt.fits - 1
3117: kplr010187017-20160128150956_dvt.fits - 4
3118: kplr005738496-20160128150956_dvt.fits - 1
3119: kplr005774349-20160128150956_dvt.fits - 1
3120: kplr010285631-20160128150956_dvt.f

3290: kplr008409588-20160128150956_dvt.fits - 1
3291: kplr011250587-20160128150956_dvt.fits - 1
3292: kplr008480285-20160128150956_dvt.fits - 1
3293: kplr008480285-20160128150956_dvt.fits - 2
3294: kplr010265898-20160128150956_dvt.fits - 1
3295: kplr010265898-20160128150956_dvt.fits - 3
3296: kplr002302758-20160128150956_dvt.fits - 1
3297: kplr010265898-20160128150956_dvt.fits - 2
3298: kplr010271806-20160128150956_dvt.fits - 1
3299: kplr010271806-20160128150956_dvt.fits - 2
3300: kplr005098444-20160128150956_dvt.fits - 1
3301: kplr005113822-20160128150956_dvt.fits - 1
3302: kplr005113822-20160128150956_dvt.fits - 2
3303: kplr005120087-20160128150956_dvt.fits - 1
3304: kplr005121511-20160128150956_dvt.fits - 1
3305: kplr008557374-20160128150956_dvt.fits - 1
3306: kplr008557374-20160128150956_dvt.fits - 2
3307: kplr012600735-20160128150956_dvt.fits - 2
3308: kplr008738735-20160128150956_dvt.fits - 2
3309: kplr008738735-20160128150956_dvt.fits - 1
3310: kplr008802165-20160128150956_dvt.f

3479: kplr009963524-20160128150956_dvt.fits - 2
3480: kplr009963524-20160128150956_dvt.fits - 3
3481: kplr009963524-20160128150956_dvt.fits - 4
3482: kplr009964801-20160128150956_dvt.fits - 1
3483: kplr009965439-20160128150956_dvt.fits - 1
3484: kplr002569995-20160128150956_dvt.fits - 1
3485: kplr012068975-20160128150956_dvt.fits - 3
3486: kplr012068975-20160128150956_dvt.fits - 2
3487: kplr012068975-20160128150956_dvt.fits - 4
3488: kplr003541946-20160128150956_dvt.fits - 1
3489: kplr003541946-20160128150956_dvt.fits - 3
3490: kplr010264660-20160128150956_dvt.fits - 1
3491: kplr007385509-20160128150956_dvt.fits - 1
3492: kplr007447200-20160128150956_dvt.fits - 1
3493: kplr007447200-20160128150956_dvt.fits - 2
3494: kplr007466863-20160128150956_dvt.fits - 1
3495: kplr007509886-20160128150956_dvt.fits - 2
3496: kplr007509886-20160128150956_dvt.fits - 1
3497: kplr010002866-20160128150956_dvt.fits - 2
3498: kplr010002866-20160128150956_dvt.fits - 3
3499: kplr006678383-20160128150956_dvt.f

3662: kplr009091755-20160128150956_dvt.fits - 1
3663: kplr003852476-20160128150956_dvt.fits - 1
3664: kplr009700449-20160128150956_dvt.fits - 1
3665: kplr006349881-20160128150956_dvt.fits - 1
3666: kplr009851970-20160128150956_dvt.fits - 1
3667: kplr006140059-20160128150956_dvt.fits - 1
3668: kplr001718958-20160128150956_dvt.fits - 1
3669: kplr006428794-20160128150956_dvt.fits - 1
3670: kplr003228825-20160128150956_dvt.fits - 1
3671: kplr004673628-20160128150956_dvt.fits - 1
3672: kplr011401767-20160128150956_dvt.fits - 3
3673: kplr003852872-20160128150956_dvt.fits - 1
3674: kplr009099950-20160128150956_dvt.fits - 1
3675: kplr009718066-20160128150956_dvt.fits - 2
3676: kplr009272070-20160128150956_dvt.fits - 1
3677: kplr005088400-20160128150956_dvt.fits - 1
3678: kplr004669402-20160128150956_dvt.fits - 1
3679: kplr003757407-20160128150956_dvt.fits - 1
3680: kplr004769931-20160128150956_dvt.fits - 1
3681: kplr009220612-20160128150956_dvt.fits - 1
3682: kplr009086154-20160128150956_dvt.f

3834: kplr007604328-20160128150956_dvt.fits - 1
3835: kplr009514372-20160128150956_dvt.fits - 1
3836: kplr009777757-20160128150956_dvt.fits - 1
3837: kplr007103919-20160128150956_dvt.fits - 1
3838: kplr003762750-20160128150956_dvt.fits - 1
3839: kplr008645191-20160128150956_dvt.fits - 1
3840: kplr010722485-20160128150956_dvt.fits - 1
3841: kplr008128067-20160128150956_dvt.fits - 1
3842: kplr007703910-20160128150956_dvt.fits - 1
3843: kplr007777818-20160128150956_dvt.fits - 1
3844: kplr009851360-20160128150956_dvt.fits - 1
3845: kplr009886361-20160128150956_dvt.fits - 4
3846: kplr006937529-20160128150956_dvt.fits - 1
3847: kplr009490653-20160128150956_dvt.fits - 1
3848: kplr003328027-20160128150956_dvt.fits - 1
3849: kplr003644631-20160128150956_dvt.fits - 1
3850: kplr009640931-20160128150956_dvt.fits - 1
3851: kplr009640962-20160128150956_dvt.fits - 1
3852: kplr005097454-20160128150956_dvt.fits - 1
3853: kplr012737015-20160128150956_dvt.fits - 1
3854: kplr003663141-20160128150956_dvt.f

4009: kplr002831251-20160128150956_dvt.fits - 1
4010: kplr007694615-20160128150956_dvt.fits - 1
4011: kplr004556468-20160128150956_dvt.fits - 1
4012: kplr009222516-20160128150956_dvt.fits - 1
4013: kplr005431027-20160128150956_dvt.fits - 1
4014: kplr007285673-20160128150956_dvt.fits - 1
4015: kplr007747457-20160128150956_dvt.fits - 1
4016: kplr008652360-20160128150956_dvt.fits - 2
4017: kplr009602421-20160128150956_dvt.fits - 1
4018: kplr006070337-20160128150956_dvt.fits - 1
4019: kplr005954001-20160128150956_dvt.fits - 1
4020: kplr004373195-20160128150956_dvt.fits - 1
4021: kplr008196226-20160128150956_dvt.fits - 1
4022: kplr009002538-20160128150956_dvt.fits - 1
4023: kplr005642688-20160128150956_dvt.fits - 1
4024: kplr010026502-20160128150956_dvt.fits - 1
4025: kplr003953173-20160128150956_dvt.fits - 1
4026: kplr005552562-20160128150956_dvt.fits - 1
4027: kplr006186182-20160128150956_dvt.fits - 1
4028: kplr004073017-20160128150956_dvt.fits - 1
4029: kplr009002278-20160128150956_dvt.f

4184: kplr008430053-20160128150956_dvt.fits - 1
4185: kplr007115661-20160128150956_dvt.fits - 1
4186: kplr012458605-20160128150956_dvt.fits - 1
4187: kplr004845862-20160128150956_dvt.fits - 1
4188: kplr010130057-20160128150956_dvt.fits - 1
4189: kplr009838608-20160128150956_dvt.fits - 1
4190: kplr004247811-20160128150956_dvt.fits - 1
4191: kplr006543645-20160128150956_dvt.fits - 1
4192: kplr005352640-20160128150956_dvt.fits - 1
4193: kplr008738244-20160128150956_dvt.fits - 1
4194: kplr004157052-20160128150956_dvt.fits - 1
4195: kplr010006641-20160128150956_dvt.fits - 1
4196: kplr010815916-20160128150956_dvt.fits - 1
4197: kplr003547760-20160128150956_dvt.fits - 1
4198: kplr007281484-20160128150956_dvt.fits - 1
4199: kplr004848115-20160128150956_dvt.fits - 1
4200: kplr004048898-20160128150956_dvt.fits - 1
4201: kplr005991765-20160128150956_dvt.fits - 1
4202: kplr003962872-20160128150956_dvt.fits - 1
4203: kplr011912911-20160128150956_dvt.fits - 1
4204: kplr008573168-20160128150956_dvt.f

4360: kplr009083564-20160128150956_dvt.fits - 1
4361: kplr007335713-20160128150956_dvt.fits - 1
4362: kplr008331612-20160128150956_dvt.fits - 1
4363: kplr008874090-20160128150956_dvt.fits - 2
4364: kplr008331612-20160128150956_dvt.fits - 2
4365: kplr005185765-20160128150956_dvt.fits - 1
4366: kplr005965819-20160128150956_dvt.fits - 2
4367: kplr005965819-20160128150956_dvt.fits - 1
4368: kplr010155029-20160128150956_dvt.fits - 1
4369: kplr011618569-20160128150956_dvt.fits - 1
4370: kplr005471566-20160128150956_dvt.fits - 1
4371: kplr005814013-20160128150956_dvt.fits - 1
4372: kplr007117355-20160128150956_dvt.fits - 1
4373: kplr002707985-20160128150956_dvt.fits - 1
4374: kplr009899141-20160128150956_dvt.fits - 1
4375: kplr008183288-20160128150956_dvt.fits - 1
4376: kplr005965819-20160128150956_dvt.fits - 3
4377: kplr006690171-20160128150956_dvt.fits - 1
4378: kplr006859801-20160128150956_dvt.fits - 1
4379: kplr007767733-20160128150956_dvt.fits - 1
4380: kplr005193659-20160128150956_dvt.f

4531: kplr005471158-20160128150956_dvt.fits - 1
4532: kplr004847843-20160128150956_dvt.fits - 1
4533: kplr005120225-20160128150956_dvt.fits - 1
4534: kplr009656543-20160128150956_dvt.fits - 1
4535: kplr010751515-20160128150956_dvt.fits - 1
4536: kplr006182846-20160128150956_dvt.fits - 1
4537: kplr004939346-20160128150956_dvt.fits - 2
4538: kplr006442602-20160128150956_dvt.fits - 1
4539: kplr009851126-20160128150956_dvt.fits - 1
4540: kplr006268648-20160128150956_dvt.fits - 2
4541: kplr011455428-20160128150956_dvt.fits - 1
4542: kplr010484409-20160128150956_dvt.fits - 1
4543: kplr010935993-20160128150956_dvt.fits - 1
4544: kplr005257082-20160128150956_dvt.fits - 1
4545: kplr005796185-20160128150956_dvt.fits - 1
4546: kplr008628665-20160128150956_dvt.fits - 1
4547: kplr003945802-20160128150956_dvt.fits - 1
4548: kplr006265665-20160128150956_dvt.fits - 1
4549: kplr005881307-20160128150956_dvt.fits - 1
4550: kplr004645174-20160128150956_dvt.fits - 1
4551: kplr004645174-20160128150956_dvt.f

4704: kplr006756202-20160128150956_dvt.fits - 1
4705: kplr010215869-20160128150956_dvt.fits - 1
4706: kplr005456319-20160128150956_dvt.fits - 1
4707: kplr007025540-20160128150956_dvt.fits - 1
4708: kplr007620844-20160128150956_dvt.fits - 1
4709: kplr006677267-20160128150956_dvt.fits - 1
4710: kplr007732791-20160128150956_dvt.fits - 1
4711: kplr005736801-20160128150956_dvt.fits - 1
4712: kplr008177798-20160128150956_dvt.fits - 1
4713: kplr012643582-20160128150956_dvt.fits - 1
4714: kplr009328641-20160128150956_dvt.fits - 1
4715: kplr009966219-20160128150956_dvt.fits - 1
4716: kplr005991070-20160128150956_dvt.fits - 1
4717: kplr005531953-20160128150956_dvt.fits - 2
4718: kplr003757778-20160128150956_dvt.fits - 1
4719: kplr011754430-20160128150956_dvt.fits - 1
4720: kplr009995771-20160128150956_dvt.fits - 1
4721: kplr011875511-20160128150956_dvt.fits - 1
4722: kplr006231721-20160128150956_dvt.fits - 1
4723: kplr009536108-20160128150956_dvt.fits - 1
4724: kplr008905246-20160128150956_dvt.f

4882: kplr005560831-20160128150956_dvt.fits - 1
4883: kplr005305404-20160128150956_dvt.fits - 1
4884: kplr003728432-20160128150956_dvt.fits - 1
4885: kplr006696580-20160128150956_dvt.fits - 3
4886: kplr005473535-20160128150956_dvt.fits - 1
4887: kplr012106934-20160128150956_dvt.fits - 1
4888: kplr006222898-20160128150956_dvt.fits - 1
4889: kplr011200767-20160128150956_dvt.fits - 1
4890: kplr009892816-20160128150956_dvt.fits - 3
4891: kplr008589731-20160128150956_dvt.fits - 1
4892: kplr004356766-20160128150956_dvt.fits - 1
4893: kplr002167890-20160128150956_dvt.fits - 1
4894: kplr005384079-20160128150956_dvt.fits - 2
4895: kplr004935914-20160128150956_dvt.fits - 1
4896: kplr011303811-20160128150956_dvt.fits - 1
4897: kplr002437783-20160128150956_dvt.fits - 1
4898: kplr012062660-20160128150956_dvt.fits - 1
4899: kplr009777087-20160128150956_dvt.fits - 1
4900: kplr008883727-20160128150956_dvt.fits - 1
4901: kplr004820550-20160128150956_dvt.fits - 1
4902: kplr011709124-20160128150956_dvt.f

5060: kplr009489524-20160128150956_dvt.fits - 4
5061: kplr007031340-20160128150956_dvt.fits - 1
5062: kplr004073730-20160128150956_dvt.fits - 1
5063: kplr007813039-20160128150956_dvt.fits - 1
5064: kplr003644601-20160128150956_dvt.fits - 3
5065: kplr008459354-20160128150956_dvt.fits - 1
5066: kplr006531617-20160128150956_dvt.fits - 1
5067: kplr004164922-20160128150956_dvt.fits - 1
5068: kplr012061969-20160128150956_dvt.fits - 2
5069: kplr004164922-20160128150956_dvt.fits - 2
5070: kplr010328393-20160128150956_dvt.fits - 3
5071: kplr010259029-20160128150956_dvt.fits - 1
5072: kplr008299947-20160128150956_dvt.fits - 1
5073: kplr003335813-20160128150956_dvt.fits - 1
5074: kplr011858741-20160128150956_dvt.fits - 1
5075: kplr006363494-20160128150956_dvt.fits - 1
5076: kplr002576107-20160128150956_dvt.fits - 1
5077: kplr009532421-20160128150956_dvt.fits - 1
5078: kplr006936966-20160128150956_dvt.fits - 1
5079: kplr005653849-20160128150956_dvt.fits - 1
5080: kplr002437060-20160128150956_dvt.f

5236: kplr008179325-20160128150956_dvt.fits - 1
5237: kplr005211199-20160128150956_dvt.fits - 2
5238: kplr005475494-20160128150956_dvt.fits - 1
5239: kplr009412445-20160128150956_dvt.fits - 1
5240: kplr006057684-20160128150956_dvt.fits - 1
5241: kplr005271608-20160128150956_dvt.fits - 1
5242: kplr005357470-20160128150956_dvt.fits - 1
5243: kplr005443775-20160128150956_dvt.fits - 1
5244: kplr005527172-20160128150956_dvt.fits - 1
5245: kplr007386391-20160128150956_dvt.fits - 1
5246: kplr005529643-20160128150956_dvt.fits - 1
5247: kplr009366989-20160128150956_dvt.fits - 1
5248: kplr009468717-20160128150956_dvt.fits - 1
5249: kplr009528420-20160128150956_dvt.fits - 1
5250: kplr009529744-20160128150956_dvt.fits - 1
5251: kplr007834712-20160128150956_dvt.fits - 1
5252: kplr009574614-20160128150956_dvt.fits - 1
5253: kplr009674608-20160128150956_dvt.fits - 1
5254: kplr008495415-20160128150956_dvt.fits - 1
5255: kplr010341905-20160128150956_dvt.fits - 1
5256: kplr012115188-20160128150956_dvt.f

5419: kplr008934103-20160128150956_dvt.fits - 1
5420: kplr007618364-20160128150956_dvt.fits - 1
5421: kplr004150390-20160128150956_dvt.fits - 1
5422: kplr004851464-20160128150956_dvt.fits - 1
5423: kplr004913000-20160128150956_dvt.fits - 1
5424: kplr004945877-20160128150956_dvt.fits - 1
5425: kplr005110423-20160128150956_dvt.fits - 1
5426: kplr002581191-20160128150956_dvt.fits - 1
5427: kplr007379385-20160128150956_dvt.fits - 1
5428: kplr006199702-20160128150956_dvt.fits - 1
5429: kplr006307083-20160128150956_dvt.fits - 1
5430: kplr006780158-20160128150956_dvt.fits - 1
5431: kplr007199756-20160128150956_dvt.fits - 1
5432: kplr004947556-20160128150956_dvt.fits - 2
5433: kplr007362696-20160128150956_dvt.fits - 1
5434: kplr003865595-20160128150956_dvt.fits - 1
5435: kplr009593528-20160128150956_dvt.fits - 1
5436: kplr005195945-20160128150956_dvt.fits - 1
5437: kplr007300182-20160128150956_dvt.fits - 1
5438: kplr005288577-20160128150956_dvt.fits - 1
5439: kplr011969988-20160128150956_dvt.f

5597: kplr006364143-20160128150956_dvt.fits - 1
5598: kplr006364307-20160128150956_dvt.fits - 1
5599: kplr006366559-20160128150956_dvt.fits - 1
5600: kplr006367663-20160128150956_dvt.fits - 1
5601: kplr006368905-20160128150956_dvt.fits - 1
5602: kplr006372194-20160128150956_dvt.fits - 1
5603: kplr006387557-20160128150956_dvt.fits - 1
5604: kplr003958301-20160128150956_dvt.fits - 2
5605: kplr004932689-20160128150956_dvt.fits - 1
5606: kplr005294945-20160128150956_dvt.fits - 1
5607: kplr007742408-20160128150956_dvt.fits - 1
5608: kplr003973549-20160128150956_dvt.fits - 1
5609: kplr004058169-20160128150956_dvt.fits - 1
5610: kplr005391911-20160128150956_dvt.fits - 1
5611: kplr006443093-20160128150956_dvt.fits - 1
5612: kplr006509282-20160128150956_dvt.fits - 1
5613: kplr008231667-20160128150956_dvt.fits - 1
5614: kplr006697817-20160128150956_dvt.fits - 1
5615: kplr003439096-20160128150956_dvt.fits - 1
5616: kplr003324644-20160128150956_dvt.fits - 1
5617: kplr005437945-20160128150956_dvt.f

5777: kplr011081512-20160128150956_dvt.fits - 1
5778: kplr003867593-20160128150956_dvt.fits - 1
5779: kplr003953106-20160128150956_dvt.fits - 1
5780: kplr003962728-20160128150956_dvt.fits - 1
5781: kplr009282853-20160128150956_dvt.fits - 1
5782: kplr010189542-20160128150956_dvt.fits - 1
5783: kplr010191056-20160128150956_dvt.fits - 1
5784: kplr007599004-20160128150956_dvt.fits - 1
5785: kplr008507073-20160128150956_dvt.fits - 1
5786: kplr008509781-20160128150956_dvt.fits - 1
5787: kplr009304976-20160128150956_dvt.fits - 1
5788: kplr010227881-20160128150956_dvt.fits - 1
5789: kplr010320341-20160128150956_dvt.fits - 1
5790: kplr007658229-20160128150956_dvt.fits - 1
5791: kplr007662502-20160128150956_dvt.fits - 1
5792: kplr007672215-20160128150956_dvt.fits - 1
5793: kplr007674050-20160128150956_dvt.fits - 1
5794: kplr007700561-20160128150956_dvt.fits - 1
5795: kplr008552607-20160128150956_dvt.fits - 1
5796: kplr009390655-20160128150956_dvt.fits - 1
5797: kplr009391817-20160128150956_dvt.f

5958: kplr011036168-20160128150956_dvt.fits - 1
5959: kplr011037818-20160128150956_dvt.fits - 1
5960: kplr011046870-20160128150956_dvt.fits - 1
5961: kplr012554212-20160128150956_dvt.fits - 1
5962: kplr012645761-20160128150956_dvt.fits - 1
5963: kplr012736658-20160128150956_dvt.fits - 1
5964: kplr011090561-20160128150956_dvt.fits - 1
5965: kplr011152511-20160128150956_dvt.fits - 1
5966: kplr011176166-20160128150956_dvt.fits - 1
5967: kplr011341314-20160128150956_dvt.fits - 1
5968: kplr011407847-20160128150956_dvt.fits - 1
5969: kplr008374077-20160128150956_dvt.fits - 1
5970: kplr008517303-20160128150956_dvt.fits - 1
5971: kplr008590149-20160128150956_dvt.fits - 1
5972: kplr010055126-20160128150956_dvt.fits - 3
5973: kplr009285265-20160128150956_dvt.fits - 2
5974: kplr003559860-20160128150956_dvt.fits - 2
5975: kplr005384713-20160128150956_dvt.fits - 5
5976: kplr006525209-20160128150956_dvt.fits - 4
5977: kplr009947653-20160128150956_dvt.fits - 2
5978: kplr006527078-20160128150956_dvt.f

6150: kplr007749318-20160128150956_dvt.fits - 1
6151: kplr007777397-20160128150956_dvt.fits - 1
6152: kplr005817566-20160128150956_dvt.fits - 1
6153: kplr005952403-20160128150956_dvt.fits - 1
6154: kplr004150539-20160128150956_dvt.fits - 3
6155: kplr009839821-20160128150956_dvt.fits - 2
6156: kplr011252617-20160128150956_dvt.fits - 1
6157: kplr008245108-20160128150956_dvt.fits - 1
6158: kplr011252617-20160128150956_dvt.fits - 3
6159: kplr007830321-20160128150956_dvt.fits - 1
6160: kplr007841986-20160128150956_dvt.fits - 1
6161: kplr007877820-20160128150956_dvt.fits - 1
6162: kplr007914906-20160128150956_dvt.fits - 1
6163: kplr005792202-20160128150956_dvt.fits - 5
6164: kplr008019043-20160128150956_dvt.fits - 1
6165: kplr008023317-20160128150956_dvt.fits - 1
6166: kplr008126531-20160128150956_dvt.fits - 1
6167: kplr008121067-20160128150956_dvt.fits - 1
6168: kplr006197038-20160128150956_dvt.fits - 1
6169: kplr006221385-20160128150956_dvt.fits - 1
6170: kplr006221385-20160128150956_dvt.f

6323: kplr009301564-20160128150956_dvt.fits - 2
6324: kplr010091257-20160128150956_dvt.fits - 1
6325: kplr010093664-20160128150956_dvt.fits - 1
6326: kplr010095512-20160128150956_dvt.fits - 1
6327: kplr010129482-20160128150956_dvt.fits - 1
6328: kplr010155080-20160128150956_dvt.fits - 1
6329: kplr010156064-20160128150956_dvt.fits - 1
6330: kplr008555967-20160128150956_dvt.fits - 1
6331: kplr011198723-20160128150956_dvt.fits - 1
6332: kplr011199725-20160128150956_dvt.fits - 1
6333: kplr011200773-20160128150956_dvt.fits - 1
6334: kplr011232745-20160128150956_dvt.fits - 1
6335: kplr011233911-20160128150956_dvt.fits - 1
6336: kplr011234677-20160128150956_dvt.fits - 1
6337: kplr003765917-20160128150956_dvt.fits - 2
6338: kplr008652360-20160128150956_dvt.fits - 1
6339: kplr005476671-20160128150956_dvt.fits - 1
6340: kplr008216763-20160128150956_dvt.fits - 3
6341: kplr008094120-20160128150956_dvt.fits - 1
6342: kplr003854496-20160128150956_dvt.fits - 1
6343: kplr003938073-20160128150956_dvt.f

6503: kplr011409698-20160128150956_dvt.fits - 1
6504: kplr011411639-20160128150956_dvt.fits - 1
6505: kplr011455795-20160128150956_dvt.fits - 1
6506: kplr011457191-20160128150956_dvt.fits - 1
6507: kplr002301068-20160128150956_dvt.fits - 1
6508: kplr002308957-20160128150956_dvt.fits - 1
6509: kplr002422820-20160128150956_dvt.fits - 1
6510: kplr002437452-20160128150956_dvt.fits - 1
6511: kplr002437804-20160128150956_dvt.fits - 1
6512: kplr002438061-20160128150956_dvt.fits - 1
6513: kplr004180396-20160128150956_dvt.fits - 1
6514: kplr004252226-20160128150956_dvt.fits - 1
6515: kplr004261960-20160128150956_dvt.fits - 1
6516: kplr004269252-20160128150956_dvt.fits - 1
6517: kplr004274071-20160128150956_dvt.fits - 1
6518: kplr004281895-20160128150956_dvt.fits - 1
6519: kplr004282390-20160128150956_dvt.fits - 1
6520: kplr005095269-20160128150956_dvt.fits - 1
6521: kplr005111801-20160128150956_dvt.fits - 1
6522: kplr005112198-20160128150956_dvt.fits - 1
6523: kplr005113809-20160128150956_dvt.f

6678: kplr009602538-20160128150956_dvt.fits - 1
6679: kplr009630640-20160128150956_dvt.fits - 1
6680: kplr009635529-20160128150956_dvt.fits - 1
6681: kplr009640123-20160128150956_dvt.fits - 1
6682: kplr010490960-20160128150956_dvt.fits - 1
6683: kplr010491031-20160128150956_dvt.fits - 1
6684: kplr010535807-20160128150956_dvt.fits - 1
6685: kplr010535858-20160128150956_dvt.fits - 1
6686: kplr010546063-20160128150956_dvt.fits - 1
6687: kplr010547685-20160128150956_dvt.fits - 1
6688: kplr010549576-20160128150956_dvt.fits - 1
6689: kplr010552700-20160128150956_dvt.fits - 1
6690: kplr011619964-20160128150956_dvt.fits - 1
6691: kplr011651105-20160128150956_dvt.fits - 1
6692: kplr011662440-20160128150956_dvt.fits - 1
6693: kplr011672354-20160128150956_dvt.fits - 1
6694: kplr002708445-20160128150956_dvt.fits - 1
6695: kplr002708614-20160128150956_dvt.fits - 1
6696: kplr002711114-20160128150956_dvt.fits - 1
6697: kplr002720354-20160128150956_dvt.fits - 1
6698: kplr002834463-20160128150956_dvt.f

6854: kplr006386784-20160128150956_dvt.fits - 1
6855: kplr006390236-20160128150956_dvt.fits - 1
6856: kplr006421188-20160128150956_dvt.fits - 1
6857: kplr007116236-20160128150956_dvt.fits - 1
6858: kplr007117513-20160128150956_dvt.fits - 1
6859: kplr007117541-20160128150956_dvt.fits - 1
6860: kplr007118545-20160128150956_dvt.fits - 1
6861: kplr007125636-20160128150956_dvt.fits - 1
6862: kplr007128918-20160128150956_dvt.fits - 1
6863: kplr007132015-20160128150956_dvt.fits - 1
6864: kplr008088354-20160128150956_dvt.fits - 1
6865: kplr008091197-20160128150956_dvt.fits - 1
6866: kplr008094140-20160128150956_dvt.fits - 1
6867: kplr008094221-20160128150956_dvt.fits - 1
6868: kplr008097825-20160128150956_dvt.fits - 1
6869: kplr008098515-20160128150956_dvt.fits - 1
6870: kplr008098960-20160128150956_dvt.fits - 1
6871: kplr008104436-20160128150956_dvt.fits - 1
6872: kplr008822585-20160128150956_dvt.fits - 1
6873: kplr008823397-20160128150956_dvt.fits - 1
6874: kplr008845221-20160128150956_dvt.f

7029: kplr005978559-20160128150956_dvt.fits - 2
7030: kplr008838950-20160128150956_dvt.fits - 2
7031: kplr006867555-20160128150956_dvt.fits - 2
7032: kplr012208631-20160128150956_dvt.fits - 2
7033: kplr003346543-20160128150956_dvt.fits - 1
7034: kplr003352751-20160128150956_dvt.fits - 1
7035: kplr003356193-20160128150956_dvt.fits - 1
7036: kplr003427776-20160128150956_dvt.fits - 1
7037: kplr003437739-20160128150956_dvt.fits - 1
7038: kplr003441340-20160128150956_dvt.fits - 1
7039: kplr003448323-20160128150956_dvt.fits - 1
7040: kplr004863369-20160128150956_dvt.fits - 1
7041: kplr004863753-20160128150956_dvt.fits - 1
7042: kplr004908495-20160128150956_dvt.fits - 1
7043: kplr004912589-20160128150956_dvt.fits - 1
7044: kplr004914709-20160128150956_dvt.fits - 1
7045: kplr004914737-20160128150956_dvt.fits - 1
7046: kplr005471415-20160128150956_dvt.fits - 1
7047: kplr005471480-20160128150956_dvt.fits - 1
7048: kplr005479973-20160128150956_dvt.fits - 1
7049: kplr005513833-20160128150956_dvt.f

7204: kplr008283875-20160128150956_dvt.fits - 1
7205: kplr009119405-20160128150956_dvt.fits - 1
7206: kplr009163674-20160128150956_dvt.fits - 1
7207: kplr009172506-20160128150956_dvt.fits - 1
7208: kplr009179531-20160128150956_dvt.fits - 1
7209: kplr009965691-20160128150956_dvt.fits - 1
7210: kplr009971475-20160128150956_dvt.fits - 1
7211: kplr010000490-20160128150956_dvt.fits - 1
7212: kplr010002049-20160128150956_dvt.fits - 1
7213: kplr010014702-20160128150956_dvt.fits - 1
7214: kplr010019854-20160128150956_dvt.fits - 1
7215: kplr010020423-20160128150956_dvt.fits - 1
7216: kplr011027722-20160128150956_dvt.fits - 1
7217: kplr011029877-20160128150956_dvt.fits - 1
7218: kplr011068630-20160128150956_dvt.fits - 1
7219: kplr011076279-20160128150956_dvt.fits - 1
7220: kplr011092783-20160128150956_dvt.fits - 1
7221: kplr011124509-20160128150956_dvt.fits - 1
7222: kplr012506342-20160128150956_dvt.fits - 1
7223: kplr012555616-20160128150956_dvt.fits - 1
7224: kplr012557713-20160128150956_dvt.f

7394: kplr010461045-20160128150956_dvt.fits - 1
7395: kplr010493253-20160128150956_dvt.fits - 1
7396: kplr010515564-20160128150956_dvt.fits - 1
7397: kplr010321927-20160128150956_dvt.fits - 1
7398: kplr010321972-20160128150956_dvt.fits - 1
7399: kplr010322797-20160128150956_dvt.fits - 1
7400: kplr010331279-20160128150956_dvt.fits - 1
7401: kplr010338529-20160128150956_dvt.fits - 1
7402: kplr010340212-20160128150956_dvt.fits - 1
7403: kplr010383729-20160128150956_dvt.fits - 1
7404: kplr005709226-20160128150956_dvt.fits - 1
7405: kplr005775090-20160128150956_dvt.fits - 1
7406: kplr005791705-20160128150956_dvt.fits - 1
7407: kplr005802205-20160128150956_dvt.fits - 1
7408: kplr005812648-20160128150956_dvt.fits - 6
7409: kplr005858919-20160128150956_dvt.fits - 1
7410: kplr005872139-20160128150956_dvt.fits - 1
7411: kplr010519701-20160128150956_dvt.fits - 1
7412: kplr010664150-20160128150956_dvt.fits - 1
7413: kplr010735575-20160128150956_dvt.fits - 1
7414: kplr010753394-20160128150956_dvt.f

7586: kplr011612241-20160128150956_dvt.fits - 3
7587: kplr011661803-20160128150956_dvt.fits - 1
7588: kplr011661803-20160128150956_dvt.fits - 2
7589: kplr008608544-20160128150956_dvt.fits - 1
7590: kplr008608544-20160128150956_dvt.fits - 2
7591: kplr008618147-20160128150956_dvt.fits - 1
7592: kplr008630840-20160128150956_dvt.fits - 1
7593: kplr008683144-20160128150956_dvt.fits - 1
7594: kplr008708961-20160128150956_dvt.fits - 1
7595: kplr008740744-20160128150956_dvt.fits - 1
7596: kplr007451315-20160128150956_dvt.fits - 1
7597: kplr007751294-20160128150956_dvt.fits - 1
7598: kplr007832787-20160128150956_dvt.fits - 1
7599: kplr008008913-20160128150956_dvt.fits - 1
7600: kplr008022520-20160128150956_dvt.fits - 1
7601: kplr008108450-20160128150956_dvt.fits - 2
7602: kplr008293631-20160128150956_dvt.fits - 1
7603: kplr009604563-20160128150956_dvt.fits - 1
7604: kplr009639021-20160128150956_dvt.fits - 1
7605: kplr009640649-20160128150956_dvt.fits - 1
7606: kplr009643210-20160128150956_dvt.f

In [19]:
# Create a dataframe from the list of records
df_fits = pd.DataFrame.from_records(data)

# Rename the columns in lower case and reorder them
header_keys = [key.lower() for key in header_keys]
extension_keys = [key.lower() for key in extension_keys]
df_fits.columns = ['kepoi_name', 'koi_disposition', 'kepid', 'filename'] + header_keys + extension_keys

print('Dataframe shape:', df_fits.shape)
df_fits.head()

Dataframe shape: (7732, 99)


Unnamed: 0,kepoi_name,koi_disposition,kepid,filename,origin,date,creator,procver,filever,timversn,...,impact,inclin,drratio,radratio,pradius,maxmes,maxses,ntrans,convrge,meddetr
0,K00752.01,CONFIRMED,10797460,kplr010797460-20160128150956_dvt.fits,NASA/Ames,2016-03-01,1165179 DvTimeSeriesExporter2PipelineModule,svn+ssh://murzim/repo/soc/tags/release/9.3.43 ...,2.0,OGIP/93-003,...,0.934177,84.169681,9.196221,0.028956,2.929099,28.470819,5.135849,142,True,17.639999
1,K00752.02,CONFIRMED,10797460,kplr010797460-20160128150956_dvt.fits,NASA/Ames,2016-03-01,1165179 DvTimeSeriesExporter2PipelineModule,svn+ssh://murzim/repo/soc/tags/release/9.3.43 ...,2.0,OGIP/93-003,...,0.71163,89.395301,67.428761,0.029346,2.968514,20.109507,7.027669,25,True,23.52
2,K00753.01,CANDIDATE,10811496,kplr010811496-20160128150956_dvt.fits,NASA/Ames,2016-03-01,1165192 DvTimeSeriesExporter2PipelineModule,svn+ssh://murzim/repo/soc/tags/release/9.3.43 ...,2.0,OGIP/93-003,...,0.929919,89.029114,54.880784,0.129501,12.266182,187.449097,37.159767,56,True,9.8
3,K00754.01,FALSE POSITIVE,10848459,kplr010848459-20160128150956_dvt.fits,NASA/Ames,2016-03-01,1165166 DvTimeSeriesExporter2PipelineModule,svn+ssh://murzim/repo/soc/tags/release/9.3.43 ...,2.0,OGIP/93-003,...,0.93347,74.673384,3.531577,0.108508,9.365962,541.895081,39.066551,621,True,11.76
4,K00755.01,CONFIRMED,10854555,kplr010854555-20160128150956_dvt.fits,NASA/Ames,2016-03-01,1165185 DvTimeSeriesExporter2PipelineModule,svn+ssh://murzim/repo/soc/tags/release/9.3.43 ...,2.0,OGIP/93-003,...,0.690533,85.604302,9.009601,0.023557,2.688826,33.191898,4.749945,515,True,8.82


The dataframe contains features other than the stellar properties and transit parameters that we will use to train models.  
These supplementary features could be usefull during the EDA and data cleaning processes.

## 2.3 Basic Cleaning

### 2.3.1 Drop the empty variables

In [20]:
# Get the variables with no value
vars_novalue = list(df_fits.columns[df_fits.isnull().all()])

# Drop variables with no value
df_fits.dropna(axis=1, how='all', inplace=True)

print('Empty variables dropped:', vars_novalue)
print('\nDataFrame Shape:', df_fits.shape)

Empty variables dropped: ['tierabso']

DataFrame Shape: (7732, 98)


### 2.3.2 Drop the variables with only one unique value

#### Object variables

In [21]:
# Get the object variables containing only one unique value
s_unique = df_fits.describe(include=np.object).T.loc[:, 'unique']
vars_unique = s_unique[s_unique == 1]

# Drop object variables with one unique values
df_fits.drop(axis=1, columns=vars_unique.index, inplace=True)

print('Object variables with only one unique value dropped:', list(vars_unique.index))
print('\nDataFrame Shape:', df_fits.shape)

Object variables with only one unique value dropped: ['origin', 'date', 'procver', 'filever', 'timversn', 'telescop', 'instrume', 'obsmode', 'mission', 'radesys', 'xmlstr', 'dvversn', 'timeref', 'tassign', 'timesys', 'timeunit', 'date-obs', 'date-end']

DataFrame Shape: (7732, 80)


#### Numeric variables

In [22]:
# Get statistics about the number variables
df_stats = df_fits.describe(include=np.number).T

# Variable with only on unique value (min value = max value)
vars_unique = df_stats.loc[(df_stats['min'] == df_stats['max']), :].index

# Drop object variables with only one unique values
df_fits.drop(axis=1, columns=vars_unique, inplace=True)

print('Numerical variables with only one unique value dropped:', list(vars_unique))
print('\nDataFrame Shape:', df_fits.shape)

Numerical variables with only one unique value dropped: ['data_rel', 'equinox', 'bjdrefi', 'bjdreff', 'lc_start', 'lc_end', 'deadc', 'timepixr', 'tierrela', 'int_time', 'readtime', 'frametim', 'num_frm', 'timedel', 'nreadout']

DataFrame Shape: (7732, 65)


#### Boolean variables

In [23]:
# Get the boolean variables containing only one unique value
s_unique = df_fits.describe(include=np.bool).T.loc[:, 'unique']
vars_unique = s_unique[s_unique == 1]

# Drop boolean variables with one unique values
df_fits.drop(axis=1, columns=vars_unique.index, inplace=True)

print('Boolean variables with only one unique value dropped:', list(vars_unique.index))
print('\nDataFrame Shape:', df_fits.shape)

Boolean variables with only one unique value dropped: ['backapp', 'deadapp', 'vignapp']

DataFrame Shape: (7732, 62)


### 2.3.3 Variables with more than 90% of missing values

In [24]:
threshold = 0.9 * df_fits.shape[0]
vars = df_fits.columns[df_fits.isnull().sum() > threshold]
df_fits.dropna(axis=1, thresh=threshold, inplace=True)

print('Variables dropped:', list(vars))
print('\nDataFrame Shape:', df_fits.shape)

Variables dropped: ['parallax', 'scpid']

DataFrame Shape: (7732, 60)


## 2.4 Save the data in the database

The dataframe contains features related to different themes:

* stellar properties
* transit parameters
* magnitudes at different spectral band
* location and motion
* pipeline statistics

We don't plan to use all these features but we will keep them and store them in different database tables to facilitate our future select queries during the EDA. Features related to the hosting stars will be stored in tables with the KIC number (`kepid` variable) as their primary unique key. Features related to the TCEs will be stored in tables with the KOI name (`kepoi_name` variable) as their primary key.

We are not building a datawarehouse so we autorize ourself to build a denormalized schema and skip some database best practices...

### 2.4.1 Save the stellar properties in a database table

In [25]:
# 1. CREATE A DATAFRAME WITH ONLY THE STELLAR PROPERTIES VARIABLES
# ----------------------------------------------------------------

# Stellar properties variables coming from the primary header
cols = [
    'kepid',
    'teff',
    'logg',
    'feh',
    'radius',
    'ebminusv',
    'av',
    'numtces',
    'quarters'
]

# Create a dataframe with only the stellar properties variables 
df = df_fits.reindex(columns=cols)

# Drop the duplicate records (because there are stars hosting multiple TCEs)
df.drop_duplicates(subset=['kepid'], keep='first', inplace=True)



# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS stellar_properties;')

# Create the stellar_properties table
query = '''
CREATE TABLE stellar_properties
(
    kepid TEXT PRIMARY KEY,
    teff REAL,
    logg REAL,
    feh REAL,
    radius REAL,
    ebminusv REAL,
    av REAL,
    numtces  INTEGER,
    quarters TEXT
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='stellar_properties', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    stellar_properties;
'''
df = pd.read_sql_query(query, db)
print('Stellar Properties Database Table Shape: ', df.shape)
df.head()

Stellar Properties Database Table Shape:  (6606, 9)


Unnamed: 0,kepid,teff,logg,feh,radius,ebminusv,av,numtces,quarters
0,10797460,5850.0,4.426,0.14,1.04,0.142,0.441,2,11111111111111111000000000000000
1,10811496,5853.0,4.544,-0.18,0.868,0.12,0.373,1,11111101110111011000000000000000
2,10848459,5795.0,4.546,-0.52,0.803,0.122,0.378,1,11111110111011101000000000000000
3,10854555,6031.0,4.438,0.07,1.046,0.142,0.44,1,01111111111111111000000000000000
4,10872983,6046.0,4.486,-0.08,0.972,0.168,0.522,3,01111101110111011000000000000000


We can see that we have 7'732 TCEs around 6'606 stars.

### 2.4.2 Save the magnitude features in a database table

In [26]:
# 1. CREATE A DATAFRAME WITH ONLY THE MAGNITUDE VARIABLES
# -------------------------------------------------------

# Magnitude variables coming from the primary header
cols = [
    'kepid',
    'kepmag',
    'd51mag',
    'gmag',
    'hmag',
    'imag',
    'jmag',
    'kmag',
    'rmag',
    'zmag',
    'grcolor',
    'jkcolor',
    'gkcolor', 
]

# Create a dataframe with only the magnitude variables 
df = df_fits.reindex(columns=cols)

# Drop the duplicate records (because there are stars hosting multiple TCEs)
df.drop_duplicates(subset=['kepid'], keep='first', inplace=True)



# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS magnitudes;')

# Create the magnitudes table
query = '''
CREATE TABLE magnitudes
(
    kepid TEXT PRIMARY KEY,
    kepmag REAL,
    d51mag REAL,
    gmag REAL,
    hmag REAL,
    imag REAL,
    jmag REAL,
    kmag REAL,
    rmag REAL,
    zmag REAL,
    grcolor REAL,
    jkcolor REAL,
    gkcolor REAL  
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='magnitudes', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    magnitudes;
'''
df = pd.read_sql_query(query, db)
print('Magnitudes Database Table Shape: ', df.shape)
df.head()

Magnitudes Database Table Shape:  (6606, 13)


Unnamed: 0,kepid,kepmag,d51mag,gmag,hmag,imag,jmag,kmag,rmag,zmag,grcolor,jkcolor,gkcolor
0,10797460,15.347,15.636,15.89,13.75,15.114,14.082,13.647,15.27,15.006,0.62,0.435,2.243
1,10811496,15.436,15.733,15.943,13.9,15.22,14.253,13.826,15.39,15.166,0.553,0.427,2.117
2,10848459,15.597,15.895,16.1,13.91,15.382,14.326,13.808,15.554,15.266,0.546,0.518,2.292
3,10854555,15.509,15.799,16.015,14.064,15.292,14.365,13.951,15.468,15.241,0.547,0.414,2.064
4,10872983,15.714,16.0,16.234,14.112,15.492,14.528,14.131,15.677,15.441,0.557,0.397,2.103


### 2.4.3 Save the location and motion features in a database table

In [27]:
# 1. CREATE A DATAFRAME WITH ONLY THE LOCATION AND MOTION VARIABLES
# -----------------------------------------------------------------

# Location and motion variables coming from the primary header
cols = [
    'kepid',
    'skygroup',
    'ra_obj',
    'dec_obj',
    'glon',
    'glat',
    'pmra',
    'pmdec',
    'pmtotal'
]

# Create a dataframe with only the location and motion variables 
df = df_fits.reindex(columns=cols)

# Drop the duplicate records (because there are stars hosting multiple TCEs)
df.drop_duplicates(subset=['kepid'], keep='first', inplace=True)



# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS location_motion;')

# Create the location_motion table
query = '''
CREATE TABLE location_motion
(
    kepid TEXT PRIMARY KEY,
    skygroup INTEGER, 
    ra_obj REAL,
    dec_obj REAL,
    glon REAL,
    glat REAL,
    pmra REAL,
    pmdec REAL,
    pmtotal REAL
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='location_motion', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    location_motion;
'''
df = pd.read_sql_query(query, db)
print('Location and Motion Database Table Shape: ', df.shape)
df.head()

Location and Motion Database Table Shape:  (6606, 9)


Unnamed: 0,kepid,skygroup,ra_obj,dec_obj,glon,glat,pmra,pmdec,pmtotal
0,10797460,18,291.93423,48.14165,80.102268,14.237782,0.0,0.0,0.0
1,10811496,34,297.00482,48.13413,81.651714,11.21084,0.0,0.0,0.0
2,10848459,6,285.534611,48.28521,78.520972,18.224196,0.0,0.0,0.0
3,10854555,24,288.75489,48.2262,79.293537,16.210389,0.0,0.0,0.0
4,10872983,34,296.28614,48.22467,81.50266,11.676136,0.002,-0.006,0.0063


The following primary header features can be ommited without risk because they are catalog IDs or description
* keplerid
* object
* tmindex
* creator

### 2.4.4 Save the transit parameters in a database table

In [28]:
# 1. CREATE A DATAFRAME WITH ONLY THE TRANSIT PARAMETER VARIABLES
# ----------------------------------------------------------------

# Transit parameter variables coming from the TCE extension headers
cols = [
    'kepoi_name',
    'tperiod',
    'tdepth',
    'tdur',
    'indur',
    'impact',
    'inclin',
    'drratio',
    'radratio',
    'pradius',
]

# Create a dataframe with only the transit parameter variables 
df = df_fits.reindex(columns=cols)


# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS transit_parameters;')

# Create the transit_parameters table
query = '''
CREATE TABLE transit_parameters
(
    kepoi_name TEXT PRIMARY KEY,
    tperiod REAL,
    tdepth REAL,
    tdur REAL,
    indur REAL,
    impact REAL,
    inclin REAL,
    drratio REAL,
    radratio REAL,
    pradius REAL
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='transit_parameters', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    transit_parameters;
'''
df = pd.read_sql_query(query, db)
print('Transit Parameters Database Table Shape: ', df.shape)
df.head()

Transit Parameters Database Table Shape:  (7732, 10)


Unnamed: 0,kepoi_name,tperiod,tdepth,tdur,indur,impact,inclin,drratio,radratio,pradius
0,K00752.01,9.488028,630.38163,3.418783,0.659365,0.934177,84.169681,9.196221,0.028956,2.929099
1,K00752.02,54.418374,893.123959,4.585753,0.257789,0.71163,89.395301,67.428761,0.029346,2.968514
2,K00753.01,19.899143,10630.409762,1.776158,0.888079,0.929919,89.029114,54.880784,0.129501,12.266182
3,K00754.01,1.736955,7874.849223,2.34129,1.170645,0.93347,74.673384,3.531577,0.108508,9.365962
4,K00755.01,2.525622,578.297482,1.624692,0.070214,0.690533,85.604302,9.009601,0.023557,2.688826


### 2.4.5 Save the TCE pipeline statistics in a database table

In [29]:
# 1. CREATE A DATAFRAME WITH ONLY THE PIPELINE STATISTICS VARIABLES
# -----------------------------------------------------------------

# Pipeline statistics variables coming from the TCE extension headers
cols = [
    'kepoi_name',
    'tsnr',
    'maxmes',
    'maxses',
    'meddetr',
    'cdpp3_0',
    'cdpp6_0',
    'cdpp12_0',
    'ntrans',
    'convrge'
]

# Create a dataframe with only the pipeline statistics variables 
df = df_fits.reindex(columns=cols)


# 2. CREATE THE DATABASE TABLE
# ----------------------------

# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS pipeline_statistics;')

# Create the pipeline_statistics table
query = '''
CREATE TABLE pipeline_statistics
(
    kepoi_name TEXT PRIMARY KEY,
    tsnr REAL,
    maxmes REAL,
    maxses REAL,
    meddetr REAL,
    cdpp3_0 REAL,
    cdpp6_0 REAL,
    cdpp12_0 REAL,
    ntrans INTEGER,
    convrge BOOLEAN
);
'''
cursor.execute(query)


# 3. LOAD THE DATA IN THE TABLE
# -----------------------------

# Insert the source data in the fits_stellar_properties database table
df.to_sql(name='pipeline_statistics', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    pipeline_statistics;
'''
df = pd.read_sql_query(query, db)
print('Pipeline Statistics Database Table Shape: ', df.shape)
df.head()

Pipeline Statistics Database Table Shape:  (7732, 10)


Unnamed: 0,kepoi_name,tsnr,maxmes,maxses,meddetr,cdpp3_0,cdpp6_0,cdpp12_0,ntrans,convrge
0,K00752.01,31.178432,28.470819,5.135849,17.639999,224.206421,184.823334,163.164536,142,1
1,K00752.02,20.690699,20.109507,7.027669,23.52,224.206421,184.823334,163.164536,25,1
2,K00753.01,181.560364,187.449097,37.159767,9.8,285.547546,257.448669,271.064423,56,1
3,K00754.01,530.372681,541.895081,39.066551,11.76,242.146011,183.0672,140.485062,621,1
4,K00755.01,37.310295,33.191898,4.749945,8.82,272.847504,222.750381,192.141464,515,1


The following TCE extension features can be ommited without risk because they are experimental description features

* tepoch: transit epoch in bkjd
* tstart: observation start time in BJD-BJDREF
* tstop: observation stop time in BJD-BJDREF
* telapse: TSTOP - TSTART
* livetime: TELAPSE multiplied by DEADC
* exposure: time on source

# 3. TCE status database table 

During the EDA, we will drop observations. Keeping track of them and the reason they have been dropped is a good practice and easy way to explain and/or justify models results, expected or not. So, we will add a status technical table hosting one record per TCE, indicating if it has been dropped during the EDA and for which reason.

In [30]:
# Drop the table if it already exists (easy to iterate in development)
cursor.execute('DROP TABLE IF EXISTS tce_status;')

query = '''
CREATE TABLE tce_status
(
    kepoi_name TEXT PRIMARY KEY,
    kepid TEXT NOT NULL,
    excluded BOOLEAN DEFAULT 0,
    exclusion_reason TEXT DEFAULT "",
    FOREIGN KEY (kepid) REFERENCES stellar_properties (kepid)
        ON DELETE CASCADE ON UPDATE NO ACTION,
    FOREIGN KEY (kepid) REFERENCES magnitudes (kepid)
        ON DELETE CASCADE ON UPDATE NO ACTION,
    FOREIGN KEY (kepid) REFERENCES location_motion (kepid)
        ON DELETE CASCADE ON UPDATE NO ACTION        
);
'''
cursor.execute(query)

<sqlite3.Cursor at 0x11186c490>

In [32]:
# Load the table with the list of KOIs
kois.loc[:, ['kepoi_name', 'kepid']].to_sql(name='tce_status', con=db, if_exists='append', index=False)

# Check the loading
query = '''
SELECT *
FROM
    tce_status;
'''
df = pd.read_sql_query(query, db)
print('TCE Status Database Table Shape: ', df.shape)
df.head()

TCE Status Database Table Shape:  (7732, 4)


Unnamed: 0,kepoi_name,kepid,excluded,exclusion_reason
0,K00752.01,10797460,0,
1,K00752.02,10797460,0,
2,K00753.01,10811496,0,
3,K00754.01,10848459,0,
4,K00755.01,10854555,0,


In [33]:
# Commit and close the database
db.commit()
db.close()