<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="figures/k2_pix_small.png">
*This notebook contains an excerpt instructional material from [gully](https://twitter.com/gully_) and the [K2 Guest Observer Office](https://keplerscience.arc.nasa.gov/); the content is available [on GitHub](https://github.com/gully/k2-metadata).*


<!--NAVIGATION-->
< [K2 Guest Observer Proposal Information](01.01-Concatenate-Guest-Observer-proposals.ipynb) | [Contents](Index.ipynb) | [EPIC Catalog read in and save to feather](01.03-Read-EPIC-catalog-faster.ipynb) >

# EPIC Catalog Column Descriptions

The **Ecliptic Plane Input Catalog** (EPIC) has includes 67 pieces of metadata for virtually any conceivable stellar/galactic target for K2.  [Huber et al. 2016](http://adsabs.harvard.edu/abs/2016ApJS..224....2H) defined the catalog in detail.  In this and following Jupyter notebooks we show some techniques for working with the large catalog.

Things to say:
- Relevance of "k2_observed flag"
- Need for Datatypes for saving memory while reading in
- Reading in from the ApJ instead of from the crummy README file?

In [1]:
import pandas as pd
pd.set_option('display.max_rows',70)
import numpy as np

## Catalog metadata and column descriptions

In [2]:
epic_readme = '../metadata/EPIC_catalog/README_epic.txt'

This file was not intended to be read as a conventional csv file.  Nevertheless, we can abuse pandas into reading it in two parts.

In [3]:
first3cols = pd.read_csv(epic_readme, skiprows=26, delimiter=r"\s+", usecols=[0,1,2], names='abc')
lastcol = pd.read_csv(epic_readme, skiprows=26, delimiter=r"\s\s\s+",usecols=[2], names='bcd', engine='python')
bad_rows = (first3cols.a == '###') | (first3cols.a.str.contains('---'))

In [4]:
df_comb = pd.concat([first3cols, lastcol], axis=1)[~bad_rows].reset_index(drop=True)

In [5]:
df_comb['a']=df_comb.a.str.strip('#').astype(int)

In [6]:
df_comb = df_comb.rename(columns={'a': 'col_num', 'b':'col_name', 
                        'c': 'data_format', 'd':'description'})

In [7]:
df_comb = df_comb.set_value(8, 'description', '... Stellar Properties Flag [not used]')
df_comb = df_comb.set_value(66, 'data_format', 'bool')
df_comb = df_comb.set_value(66, 'description', '1=target was observed, 0=target not observed')

In [8]:
df_comb.data_format.str[0].unique()

array(['I', 'A', 'D', 'E', 'b'], dtype=object)

> The lack of NaN rep in integer columns is a pandas "gotcha". 
- https://stackoverflow.com/questions/21287624/convert-pandas-column-containing-nans-to-dtype-int

In [10]:
df_comb

Unnamed: 0,col_num,col_name,data_format,description
0,1,ID,I10,... K2 Input Catalog Identifier
1,2,HIP,I8,... Hipparcos Identifier
2,3,TYC,A15,... Tycho2 Identifier
3,4,UCAC,A15,... UCAC4 Identifier
4,5,2MASS,A20,... 2MASS Identifier
5,6,SDSS,A20,... SDSS DR9 Identifier
6,7,Objtype,A10,"... Object Type [STAR,EXTENDED]"
7,8,Kepflag,A5,"... Kepler Magnitude Flag [gri,BV,JHK,J]"
8,9,StpropFlag,A5,... Stellar Properties Flag [not used]
9,10,RA,D10.6,... Right Ascension JD2000 (Deg)


In [9]:
df_comb['name'] = df_comb.col_name.str.lower()

In [10]:
df_comb.to_csv('../metadata/EPIC_catalog/column_dtypes.csv', index=False)

The end

<!--NAVIGATION-->
< [K2 Guest Observer Proposal Information](01.01-Concatenate-Guest-Observer-proposals.ipynb) | [Contents](Index.ipynb) | [EPIC Catalog read in and save to feather](01.03-Read-EPIC-catalog-faster.ipynb) >