<!--BOOK_INFORMATION-->
<img align="left" style="padding-right:10px;" src="figures/k2_pix_small.png">
*This notebook contains an excerpt instructional material from [gully](https://twitter.com/gully_) and the [K2 Guest Observer Office](https://keplerscience.arc.nasa.gov/); the content is available [on GitHub](https://github.com/gully/k2-metadata).*


<!--NAVIGATION-->
< [Munge metadata into tidy dataframes](01.00-Munge-metadata-into-tidy-dataframes.ipynb) | [Contents](Index.ipynb) | [EPIC Catalog Column Descriptions](01.02-EPIC_catalog_column_descriptions.ipynb) >

# K2 Guest Observer Proposal Information

Things to say:

- Where and how to download all of the files
- Dissect the GO proposal IDs
- Read in the files
- Note perceived redundancy of RA and DEC information with EPIC catalog
- Note that some targets were or will-be re-observed in multiple campaigns, so we must include the campaign alongside the EPIC ID.
- Save the combined file

In [1]:
import pandas as pd

In [2]:
import glob

In [3]:
#! pip install natsort
import natsort

We want all of the csv files in the `proposed_targets` directory.

In [4]:
fns = glob.glob('../metadata/proposed_targets/*.csv')

In [6]:
fns = natsort.natsorted(fns)

In [6]:
df_all = pd.DataFrame()

In [7]:
for fn in fns:
    id_min, id_max = fn.rfind('K2Campaign')+10, fn.find('targets.csv')
    campaign = fn[id_min:id_max]
    df = pd.read_csv(fn)
    # We need to clean the columns due to errant Campaign 1 whitespace
    df.rename(columns={col:col.strip(' ') for col in df.columns}, inplace=True)
    df['campaign'] = campaign
    df_all = df_all.append(df, ignore_index=True)
    print(campaign, end=' ')

0 1 2 3 4 5 6 7 8 9a 9b 10 11 12 13 14 15 

In [8]:
df_all.columns

Index(['EPIC ID', 'RA (J2000) [deg]', 'Dec (J2000) [deg]', 'magnitude',
       'Investigation IDs', 'campaign'],
      dtype='object')

In [9]:
df_all.iloc[70000:70005]

Unnamed: 0,EPIC ID,RA (J2000) [deg],Dec (J2000) [deg],magnitude,Investigation IDs,campaign
70000,210647774,54.196178,17.63787,11.286,GO4060_LC|GO4033_LC|GO4007_LC,4
70001,210647804,51.69544,17.638214,13.587,GO4020_LC|GO4060_LC|GO4011_LC,4
70002,210647813,52.355005,17.638277,12.824,GO4029_LC|GO4033_LC|GO4007_LC,4
70003,210647818,56.497824,17.638409,17.511,GO4011_LC,4
70004,210648137,58.730242,17.642622,13.075,GO4020_LC|GO4060_LC|GO4007_LC,4


In [10]:
df_all.shape

(426130, 6)

In [12]:
df_all.to_csv('../metadata/tidy/GO_proposal_metadata.csv', index=False)

In [13]:
! du -hs ../metadata/tidy/GO_proposal_metadata.csv

 24M	../metadata/tidy/GO_proposal_metadata.csv


All done, the combined file is 24 MB.

<!--NAVIGATION-->
< [Munge metadata into tidy dataframes](01.00-Munge-metadata-into-tidy-dataframes.ipynb) | [Contents](Index.ipynb) | [EPIC Catalog Column Descriptions](01.02-EPIC_catalog_column_descriptions.ipynb) >