# Explore NASA Exoplanet Data 

> J. Colliander  
> 2022-06-08 (at [3rd Jack Eddy Symposium](https://cpaess.ucar.edu/meetings/eddy-symposium-2022)).

The NASA Exoplanet Institute at Caltech hosts the [NASA Exoplanet Archive](https://exoplanetarchive.ipac.caltech.edu/). This resource provides the data on known exoplanets and mobilizes this data with an [applied programming interface (API)](https://exoplanetarchive.ipac.caltech.edu/docs/program_interfaces.html). The [API Queries](https://exoplanetarchive.ipac.caltech.edu/docs/API_queries.html) page provides example calls. 

The goals for this short notebook: 

+ ingest NASA Exoplanet data via API
+ transform data into Pandas Dataframe
+ start exploration of data

In [1]:
# gather some tools
import pandas as pd

In [2]:
# This line reads in the data (as rendered in CSV format) from the Archive's API into a Pandas dataframe.
# Current version of this call generates an error message that I choose to ignore for now...
df = pd.read_csv('https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+*+from+ps&format=csv')

  df = pd.read_csv('https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+*+from+ps&format=csv')


In [3]:
# Dump the data into a table
df

Unnamed: 0,pl_name,pl_letter,hostname,hd_name,hip_name,tic_id,gaia_id,default_flag,pl_refname,sy_refname,...,sy_jmagerr1,sy_jmagerr2,sy_jmagstr,sy_hmag,sy_hmagerr1,sy_hmagerr2,sy_hmagstr,sy_kmag,sy_kmagerr1,sy_kmagerr2
0,Kepler-11 c,c,Kepler-11,,,TIC 169175503,Gaia DR2 2076960598545789824,0,<a refstr=LISSAUER_ET_AL__2011 href=https://ui...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.024,-0.024,12.548&plusmn;0.024,12.237,0.024,-0.024,12.237&plusmn;0.024,12.180,0.020,-0.020
1,Kepler-11 f,f,Kepler-11,,,TIC 169175503,Gaia DR2 2076960598545789824,0,<a refstr=LISSAUER_ET_AL__2011 href=https://ui...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.024,-0.024,12.548&plusmn;0.024,12.237,0.024,-0.024,12.237&plusmn;0.024,12.180,0.020,-0.020
2,OGLE-TR-10 b,b,OGLE-TR-10,,,TIC 130150682,Gaia DR2 4056443366649948160,1,<a refstr=TORRES_ET_AL__2008 href=https://ui.a...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,,,13.692,13.314,0.121,-0.121,13.314&plusmn;0.121,12.856,,
3,HD 210702 b,b,HD 210702,HD 210702,HIP 109577,TIC 456826468,Gaia DR2 1775004778213735168,0,<a refstr=BOWLER_ET_AL__2010 href=https://ui.a...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.320,-0.320,4.508&plusmn;0.320,3.995,0.226,-0.226,3.995&plusmn;0.226,3.984,0.294,-0.294
4,BD-08 2823 b,b,BD-08 2823,,HIP 49067,TIC 33355302,Gaia DR2 3770419611540574080,0,<a refstr=HEBRARD_ET_AL__2010 href=https://ui....,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.020,-0.020,7.96&plusmn;0.02,7.498,0.047,-0.047,7.498&plusmn;0.047,7.323,0.021,-0.021
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32537,Kepler-381 c,c,Kepler-381,,,TIC 164884235,Gaia DR2 2105835281411929728,0,<a refstr=Q1_Q17_DR25_KOI_TABLE href=https://e...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.020,-0.020,9.721&plusmn;0.020,9.547,0.018,-0.018,9.547&plusmn;0.018,9.504,0.014,-0.014
32538,Kepler-1851 b,b,Kepler-1851,,,TIC 352013607,Gaia DR2 2106494580370491392,1,<a refstr=VALIZADEGAN_ET_AL__2021 href=https:/...,<a refstr=STASSUN_ET_AL__2019 href=https://ui....,...,0.039,-0.039,14.524&plusmn;0.039,14.141,0.043,-0.043,14.141&plusmn;0.043,13.962,0.052,-0.052
32539,KMT-2017-BLG-2509L b,b,KMT-2017-BLG-2509L,,,,,1,<a refstr=HAN_ET_AL__2021 href=https://ui.adsa...,<a refstr=HAN_ET_AL__2021 href=https://ui.adsa...,...,,,,,,,,,,
32540,OGLE-2017-BLG-1099L b,b,OGLE-2017-BLG-1099L,,,,,1,<a refstr=HAN_ET_AL__2021 href=https://ui.adsa...,<a refstr=HAN_ET_AL__2021 href=https://ui.adsa...,...,,,,,,,,,,


In [4]:
df.describe()

Unnamed: 0,default_flag,disc_year,ra,dec,glon,glat,elon,elat,pl_orbper,pl_orbpererr1,...,sy_vmagerr2,sy_jmag,sy_jmagerr1,sy_jmagerr2,sy_hmag,sy_hmagerr1,sy_hmagerr2,sy_kmag,sy_kmagerr1,sy_kmagerr2
count,32542.0,32542.0,32542.0,32542.0,32542.0,32542.0,32542.0,32542.0,29734.0,28493.0,...,32150.0,32121.0,32042.0,32042.0,32150.0,32033.0,32033.0,32147.0,31939.0,31939.0
mean,0.154723,2015.023539,270.195183,35.02763,97.241498,10.924461,284.021055,51.667717,14224.33,16686.66,...,-0.123051,12.264786,0.029814,-0.029576,11.884567,0.031071,-0.030926,11.799573,0.035792,-0.035218
std,0.361646,3.488738,61.797283,23.854925,63.911938,18.547882,65.967952,30.58565,2332177.0,2784470.0,...,0.188271,2.183902,0.132986,0.125557,2.19671,0.090701,0.093511,2.211296,0.189299,0.179215
min,0.0,1989.0,0.185606,-88.121111,0.03925,-88.32478,0.44407,-87.16372,0.09070629,0.0,...,-12.27,-2.095,0.017,-8.888,-2.775,0.014,-8.99,-3.044,0.011,-11.14
25%,0.0,2014.0,284.828058,39.249173,73.44989,9.76833,296.60281,59.6699,4.662719,9.34e-06,...,-0.137,11.59,0.022,-0.027,11.222,0.021,-0.031,11.155,0.02,-0.035
50%,0.0,2016.0,290.248102,43.050038,77.00796,12.89296,305.42853,63.52628,10.9947,4.125e-05,...,-0.092,12.871,0.024,-0.024,12.496,0.025,-0.025,12.407,0.025,-0.025
75%,0.0,2016.0,294.614791,46.726681,80.84382,16.41182,313.24374,66.94812,27.50868,0.0002028,...,-0.057,13.719,0.027,-0.022,13.3,0.031,-0.021,13.234,0.035,-0.02
max,1.0,2022.0,359.974984,85.736533,359.99627,86.47046,359.90117,87.18291,402000000.0,470000000.0,...,-0.001,25.34,8.888,-0.017,33.83,6.99,-0.014,35.33,9.995,-0.011


In [5]:
HostStars = pd.read_json('https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+distinct+hostname+from+ps+order+by+hostname+asc&format=json')

In [6]:
HostStars

Unnamed: 0,hostname
0,11 Com
1,11 UMi
2,14 And
3,14 Her
4,16 Cyg B
...,...
3770,tau Cet
3771,tau Gem
3772,ups And
3773,ups Leo


In [7]:
SinglePlanetarySolutions = pd.read_json('https://exoplanetarchive.ipac.caltech.edu/TAP/sync?query=select+*+from+pscomppars&format=json')

In [8]:
SinglePlanetarySolutions.columns

Index(['pl_name', 'pl_letter', 'hostname', 'hd_name', 'hip_name', 'tic_id',
       'disc_pubdate', 'disc_year', 'discoverymethod', 'disc_locale',
       ...
       'sy_pmstr', 'sy_pm_reflink', 'sy_pmra', 'sy_pmraerr1', 'sy_pmraerr2',
       'sy_pmrastr', 'x', 'y', 'z', 'htm20'],
      dtype='object', length=373)

In [9]:
SinglePlanetarySolutions.describe()

Unnamed: 0,disc_year,ra,dec,glon,glat,elon,elat,pl_orbper,pl_orbpererr1,pl_orbpererr2,...,sy_pm,sy_pmerr1,sy_pmerr2,sy_pmra,sy_pmraerr1,sy_pmraerr2,x,y,z,htm20
count,5035.0,5035.0,5035.0,5035.0,5035.0,5035.0,5035.0,4867.0,4377.0,4377.0,...,4887.0,4884.0,4870.0,4887.0,4884.0,4870.0,5035.0,5035.0,5035.0,5035.0
mean,2015.442701,240.958811,21.706373,127.938011,7.009003,251.494434,33.09131,86451.06,108588.7,-24020.73,...,100.789773,0.173303,-0.171665,10.943209,0.175484,-0.173328,0.12017,-0.362206,0.341133,47102520.0
std,4.202233,88.144619,34.059262,92.047449,27.863722,92.571857,42.099059,5764398.0,7104319.0,1512570.0,...,358.830489,0.644031,0.643132,289.224983,0.646237,0.643208,0.443836,0.517565,0.522758,1267325000.0
min,1989.0,0.185606,-88.121111,0.03925,-88.32478,0.44407,-87.16372,0.09070629,0.0,-100000000.0,...,0.158639,0.017492,-8.0,-3781.31,0.014895,-8.0,-0.99939,-0.999976,-0.999462,-2145840000.0
25%,2014.0,192.580917,-2.105242,73.92797,3.81952,201.51199,-2.56827,4.467694,1.6e-05,-0.001428,...,7.760034,0.039905,-0.073743,-8.869035,0.03905,-0.078768,0.022936,-0.692006,-0.036735,-1035475000.0
50%,2016.0,286.999476,40.548388,79.06848,12.11636,300.52004,60.34394,11.55103,0.0001,-0.0001,...,16.588369,0.051459,-0.051407,-1.13782,0.052132,-0.051896,0.236552,-0.628725,0.65009,4186393.0
75%,2018.0,293.822755,45.614334,189.098825,17.29204,311.61896,65.68793,39.46718,0.001431,-1.6e-05,...,51.172173,0.07423,-0.03986,6.143395,0.079129,-0.038939,0.310958,-0.096003,0.714648,1201603000.0
max,2022.0,359.974984,85.736533,359.99627,86.47046,359.90117,87.18291,402000000.0,470000000.0,-0.0,...,8644.904613,8.0,-0.017492,6767.26,8.0,-0.014895,0.997391,0.998079,0.997233,2146307000.0
