# `Starskøpe`


## $\phi$  **Exoplanet Hunting with Deep Boltzman Machines**

<i class="fa fa-flask "></i> __doctree__ <i class="fa fa-tree "></i><br>

# <i class="fa fa-rocket "></i>`STARSKØPE`<br>
`-------------------------------------------------------`<br>
`|:`<i class="fa fa-lightbulb-o"></i> dev<br>
`|:` <i class="fa fa-android"></i> machine_learning <br>
`|:` <i class="fa fa-bullseye"></i> model <br>
`|:`<i class="fa fa-bomb "></i> tensorflow<br>
`|:`<i class="fa fa-television"> k2 </i><br>
`|:`<i class="fa fa-github"></i> winterdelta <br>


>Capstone Project 
* Student name: Ru Keïn
* Student pace: full time
* Project review: 3/20/2020
* Instructor name: James Irving, PhD
* Blog post:
* Presentation:


<i class="fa fa-tree "></i> __doctree__ <i class="fa fa-tree "></i><br>
`---------`

<i class="fa fa-rocket ">`STARSKØPE`</i><br>
`|::::`<i class="fa fa-television"> k2 </i><br>
`|:::::`<i class="fa fa-github"></i> winterdelta <br>


# Import

In [29]:
# Import libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from matplotlib.colors import LogNorm

# default style/formatting
import matplotlib as mpl
%matplotlib inline
import matplotlib.pyplot as pl
import seaborn as sns
sns.set_style('whitegrid')
plt.style.use('seaborn-bright')

font_dict={'family':'TitilliumWeb',
          'size':16}
mpl.rc('font',**font_dict)

In [None]:
# set path for retrieving and storing files
import os
import sys
curdir = 

In [31]:
%ls


 kepler3-exodata.ipynb
Analyzing-UVES-Spectroscopy-Astropy.ipynb
FITS-cubes.ipynb
FITS-image-import-astropy.ipynb
Icon?
MAST-Astroquery.ipynb
TensorflowLearn.ipynb
UVES.ipynb
astroTime.ipynb
fast-fourier-transform.ipynb
hubble-hst-aws-api.ipynb
starskøpe-1.0.ipynb


In [20]:
# astro.py for extremely useful astronomical tools
# uncomment items below to install if needed:

# !pip install astropy astroquery
# !pip install astropy spectral-cube
# !pip install astropy reproject

import astropy.units as u

In [25]:
# import our initial dataset from Kaggle 
# kaggle datasets download -d keplersmachines/kepler-labelled-time-series-data

In [26]:
exo_kepler

'https://www.kaggle.com/keplersmachines/kepler-labelled-time-series-data/download/ybmp4TmQjzLj0MHVtWwB%2Fversions%2FbGfiep8n3X7lV4JvuNeG%2Ffiles%2FexoTest.csv?datasetVersionNumber=3'

In [22]:

train_df = pd.read_csv('exoTrain.csv')
test_df = pd.read_csv('exoTest.csv')

FileNotFoundError: [Errno 2] File b'exoTrain.csv' does not exist: b'exoTrain.csv'

# Scrub

In [None]:
train_df.shape
test_df.shape

In [None]:
# checking for imbalanced classes
train_df['LABEL'].value_counts()
test_df['LABEL'].value_counts()

In [17]:
# look at the first Star in the dataset
star0 = train_df.iloc[0, :]
star0.head()

In [7]:
# Scatter Plot For First Star
plt.figure(figsize=(15, 5))
plt.scatter(pd.Series([i for i in range(1, len(star0))]), star0[1:])
plt.ylabel('Flux')
plt.xlabel('Time')
plt.title('Flux for Star 0 - scatterplot')
plt.show()

In [None]:
# Line Plot For First Star
plt.figure(figsize=(15, 5))
plt.plot(pd.Series([i for i in range(1, len(star0))]), star0[1:])
plt.ylabel('Flux')
plt.xlabel('Time')
plt.title('Flux for Star 0 - line plot')
plt.show()

In [5]:


from astropy.utils.data import download_file
from astropy.io import fits  # We use fits to open the actual data file
from astropy.utils import data
data.conf.remote_timeout = 60
from spectral_cube import SpectralCube
from astroquery.esasky import ESASky
from astroquery.utils import TableList
from astropy.wcs import WCS
from reproject import reproject_interp


 Time Series (at varying $\phi$)📓

In [None]:
# Converting times
# `astropy.time <http://docs.astropy.org/en/stable/time/index.html>`__ provides methods to convert times and dates between different systems and formats. Since the ESO FITS headers already contain the time of the observation in different systems, we could just read the keyword in the time system we like, but we will use astropy.time to make this conversion here. astropy.time.Time will parse many common input formats (strings, floats), but unless the format is unambiguous the format needs to be specified (e.g. a number could mean JD or MJD or year). Also, the time system needs to be given (e.g. UTC). Below are several examples, initialized from different header keywords.



from astropy.time import Time
t1 = Time(header['MJD-Obs'], format = 'mjd', scale = 'utc')
t2 = Time(header['Date-Obs'], scale = 'utc')

#Times can be expressed in different formats:

t1
t1.isot
t2

#can be converted to a different time system.
t1.tt

#<Time object: scale='tt' format='mjd' value=55784.97567650852>

#Times can also be initialized from arrays and we can calculate time differences.

obs_times = Time(date, scale = 'utc')
delta_t = obs_times - Time(date[0], scale = 'utc')


#Now we want to express the time difference between the individual spectra of MN Lup in rotational periods. While the unit of delta_t is days, unfortunately astropy.time.Time and astropy.units.Quantity objects don’t work together yet, so we’ll have to convert from one to the other explicitly.


delta_p = delta_t.value * u.day / period

In [None]:
# Normalize the flux to the local continuum
# In this example we want to look at the time evolution of a single specific emission line in the spectrum. In order to estimate the equivalent width or make reasonable plots we need to normalize the flux to the local continuum. In this specific case the emission line is bright and the continuum can be described reasonably by a second-order polynomial.

# So, we define two regions left and right of the emission line, where we fit the polynomial. Looking at the figure, [3925*u.AA, 3930*u.AA] and [3938*u.AA, 3945*u.AA] seem right for that. Then, we normalize the flux by this polynomial.

# The following function will do that:



def region_around_line(w, flux, cont):
    '''cut out and normalize flux around a line

    Parameters
    ----------
    w : 1 dim np.ndarray
    array of wavelengths
    flux : np.ndarray of shape (N, len(w))
    array of flux values for different spectra in the series
    cont : list of lists
    wavelengths for continuum normalization [[low1,up1],[low2, up2]]
    that described two areas on both sides of the line
    '''
    #index is true in the region where we fit the polynomial
    indcont = ((w > cont[0][0]) & (w < cont[0][1])) |((w > cont[1][0]) & (w < cont[1][1]))
    #index of the region we want to return
    indrange = (w > cont[0][0]) & (w < cont[1][1])
    # make a flux array of shape
    # (number of spectra, number of points in indrange)
    f = np.zeros((flux.shape[0], indrange.sum()))
    for i in range(flux.shape[0]):
        # fit polynomial of second order to the continuum region
        linecoeff = np.polyfit(w[indcont], flux[i, indcont], 2)
        # divide the flux by the polynomial and put the result in our
        # new flux array
        f[i,:] = flux[i,indrange] / np.polyval(linecoeff, w[indrange].value)
    return w[indrange], f

wcaII, fcaII = region_around_line(wavelength, flux,
    [[3925*u.AA, 3930*u.AA],[3938*u.AA, 3945*u.AA]])

In [13]:


#import lightkurve as lk
# Finding periodic signals
# The lightkurve.periodogram module provides classes to help find periodic signals in light curves.

# Periodogram(frequency, power[, nyquist, …])

Collecting lk
[31m  ERROR: Could not find a version that satisfies the requirement lk (from versions: none)[0m
[31mERROR: No matching distribution found for lk[0m


In [None]:
http://localhost:8889/notebooks/STScI%20notebooks/notebooks/MAST/K2/K2_Lightcurve/k2_lightcurve.ipynb