# Exoplanet classification with a Convolutional Neural Network

## Background
In this exercise we will use light transit curves of NASA's Kepler mission to classify exoplanets. The dataset includes time profiles of the flux-density of stars observed by the Kepler telescope. While stars without exoplanets show a regular, stable flux-density, the flux-density of stars with exoplanets show characteristic dips in brightness, as an exoplanet passes in front of the star. The animation below illustrates this effect.

<div style="text-align: center">
    <img src="../data/supplementary_data/Exoplanet_Animation_Transit_Light_Curve_appletv.gif">
    <p>Source: <a href="https://science.nasa.gov/resource/transit-light-curve/">NASA</a></p>
</div>

## Aim of the exercise
The aim of this exercise is to classify light transit curves of stars observed by the Kepler telescope into two classes: stars with exoplanets and stars without exoplanets. We will use labelled data of light-transit curves of stars, with and without exoplanets, to train a Convolutional Neural Network (CNN) to classify the light-transit curves.

## Data Analysis
First we create a class, which we will use to load, explore and preprocess the data:

In [None]:
from dataclasses import dataclass
import pandas as pd


@dataclass
class KeplerCurves:
    """
    Data class that contains the light transit curves.
    """
    
    train_data: pd.DataFrame
    test_data: pd.DataFrame
    
    @classmethod
    def from_csv(cls, train_path: str, test_path: str | None = None):
        """
        Load the data from csv files.
        
        :param train_path: Path to the training data.
        :param test_path: Path to the test data.
        """
        cls.train_data = pd.read_csv(train_path)
        cls.test_data = pd.read_csv(test_path)
        
    
    @classmethod
    def generate_test_data(cls, n_samples: int):
        """
        Generate test data from the training data.
        
        :param n_samples: Number of samples to generate.
        """
        
        cls.test_data = cls.train_data.sample(n_samples)
        cls.train_data = cls.train_data.drop(cls.test_data.index)
    
    