Over a period of nine years in deep space, the NASA Kepler space telescope has been out on a planet-hunting mission to discover hidden planets outside of our solar system.
Create machine learning models capable of classifying candidate exoplanets from the raw dataset.
- Preprocessed the dataset prior to fitting the model.
- Performed feature selection and remove unnecessary features.
- Used
MinMaxScaler
to scale the numerical data. - Separated the data into training and testing data.
- Used
GridSearch
to tune model parameters. - Tune and compare different classifiers like Logistic Regression, K-Nearest Neighbours, Support Vector Machine, Random Forest Classifier.
After comparing all the models, it looks like Random Forest Classifier gives the best Accuracy Score.
- Training Data Score: 0.8411214953271028
- Testing Data Score: 0.8409610983981693
- Training Data Score: 0.8725920274651917
- Testing Data Score: 0.8249427917620137
- Training Data Score: 0.8725920274651917
- Testing Data Score: 0.8249427917620137
- Training Data Score: 0.8439824527942018
- Testing Data Score: 0.8415331807780321
- Training Data Score: 0.8901392332633988
- Testing Data Score: 0.8861556064073226
- Training Data Score: 0.996185390043868
- Testing Data Score: 0.8729977116704806
- Training Data Score: 1.0
- Testing Data Score: 0.8907322654462243