ML-EBS: Machine Learning for Eclipsing Binary Stars

A machine learning framework for predicting physical parameters and morphological classification of eclipsing binary stars from photometric light curves.

Installation

Requirements

Python 3.7+
8GB+ RAM recommended
Optional: CUDA-compatible GPU for faster training and predictions (CuPy)

Install Dependencies

pip install -r requirements.txt

For GPU acceleration (optional):

pip install cupy-cuda11x  # Replace 11x with your CUDA version

Quick Start

Option 1: Run Complete Pipeline (Recommended)

Execute the combined script to run data preparation, feature extraction, and training sequentially. models/held_out_data.pkl is included in this repository. When present, 3_train_models.py loads it directly to ensure the exact same 845/150 train/test split used in the paper. If deleted, a fresh stratified split will be created automatically.

python 123_extract_and_train.py

This script automatically runs:

1_prepare_training_data.py - Preprocesses light curves
2_extract_training_features.py - Extracts 51 features per light curve
3_train_models.py - Trains RF and XGBoost models with 5-fold CV

Option 2: Run Scripts Separately

If you need more control or want to modify individual steps:

# Step 1: Prepare training data (PCHIP interpolation to 1000 points)
python 1_prepare_training_data.py

# Step 2: Extract features (51 features per light curve)
python 2_extract_training_features.py

# Step 3: Train models (RF and XGBoost, 5-fold CV on 845 systems, 150 held out)
python 3_train_models.py

Make Predictions on New Data

After training, predict parameters for OGLE, Kepler, or custom light curves. Prediction scripts use GPU acceleration (CuPy) if available, falling back to CPU otherwise. Refer to the download.txt files in ogle_data and kepler_data to access the necessary datasets.

# Predict OGLE catalog
python 4a_ogle_prediction.py

# Predict Kepler catalog
python 4b_kepler_prediction.py

# Predict custom light curves
python 4c_custom_prediction.py

For custom predictions, place your CSV files in the custom_data/ folder. Each file must have two columns: phase and flux (with header row).

Compute Mahalanobis Distance

After predictions, you can assess prediction reliability by computing Mahalanobis distance. This measures how far each system's features are from the training distribution; systems with high distance are out-of-distribution and predictions may be less reliable.

# Step 1: Extract features for distance computation
python 5a_extract_ogle_features.py
python 5b_extract_kepler_features.py

# Step 2: Compute distances and merge with predictions
python 6_compute_mahalanobis.py

Evaluate on Held-Out Test Set

After training, you can evaluate model performance on the 150-system held-out test set. This script loads the trained RF and XGBoost models, makes predictions on the held-out data, and reports R² scores and classification metrics.

python 5_held_out_evaluation.py

Output Files

Training Pipeline

processed_data/training_data.pkl - Preprocessed light curves
processed_data/training_features.pkl - Extracted features
models/models_rf/ - Random Forest models (5 folds x 6 tasks)
models/models_xgb/ - XGBoost models (5 folds x 6 tasks)
models/models_rf/rf_cv_summary.csv - Cross-validation results (RF)
models/models_xgb/xgb_cv_summary.csv - Cross-validation results (XGB)
models/held_out_data.pkl - Held-out test set (150 systems, pre-defined for reproducibility)
models/held_out_evaluation_results.csv - Held-out R² and classification metrics

Predictions

predictions/ogle_predictions/ogle_predictions.csv
predictions/kepler_predictions/kepler_predictions.csv
predictions/custom_predictions/custom_predictions.csv

Mahalanobis Distance

ogle_features.pkl - OGLE feature vectors
kepler_features.pkl - Kepler feature vectors
ogle_predictions_with_distance.csv - OGLE predictions with Mahalanobis distance
kepler_predictions_with_distance.csv - Kepler predictions with Mahalanobis distance
mahalanobis_summary.txt - Distance summary statistics

License

MIT License - See licence.txt for details.

Contact

For questions or issues, please open a GitHub issue or contact [burak.ulas@comu.edu.tr].

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ML-EBS: Machine Learning for Eclipsing Binary Stars

Installation

Requirements

Install Dependencies

Quick Start

Option 1: Run Complete Pipeline (Recommended)

Option 2: Run Scripts Separately

Make Predictions on New Data

Compute Mahalanobis Distance

Evaluate on Held-Out Test Set

Output Files

Training Pipeline

Predictions

Mahalanobis Distance

License

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
custom_data		custom_data
kepler_data		kepler_data
ogle_data		ogle_data
training_data		training_data
123_extract_and_train.py		123_extract_and_train.py
1_prepare_training_data.py		1_prepare_training_data.py
2_extract_training_features.py		2_extract_training_features.py
3_train_models.py		3_train_models.py
4a_ogle_prediction.py		4a_ogle_prediction.py
4b_kepler_prediction.py		4b_kepler_prediction.py
4c_custom_prediction.py		4c_custom_prediction.py
5_held_out_evaluation.py		5_held_out_evaluation.py
5a_extract_ogle_features.py		5a_extract_ogle_features.py
5b_extract_kepler_features.py		5b_extract_kepler_features.py
6_compute_mahalanobis.py		6_compute_mahalanobis.py
Feature_List.md		Feature_List.md
README.md		README.md
licence.txt		licence.txt
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

ML-EBS: Machine Learning for Eclipsing Binary Stars

Installation

Requirements

Install Dependencies

Quick Start

Option 1: Run Complete Pipeline (Recommended)

Option 2: Run Scripts Separately

Make Predictions on New Data

Compute Mahalanobis Distance

Evaluate on Held-Out Test Set

Output Files

Training Pipeline

Predictions

Mahalanobis Distance

License

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages