Multi-Strategy Ensemble Learning for Binary Data Classification

Kaggle Competition Leaderboard First Score (0.41791)

This project develops a robust classification pipeline for high-dimensional binary datasets. The dataset contained 64 binary features and a categorical target, with class imbalance handled using SMOTE, producing a balanced training set of 792 samples.

Features were scaled using StandardScaler to ensure consistency across training and testing. Hyperparameters were optimized using Optuna with stratified k-fold cross-validation (k=3 for ensemble, k=5 for individual models).

The final model is a soft-voting ensemble integrating multiple Bernoulli Naive Bayes classifiers, Logistic Regression, XGBoost, Balanced Random Forest, and three standard Random Forest classifiers. Ensemble hyperparameters were specifically tuned to maximize cross-validated accuracy (~0.437).

Exploratory methods like CTGAN, MLP-based feature extraction, and Random Forest-based feature selection were investigated but not included in the final pipeline.

Critical attention was given to preprocessing consistency: the test set was scaled using the training scaler to prevent distribution mismatch.

The ensemble approach balances diverse algorithmic strengths, improves robustness, and handles class imbalance effectively. Future directions include exploring advanced feature selection, stacking ensembles, alternative imbalance techniques, and probability calibration.

Key takeaways: multi-strategy ensembles are effective for high-dimensional binary data, SMOTE improves minority class performance, and consistent preprocessing is crucial for valid inference.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
README.md		README.md
c_report.pdf		c_report.pdf
classification.ipynb		classification.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Multi-Strategy Ensemble Learning for Binary Data Classification

About

Uh oh!

Releases

Packages

Languages

sabamadadi/multi-strategy-ensemble-binary-classification

Folders and files

Latest commit

History

Repository files navigation

Multi-Strategy Ensemble Learning for Binary Data Classification

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages