Building a classification pipeline in Python using scikit-learn. This notebook serves as a reusable template for tackling classification problems. The workflow covers key steps such as:
- Data preprocessing and Feature engineering.
- Adressing data leakage problem.
- Handling imbalanced data in classification.
- Building a reusable pipline to to train several models.
- Optimization techniques such as grid search, crossvalidation, precision-recall trade-off, feature importance..etc.