Skip to content

so24def/top10_Kaggle_datathon_e-commerce_customer_classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Predicting the Customer Segments by using e-commerce data

Problem type: Multiclass Classification

Includes solution and jury presentation of BTK Akademi Datathon 2023. I attended the competition solo and ranked in top 10 by the jury's selection out of 359 competitors and 255 teams.

Solution

  • A very detailed EDA phase followed by multiple pivot tables
  • Feature engineering; extracting new numerical features, trying the experimental "Cluster feature" method and getting statistical features by cluster groups
  • Feature selection with Sequential Feature Selection, RFECV, SHAP (not included in this repo)
  • Model selection / model re-evaluation
  • Detection and analysis of the sample that is being misclassified by each of the Random Forest, XGBoost, CatBoost, LightGBM models
  • Hyperparameter tuning with Optuna
  • Creating the final submission with decided final feature set and model architecture

  • I also included every helper function that I use throughout different sections of the solution

External Data/Sources


https://www.kaggle.com/so24def

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published