Skip to content

so24def/top9percent_14th_Kaggle_datathon_employee_churn_prediction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Predicting Employee Churn using Linkedin Tech-Talent Data

Problem type: Binary Classification

Includes detailed solution of Garanti BBVA Data Camp. I attended the competition solo and ranked 14th(top %9) out of 210 competitors and 174 teams.

Solution

  • Cleaning education, skill, language, work experiences datasets which are included with erroneous data at a high rate and fixing a few problems on train - test sets.
  • Feature engineering, followed by imputation and encoding.
  • Using Random Forest cross-validation training and stacking, according to model selection during the competition and also with the best dataset decided by various feature selection techniques.

Main solution can be found both in English and Turkish, with the added bonus/unused work performed during competition.


Bonus

I also added;

  • The mentioned fi_forward_feature_selector function that I wrote and used to create the dataset that got me the second best private score out of my three final day submissions.
  • The codes of HalvingGridSearch, TuneGridSearch and Optuna, when I was in search of tuning hyperparameters faster than standart GridSearch. While GridSearchCV does not use any optimization algorithm and tries all the combinations from the given parameter grid, HalvingGridSearch uses an algorithm called successive halving that makes it approximately 4x faster. Also TuneGridSearch is another faster alternative.
  • The code of training curves with Yellowbrick library, to detect how hyperparameter values effect the model, and by that minimize the range of hyperparameters given to HalvingGridSearch to get even faster results.
  • The code that I used to scrape an external data but unfortunately seemed unimportant after modelling and so remained unused during competition.

External Data/Sources


https://www.kaggle.com/so24def

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published