Skip to content
Permalink
Branch: master
Find file Copy path
Find file Copy path
Fetching contributors…
Cannot retrieve contributors at this time
14 lines (10 sloc) 1.51 KB

TalkingData Kaggle competition (link)

This repository includes data preprocessing, feature engineering and machine learning techniques to produce the top 13% results on Kaggle Private leaderboard

Preprocess

Modeling

  • Random Forest Hyperparameter tuning, evaluate Random Forest's feature importance and visualize redundant features using dendogram
  • XGBoost Hyperparameter tuning (without tree depth tuning), visualize feature importance. This notebook produces final submission file.
  • Deep Neural Network with categorical embeddings Built with pytorch and fast.ai library wrapper. Use cyclical learning rate to speed up training process.
  • Blending Simple average blending and short tutorial of using numpy memory map to save memory while process data.
You can’t perform that action at this time.