Skip to content

Hook12aaa/HF_kaggle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Kaggle Playground Series S6E2 — Predicting Heart Disease

This is my first time using the XGBoost + CatBoost blend approach, but I wanted to give it a go and see if we can score higher on this exercise to find what features work best. It was difficult but fun!

Competition

Results

Leaderboard: #817 — Public LB Score: 0.95355

Model OOF AUC
XGBoost (7-seed avg) 0.95542
CatBoost (7-seed avg) 0.95550
50/50 Blend 0.95550

Tuning Journey

Run Learning Rate Seeds OOF AUC Public LB Rank
v1 0.05 3 0.95547 0.95353 #850
v2 0.02 7 0.95550 0.95355 #817

Project Structure

├── data.py          # Data loading and target encoding
├── features.py      # Feature column definitions and transformations
├── model.py         # Stratified K-fold CV with multi-seed averaging
├── submission.py    # Blend predictions and generate submission CSV
├── requirements.txt
└── data/
    ├── train.csv
    ├── test.csv
    └── sample_submission.csv

Usage

pip install -r requirements.txt
python submission.py

About

This is my first time using the XGBoost + CatBoost blend approach, but I wanted to give it a go and see if we can score higher on this exercise to find what features work best. It was difficult but fun!

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages