Drivendata.org Warm Up Challenge
This dataset is from a study of heart disease that has been open to the public for many years. The study collects various measurements on patient health and cardiovascular statistics, and of course makes patient identities anonymous.
Goal is to predict the binary class heart_disease_present
, which represents whether or not a patient has heart disease.
Approach:
- Data size is small and tidy, so it did not require any cleaning
- Used
RandomForestClassifier
for model building - Used
GridSearchCV
to find best hyperparameters
Evaluation-metric: logarithmic loss; Scored: 0.37442 (top 15%)