My approach in Python (with Jupyter Notebooks) for a hackathon / workshop conducted by Analytics Vidhya. This repository contains the train and test data along with my approach and submission files.
Jupyter Notebook
Latest commit efcf812 Sep 1, 2016 @anirudhjayaraman committed on GitHub Commit 004
Final set of files
Permalink
Failed to load latest commit information.
Experiments with Data.ipynb
README.md
Solution Checker.mp4
leaderboard.png
leaderboard_31Aug2016.pdf
sample_submission.csv
submission_01.csv
submission_02.csv
submission_03.csv
submission_04.csv
submission_05.csv
submission_06.csv
submission_07.csv
submission_08.csv
test.csv
train.csv

README.md

Hackathon-Experiments-With-Data-Analytics-Vidhya

My approach in Python (with Jupyter Notebooks) for a hackathon / workshop conducted by Analytics Vidhya. This repository contains the train and test data along with my approach and submission files.

All missing data in this dataset were for categorical variables. I did a simple mode-imputation to deal with such missing values.

I approached the problem by first attempting some feature engineering (not including missing value treatment) on the data, after which I ran a basic logistic classifier and a random forest classifier. However, the models performed better without feature engineering, so maybe the dataset was already quite clean and informative to start with, which might have been the case for this competition.

I later attempted gradient boosting with parameter tuning and maximizing scores.

Here's a link to the leaderboard. I'm anirudhjay :)