Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data exploration epic #10

Open
jvmncs opened this issue Jul 24, 2018 · 0 comments
Open

Data exploration epic #10

jvmncs opened this issue Jul 24, 2018 · 0 comments
Labels
eda exploratory data analysis epic epic issues

Comments

@jvmncs
Copy link
Owner

jvmncs commented Jul 24, 2018

The goal here is to gather the insights we need to make informed choices about data processing and downstream modeling tasks. Although it's not super exciting work, everything else depends on this being completed. It's also highly parallelizable, which means that we should be able to get it done fairly quickly if we have enough people volunteer.

There's an issue for performing EDA on each table. All of the issues below except for #3 follow the same basic workflow, while #3 consists of gathering research into existing kernels on Kaggle. If you're going to pickup an issue, please assign it to yourself so we don't end up repeating work!

#3 application_{train/test}.csv
#4 bureau.csv
#5 bureau_balance.csv
#6 previous_application.csv
#7 POS_cash_balance.csv
#8 credit_card_balance.csv
#9 installments_payments.csv

@jvmncs jvmncs added eda exploratory data analysis epic epic issues labels Jul 24, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
eda exploratory data analysis epic epic issues
Projects
None yet
Development

No branches or pull requests

1 participant