a simple blueprint for data preprocessing
-GOAL:
Pre-modeling/modeling 80%/20% of work
Show the importance of data preprocessing, feature exploration, and feature engineering on model performace
Go over a few effective pre-modeling steps
This is only a small subset of pre-modeling Format:
-Tutorial style:
Walk through concepts and code (and point out libraries)
Use an edited version of the 'adult' dataset (to predict income) with the objective of building a binary classification model
-Python libraries:
Numpy Pandas Sci-kit learn Matplotlib
Almost entire workflow is covered by these four libraries