My journey on learning exploratory data analysis. I discuss basic EDA concepts and demonstrate using a dataset used by one of AI Saturdays members. I chose it because it was a simple dataset with a lot of what I needed to demonstrate these concepts . I was inspired to do this because it seemed to me that when it comes to machine learning a lot more focus is on the models rather data.
A process to uncover underlying insights about the data.
Saves time
Helps in extracting and engineering features
Understand why your model fails/succeeds
Help understand validate or dispell assumptions
Can give better understanding about the domain one can easily ask the right questions
- Visualizing data
- Statistical summaries and inferences
- Cleaning
- Feature selection, engineering and extraction
The data documents tracks travel of out and into Busia.
Travel_Route : Whether it is an arrival or departure
Visitors_in_Transit : Number of visitors passing through Busia
Visitors_on_Holiday : Number of visitors on holiday in Busia
Visitors_on_Business : Number of visitors on business in Busia
Other_Visitors : Visitors whose purpose has not been specified
Year : The date of travel
Year_text : Extracted year
Results_Status : No idea what this is
OBJECTID : Index of rows
This notebook discusses ways to summarize data and types of data distributions
This notebook discusses ways to prepare data before passing to model
Images used to illustrate concepts
This notebook discusses various types of plots