Skip to content

Abhishree7/Basic-EDA

 
 

Repository files navigation

EDA (Exploratory Data Analysis)

My journey on learning exploratory data analysis. I discuss basic EDA concepts and demonstrate using a dataset used by one of AI Saturdays members. I chose it because it was a simple dataset with a lot of what I needed to demonstrate these concepts . I was inspired to do this because it seemed to me that when it comes to machine learning a lot more focus is on the models rather data.

What is EDA

A process to uncover underlying insights about the data.

Why EDA is important

Saves time

Helps in extracting and engineering features

Understand why your model fails/succeeds

Help understand validate or dispell assumptions

Can give better understanding about the domain one can easily ask the right questions

EDA Concepts

  1. Visualizing data
  2. Statistical summaries and inferences
  3. Cleaning
  4. Feature selection, engineering and extraction

The data documents tracks travel of out and into Busia.

Travel_Route : Whether it is an arrival or departure

Visitors_in_Transit : Number of visitors passing through Busia

Visitors_on_Holiday : Number of visitors on holiday in Busia

Visitors_on_Business : Number of visitors on business in Busia

Other_Visitors : Visitors whose purpose has not been specified

Year : The date of travel

Year_text : Extracted year

Results_Status : No idea what this is

OBJECTID : Index of rows

This notebook discusses ways to summarize data and types of data distributions

This notebook discusses ways to prepare data before passing to model

Images used to illustrate concepts

This notebook discusses various types of plots

About

Project on Basic exploratory data analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Jupyter Notebook 100.0%