This repository presents a collection of exploratory data analysis (EDA) projects that highlight my ability to transform raw datasets into meaningful insights. Each project focuses on understanding data structure, detecting anomalies, identifying trends, and preparing datasets for advanced analytics or modeling.
The typical workflow followed across projects includes:
- Data Collection & Cleaning – Handling missing values, correcting inconsistencies, and preparing structured data.
- Exploratory Analysis – Summarizing data using descriptive statistics and uncovering relationships between variables.
- Outlier Detection & Treatment – Identifying unusual values using statistical methods (IQR, Z-score, etc.) and applying capping or removal.
- Visualization – Communicating insights through clear and intuitive visualizations.
- Interpretation & Reporting – Highlighting key findings and providing actionable insights.
- Data wrangling and preprocessing
- Statistical analysis and hypothesis testing
- Outlier detection and handling strategies
- Correlation and regression analysis
- Data visualization and storytelling
- Python: pandas, NumPy, SciPy, matplotlib, seaborn, statsmodels
- Jupyter Notebook for analysis and documentation
- Git & GitHub for version control and project showcasing