Skip to content

StevenTapscott/Python-Notebooks-EDA-Analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploratory Data Analysis (EDA) – Global Superstore

Overview This project performs an in-depth Exploratory Data Analysis (EDA) on the Global Superstore dataset to uncover patterns in sales, profit, and customer behaviour. The analysis focuses on identifying key business insights that can support data-driven decision-making in a retail context.

Objectives Understand overall sales and profit performance Identify top-performing products and categories Analyse regional and segment-level performance Detect trends, patterns, and potential inefficiencies

Tech Stack Python Pandas & NumPy – data manipulation Matplotlib & Seaborn – data visualisation

Dataset Global Superstore dataset

Includes: Orders and sales data Product categories and sub-categories Customer segments Regional and geographical information

Project Workflow

  1. Data Exploration Reviewed dataset structure, columns, and data types Generated summary statistics for key variables
  2. Data Cleaning Checked for missing values and inconsistencies Prepared dataset for analysis
  3. Sales & Profit Analysis Analysed total sales and profit performance Compared profitability across categories and sub-categories
  4. Regional Analysis Evaluated sales performance across regions Identified high-performing and underperforming areas
  5. Customer Segment Analysis Compared sales and profit across customer segments Identified the most valuable customer groups
  6. Visualisation Created charts to highlight trends and patterns Used visual storytelling to communicate insights

Key Insights Certain product categories generate high sales but lower profit, indicating potential cost inefficiencies Regional performance varies significantly, highlighting opportunities for targeted strategies Specific customer segments contribute disproportionately to overall revenue Some sub-categories consistently underperform, suggesting potential for optimisation or removal

Limitations Analysis is based on historical data and may not reflect future trends Limited external factors (e.g., economic conditions, competition) Dataset scope may not capture all business variables

Future Improvements Build predictive models for sales forecasting Integrate additional datasets for deeper insights Develop interactive dashboards (Power BI / Tableau) Perform customer segmentation using clustering techniques

Portfolio Value

This project demonstrates:

Strong EDA and data analysis skills Ability to extract business insights from data Proficiency in data visualisation and storytelling Understanding of retail and sales analytics

About

Python exploratory data analysis project using Pandas, NumPy, Matplotlib, and Seaborn to perform data cleaning, statistical analysis, feature exploration, and visual storytelling across structured datasets within Jupyter Notebook workflows.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors