Skip to content

Roses29/retail-sales-analysis-python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

15 Commits
 
 
 
 
 
 

Repository files navigation

🌍 Lire en Français | 📓 View the notebook: Open Notebook

Retail Store Sales Analysis (Python Project)

Project Overview

This project focuses on analyzing a retail store sales dataset using Python. The dataset comes from Kaggle and is intentionally "dirty", containing missing and inconsistent values to simulate real-world data challenges.

The project covers the full data analysis workflow:

  • Data loading
  • Data exploration
  • Data cleaning
  • Exploratory Data Analysis (EDA)
  • Insights generation

📂 Dataset Description

The dataset represents transactional sales data from a retail store, including:

  • Transaction ID
  • Customer ID
  • Category
  • Item
  • Price per Unit
  • Quantity
  • Total Spent
  • Payment Method
  • Location (Online / In-store)

It contains:

  • 8 product categories
  • 25 items per category
  • Multiple customers and transactions
  • Missing and inconsistent values

🎯 Objectives

The main goals of this project are:

  • Clean a messy real-world dataset
  • Understand customer purchasing behavior
  • Analyze sales performance across categories
  • Identify trends and patterns in transactions
  • Practice end-to-end data analysis in Python

🧹 Data Cleaning

Several data quality issues were addressed:

  • Handling missing values in key columns (Price, Quantity, Total Spent)
  • Ensuring consistency between related variables
  • Removing or handling duplicates
  • Validating calculated fields (e.g., Total Spent = Price × Quantity)

🔍 Exploratory Data Analysis (EDA)

The analysis includes:

  • Dataset structure exploration (shape, types, unique values)
  • Customer and product analysis
  • Category-level performance
  • Payment method distribution
  • Online vs in-store behavior

📊 Tools & Technologies

  • Python
  • Pandas (data manipulation)
  • Matplotlib & Seaborn (visualization)
  • Google Colab

💡 Key Insights

  • Identification of the number of unique customers and products
  • Clear distinction between online and in-store transactions
  • Detection of missing values patterns in key financial variables
  • Understanding of category distribution and customer behavior

👤 Author: Robin Rubangura

About

End-to-end retail sales analysis using Python, including data cleaning, EDA, and insights from a real-world messy dataset/Analyse complète de données retail en Python : nettoyage, EDA et extraction d’insights à partir d’un dataset réaliste.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors