Skip to content

evansphillips/retail-customer-segmentation-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Exploring Customer Segmentation and Customer Lifetime Value for Sales Forecasting


Background

Welcome to the data exploration journey of understanding customer behavior and enhancing sales forecasting for a UK-based company specializing in unique all-occasion gifts. Our goal is to unlock valuable insights from customer data and historical sales, laying the foundation for effective customer segmentation and improved sales predictions.

Objectives

  • Understand the Data:

  • Exploratory Data Analysis (EDA):

    • Perform comprehensive exploratory data analysis to uncover hidden patterns, trends, and anomalies within the dataset.
  • Data Preparation:

    • Preprocess and prepare the data for subsequent analyses, ensuring its suitability for modeling.
  • Customer Segmentation:

    • Utilize advanced segmentation techniques to categorize customers based on their behavior, preferences, and historical interactions.
  • Forecasting Models:

    • Develop and implement tailored forecast models for each customer segment, aiming for accurate sales predictions.
  • Results Presentation:

    • Present the findings, insights, and actionable recommendations in a clear and concise manner.

Data Description

The heart of our exploration lies in the Online Retail II dataset, offering a real-world snapshot of online retail transactions. The primary data elements include:

online_retail_II.xlsx
This comprehensive table captures records for all created orders, boasting 1,067,371 rows and 8 columns. With a size of 44.55MB, it serves as a rich source of information for our analysis.

Data Element Type Description
Invoice object Invoice number, uniquely assigned to each transaction. If starting with 'c', it signifies a cancellation.
StockCode object Unique product (item) code assigned to each distinct product.
Description object Descriptive name of the product (item).
Quantity int64 Quantities of each product (item) per transaction.
InvoiceDate datetime Date and time of the invoice generation.
Price float64 Unit price of the product in pounds (£).
Customer ID int64 Unique customer identifier with a 5-digit integral number.
Country object Country name where the customer resides.

File Tree

.
├── data
│   ├── 2009-2010.csv
│   └── 2010-2011.csv
├── models
│   └── t2v
├── notebooks
│   ├── ds4a_retail_challenge.ipynb
│   ├── gensim_lda.py
│   └── utils.py
├── README.md
└── requirements.txt

References

  1. https://www.machinelearningplus.com/nlp/topic-modeling-gensim-python/
  2. https://github.com/nicodv/kmodes
  3. https://towardsdatascience.com/understanding-topic-coherence-measures-4aa41339634c
  4. https://medium.com/@thomas.shawcarveth/market-segmentation-and-predicting-marketing-success-with-data-science-f48c99e3b4e1
  5. https://www.geeksforgeeks.org/rfm-analysis-analysis-using-python/
  6. https://www.machinelearningplus.com/time-series/arima-model-time-series-forecasting-python/

About

A retail revenue forecasting challenge from DS4A program.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published