# Phase 3: Advanced Analytics - RFM Customer Segmentation

**Team:** Code Serpents

**Team Member:** G.A.Dilsara Thiranjaya

---

## Executive Summary

This notebook implements a comprehensive RFM (Recency, Frequency, Monetary) analysis to segment customers of "Unique Gifts Ltd." based on their purchasing behavior. The analysis will help identify high-value customers, at-risk segments, and opportunities for targeted marketing strategies.

## Objectives

1. **Calculate RFM Metrics**: Compute Recency, Frequency, and Monetary values for each customer
2. **Assign RFM Scores**: Use quintile-based scoring (1-5) for each RFM dimension
3. **Customer Segmentation**: Map RFM scores to descriptive business segments
4. **Business Insights**: Generate actionable recommendations for each segment
5. **Wholesaler Analysis**: Investigate the hypothesis of two distinct customer groups

---

## 1. Setup and Data Loading

In [2]:
# Import required libraries
import sys
import os

# Add the src directory to Python path
sys.path.append('../src')

# Import our custom RFM analyzer
from rfm_segmentation import load_and_validate_data

# Standard libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime
import warnings
warnings.filterwarnings('ignore')

# Set display options
pd.set_option('display.max_columns', None) # Display all columns of the DataFrame without truncation
pd.set_option('display.max_rows', 100) # Display a maximum of 100 rows in the DataFrame

print("📦 All libraries imported successfully!")
print(f"📅 Analysis Date: {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")

# Load the cleaned data
DATA_PATH = "../data/online_retail_clean.csv"

data = load_and_validate_data(DATA_PATH)

# Display basic information about the dataset
print("\n📊 Dataset Overview:")
print(f"Shape: {data.shape}")
print(f"Date Range: {data['InvoiceDate'].min()} to {data['InvoiceDate'].max()}")
print(f"Unique Customers: {data['Customer ID'].nunique():,}")
print(f"Total Transactions: {data['Invoice'].nunique():,}")
print(f"Total Revenue: £{data['TotalPrice'].sum():,.2f}")

# Display first few rows
print("\n🔍 Sample Data:")
display(data.head())

📦 All libraries imported successfully!
📅 Analysis Date: 2025-08-18 08:36:38
✅ Data loaded successfully: 13 rows, 13 columns

📊 Dataset Overview:
Shape: (13, 13)
Date Range: 2009-12-01 07:45:00 to 2009-12-01 09:06:00
Unique Customers: 2
Total Transactions: 3
Total Revenue: £710.60

🔍 Sample Data:


Unnamed: 0,Invoice,StockCode,Description,Quantity,InvoiceDate,Price,Customer ID,Country,TotalPrice,Year,Month,DayOfWeek,HourOfDay
0,489434,85048,15CM CHRISTMAS GLASS BALL 20 LIGHTS,12,2009-12-01 07:45:00,6.95,13085,United Kingdom,83.4,2009,12,1,7
1,489434,79323P,PINK CHERRY LIGHTS,12,2009-12-01 07:45:00,6.75,13085,United Kingdom,81.0,2009,12,1,7
2,489434,79323W,WHITE CHERRY LIGHTS,12,2009-12-01 07:45:00,6.75,13085,United Kingdom,81.0,2009,12,1,7
3,489434,22041,"RECORD FRAME 7"" SINGLE SIZE",48,2009-12-01 07:45:00,2.1,13085,United Kingdom,100.8,2009,12,1,7
4,489434,21232,STRAWBERRY CERAMIC TRINKET BOX,24,2009-12-01 07:45:00,1.25,13085,United Kingdom,30.0,2009,12,1,7


## 2. RFM Analysis Implementation

### 2.1 Initialize RFM Analyzer

In [None]:
# Initialize RFM analyzer
# rfm_analyzer = RFMAnalyzer(data)

# print("🚀 RFM Analyzer initialized successfully!")
# print("Ready to perform comprehensive customer segmentation analysis.")