## Mod 5 Lecture 6 Code-Along:  Segmentation and RFM 

### Why this activity?
This hands-on exercise connects directly to today’s lecture on segmentation and RFM analysis.  

### Goals (you will be able to…)
✅ Calculate Recency, Frequency, and Monetary (RFM) values  
✅ Score customers on a scale from 1–4  
✅ Combine RFM scores to segment customers  
✅ Prepare for interview questions about segmentation & behavior-based targeting

### Interview practice (be ready to answer)
- **Q1: You've just completed an RFM analysis. What are two different business actions you could recommend based on your findings?**

- **Q2: You work for a music streaming service that doesn't have purchases. How would you adapt RFM to user engagement?**


## 🔹 Step 1 — Create a sample transaction dataset

In [None]:
import pandas as pd
from datetime import datetime

# Sample transaction data
transactions = {
    'customer_id': [1, 2, 1, 3, 2, 1, 4, 3, 2],
    'transaction_date': [
        '2023-10-01', '2023-10-05', '2023-10-12',
        '2023-08-20', '2023-10-15', '2023-10-25',
        '2023-06-10', '2023-09-01', '2023-10-26'
    ],
    'amount': [50, 75, 60, 120, 80, 55, 200, 150, 90]
}

# Convert to DataFrame and parse dates
df = None
df['transaction_date'] = None
df.head()




## 🔹 Step 2 — Set snapshot date and calculate Recency, Frequency, and Monetary (RFM)

- **Recency**: Days since last purchase  
- **Frequency**: Number of purchases  
- **Monetary**: Total amount spent  
We’ll anchor everything to a fixed snapshot date so it’s reproducible.


In [None]:
# Set a fixed date to calculate recency from
snapshot_date = datetime(2023, 10, 27)

# Group by customer and calculate RFM
rfm = None

rfm

## 🔹 Step 3 — Assign scores from 1 (worst) to 4 (best)


We’ll use quartiles:
- For **Recency**, lower is better → score 4  
- For **Frequency & Monetary**, higher is better → score 4  
These scores help us group users.


In [None]:
# Convert raw R, F, M to scores using quartiles
rfm['R_score'] = pd.qcut(rfm['Recency'], 4, labels=[4, 3, 2, 1])
rfm['F_score'] = pd.qcut(rfm['Frequency'].rank(method='first'), 4, labels=[1, 2, 3, 4])
rfm['M_score'] = None

rfm[['Recency', 'Frequency', 'Monetary', 'R_score', 'F_score', 'M_score']]


## 🔹 Step 4 — Combine R, F, M scores into one RFM_Score


For example:  
- **444** = Very recent, very frequent, high spender → Champion  
- **111** = Not recent, rare buyer, low spender → Lost


In [None]:
# Combine scores into a single RFM segment
rfm['RFM_Score'] = rfm['R_score'].astype(str) + None + rfm['M_score'].astype(str)

# View full table
rfm


## 🔹 Step 5 — Interpreting segments

Segmenting customers by RFM helps identify:
- **Champions** (444): Reward and retain  
- **At-Risk** (e.g. 431): Re-engage with offers  
- **Lost** (111): Survey or win-back attempt  
This helps prioritize your customer strategies.
