### What is RFM Analysis ?

RFM analysis is a marketing technique used to quantitatively rank & group customers based on their transaction history .

> - Recency : How recently a customer made a purchase .
> - Frequency : How often a customer makes a purchase .
> - Monetary : How much money a customer spends on purchases .

### Why use RFM Analysis ?

RFM helps businesses understand customer behaviour and segment customers to different groups for targeted marketing . Mainly used for customer segmentation ( identifying high value customers) , personalized marketing , customer retention

RFM analysis is best used when you have transactional data and want to segment customers based on their purchasing behaviour . it is particularly useful for businesses with a large customer base and frequent transactions .

In [1]:
import pandas as pd

# Sample data
data = {
    'CustomerID': [1, 2, 1, 3, 2],
    'TransactionDate': ['2023-07-01', '2023-06-15', '2023-07-10', '2023-05-20', '2023-07-05'],
    'Amount': [100, 200, 150, 300, 250]
}
df = pd.DataFrame(data)
df

Unnamed: 0,CustomerID,TransactionDate,Amount
0,1,2023-07-01,100
1,2,2023-06-15,200
2,1,2023-07-10,150
3,3,2023-05-20,300
4,2,2023-07-05,250


In [3]:
df['TransactionDate'] = pd.to_datetime(df['TransactionDate'])

In [14]:
#Calculating RFM Metrics :

reference_date = pd.to_datetime('2023-07-15')

rfm = df.groupby('CustomerID').agg({
    'TransactionDate' : lambda x : (reference_date - x.max()).days , 
    'CustomerID' : 'count' , 
    'Amount' : 'sum'
})

rfm = rfm.rename(columns={
    'TransactionDate': 'Recency',
    'CustomerID': 'Frequency',
    'Amount': 'Monetary'
}).reset_index()

rfm

Unnamed: 0,CustomerID,Recency,Frequency,Monetary
0,1,5,2,250
1,2,10,2,450
2,3,56,1,300


#### Assigning scores

In [24]:
rfm['R_Score'] = pd.qcut(rfm['Recency'], 4, [4, 3, 2, 1])

>- pd.qcut: This function divides the data into equal-sized bins based on the quantiles.
>- rfm['Recency']: The column containing the recency values.
>- 4: The number of quantiles (quartiles) to divide the data into.
>- [4, 3, 2, 1]: The scores assigned to each quartile. The lowest recency (most recent transactions) gets the highest score  (4), and the highest recency (least recent transactions) gets the lowest score (1).

In [25]:
rfm['F_Score'] = pd.qcut(rfm['Frequency'].rank(method='first'), 4, [1, 2, 3, 4])

> - rfm['Frequency'].rank(method='first'): This ranks the frequency values. The method='first' ensures that ties are assigned ranks in the order they appear.
> - 4: The number of quantiles (quartiles) to divide the data into.
> - [1, 2, 3, 4]: The scores assigned to each quartile. The lowest frequency gets the lowest score (1), and the highest frequency gets the highest score (4).

In [26]:
rfm['M_Score'] = pd.qcut(rfm['Monetary'], 4, [1, 2, 3, 4])

>- rfm['Monetary']: The column containing the monetary values.
>- 4: The number of quantiles (quartiles) to divide the data into.
>- [1, 2, 3, 4]: The scores assigned to each quartile. The lowest monetary value gets the lowest score (1), and the highest monetary value gets the highest score (4).

### Calculating the combined rfm score

In [28]:
rfm['RFM_Score'] = rfm['R_Score'].astype(str) + rfm['F_Score'].astype(str) + rfm['M_Score'].astype(str)
#concatenates the score 

In [30]:
rfm

Unnamed: 0,CustomerID,Recency,Frequency,Monetary,R_Score,F_Score,M_Score,RFM_Score
0,1,5,2,250,4,2,1,421
1,2,10,2,450,3,4,4,344
2,3,56,1,300,1,1,2,112


## RFM Analysis

### Customer 0
- **Recency (R_Score: 4)**: This customer has made a purchase recently.
- **Frequency (F_Score: 2)**: They have made a moderate number of purchases.
- **Monetary (M_Score: 1)**: Their total spending is relatively low.
- **RFM Score: 421**: This customer is relatively recent but not very frequent or high-spending.

### Customer 1
- **Recency (R_Score: 3)**: This customer has made a purchase somewhat recently.
- **Frequency (F_Score: 4)**: They have made a high number of purchases.
- **Monetary (M_Score: 4)**: Their total spending is high.
- **RFM Score: 344**: This customer is frequent and high-spending, but not the most recent.

### Customer 2
- **Recency (R_Score: 1)**: This customer has not made a purchase in a long time.
- **Frequency (F_Score: 1)**: They have made very few purchases.
- **Monetary (M_Score: 2)**: Their total spending is moderate.
- **RFM Score: 112**: This customer is infrequent, not recent, and has moderate spending.

## Insights

- **Customer 0**: This customer might be a new or occasional buyer. They have made a recent purchase, so it could be beneficial to engage them with follow-up offers or incentives to increase their frequency and spending.
- **Customer 1**: This is a valuable customer who buys frequently and spends a lot. Maintaining a strong relationship with them through loyalty programs, personalized offers, and excellent customer service could be key to retaining them.
- **Customer 2**: This customer is at risk of churning. They haven't purchased in a while and don't buy often. Re-engagement strategies such as special discounts, reminders, or personalized communication might help bring them back.
