<a href="https://colab.research.google.com/github/dk-wei/customer-driven-ds/blob/main/RFM_Analysis_(Customer_Segmentation).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [64]:
!pip install ipython-autotime
%load_ext autotime

import matplotlib as mpl
import matplotlib.pyplot as plt
%matplotlib inline

import numpy as np
import sklearn
import pandas as pd
import os
import sys
import time
import tensorflow as tf
import datetime as dt

from tensorflow import keras
%load_ext tensorboard

print(tf.__version__)
print(sys.version_info)
for module in mpl, np, pd, sklearn, tf, keras:
    print(module.__name__, module.__version__)

The autotime extension is already loaded. To reload it, use:
  %reload_ext autotime
The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard
2.5.0
sys.version_info(major=3, minor=7, micro=11, releaselevel='final', serial=0)
matplotlib 3.2.2
numpy 1.19.5
pandas 1.1.5
sklearn 0.22.2.post1
tensorflow 2.5.0
tensorflow.keras 2.5.0
time: 2.87 s (started: 2021-07-26 04:06:21 +00:00)


In [63]:
# df_purchase = pd.read_csv('purchase data.csv')

from google.colab import drive
drive.mount('/content/drive')

# Load the data, contained in the segmentation data csv file.
GD_PATH = '/content/drive/MyDrive/扬FAANG起航/单项准备/customer analytics/'
retail_df = pd.read_csv(GD_PATH+'E-Commerce Data.csv',encoding="ISO-8859-1",dtype={'CustomerID': str,'InvoiceID': str})

#load data
#df_purchase = pd.read_csv(GD_PATH+'E-Commerce Data.csv')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).
time: 795 ms (started: 2021-07-26 04:06:19 +00:00)


In [65]:
#load the dataset
#retail_df = pd.read_csv('https://raw.githubusercontent.com/iyappan24/E-commerce-Data-Analysis-/master/Ecommerce%20Customers',encoding="ISO-8859-1",dtype={'CustomerID': str,'InvoiceID': str})

retail_df.head()


Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country
0,536365,85123A,WHITE HANGING HEART T-LIGHT HOLDER,6,12/1/2010 8:26,2.55,17850,United Kingdom
1,536365,71053,WHITE METAL LANTERN,6,12/1/2010 8:26,3.39,17850,United Kingdom
2,536365,84406B,CREAM CUPID HEARTS COAT HANGER,8,12/1/2010 8:26,2.75,17850,United Kingdom
3,536365,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,6,12/1/2010 8:26,3.39,17850,United Kingdom
4,536365,84029E,RED WOOLLY HOTTIE WHITE HEART.,6,12/1/2010 8:26,3.39,17850,United Kingdom


time: 27 ms (started: 2021-07-26 04:06:24 +00:00)


# Prepare the Data

As customer clusters may vary by geography, I’ll restrict the data to only United Kingdom customers, which contains most of our customers historical data.

In [66]:
retail_uk = retail_df[retail_df['Country']=='United Kingdom']
#check the shape
retail_uk.shape

(495478, 8)

time: 85.2 ms (started: 2021-07-26 04:06:24 +00:00)


In [67]:
#remove canceled orders
retail_uk = retail_uk[retail_uk['Quantity']>0]
retail_uk.shape

(486286, 8)

time: 61.6 ms (started: 2021-07-26 04:06:24 +00:00)


In [68]:
#remove rows where customerID are NA
retail_uk.dropna(subset=['CustomerID'],how='all',inplace=True)
retail_uk.shape

(354345, 8)

time: 67.4 ms (started: 2021-07-26 04:06:24 +00:00)


In [69]:
#restrict the data to one full year because it's better to use a metric per Months or Years in RFM
retail_uk = retail_uk[retail_uk['InvoiceDate']>= "2010-12-09"]
retail_uk.shape

(176137, 8)

time: 47.2 ms (started: 2021-07-26 04:06:24 +00:00)


In [70]:
print("Summary..")
#exploring the unique values of each attribute
print("Number of transactions: ", retail_uk['InvoiceNo'].nunique())
print("Number of products bought: ",retail_uk['StockCode'].nunique())
print("Number of customers:", retail_uk['CustomerID'].nunique() )
print("Percentage of customers NA: ", round(retail_uk['CustomerID'].isnull().sum() * 100 / len(retail_df),2),"%" )

Summary..
Number of transactions:  8789
Number of products bought:  3294
Number of customers: 2864
Percentage of customers NA:  0.0 %
time: 45.2 ms (started: 2021-07-26 04:06:24 +00:00)


# RFM Analysis

在一些营销场景下，对不同客户给予相同的对待或策略有时不太合适 (类似于price discrimination)，所以我们根据用户数据，分析用户行为和消费倾向，进行用户画像 (user profiling)，并打上相应的标签应用于不同厂家。将客户分成不同特性的组，用**有限的公司资源优先服务于公司最重要的客户**，客户与我们的粘性将会更高，并与双方建立忠诚的合作关系。

RFM (Recency, Frequency, Monetary) analysis is a customer segmentation technique that uses past purchase behavior to divide customers into groups. RFM helps divide customers into various categories or clusters to identify customers who are more likely to respond to promotions and also for future personalization services.

我们主要参看一年的数据

- `RECENCY (R)`: Days since last purchase (上次购买是多少周之前)
- `FREQUENCY (F)`: Total number of purchases (一年中购买的周数/52周)
- `MONETARY VALUE (M)`: Total money this customer spent last year (去年一年消费的总金额)

We will create those 3 customer attributes for each customer

![](http://image.woshipm.com/wp-files/2020/04/qvYutFYaLtunTWhBOHdQ.jpg)



## Recency

To calculate recency, we need to choose a date point from which we evaluate how many days ago was the customer's last purchase.

我们一般认为认为**用户的最近一次消费行为离今日越近，他当前的活跃度将会更高，价值也会更高，回购的可能性越大，我们越容易维系与其的关系。**。

因为数据集对应的是`TO_B`的业务，所以我们此处定义用户如果在**周内有消费行为且销售额大于最低阈值**，该用户被标记为该周活跃。

我们可以分为四个组，分组表现如下：    
![](http://image.woshipm.com/wp-files/2020/04/MY6vbyRd0dUgT4hiUSZ7.jpg)

**我们应该同时做一下customer_share vs sales devode的比较。**

一些insights：
1. 超过85%的用户在最近半年至少消费一次。

2. 用户最近一次消费在距今1~4周的用户数量占比为10%，但提供了将近50%的销售额。

3. 最近一次购买距今超过9个月的用户几乎不产生消费。

这表示Recent_C组的用户已经很有可能将要离开或者已经离开我们了。当然他们有被激活的机会，但是也许不应该花费过多，**因为这个群组客户的投资汇报（ROI）可能相对较低**，也就是说不同的两个组，投入相同的有效资源，高ROI的群组召回的话，大概率会产生更高的回报。



In [71]:
#last date available in our dataset
retail_uk['InvoiceDate'].max()

'9/9/2011 9:52'

time: 24.3 ms (started: 2021-07-26 04:06:24 +00:00)


In [72]:
#create a new column called date which contains the date of invoice only
retail_uk['date'] = pd.DatetimeIndex(retail_uk['InvoiceDate']).date

time: 16.8 s (started: 2021-07-26 04:06:24 +00:00)


In [73]:
retail_uk.head()

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,date
105335,545220,21955,DOORMAT UNION JACK GUNS AND ROSES,2,3/1/2011 8:30,7.95,14620,United Kingdom,2011-03-01
105336,545220,48194,DOORMAT HEARTS,2,3/1/2011 8:30,7.95,14620,United Kingdom,2011-03-01
105337,545220,22556,PLASTERS IN TIN CIRCUS PARADE,12,3/1/2011 8:30,1.65,14620,United Kingdom,2011-03-01
105338,545220,22139,RETROSPOT TEA SET CERAMIC 11 PC,3,3/1/2011 8:30,4.95,14620,United Kingdom,2011-03-01
105339,545220,84029G,KNITTED UNION FLAG HOT WATER BOTTLE,4,3/1/2011 8:30,3.75,14620,United Kingdom,2011-03-01


time: 30 ms (started: 2021-07-26 04:06:41 +00:00)


In [74]:
#group by customers and check last date of purshace
recency_df = retail_uk.groupby(by='CustomerID', as_index=False)['date'].max()
recency_df.columns = ['CustomerID','LastPurshaceDate']
recency_df.head()

Unnamed: 0,CustomerID,LastPurshaceDate
0,12747,2011-08-22
1,12748,2011-09-30
2,12749,2011-08-01
3,12820,2011-09-26
4,12821,2011-05-09


time: 531 ms (started: 2021-07-26 04:06:41 +00:00)


In [75]:
now = dt.date(2011,12,9)
print(now)


2011-12-09
time: 1.04 ms (started: 2021-07-26 04:06:42 +00:00)


In [76]:
#calculate recency
recency_df['Recency'] = recency_df['LastPurshaceDate'].apply(lambda x: (now - x).days)

time: 7.57 ms (started: 2021-07-26 04:06:42 +00:00)


In [77]:
recency_df.head()

Unnamed: 0,CustomerID,LastPurshaceDate,Recency
0,12747,2011-08-22,109
1,12748,2011-09-30,70
2,12749,2011-08-01,130
3,12820,2011-09-26,74
4,12821,2011-05-09,214


time: 18.6 ms (started: 2021-07-26 04:06:42 +00:00)


In [78]:
#drop LastPurchaseDate as we don't need it anymore
recency_df.drop('LastPurshaceDate',axis=1,inplace=True)

time: 3.97 ms (started: 2021-07-26 04:06:42 +00:00)


## Frequency

Frequency helps us to know how many times a customer purchased from us. To do that we need to check how many invoices are registered by the same customer.

用户在一段时间内的购买的频次。如果用户有任何购买行为，并且订单金额超过一个基础阈值，他们这周就会被标记为活跃用户。我们认为用户的购买行为越频繁，他就会有更高的活跃度和交易价值。

![](http://image.woshipm.com/wp-files/2020/04/uje7ZTkbdh3RurY82U1X.jpg)

很明显，Fre_S级组别用户最有价值，购买非常高频率。他们这个组以10%的数量占比贡献了45%的销售额。

In [79]:
# drop duplicates
retail_uk_copy = retail_uk
retail_uk_copy.drop_duplicates(subset=['InvoiceNo', 'CustomerID'], keep="first", inplace=True)
#calculate frequency of purchases
frequency_df = retail_uk_copy.groupby(by=['CustomerID'], as_index=False)['InvoiceNo'].count()
frequency_df.columns = ['CustomerID','Frequency']
frequency_df.head()

Unnamed: 0,CustomerID,Frequency
0,12747,5
1,12748,96
2,12749,3
3,12820,1
4,12821,1


time: 56.7 ms (started: 2021-07-26 04:06:42 +00:00)


## Monetary

Monetary attribute answers the question: How much money did the customer spent over time?

To do that, first, we will create a new column total cost to have the total price per invoice.

用户在一段时间内消费的总金额。消费金额一直是商业中的核心指标。这边可以根据需求差异使用销售额，实际毛利等。

![](http://image.woshipm.com/wp-files/2020/04/Wa8VqAa29nCCL79VArfm.jpg)

一些insights：
1. 前10%的用户贡献了差不多70%的销售额
2. 后40%的用户人数众多，但是几乎不提供任何消费收益。

而Money_S组客户只占总体的10%，它贡献了总体70%的销售额。但其实进入S组的门槛并不太高，年销售额超过￥xxxxx，已经可以加入消费S组。

这边我们可以提供一个应用场景：会员升级 你的老板让你测算用户升级对整体销售提升的效果。

In [80]:
#create column total cost
retail_uk['TotalCost'] = retail_uk['Quantity'] * retail_uk['UnitPrice']

time: 5.07 ms (started: 2021-07-26 04:06:42 +00:00)


In [81]:
monetary_df = retail_uk.groupby(by='CustomerID',as_index=False).agg({'TotalCost': 'sum'})
monetary_df.columns = ['CustomerID','Monetary']
monetary_df.head()

Unnamed: 0,CustomerID,Monetary
0,12747,191.85
1,12748,1054.43
2,12749,67.0
3,12820,15.0
4,12821,19.92


time: 22.6 ms (started: 2021-07-26 04:06:42 +00:00)


# Create RFM Table


In [82]:
#merge recency dataframe with frequency dataframe
temp_df = recency_df.merge(frequency_df,on='CustomerID')
temp_df.head()

Unnamed: 0,CustomerID,Recency,Frequency
0,12747,109,5
1,12748,70,96
2,12749,130,3
3,12820,74,1
4,12821,214,1


time: 23 ms (started: 2021-07-26 04:06:42 +00:00)


In [83]:
#merge with monetary dataframe to get a table with the 3 columns
rfm_df = temp_df.merge(monetary_df,on='CustomerID')
#use CustomerID as index
rfm_df.set_index('CustomerID',inplace=True)
#check the head
rfm_df.head()

Unnamed: 0_level_0,Recency,Frequency,Monetary
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
12747,109,5,191.85
12748,70,96,1054.43
12749,130,3,67.0
12820,74,1,15.0
12821,214,1,19.92


time: 22.7 ms (started: 2021-07-26 04:06:42 +00:00)


## RFM Table Correctness verification

In [84]:
retail_uk[retail_uk['CustomerID']=='12820']

Unnamed: 0,InvoiceNo,StockCode,Description,Quantity,InvoiceDate,UnitPrice,CustomerID,Country,date,TotalCost
360567,568236,23328,SET 6 SCHOOL MILK BOTTLES IN CRATE,4,9/26/2011 11:49,3.75,12820,United Kingdom,2011-09-26,15.0


time: 33.4 ms (started: 2021-07-26 04:06:42 +00:00)


In [85]:
(now - dt.date(2011,9,26)).days == 74

True

time: 3.94 ms (started: 2021-07-26 04:06:42 +00:00)


# Customer segments with RFM Model


The simplest way to create customers segments from RFM Model is to use Quartiles. We assign a score from 1 to 4 to Recency, Frequency and Monetary. Four is the best/highest value, and one is the lowest/worst value. A final RFM score is calculated simply by combining individual RFM score numbers.

Note: Quintiles (score from 1-5) offer better granularity, in case the business needs that but it will be more challenging to create segments since we will have 555 possible combinations. So, we will use quartiles.

当我们得到了R,F,M各个维度的数据，下一步就可以进行**分箱**了，每个组都可以分成1-4，或者S,A,B,C四组

In [86]:
quantiles = rfm_df.quantile(q=[0.25,0.5,0.75])
quantiles

Unnamed: 0,Recency,Frequency,Monetary
0.25,85.0,1.0,16.35
0.5,119.0,2.0,35.4
0.75,183.0,3.0,92.42


time: 24.6 ms (started: 2021-07-26 04:06:42 +00:00)


In [87]:
quantiles.to_dict()

{'Frequency': {0.25: 1.0, 0.5: 2.0, 0.75: 3.0},
 'Monetary': {0.25: 16.35, 0.5: 35.400000000000006, 0.75: 92.42000000000002},
 'Recency': {0.25: 85.0, 0.5: 119.0, 0.75: 183.0}}

time: 17.1 ms (started: 2021-07-26 04:06:42 +00:00)


## Creation of RFM Segments

We will create two segmentation classes since, high recency is bad, while high frequency and monetary value is good.

## RFM Quartiles / Binning

In [88]:
# Arguments (x = value, p = recency, monetary_value, frequency, d = quartiles dict)
def RScore(x,p,d):
    if x <= d[p][0.25]:
        return 4
    elif x <= d[p][0.50]:
        return 3
    elif x <= d[p][0.75]: 
        return 2
    else:
        return 1
# Arguments (x = value, p = recency, monetary_value, frequency, k = quartiles dict)
def FMScore(x,p,d):
    if x <= d[p][0.25]:
        return 1
    elif x <= d[p][0.50]:
        return 2
    elif x <= d[p][0.75]: 
        return 3
    else:
        return 4

time: 8.94 ms (started: 2021-07-26 04:06:42 +00:00)


Now that we have the score of each customer, we can represent our customer segmentation. First, we need to combine the scores (R_Quartile, F_Quartile,M_Quartile) together.

In [89]:
#create rfm segmentation table
rfm_segmentation = rfm_df
rfm_segmentation['R_Quartile'] = rfm_segmentation['Recency'].apply(RScore, args=('Recency',quantiles,))
rfm_segmentation['F_Quartile'] = rfm_segmentation['Frequency'].apply(FMScore, args=('Frequency',quantiles,))
rfm_segmentation['M_Quartile'] = rfm_segmentation['Monetary'].apply(FMScore, args=('Monetary',quantiles,))

time: 180 ms (started: 2021-07-26 04:06:42 +00:00)


In [90]:
rfm_segmentation.head()

Unnamed: 0_level_0,Recency,Frequency,Monetary,R_Quartile,F_Quartile,M_Quartile
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
12747,109,5,191.85,3,4,4
12748,70,96,1054.43,4,4,4
12749,130,3,67.0,2,3,3
12820,74,1,15.0,4,1,1
12821,214,1,19.92,1,1,2


time: 17.3 ms (started: 2021-07-26 04:06:42 +00:00)


Best Recency score = 4: most recently purchase. Best Frequency score = 4: most quantity purchase. Best Monetary score = 4: spent the most.

Let's see who are our Champions (best customers).



In [91]:
rfm_segmentation['RFMScore'] = rfm_segmentation.R_Quartile.map(str) \
                            + rfm_segmentation.F_Quartile.map(str) \
                            + rfm_segmentation.M_Quartile.map(str)
rfm_segmentation.head()

Unnamed: 0_level_0,Recency,Frequency,Monetary,R_Quartile,F_Quartile,M_Quartile,RFMScore
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
12747,109,5,191.85,3,4,4,344
12748,70,96,1054.43,4,4,4,444
12749,130,3,67.0,2,3,3,233
12820,74,1,15.0,4,1,1,411
12821,214,1,19.92,1,1,2,112


time: 30.9 ms (started: 2021-07-26 04:06:42 +00:00)


挑选出RFM为`444`的组，我们的SVIP用户。

In [92]:
rfm_segmentation[rfm_segmentation['RFMScore']=='444'].sort_values('Monetary', ascending=False).head(10)

Unnamed: 0_level_0,Recency,Frequency,Monetary,R_Quartile,F_Quartile,M_Quartile,RFMScore
CustomerID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
18102,72,34,26632.62,4,4,4,444
17949,70,32,22504.73,4,4,4,444
17450,70,28,18009.06,4,4,4,444
16029,80,39,15119.49,4,4,4,444
16013,70,24,10402.34,4,4,4,444
12901,81,20,5915.66,4,4,4,444
13798,72,34,4648.8,4,4,4,444
17857,72,12,4644.68,4,4,4,444
13694,71,32,4472.68,4,4,4,444
15061,73,23,3417.7,4,4,4,444


time: 32.4 ms (started: 2021-07-26 04:06:42 +00:00)


We can find here a suggestion of key segments and then we can decide which segment to consider for further study.

Note: the suggested link use the opposite valuation: 1 as highest/best score and 4 is the lowest.

How many customers do we have in each segment?



In [93]:
print("Best Customers: ",len(rfm_segmentation[rfm_segmentation['RFMScore']=='444']))
print('Loyal Customers: ',len(rfm_segmentation[rfm_segmentation['F_Quartile']==4]))
print("Big Spenders: ",len(rfm_segmentation[rfm_segmentation['M_Quartile']==4]))
print('Almost Lost: ', len(rfm_segmentation[rfm_segmentation['RFMScore']=='244']))
print('Lost Customers: ',len(rfm_segmentation[rfm_segmentation['RFMScore']=='144']))
print('Lost Cheap Customers: ',len(rfm_segmentation[rfm_segmentation['RFMScore']=='111']))

Best Customers:  218
Loyal Customers:  687
Big Spenders:  716
Almost Lost:  52
Lost Customers:  5
Lost Cheap Customers:  278
time: 23.6 ms (started: 2021-07-26 04:06:42 +00:00)


Now that we knew our customers segments we can choose how to target or deal with each segment.

For example:

- Best Customers - Champions: Reward them. They can be early adopters to new products. Suggest them "Refer a friend".

- At Risk: Send them personalized emails to encourage them to shop.

More ideas about what actions to perform in Ometria.


通过上述分析，我们得到三个简单标签 —— 最近一次消费，消费频次，消费金额。

很多分析者喜欢一上来就把三者赋予对应权重，合并计算出一个得分，确实有可取的场景。但其实每一个标签不是为了单个分析报告或者业务活动服务进行的一次性分析。主要还是作为数据资产，以用户标签库的形式，随取随用，服务于各个业务及分析场景。

这边我们将三者联合：    
![](http://image.woshipm.com/wp-files/2020/04/MBYAiWWnVedJloBeNHzU.jpg)

我们可以看到各个标签下用户的表现然后结合具体业务目的分析。
![](http://image.woshipm.com/wp-files/2020/04/fQk1Q2v2zS9E1b8EO8kZ.jpg)


我们选择三个重要标签组来演示：    
![](http://image.woshipm.com/wp-files/2020/04/EcFweh7ZPkiwC7dqNeol.jpg)

同样，我们可以给每个标签内的组别赋予相应的分值，再对每个标签给定相应地权重，来计算出一个总体得分。

我们也可以只是RFM每个维度只分为大于小于平均数两个group，根据R、F、M三个值各自高于或不高于平均分的情况，可以细分出2×2×2共8组用户。对于我们来说，理想的用户状态是用户消费时间间隔(R)越短越好，消费频率(F)越高越好，消费金额(M)越大越好，这类用户可以认为是最具价值的用户。以1表示得分优于均值，以0表示得分劣于均值，划分的8组用户类型如下：
![](https://www.truemetrics.cn/articles/wp-content/uploads/2020/03/%E6%AD%A3%E6%96%87%E5%9B%BE2.png)

可以查看各个group的用户占比：
![](https://www.truemetrics.cn/articles/wp-content/uploads/2020/03/%E6%AD%A3%E6%96%87%E5%9B%BE10.png)


## 形成最终图表

更高阶的来看，我们可以使用Power BI / Tableau（或其他交互式数据化工具）直接连接到业务数据库，获取更多的源数据，关联年龄、性别、地域等用户个人信息，通过报表交互快速查看譬如忠诚用户的性别分布、流失用户的地域分布等指标，从多维度细分我们的用户情况。

再往上，我们还可以设置RFM参数、时间参数切片器，直接在报表页面上调整日期范围、评分段和时间参照点，从而实现不同时间下的RFM模型建立自动化。
![](https://www.truemetrics.cn/articles/wp-content/uploads/2020/03/%E6%AD%A3%E6%96%87%E5%9B%BE11-1024x572.png)

# 应用场景：会员升级

这边我们可以提供一个应用场景：会员升级 你的老板让你测算用户升级对整体销售提升的效果。

基于Monetery_part的用户升级测算

![](http://image.woshipm.com/wp-files/2020/04/SNfcMnW7bg8LkUdHWOmf.jpg)

事实上，用户的层级相对来说不是那么容易去改变。另一方面来说，通过有效的策略促进用户升级成功，会对带来巨大的业务增长。

我的测算基于如下思维逻辑——每个群组的头部用户更有机会升级到下一个群组的尾部，举例 C组前25%消费排名用户会有机会加入B组的尾部，B组前25%消费排名用户会有机会加入A组的尾部。

![](http://image.woshipm.com/wp-files/2020/04/tQEL71bPag2q0HyW8bIE.jpg)

B组头部 → A组尾部 的升级 客户只需要提高消费￥2000每年，所以只需要采取一些策略很容易就可以实现这个目标。所以我赋予这层的转化率是80%，而最后能够得到￥23M的收益。

A组头部 → S组尾部 的升级 客户需要提高消费￥11000，难度提高，所以我赋予这层的转化率是50%，而最后能够得到￥43.5M的收益。

仅仅通过这两个可行度较高的部分就可以增加66.6M的销售提升，而所需的成本很可能就是一些积分，头衔等。但是如果需要提升更高的销售额，那可能需要与客户分享一部分收益。

自然还有A级中部用户升级为头部用户，S级底部用户升级为中部用户…还有活跃部分R和F都可以做很多的提升策略，用户一直是很大的宝藏。

# 更多应用场景
1. 如果我们举办一个营销活动关于老客户的促销，我们赋予 R 30%的权重 F 30%的权重 M 40%的权重。

用每个用户所在组别对应的分值 X 标签权重 求和 可以得到用户的得分 优先筛选合适分段用户即可。

2. 如果我们要办一个**流失用户召回**的活动，我们就可以**直接选择R标签中的 Recent_C 或更久没有消费的用户，同时他的M得分很高，就可以获得更易召回的目标用户**。

3. RFM模型可帮助提高客户生命周期价值: 基于实际的交易历史，RFM模型是我们快速了解当前客户价值的方法。在更好地了解细分受众群体的情况下，通过改善业务，促使用户从流失用户→普通用户→忠诚用户转变。

4. 通过RFM模型最大程度上降低营销成本并提高回报率: 将营销活动针对那些更有可能对活动做出响应的客户，而非全部用户，从而降低营销成本成本，也提过了转化率。

5. 通过RFM模型提高用户忠诚度: 根据帕累托法则（即二八定律），20%的用户会给我们带来80%的贡献。针对高价值用户，我们可以通过感谢信、节日专属福利等方式，增进与其的联系，加强用户归属感。

6. 通过RFM模型改善用户流失: 处于休眠状态或流失边缘的用户是值得我们特别注意的，我们需要识别出这部分用户，并通过相应的措施，如个性化电子邮件、短信，亦或是优惠券、折扣商品等，及时将用户唤回。

这些标签在面对不同的业务环境是会有很多的应用场景，同时他们也可以与更多的标签和数据结合，关联分析来产生更高的业务价值。

不要忽略时间，地区的差异性。不同的地区的用户有着不同的消费水平，我们可以结合具体业务场景分别讨论。

本报告完全基于个人实际工作，尽量详细。不清楚的地方和其他想了解探讨的方向，欢迎留言。如首图，后续会更新更多的用户画像的分析报告与实际应用方法。



# RFM模型的灵活应用

RFM模型诞生于传统销售行业，是量化客户行为的简单框架，目前也普遍应用于互联网、电信等行业中。运用该模型细分我们的用户群体时，不应该仅仅局限在消费时间、消费次数、消费金额这三个因素中，可以扩展RFM模型进行相应的变体。

对于新闻类网站：
![](https://www.truemetrics.cn/articles/wp-content/uploads/2020/03/%E6%AD%A3%E6%96%87%E5%9B%BE12.png)

对于BBS论坛类网站：
![](https://www.truemetrics.cn/articles/wp-content/uploads/2020/03/%E6%AD%A3%E6%96%87%E5%9B%BE13.png)

不同的平台，对应着不同类型的用户群体，对应的R、F、M的具体意义也就不一样。根据笔者的业务分析经验，建议各位要结合自己的实际业务来选取关键数据指标进行分析，协助运营同事更深入地了解用户行为，为其营销策略提供依据，实现数据驱动运营。