Let's build RFM user segmentation to qualitatively evaluate the audience.
In clustering, we will choose the following metrics: 
- R - time from the user's last purchase to the current date,
- F - the total number of purchases from the user for the entire time,
- M - the amount of purchases for the entire time.
For each RFM segment,we will build the bounds of the recency metrics,frequency and monetary to interpret these clusters.
___
An example of such a description: RFM segment 132 (recency=1,
frequency=3, monetary=2) has limits of recency metrics from 130 to 500 days,
frequency from 2 to 5 orders per week, monetary from 1780 to 3560 rubles per week.

In [1]:
import pandas as pd
from datetime import timedelta


In [2]:
customers = pd.read_csv('./olist_customers_dataset.csv',encoding='Windows-1251')

In [3]:
customers.describe() # statistics

Unnamed: 0,customer_zip_code_prefix
count,99441.0
mean,35137.474583
std,29797.938996
min,1003.0
25%,11347.0
50%,24416.0
75%,58900.0
max,99990.0


In [4]:
print('{:,} rows; {:,} columns'.format(customers.shape[0], customers.shape[1])) # выведем размеры таблицы

99,441 rows; 5 columns


In [5]:
customers.dtypes # column types

customer_id                 object
customer_unique_id          object
customer_zip_code_prefix     int64
customer_city               object
customer_state              object
dtype: object

In [6]:
customers.isna().sum() # show the sum of empty fields in columns

customer_id                 0
customer_unique_id          0
customer_zip_code_prefix    0
customer_city               0
customer_state              0
dtype: int64

In [7]:
customers.head() # show first 5 rows

Unnamed: 0,customer_id,customer_unique_id,customer_zip_code_prefix,customer_city,customer_state
0,06b8999e2fba1a1fbc88172c00ba8bc7,861eff4711a542e4b93843c6dd7febb0,14409,franca,SP
1,18955e83d337fd6b2def6b18a428ac77,290c77bc529b7ac935b93aa66c333dc3,9790,sao bernardo do campo,SP
2,4e7b3e00288586ebd08712fdd0374a03,060e732b5b29e8181a18229c7b0b2b5e,1151,sao paulo,SP
3,b2b6027bc5c5109e529d4dc6358b12c3,259dac757896d24d7702b9acbbff3f3c,8775,mogi das cruzes,SP
4,4f2d8ab171c80ec8364f7c12e35b23ad,345ecd01c38d18a9036ed96c73b8d066,13056,campinas,SP


In [8]:
customers.customer_unique_id.nunique() # the number of unique customer_unique_id

96096

It seems that most of the customers made only one order.

In [9]:
customers.groupby('customer_unique_id').nunique().count() #Count number of distinct elements - customer_unique_id

customer_id                 96096
customer_unique_id          96096
customer_zip_code_prefix    96096
customer_city               96096
customer_state              96096
dtype: int64

In [10]:
customers.nunique() 

customer_id                 99441
customer_unique_id          96096
customer_zip_code_prefix    14994
customer_city                4119
customer_state                 27
dtype: int64

In [11]:
customers.groupby(['customer_unique_id']).agg({'customer_id':'count'})\
.query('customer_id > 10') # show the customer_unique_id who has > 10 orders

Unnamed: 0_level_0,customer_id
customer_unique_id,Unnamed: 1_level_1
8d50f5eadf50201ccdcedfb9e2ac8455,17


In [4]:
orders = pd.read_csv('./olist_orders_dataset.csv',encoding='Windows-1251', parse_dates=['order_purchase_timestamp', 'order_approved_at', 'order_delivered_carrier_date', 'order_delivered_customer_date', 'order_estimated_delivery_date'])

In [5]:
print('{:,} rows; {:,} columns'.format(orders.shape[0], orders.shape[1])) # выведем размеры таблицы

99,441 rows; 8 columns


In [14]:
orders.nunique()

order_id                         99441
customer_id                      99441
order_status                         8
order_purchase_timestamp         98875
order_approved_at                90733
order_delivered_carrier_date     81018
order_delivered_customer_date    95664
order_estimated_delivery_date      459
dtype: int64

In [15]:
orders.dtypes

order_id                                 object
customer_id                              object
order_status                             object
order_purchase_timestamp         datetime64[ns]
order_approved_at                datetime64[ns]
order_delivered_carrier_date     datetime64[ns]
order_delivered_customer_date    datetime64[ns]
order_estimated_delivery_date    datetime64[ns]
dtype: object

In [16]:
orders.head()

Unnamed: 0,order_id,customer_id,order_status,order_purchase_timestamp,order_approved_at,order_delivered_carrier_date,order_delivered_customer_date,order_estimated_delivery_date
0,e481f51cbdc54678b7cc49136f2d6af7,9ef432eb6251297304e76186b10a928d,delivered,2017-10-02 10:56:33,2017-10-02 11:07:15,2017-10-04 19:55:00,2017-10-10 21:25:13,2017-10-18
1,53cdb2fc8bc7dce0b6741e2150273451,b0830fb4747a6c6d20dea0b8c802d7ef,delivered,2018-07-24 20:41:37,2018-07-26 03:24:27,2018-07-26 14:31:00,2018-08-07 15:27:45,2018-08-13
2,47770eb9100c2d0c44946d9cf07ec65d,41ce2a54c0b03bf3443c3d931a367089,delivered,2018-08-08 08:38:49,2018-08-08 08:55:23,2018-08-08 13:50:00,2018-08-17 18:06:29,2018-09-04
3,949d5b44dbf5de918fe9c16f97b45f8a,f88197465ea7920adcdbec7375364d82,delivered,2017-11-18 19:28:06,2017-11-18 19:45:59,2017-11-22 13:39:59,2017-12-02 00:28:42,2017-12-15
4,ad21c59c0840e6cb83a9ceb5573f8159,8ab97904e6daea8866dbdbc4fb7aad2c,delivered,2018-02-13 21:18:39,2018-02-13 22:20:29,2018-02-14 19:46:34,2018-02-16 18:17:02,2018-02-26


In [6]:
items = pd.read_csv('./olist_order_items_dataset.csv',encoding='Windows-1251', parse_dates=['shipping_limit_date'])

In [20]:
items.dtypes

order_id                       object
order_item_id                   int64
product_id                     object
seller_id                      object
shipping_limit_date    datetime64[ns]
price                         float64
freight_value                 float64
dtype: object

In [21]:
items.shape

(112650, 7)

In [22]:
items.nunique()

order_id               98666
order_item_id             21
product_id             32951
seller_id               3095
shipping_limit_date    93318
price                   5968
freight_value           6999
dtype: int64

In [23]:
items.head()

Unnamed: 0,order_id,order_item_id,product_id,seller_id,shipping_limit_date,price,freight_value
0,00010242fe8c5a6d1ba2dd792cb16214,1,4244733e06e7ecb4970a6e2683c13e61,48436dade18ac8b2bce089ec2a041202,2017-09-19 09:45:35,58.9,13.29
1,00018f77f2f0320c557190d7a144bdd3,1,e5f2d52b802189ee658865ca93d83a8f,dd7ddc04e1b6c2c614352b383efe2d36,2017-05-03 11:05:13,239.9,19.93
2,000229ec398224ef6ca0657da4fc703e,1,c777355d18b72b67abbeef9df44fd0fd,5b51032eddd242adc84c38acab88f23d,2018-01-18 14:48:30,199.0,17.87
3,00024acbcdf0a6daa1e931b038114c75,1,7634da152a4610f1595efa32f14722fc,9d7a1d34a5052409006425275ba1c2b4,2018-08-15 10:10:18,12.99,12.79
4,00042b26cf59d7ce69dfabb4e55b4fd9,1,ac6c3623068f30de03045865e4e10089,df560393f3a51e74553ab94004ba5c87,2017-02-13 13:57:51,199.9,18.14


In [24]:
items.dropna() # delete emoty fields

Unnamed: 0,order_id,order_item_id,product_id,seller_id,shipping_limit_date,price,freight_value
0,00010242fe8c5a6d1ba2dd792cb16214,1,4244733e06e7ecb4970a6e2683c13e61,48436dade18ac8b2bce089ec2a041202,2017-09-19 09:45:35,58.90,13.29
1,00018f77f2f0320c557190d7a144bdd3,1,e5f2d52b802189ee658865ca93d83a8f,dd7ddc04e1b6c2c614352b383efe2d36,2017-05-03 11:05:13,239.90,19.93
2,000229ec398224ef6ca0657da4fc703e,1,c777355d18b72b67abbeef9df44fd0fd,5b51032eddd242adc84c38acab88f23d,2018-01-18 14:48:30,199.00,17.87
3,00024acbcdf0a6daa1e931b038114c75,1,7634da152a4610f1595efa32f14722fc,9d7a1d34a5052409006425275ba1c2b4,2018-08-15 10:10:18,12.99,12.79
4,00042b26cf59d7ce69dfabb4e55b4fd9,1,ac6c3623068f30de03045865e4e10089,df560393f3a51e74553ab94004ba5c87,2017-02-13 13:57:51,199.90,18.14
...,...,...,...,...,...,...,...
112645,fffc94f6ce00a00581880bf54a75a037,1,4aa6014eceb682077f9dc4bffebc05b0,b8bc237ba3788b23da09c0f1f3a3288c,2018-05-02 04:11:01,299.99,43.41
112646,fffcd46ef2263f404302a634eb57f7eb,1,32e07fd915822b0765e448c4dd74c828,f3c38ab652836d21de61fb8314b69182,2018-07-20 04:31:48,350.00,36.53
112647,fffce4705a9662cd70adb13d4a31832d,1,72a30483855e2eafc67aee5dc2560482,c3cfdc648177fdbbbb35635a37472c53,2017-10-30 17:14:25,99.90,16.95
112648,fffe18544ffabc95dfada21779c9644f,1,9c422a519119dcad7575db5af1ba540e,2b3e4a2a3ea8e01938cabda2a3e5cc79,2017-08-21 00:04:32,55.99,8.72


In [25]:
items.shape

(112650, 7)

 Let's merge (customers + orders)

In [26]:
customers_orders = customers.merge(orders, on = 'customer_id', how = 'left')

In [27]:
customers_orders.shape

(99441, 12)

In [28]:
customers_orders.head()

Unnamed: 0,customer_id,customer_unique_id,customer_zip_code_prefix,customer_city,customer_state,order_id,order_status,order_purchase_timestamp,order_approved_at,order_delivered_carrier_date,order_delivered_customer_date,order_estimated_delivery_date
0,06b8999e2fba1a1fbc88172c00ba8bc7,861eff4711a542e4b93843c6dd7febb0,14409,franca,SP,00e7ee1b050b8499577073aeb2a297a1,delivered,2017-05-16 15:05:35,2017-05-16 15:22:12,2017-05-23 10:47:57,2017-05-25 10:35:35,2017-06-05
1,18955e83d337fd6b2def6b18a428ac77,290c77bc529b7ac935b93aa66c333dc3,9790,sao bernardo do campo,SP,29150127e6685892b6eab3eec79f59c7,delivered,2018-01-12 20:48:24,2018-01-12 20:58:32,2018-01-15 17:14:59,2018-01-29 12:41:19,2018-02-06
2,4e7b3e00288586ebd08712fdd0374a03,060e732b5b29e8181a18229c7b0b2b5e,1151,sao paulo,SP,b2059ed67ce144a36e2aa97d2c9e9ad2,delivered,2018-05-19 16:07:45,2018-05-20 16:19:10,2018-06-11 14:31:00,2018-06-14 17:58:51,2018-06-13
3,b2b6027bc5c5109e529d4dc6358b12c3,259dac757896d24d7702b9acbbff3f3c,8775,mogi das cruzes,SP,951670f92359f4fe4a63112aa7306eba,delivered,2018-03-13 16:06:38,2018-03-13 17:29:19,2018-03-27 23:22:42,2018-03-28 16:04:25,2018-04-10
4,4f2d8ab171c80ec8364f7c12e35b23ad,345ecd01c38d18a9036ed96c73b8d066,13056,campinas,SP,6b7d50bd145f6fc7f33cebabd7e49d0f,delivered,2018-07-29 09:51:30,2018-07-29 10:10:09,2018-07-30 15:16:00,2018-08-09 20:55:48,2018-08-15


In [43]:
#customers_orders + items по order_id

In [44]:
all = customers_orders.merge(items, on = 'order_id', how = 'left')

In [45]:
all.shape

(113425, 18)

In [46]:
all.head()

Unnamed: 0,customer_id,customer_unique_id,customer_zip_code_prefix,customer_city,customer_state,order_id,order_status,order_purchase_timestamp,order_approved_at,order_delivered_carrier_date,order_delivered_customer_date,order_estimated_delivery_date,order_item_id,product_id,seller_id,shipping_limit_date,price,freight_value
0,06b8999e2fba1a1fbc88172c00ba8bc7,861eff4711a542e4b93843c6dd7febb0,14409,franca,SP,00e7ee1b050b8499577073aeb2a297a1,delivered,2017-05-16 15:05:35,2017-05-16 15:22:12,2017-05-23 10:47:57,2017-05-25 10:35:35,2017-06-05,1.0,a9516a079e37a9c9c36b9b78b10169e8,7c67e1448b00f6e969d365cea6b010ab,2017-05-22 15:22:12,124.99,21.88
1,18955e83d337fd6b2def6b18a428ac77,290c77bc529b7ac935b93aa66c333dc3,9790,sao bernardo do campo,SP,29150127e6685892b6eab3eec79f59c7,delivered,2018-01-12 20:48:24,2018-01-12 20:58:32,2018-01-15 17:14:59,2018-01-29 12:41:19,2018-02-06,1.0,4aa6014eceb682077f9dc4bffebc05b0,b8bc237ba3788b23da09c0f1f3a3288c,2018-01-18 20:58:32,289.0,46.48
2,4e7b3e00288586ebd08712fdd0374a03,060e732b5b29e8181a18229c7b0b2b5e,1151,sao paulo,SP,b2059ed67ce144a36e2aa97d2c9e9ad2,delivered,2018-05-19 16:07:45,2018-05-20 16:19:10,2018-06-11 14:31:00,2018-06-14 17:58:51,2018-06-13,1.0,bd07b66896d6f1494f5b86251848ced7,7c67e1448b00f6e969d365cea6b010ab,2018-06-05 16:19:10,139.94,17.79
3,b2b6027bc5c5109e529d4dc6358b12c3,259dac757896d24d7702b9acbbff3f3c,8775,mogi das cruzes,SP,951670f92359f4fe4a63112aa7306eba,delivered,2018-03-13 16:06:38,2018-03-13 17:29:19,2018-03-27 23:22:42,2018-03-28 16:04:25,2018-04-10,1.0,a5647c44af977b148e0a3a4751a09e2e,7c67e1448b00f6e969d365cea6b010ab,2018-03-27 16:31:16,149.94,23.36
4,4f2d8ab171c80ec8364f7c12e35b23ad,345ecd01c38d18a9036ed96c73b8d066,13056,campinas,SP,6b7d50bd145f6fc7f33cebabd7e49d0f,delivered,2018-07-29 09:51:30,2018-07-29 10:10:09,2018-07-30 15:16:00,2018-08-09 20:55:48,2018-08-15,1.0,9391a573abe00141c56e38d84d7d5b3b,4a3ca9315b744ce9f8e9374361493884,2018-07-31 10:10:09,230.0,22.25


In [47]:
all.shape

(113425, 18)

Is better to select only required columns for the table we will use next

In [48]:
df = all[['customer_unique_id', 'order_id', 'order_purchase_timestamp', 'order_approved_at', 'price']].copy()

In [49]:
df.head()

Unnamed: 0,customer_unique_id,order_id,order_purchase_timestamp,order_approved_at,price
0,861eff4711a542e4b93843c6dd7febb0,00e7ee1b050b8499577073aeb2a297a1,2017-05-16 15:05:35,2017-05-16 15:22:12,124.99
1,290c77bc529b7ac935b93aa66c333dc3,29150127e6685892b6eab3eec79f59c7,2018-01-12 20:48:24,2018-01-12 20:58:32,289.0
2,060e732b5b29e8181a18229c7b0b2b5e,b2059ed67ce144a36e2aa97d2c9e9ad2,2018-05-19 16:07:45,2018-05-20 16:19:10,139.94
3,259dac757896d24d7702b9acbbff3f3c,951670f92359f4fe4a63112aa7306eba,2018-03-13 16:06:38,2018-03-13 17:29:19,149.94
4,345ecd01c38d18a9036ed96c73b8d066,6b7d50bd145f6fc7f33cebabd7e49d0f,2018-07-29 09:51:30,2018-07-29 10:10:09,230.0


In [50]:
df.dtypes

customer_unique_id                  object
order_id                            object
order_purchase_timestamp    datetime64[ns]
order_approved_at           datetime64[ns]
price                              float64
dtype: object

To estimate the amount of purchases, we need the amount of each order, for this we group order_items by the order_id field and sum the price column and add the prize column to the main dataframe.

In [51]:
sum_orders = df.groupby(['customer_unique_id', 'order_id', 'order_purchase_timestamp'], as_index = False)\
               .agg({"price":"sum"})
#calculate amount of each order

In [52]:
sum_orders.head(4)

Unnamed: 0,customer_unique_id,order_id,order_purchase_timestamp,price
0,0000366f3b9a7992bf8c76cfdf3221e2,e22acc9c116caa3f2b7121bbb380d08e,2018-05-10 10:56:27,129.9
1,0000b849f77a49e4a4ce2b2a4ca5be3f,3594e05a005ac4d06a72673270ef9ec9,2018-05-07 11:11:27,18.9
2,0000f46a3911fa3c0805444483337064,b33ec3b699337181488304f362a6b734,2017-03-10 21:05:03,69.0
3,0000f6ccb0745a6a4b88665a16c9f078,41272756ecddd9a9ed0180413cc22fb6,2017-10-12 20:29:41,25.99


In [53]:
print('Orders from {} to {}'.format(sum_orders['order_purchase_timestamp'].min(),
                                    sum_orders['order_purchase_timestamp'].max()))# посмотрим за какое время у нас данные о покупках

Orders from 2016-09-04 21:15:19 to 2018-10-17 17:30:18


I will select data only for the full year -  2017

In [54]:
sum_orders = sum_orders.query('order_purchase_timestamp > "2016-12-31" and order_purchase_timestamp < "2018-01-01"')
sum_orders

Unnamed: 0,customer_unique_id,order_id,order_purchase_timestamp,price
2,0000f46a3911fa3c0805444483337064,b33ec3b699337181488304f362a6b734,2017-03-10 21:05:03,69.00
3,0000f6ccb0745a6a4b88665a16c9f078,41272756ecddd9a9ed0180413cc22fb6,2017-10-12 20:29:41,25.99
4,0004aac84e0df4da2b147fca70cf8255,d957021f1127559cd947b62533f484f7,2017-11-14 19:45:42,180.00
8,0005e1862207bf6ccc02e4228effd9a0,ae76bef74b97bcb0b3e355e60d9a6f9c,2017-03-04 23:32:12,135.00
10,0006fdc98a402fceb4eb0ee528f6a8d4,6681163e3dab91c549952b2845b20281,2017-07-18 09:23:10,13.90
...,...,...,...,...
99434,fffbf87b7a1a6fa8b03f081c5f51a201,64397307c6954ae1ad2ad8e791ad8a31,2017-12-27 22:36:41,149.00
99436,fffcf5a5ff07b0908bd4e2dbc735a684,725cf8e9c24e679a8a5a32cb92c9ce1e,2017-06-08 21:00:36,1570.00
99437,fffea47cd6d3cc0a88bd621562a9d061,c71b9252fd7b3b263aaa4cb09319a323,2017-12-10 20:07:56,64.89
99438,ffff371b4d645b6ecea244b27531430a,fdc45e6c7555e6cb3cc0daca2557dbe1,2017-02-07 15:49:16,89.90


We consider order_purchase_timestamp as the date of creation of the order - the buyer has already chosen something, but may not have paid.
Still, this means that he has an interest

In [55]:
now = sum_orders['order_purchase_timestamp'].max() + timedelta(days=1)

In [56]:
now

Timestamp('2018-01-01 23:29:31')

#### Calculate the Recency, Frequency and Monetary Value of each customers
___
Let's group the data by customer_unique_id and calculate 
- R (recency) - the time from the last purchase of the user to the current date, 
- F (frequency) - the total number of purchases from the user for the entire time,
- M (monetary) - the amount of purchases for the entire time.

In [57]:
sum_orders['Days_Since_Order'] = sum_orders['order_purchase_timestamp'].apply(lambda x: (now - x).days)
# Count the number of days from today to the order date

In [58]:
sum_orders.head(7)

Unnamed: 0,customer_unique_id,order_id,order_purchase_timestamp,price,Days_Since_Order
2,0000f46a3911fa3c0805444483337064,b33ec3b699337181488304f362a6b734,2017-03-10 21:05:03,69.0,297
3,0000f6ccb0745a6a4b88665a16c9f078,41272756ecddd9a9ed0180413cc22fb6,2017-10-12 20:29:41,25.99,81
4,0004aac84e0df4da2b147fca70cf8255,d957021f1127559cd947b62533f484f7,2017-11-14 19:45:42,180.0,48
8,0005e1862207bf6ccc02e4228effd9a0,ae76bef74b97bcb0b3e355e60d9a6f9c,2017-03-04 23:32:12,135.0,302
10,0006fdc98a402fceb4eb0ee528f6a8d4,6681163e3dab91c549952b2845b20281,2017-07-18 09:23:10,13.9,167
11,00082cbe03e478190aadbea78542e933,67503374d1fbcbe5e3a40324f703ffc8,2017-11-19 15:22:02,79.0,43
14,000a5ad9c4601d2bbdd9ed765d5213b3,f7fa5cf8386e51037856df1add3e1228,2017-08-11 13:45:15,76.99,143


In [59]:
rfmTable = sum_orders.groupby('customer_unique_id', as_index = False).agg({'Days_Since_Order': lambda x:x.min(),# Recency
                                                         'order_id': lambda x: len(x), # Frequency
                                                         'price': lambda x: x.sum()}) # Monetary

In [60]:
rfmTable.rename(columns={'Days_Since_Order': 'recency', 
                         'order_id': 'frequency', 
                         'price': 'monetary'}, inplace=True)

To divide customers into segments, we use the quartile method, i.e. divide each value into 5 parts
- Recency: 5- last order was not much time ago, 1 - the order was far in the past
- Frequency: 5- buys often, 1 - not often
- Monetary: 5- high sum in bill, 1 - low bill

In [61]:
#Calculate the quantiles for each value and put them in a dictionary
quantiles = rfmTable[['recency', 'frequency', 'monetary']].quantile([.2, .4, .6, .8]).to_dict()
quantiles


{'recency': {0.2: 38.0, 0.4: 92.0, 0.6: 157.0, 0.8: 233.0},
 'frequency': {0.2: 1.0, 0.4: 1.0, 0.6: 1.0, 0.8: 1.0},
 'monetary': {0.2: 39.0, 0.4: 65.99, 0.6: 105.9, 0.8: 179.9}}

In [62]:
# as max number of orders is 10
quantiles['frequency'] = {0.2: 1.0, 0.4: 3.0, 0.6: 5.0, 0.8: 9.0}
quantiles

{'recency': {0.2: 38.0, 0.4: 92.0, 0.6: 157.0, 0.8: 233.0},
 'frequency': {0.2: 1.0, 0.4: 3.0, 0.6: 5.0, 0.8: 9.0},
 'monetary': {0.2: 39.0, 0.4: 65.99, 0.6: 105.9, 0.8: 179.9}}

In [65]:
# Let's write functions for segmentation through quantiles:
# For recency - the lower number of days since the last purchase is, is better:

def r(x):
    if x <= quantiles['recency'][.2]:
        return 5
    elif x <= quantiles['recency'][.4]:
        return 4
    elif x <= quantiles['recency'][.6]:
        return 3
    elif x <= quantiles['recency'][.8]:
        return 2
    else:
        return 1


In [66]:
# for frequency & monetary - the higher value is better:
def fm(x, c):
    if x <= quantiles[c][.2]:
        return 1
    elif x <= quantiles[c][.4]:
        return 2
    elif x <= quantiles[c][.6]:
        return 3
    elif x <= quantiles[c][.8]:
        return 4
    else:
        return 5 
   

In [67]:
rfmSeg = rfmTable
rfmSeg['R_quantile'] = rfmTable['recency'].apply(lambda x: r(x))
rfmSeg['F_quantile'] = rfmTable['frequency'].apply(lambda x: fm(x, 'frequency'))
rfmSeg['M_quantile'] = rfmTable['monetary'].apply(lambda x: fm(x, 'monetary'))

In [68]:
rfmSeg

Unnamed: 0,customer_unique_id,recency,frequency,monetary,R_quantile,F_quantile,M_quantile
0,0000f46a3911fa3c0805444483337064,297,1,69.00,1,1,3
1,0000f6ccb0745a6a4b88665a16c9f078,81,1,25.99,4,1,1
2,0004aac84e0df4da2b147fca70cf8255,48,1,180.00,4,1,5
3,0005e1862207bf6ccc02e4228effd9a0,302,1,135.00,1,1,4
4,0006fdc98a402fceb4eb0ee528f6a8d4,167,1,13.90,2,1,1
...,...,...,...,...,...,...,...
43708,fffbf87b7a1a6fa8b03f081c5f51a201,5,1,149.00,5,1,4
43709,fffcf5a5ff07b0908bd4e2dbc735a684,207,1,1570.00,2,1,5
43710,fffea47cd6d3cc0a88bd621562a9d061,22,1,64.89,5,1,2
43711,ffff371b4d645b6ecea244b27531430a,328,1,89.90,1,1,3


Get a combination of indicators R_quantile M_quantile F_quantile (RFMClass)
First you need to convert these values to string format

In [71]:
rfmSeg['RFMClass'] = rfmSeg.R_quantile.map(str) \
                            + rfmSeg.F_quantile.map(str) \
                            + rfmSeg.M_quantile.map(str)

In [72]:
rfmSeg.sort_values('RFMClass', ascending=[False])

Unnamed: 0,customer_unique_id,recency,frequency,monetary,R_quantile,F_quantile,M_quantile,RFMClass
40419,ec7f1811826ab04a27a92197bc40c888,13,4,487.89,5,3,5,535
37746,dc813062e0fc23409cd255f7f53c7074,37,5,612.86,5,3,5,535
12304,47c1a3033b8b77b3ab6e109eb4d5fdf3,35,5,784.60,5,3,5,535
17207,6469f99c1f9dfae7733b25662e7f1782,7,5,514.50,5,3,5,535
4697,1b6c7548a2a1f9037c1fd3ddfed95f33,12,5,487.22,5,3,5,535
...,...,...,...,...,...,...,...,...
27599,a1c61f8566347ec44ea37d22854634a1,288,1,19.90,1,1,1,111
27629,a1f24259f50aa458eac45c8db044550f,252,1,21.00,1,1,1,111
12840,4af923ab1facb0d2b0511746cffa138d,314,1,29.90,1,1,1,111
42041,f65c8b0c2fef64995e11d3620fec61fa,240,1,29.90,1,1,1,111


Sure the best customer is the customer with RMFClass 555 or at least 535- the customer who made a purchase not more than 38 days ago, (may be) he visits us not so often (because thre is no 4-5 values for frequency) and who pays more than 179.9. 
How many of them do we have?

In [291]:
rfmSeg.query('RFMClass == "535"')

Unnamed: 0,customer_unique_id,recency,frequency,monetary,R_quantile,F_quantile,M_quantile,RFMClass
4697,1b6c7548a2a1f9037c1fd3ddfed95f33,12,5,487.22,5,3,5,535
10660,3e43e6105506432c953e165fb2acf44c,31,4,551.97,5,3,5,535
12304,47c1a3033b8b77b3ab6e109eb4d5fdf3,35,5,784.6,5,3,5,535
17207,6469f99c1f9dfae7733b25662e7f1782,7,5,514.5,5,3,5,535
37746,dc813062e0fc23409cd255f7f53c7074,37,5,612.86,5,3,5,535
40419,ec7f1811826ab04a27a92197bc40c888,13,4,487.89,5,3,5,535
