In [1]:
import pandas as pd

In [2]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [3]:
# Load cleaned datasets
data_3m = pd.read_csv('/content/drive/MyDrive/DSB/UseCase_1/Data/3m_cdata.csv')
data_6m = pd.read_csv('/content/drive/MyDrive/DSB/UseCase_1/Data/6m_cdata.csv')

In [4]:
# Top cities by customer count in 3-month data
top_cities_3m = data_3m['City'].value_counts().head(5).index.tolist()

# Filtering data for top cities and grouping by 'City', 'Reached_3w', 'Reached_3m'
grouped_city_3m = data_3m[data_3m['City'].isin(top_cities_3m)].groupby(['City', 'Reached_3w', 'Reached_3m'])[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

# Top cities by customer count in 6-month data
top_cities_6m = data_6m['City'].value_counts().head(5).index.tolist()

# Filtering data for top cities and grouping by 'City', 'Reached_3w', 'Reached_3m'
grouped_city_6m = data_6m[data_6m['City'].isin(top_cities_6m)].groupby(['City', 'Reached_3w', 'Reached_3m'])[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_city_3m, grouped_city_6m

(               City  Reached_3w  Reached_3m    FUA_Balance  Number_Of_Services
 0           BURNABY         0.0         0.0  233367.126634            1.869663
 1           BURNABY         0.0         1.0  201711.032833            1.885246
 2           BURNABY         1.0         0.0  317919.310941            2.163265
 3           BURNABY         1.0         1.0  265488.263072            1.919192
 4         COQUITLAM         0.0         0.0  225402.903519            1.977679
 5         COQUITLAM         0.0         1.0  165207.735973            1.933333
 6         COQUITLAM         1.0         0.0  221169.666500            1.825397
 7         COQUITLAM         1.0         1.0  194132.697639            1.981481
 8   NORTH VANCOUVER         0.0         0.0  281136.558732            1.937500
 9   NORTH VANCOUVER         0.0         1.0   75849.863448            1.965517
 10  NORTH VANCOUVER         1.0         0.0  366857.204831            1.942308
 11  NORTH VANCOUVER         1.0        

**3-month Data:**

In most cities, the average FUA_Balance for customers reached at both 3 weeks and 3 months is generally higher compared to those who weren't reached. The number of services used is comparable between the two groups.

However, there are exceptions, such as 'NORTH VANCOUVER', where customers reached at the 3-week mark but not at the 3-month mark have a notably higher FUA_Balance.

**6-month Data:**

Similar to the 3-month data, in most cities, customers reached at both the 3-week and 3-month marks have a higher average FUA_Balance.

Interestingly, in cities like 'BURNABY' and 'NORTH VANCOUVER', customers who were reached at both intervals have a lower average number of services used compared to other groups.

From this segmentation, we observe that:

The phone call campaigns generally lead to higher FUA_Balance across most cities.
The impact on the number of services used is mixed and varies by city.

In [5]:
# Define age bins and labels
bins = [18, 30, 45, 60, 100]
labels = ['18-30', '31-45', '46-60', '61+']

# Create age groups for both datasets
data_3m['Age_Group'] = pd.cut(data_3m['Age'], bins=bins, labels=labels, right=False)
data_6m['Age_Group'] = pd.cut(data_6m['Age'], bins=bins, labels=labels, right=False)

# Group by 'Age_Group', 'Reached_3w', 'Reached_3m' and calculate the mean of 'FUA_Balance' and 'Number_Of_Services'
grouped_age_3m = data_3m.groupby(['Age_Group', 'Reached_3w', 'Reached_3m'])[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()
grouped_age_6m = data_6m.groupby(['Age_Group', 'Reached_3w', 'Reached_3m'])[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_age_3m, grouped_age_6m

(   Age_Group  Reached_3w  Reached_3m    FUA_Balance  Number_Of_Services
 0      18-30         0.0         0.0  226663.626681            1.931034
 1      18-30         0.0         1.0  191963.868000            2.300000
 2      18-30         1.0         0.0  272024.913033            2.133333
 3      18-30         1.0         1.0  354807.032737            2.312500
 4      31-45         0.0         0.0  233538.731932            1.920327
 5      31-45         0.0         1.0  205795.942759            1.867769
 6      31-45         1.0         0.0  333485.443797            2.054054
 7      31-45         1.0         1.0  271501.286986            1.915929
 8      46-60         0.0         0.0  248355.183952            1.899948
 9      46-60         0.0         1.0  192249.357270            2.042918
 10     46-60         1.0         0.0  302593.135266            1.971861
 11     46-60         1.0         1.0  274173.600857            1.966981
 12       61+         0.0         0.0  241817.26087

**3-month Data:**

For the age group 18-30, those who were reached at both 3 weeks and 3 months have a higher average FUA_Balance and number of services used compared to those not reached.

For the age group 31-45, those who were reached at the 3-week mark (regardless of being reached at 3 months or not) have a higher FUA_Balance compared to those not reached. However, the number of services used is comparable.

For the age group 46-60, the patterns are similar to the 31-45 age group.

For the age group 61+, those reached at the 3-week mark (regardless of being reached at 3 months or not) have a notably higher FUA_Balance. The number of services used is slightly higher for those reached at both intervals.


**6-month Data:**

For the age group 18-30, those who were reached at both intervals have a significantly higher average FUA_Balance. However, the number of services used is lower for this group compared to others.

For the age group 31-45, those reached at both intervals have the highest FUA_Balance. The number of services used is lower for this group.

The patterns for the age groups 46-60 and 61+ are consistent with the 3-month data.

From this segmentation, we observe:

Across all age groups, the onboarding calls generally lead to a higher FUA_Balance.
The impact on the number of services used is mixed and varies by age group.

In [6]:
# Filtering data for top cities and customers reached only at the 3-week mark
grouped_city_3w_3m = data_3m[(data_3m['Reached_3w'] == 1) & (data_3m['Reached_3m'] == 0) & data_3m['City'].isin(top_cities_3m)]
grouped_city_3w_3m = grouped_city_3w_3m.groupby('City')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_city_3w_6m = data_6m[(data_6m['Reached_3w'] == 1) & (data_6m['Reached_3m'] == 0) & data_6m['City'].isin(top_cities_6m)]
grouped_city_3w_6m = grouped_city_3w_6m.groupby('City')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

# Filtering data for top cities and customers reached only at the 3-month mark
grouped_city_3m_only_3m = data_3m[(data_3m['Reached_3w'] == 0) & (data_3m['Reached_3m'] == 1) & data_3m['City'].isin(top_cities_3m)]
grouped_city_3m_only_3m = grouped_city_3m_only_3m.groupby('City')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_city_3m_only_6m = data_6m[(data_6m['Reached_3w'] == 0) & (data_6m['Reached_3m'] == 1) & data_6m['City'].isin(top_cities_6m)]
grouped_city_3m_only_6m = grouped_city_3m_only_6m.groupby('City')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_city_3w_3m, grouped_city_3w_6m, grouped_city_3m_only_3m, grouped_city_3m_only_6m

(              City    FUA_Balance  Number_Of_Services
 0          BURNABY  317919.310941            2.163265
 1        COQUITLAM  221169.666500            1.825397
 2  NORTH VANCOUVER  366857.204831            1.942308
 3           SURREY  371203.726985            1.994083
 4        VANCOUVER  297549.904766            2.013453,
               City    FUA_Balance  Number_Of_Services
 0          BURNABY  380022.690475            2.347518
 1        COQUITLAM  317545.338118            2.223529
 2  NORTH VANCOUVER  446106.508704            2.073171
 3           SURREY  411571.463656            2.051948
 4        VANCOUVER  339268.590342            2.267123,
               City    FUA_Balance  Number_Of_Services
 0          BURNABY  201711.032833            1.885246
 1        COQUITLAM  165207.735973            1.933333
 2  NORTH VANCOUVER   75849.863448            1.965517
 3           SURREY  144600.293788            2.030303
 4        VANCOUVER  202153.155659            1.992593,
       

Here's the segmented analysis based on 'City' for customers reached only at the 3-week mark and only at the 3-month mark:

3-month Data:

**Customers reached only at the 3-week mark:**

BURNABY: Average FUA = $317,919.31, Number of Services = 2.16

COQUITLAM: Average FUA = $221,169.67, Number of Services = 1.83

NORTH VANCOUVER: Average FUA = $366,857.20, Number of Services = 1.94

SURREY: Average FUA = $371,203.73, Number of Services = 1.99

VANCOUVER: Average FUA = $297,549.90, Number of Services = 2.01

**Customers reached only at the 3-month mark:**

BURNABY: Average FUA = $201,711.03, Number of Services = 1.89

COQUITLAM: Average FUA = $165,207.74, Number of Services = 1.93

NORTH VANCOUVER: Average FUA = $75,849.86, Number of Services = 1.97

SURREY: Average FUA = $144,600.29, Number of Services = 2.03

VANCOUVER: Average FUA = $202,153.16, Number of Services = 1.99


6-month Data:

**Customers reached only at the 3-week mark:**

BURNABY: Average FUA = $380,022.69, Number of Services = 2.35

COQUITLAM: Average FUA = $317,545.34, Number of Services = 2.22

NORTH VANCOUVER: Average FUA = $446,106.51, Number of Services = 2.07

SURREY: Average FUA = $411,571.46, Number of Services = 2.05

VANCOUVER: Average FUA = $339,268.59, Number of Services = 2.27

**Customers reached only at the 3-month mark:**

BURNABY: Average FUA = $255,350.59, Number of Services = 2.34

COQUITLAM: Average FUA = $221,937.44, Number of Services = 2.21

NORTH VANCOUVER: Average FUA = $189,167.90, Number of Services = 2.15

SURREY: Average FUA = $219,083.83, Number of Services = 2.39

VANCOUVER: Average FUA = $283,486.79, Number of Services = 2.36


From this city-segmented analysis, we observe that customers reached only at the 3-week mark generally have a higher FUA and use more services than those reached only at the 3-month mark.

In [7]:
# Group by 'Age_Group' for customers reached only at the 3-week mark and calculate the mean of 'FUA_Balance' and 'Number_Of_Services'
grouped_age_3w_3m = data_3m[(data_3m['Reached_3w'] == 1) & (data_3m['Reached_3m'] == 0)].groupby('Age_Group')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()
grouped_age_3w_6m = data_6m[(data_6m['Reached_3w'] == 1) & (data_6m['Reached_3m'] == 0)].groupby('Age_Group')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

# Group by 'Age_Group' for customers reached only at the 3-month mark and calculate the mean of 'FUA_Balance' and 'Number_Of_Services'
grouped_age_3m_only_3m = data_3m[(data_3m['Reached_3w'] == 0) & (data_3m['Reached_3m'] == 1)].groupby('Age_Group')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()
grouped_age_3m_only_6m = data_6m[(data_6m['Reached_3w'] == 0) & (data_6m['Reached_3m'] == 1)].groupby('Age_Group')[['FUA_Balance', 'Number_Of_Services']].mean().reset_index()

grouped_age_3w_3m, grouped_age_3w_6m, grouped_age_3m_only_3m, grouped_age_3m_only_6m

(  Age_Group    FUA_Balance  Number_Of_Services
 0     18-30  272024.913033            2.133333
 1     31-45  333485.443797            2.054054
 2     46-60  302593.135266            1.971861
 3       61+  330132.211634            2.020588,
   Age_Group    FUA_Balance  Number_Of_Services
 0     18-30  415757.403525            2.285714
 1     31-45  380448.187299            2.238754
 2     46-60  365156.477409            2.222047
 3       61+  375161.044157            2.220044,
   Age_Group    FUA_Balance  Number_Of_Services
 0     18-30  191963.868000            2.300000
 1     31-45  205795.942759            1.867769
 2     46-60  192249.357270            2.042918
 3       61+  162118.352007            2.105556,
   Age_Group    FUA_Balance  Number_Of_Services
 0     18-30  103520.908775            2.000000
 1     31-45  278354.693576            2.237037
 2     46-60  246241.565855            2.463035
 3       61+  255276.498501            2.328571)

Here's the segmented analysis based on 'Age Group' for customers reached only at the 3-week mark and only at the 3-month mark:

3-month Data:

Customers reached only at the 3-week mark:

18-30: Average FUA = $272,024.91, Number of Services = 2.13

31-45: Average FUA = $333,485.44, Number of Services = 2.05

46-60: Average FUA = $302,593.14, Number of Services = 1.97

61+: Average FUA = $330,132.21, Number of Services = 2.02

Customers reached only at the 3-month mark:

18-30: Average FUA = $191,963.87, Number of Services = 2.30

31-45: Average FUA = $205,795.94, Number of Services = 1.87

46-60: Average FUA = $192,249.36, Number of Services = 2.04

61+: Average FUA = $162,118.35, Number of Services = 2.11

6-month Data:

Customers reached only at the 3-week mark:

18-30: Average FUA = $415,757.40, Number of Services = 2.29

31-45: Average FUA = $380,448.19, Number of Services = 2.24

46-60: Average FUA = $365,156.48, Number of Services = 2.22

61+: Average FUA = $375,161.04, Number of Services = 2.22

Customers reached only at the 3-month mark:

18-30: Average FUA = $103,520.91, Number of Services = 2.00

31-45: Average FUA = $278,354.69, Number of Services = 2.24

46-60: Average FUA = $246,241.57, Number of Services = 2.46

61+: Average FUA = $255,276.50, Number of Services = 2.33

From this age-segmented analysis, we can make several observations:

Customers reached only at the 3-week mark, across all age groups, generally have a higher FUA, especially when comparing the 6-month data.
For the 3-month data, younger customers (18-30) reached only at the 3-month mark have a higher number of services compared to those reached at the 3-week mark. This trend doesn't persist in the 6-month data.
The FUA for customers in the age group 46-60 reached only at the 3-month mark sees a significant increase from the 3-month data to the 6-month data.
This segmented analysis can help the bank tailor its onboarding strategies based on specific customer demographics.