# Customer Churn Feature Integration

In this notebook, we combined multiple datasets to create a **comprehensive, customer-level feature set** suitable for churn modeling. All datasets were merged on the unique identifier `customer_id`.

## Datasets Merged

1. **Transaction History (`transactions_new`)**
   - Features include total spending, average spending, spending frequency, account age, and days since last purchase.

2. **Customer Service (`customer_service_new`)**
   - Features include number of complaints, feedbacks, inquiries, number of resolved and unresolved interactions, complaint rate, resolution rate, and recency of last interaction.

3. **Online Activity (`online_activity_new`)**
   - Features include recency of last login and platform engagement.
   - Mediums with low predictive value (e.g., website) were dropped.

## Merging Process

- All datasets were merged on `customer_id` using a **left join**, keeping all customers in the transaction dataset.
- This ensures that every customer has a complete set of features across transactions, service interactions, and online activity.

```python
# Merge datasets on customer_id
customer_df = transactions_new.merge(customer_service_new, on='customer_id', how='left')
customer_df = customer_df.merge(online_activity_new, on='customer_id', how='left')

# Display final merged DataFrame
customer_df.head()

In [2]:
import pandas as pd
import numpy as np

In [3]:
t_df = pd.read_csv('/Users/mac/PycharmProjects/Customer_Churn/transactions_history')
c_df = pd.read_csv('/Users/mac/PycharmProjects/Customer_Churn/customer_service')

In [6]:
df_new = t_df.merge(c_df, how='left', on='customer_id')
df_new

Unnamed: 0,customer_id,total_amount_spent,number_of_transactions,no_electronics,no_furniture,no_clothing,no_groceries,no_books,days_since_last_purchase,average_spent,...,no_complaints,no_feedbacks,no_inquiry,no_resolved,no_unresolved,no_interactions,recency,complaint_rate,unresolution_rate,feedback_rate
0,1,416.50,1,1,0,0,0,0,1278,416.500000,...,0.0,0.0,1.0,1.0,0.0,1.0,1269.0,0.0,0.0,0.0
1,2,1547.42,7,3,1,2,1,0,1041,221.060000,...,0.0,0.0,1.0,1.0,0.0,1.0,1283.0,0.0,0.0,0.0
2,3,1702.98,6,0,2,1,2,1,1083,283.830000,...,0.0,0.0,1.0,1.0,0.0,1.0,1123.0,0.0,0.0,0.0
3,4,917.29,5,2,1,1,1,0,1003,183.458000,...,0.0,0.0,2.0,1.0,1.0,2.0,1037.0,0.0,0.5,0.0
4,5,2001.49,8,3,2,0,3,0,1009,250.186250,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,227.25,1,0,0,0,0,1,1159,227.250000,...,,,,,,,,,,
996,997,419.82,2,1,1,0,0,0,1066,209.910000,...,,,,,,,,,,
997,998,252.15,1,0,0,0,0,1,1103,252.150000,...,,,,,,,,,,
998,999,2393.26,9,2,4,0,2,1,1023,265.917778,...,,,,,,,,,,


In [7]:
o_df = pd.read_csv('/Users/mac/PycharmProjects/Customer_Churn/online_activity')
o_df

Unnamed: 0,customer_id,login_frequency,service_usage,days_since_last_login
0,1,34,Mobile App,705
1,2,5,Website,660
2,3,3,Website,680
3,4,2,Website,762
4,5,41,Website,699
...,...,...,...,...
995,996,38,Mobile App,970
996,997,5,Mobile App,908
997,998,47,Website,808
998,999,23,Website,991


In [8]:
df_new = df_new.merge(o_df, how='left', on='customer_id')
df_new

Unnamed: 0,customer_id,total_amount_spent,number_of_transactions,no_electronics,no_furniture,no_clothing,no_groceries,no_books,days_since_last_purchase,average_spent,...,no_resolved,no_unresolved,no_interactions,recency,complaint_rate,unresolution_rate,feedback_rate,login_frequency,service_usage,days_since_last_login
0,1,416.50,1,1,0,0,0,0,1278,416.500000,...,1.0,0.0,1.0,1269.0,0.0,0.0,0.0,34,Mobile App,705
1,2,1547.42,7,3,1,2,1,0,1041,221.060000,...,1.0,0.0,1.0,1283.0,0.0,0.0,0.0,5,Website,660
2,3,1702.98,6,0,2,1,2,1,1083,283.830000,...,1.0,0.0,1.0,1123.0,0.0,0.0,0.0,3,Website,680
3,4,917.29,5,2,1,1,1,0,1003,183.458000,...,1.0,1.0,2.0,1037.0,0.0,0.5,0.0,2,Website,762
4,5,2001.49,8,3,2,0,3,0,1009,250.186250,...,,,,,,,,41,Website,699
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,227.25,1,0,0,0,0,1,1159,227.250000,...,,,,,,,,38,Mobile App,970
996,997,419.82,2,1,1,0,0,0,1066,209.910000,...,,,,,,,,5,Mobile App,908
997,998,252.15,1,0,0,0,0,1,1103,252.150000,...,,,,,,,,47,Website,808
998,999,2393.26,9,2,4,0,2,1,1023,265.917778,...,,,,,,,,23,Website,991


In [10]:
ch_df = pd.read_csv('/Users/mac/PycharmProjects/Customer_Churn/churn_status')
ch_df

Unnamed: 0,customer_id,churn_status
0,1,0
1,2,1
2,3,0
3,4,0
4,5,0
...,...,...
995,996,0
996,997,0
997,998,0
998,999,0


In [11]:
df_new = df_new.merge(ch_df, how='left', on='customer_id')
df_new

Unnamed: 0,customer_id,total_amount_spent,number_of_transactions,no_electronics,no_furniture,no_clothing,no_groceries,no_books,days_since_last_purchase,average_spent,...,no_unresolved,no_interactions,recency,complaint_rate,unresolution_rate,feedback_rate,login_frequency,service_usage,days_since_last_login,churn_status
0,1,416.50,1,1,0,0,0,0,1278,416.500000,...,0.0,1.0,1269.0,0.0,0.0,0.0,34,Mobile App,705,0
1,2,1547.42,7,3,1,2,1,0,1041,221.060000,...,0.0,1.0,1283.0,0.0,0.0,0.0,5,Website,660,1
2,3,1702.98,6,0,2,1,2,1,1083,283.830000,...,0.0,1.0,1123.0,0.0,0.0,0.0,3,Website,680,0
3,4,917.29,5,2,1,1,1,0,1003,183.458000,...,1.0,2.0,1037.0,0.0,0.5,0.0,2,Website,762,0
4,5,2001.49,8,3,2,0,3,0,1009,250.186250,...,,,,,,,41,Website,699,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,996,227.25,1,0,0,0,0,1,1159,227.250000,...,,,,,,,38,Mobile App,970,0
996,997,419.82,2,1,1,0,0,0,1066,209.910000,...,,,,,,,5,Mobile App,908,0
997,998,252.15,1,0,0,0,0,1,1103,252.150000,...,,,,,,,47,Website,808,0
998,999,2393.26,9,2,4,0,2,1,1023,265.917778,...,,,,,,,23,Website,991,0


In [12]:
df_new.to_csv('Merged_Dataset.csv', index=False)