# Project: Retail Analytics - VIP Customer Identification

## Task Title: Identify VIP Customers Based on Purchase Behavior & Membership


### Objective:

 1. Find customers who made total purchases over $250.
 2. Among those, return only Gold members.

### Data Sources:

🔹 purchases.csv
🔹 members.csv

### Final Output:
🔸 customer_id
🔸 total_spent
🔸 membership_level

### Methodology:
1. Performed an INNER JOIN on customer_id.
2. Aggregated total purchase amount using groupby.
3. Filtered customers with:
     - total_spent > 250
     - membership_level = 'Gold'

### Tools Used:
- Python with Pandas
- SQL for relational DB alternative

### Importing the pandas library to work with CSV files and dataframes

In [2]:
import pandas as pd

### STEP 1: Simulate the purchases data

In [13]:
purchases_data = {
    "transaction_id": ['T1001', 'T1002', 'T1003', 'T1004'],
    "customer_id": ['C001', 'C002', 'C001', 'C003'],
    "amount": [120.50, 200.00, 300.00, 50.00],
    "transaction_date": ['2024-03-15', '2024-03-15', '2024-03-16', '2024-03-17']
}
purchases = pd.DataFrame(purchases_data)
purchases

Unnamed: 0,transaction_id,customer_id,amount,transaction_date
0,T1001,C001,120.5,2024-03-15
1,T1002,C002,200.0,2024-03-15
2,T1003,C001,300.0,2024-03-16
3,T1004,C003,50.0,2024-03-17


### STEP 2: Simulate the members data

In [12]:
members_data = {
    "customer_id": ['C001', 'C002', 'C004'],
    "member_since": ['2023-05-01', '2023-11-15', '2024-01-10'],
    "membership_level": ['Gold', 'Silver', 'Gold']
}
members = pd.DataFrame(members_data)
members

Unnamed: 0,customer_id,member_since,membership_level
0,C001,2023-05-01,Gold
1,C002,2023-11-15,Silver
2,C004,2024-01-10,Gold


### STEP 3: Merge both DataFrames

In [10]:
merged_df = pd.merge(purchases, members, on="customer_id", how="inner")
merged_df

Unnamed: 0,transaction_id,customer_id,amount,transaction_date,member_since,membership_level
0,T1001,C001,120.5,2024-03-15,2023-05-01,Gold
1,T1003,C001,300.0,2024-03-16,2023-05-01,Gold
2,T1002,C002,200.0,2024-03-15,2023-11-15,Silver


### STEP 4: Filter only Gold members

In [14]:
gold_members = merged_df[merged_df["membership_level"] == "Gold"]
gold_members

Unnamed: 0,transaction_id,customer_id,amount,transaction_date,member_since,membership_level
0,T1001,C001,120.5,2024-03-15,2023-05-01,Gold
1,T1003,C001,300.0,2024-03-16,2023-05-01,Gold


### STEP 5: Group by customer_id and calculate total_spent

In [17]:
vip_summary = gold_members.groupby(["customer_id", "membership_level"])["amount"].sum().reset_index()
vip_summary

Unnamed: 0,customer_id,membership_level,amount
0,C001,Gold,420.5


### STEP 6: Filter those with total_spent > 250

In [18]:
vip_customers = vip_summary[vip_summary["amount"] > 250]
vip_customers

Unnamed: 0,customer_id,membership_level,amount
0,C001,Gold,420.5


### STEP 7: Rename 'amount' to 'total_spent'

In [19]:
vip_customers.rename(columns={"amount": "total_spent"}, inplace=True)

### STEP 8: Show result

In [20]:
print(vip_customers)

  customer_id membership_level  total_spent
0        C001             Gold        420.5


- Author: Manish Devdi
- Date: April 2025