# E-Commerce Customer Behavior Analysis

**Name:** Melvin Abraham 
**Course:** Data Science, AI & Machine Learning  
**Focus:** Understanding customer purchase behavior using Python  

This notebook documents my step-by-step approach to analyzing customer transaction data for an e-commerce platform. The goal is to explore customer behavior patterns and generate insights that can support marketing and retention strategies.

## Dataset Overview

The dataset contains customer transaction records collected over a six-month period.  
Each row represents a single purchase made by a customer.

### Columns in the dataset:
- Customer_ID
- Customer_Name
- Email
- Signup_Date
- Transaction_Date
- Product_Category
- Purchase_Amount
- Payment_Method
- Device

Since customers can make multiple purchases, the dataset is transaction-level and requires aggregation to analyze customer-level behavior.


In [40]:
import csv
from datetime import datetime

## Loading and Cleaning Transaction Data

The first step is to load the transaction data from the CSV file.  
Basic cleaning is applied to ensure:
- Purchase amounts are numeric
- Date fields are converted into datetime objects
- Incomplete or invalid rows are skipped

This ensures that all subsequent analysis is based on clean and consistent data.


In [41]:
def load_transaction_data(filename):
    transactions = []

    with open(filename, mode="r", encoding="utf-8") as file:
        reader = csv.DictReader(file)

        for row in reader:
            try:
                if not row["Customer_ID"] or not row["Purchase_Amount"]:
                    continue

                transaction = {
                    "customer_id": row["Customer_ID"].strip(),
                    "name": row["Customer_Name"].strip(),
                    "email": row["Email"].strip(),
                    "signup_date": datetime.strptime(row["Signup_Date"], "%Y-%m-%d"),
                    "transaction_date": datetime.strptime(row["Transaction_Date"], "%Y-%m-%d"),
                    "category": row["Product_Category"].strip(),
                    "amount": float(row["Purchase_Amount"]),
                    "payment_method": row["Payment_Method"].strip(),
                    "device": row["Device"].strip()
                }

                transactions.append(transaction)

            except (ValueError, KeyError):
                continue

    return transactions

In [42]:
transactions = load_transaction_data("customer_transactions.csv")
len(transactions)

6226

## Building Customer Profiles

To analyze customer behavior, transactions are grouped by Customer_ID.  
For each customer, a profile is created that tracks:
- Total number of transactions
- Total amount spent
- Individual transaction history

This structure makes it easier to perform customer-level analysis later.

In [43]:
print("Jupyter is working ðŸŽ‰")

Jupyter is working ðŸŽ‰


In [44]:
def build_customer_profiles(transactions):
    profiles = {}

    for txn in transactions:
        cid = txn["customer_id"]

        if cid not in profiles:
            profiles[cid] = {
                "customer_id": cid,
                "name": txn["name"],
                "email": txn["email"],
                "signup_date": txn["signup_date"],
                "total_transactions": 0,
                "total_spent": 0.0,
                "transactions": []
            }

        profiles[cid]["transactions"].append(txn)
        profiles[cid]["total_transactions"] += 1
        profiles[cid]["total_spent"] += txn["amount"]

    return profiles


In [45]:
profiles = build_customer_profiles(transactions)
len(profiles)

500

## Customer Database Summary

After building customer profiles, summary statistics are calculated to understand the size of the customer base, average engagement, and top customers by activity and spending.


In [46]:
def display_customer_summary(customer_profiles):
    total_customers = len(customer_profiles)

    total_transactions = sum(p["total_transactions"] for p in customer_profiles.values())
    total_revenue = sum(p["total_spent"] for p in customer_profiles.values())

    avg_transactions = total_transactions / total_customers
    avg_spend = total_revenue / total_customers

    most_active = max(customer_profiles.values(), key=lambda p: p["total_transactions"])
    highest_spender = max(customer_profiles.values(), key=lambda p: p["total_spent"])

    print("CUSTOMER DATABASE SUMMARY")
    print("=" * 30)
    print(f"Total Customers: {total_customers}")
    print(f"Average Transactions per Customer: {avg_transactions:.2f}")
    print(f"Average Customer Lifetime Value: ${avg_spend:.2f}")
    print(
        f"Most Active Customer: {most_active['customer_id']} "
        f"({most_active['total_transactions']} transactions)"
    )
    print(
        f"Highest Spending Customer: {highest_spender['customer_id']} "
        f"(${highest_spender['total_spent']:.2f})"
    )


display_customer_summary(profiles)

CUSTOMER DATABASE SUMMARY
Total Customers: 500
Average Transactions per Customer: 12.45
Average Customer Lifetime Value: $5303.10
Most Active Customer: CUST-0171 (15 transactions)
Highest Spending Customer: CUST-0407 ($11917.21)


## Purchase Behavior Analysis

After building customer profiles, the next step was to understand how customers
behave when making purchases. This section focuses on purchase frequency,
customer value, and product category preferences.


## Quick Explanation

In this section, we analyze how frequently customers make purchases and how much value they contribute over time.
Understanding purchase frequency helps the business identify casual buyers versus loyal customers, while customer lifetime value (CLV) highlights the most valuable customers.


In [47]:
def analyze_purchase_frequency(customer_profiles):
    """
    Categorize customers based on how often they purchase.
    """
    categories = {
        "One-time": [],
        "Occasional": [],
        "Regular": [],
        "Frequent": []
    }

    for customer_id, profile in customer_profiles.items():
        count = profile["total_transactions"]
        total_spent = profile["total_spent"]

        if count == 1:
            categories["One-time"].append(total_spent)
        elif 2 <= count <= 3:
            categories["Occasional"].append(total_spent)
        elif 4 <= count <= 6:
            categories["Regular"].append(total_spent)
        else:
            categories["Frequent"].append(total_spent)

    return categories

In [48]:
frequency_data = analyze_purchase_frequency(profiles)

total_customers = len(profiles)

print("PURCHASE FREQUENCY DISTRIBUTION")
print("=" * 35)

for category, spends in frequency_data.items():
    count = len(spends)
    percentage = (count / total_customers) * 100
    avg_spend = sum(spends) / count if count > 0 else 0

    print(
        f"{category:<12}: {count:>4} "
        f"({percentage:>5.1f}%) | Avg Spend: ${avg_spend:,.2f}"
    )

PURCHASE FREQUENCY DISTRIBUTION
One-time    :    0 (  0.0%) | Avg Spend: $0.00
Occasional  :    0 (  0.0%) | Avg Spend: $0.00
Regular     :    0 (  0.0%) | Avg Spend: $0.00
Frequent    :  500 (100.0%) | Avg Spend: $5,303.10


### Insights from Purchase Frequency

- One-time buyers represent customers with minimal engagement and may require targeted onboarding campaigns.
- Frequent buyers contribute significantly higher average spend, indicating strong brand loyalty.
- Improving retention among occasional and regular buyers could lead to substantial revenue growth.


In [49]:
def calculate_customer_lifetime_value(transactions):
    """
    Calculate total spending per customer and sort by highest value.
    """
    clv = {}

    for tx in transactions:
        customer_id = tx["customer_id"]
        amount = tx["amount"] # fix error 

        clv[customer_id] = clv.get(customer_id, 0) + amount

    return dict(sorted(clv.items(), key=lambda x: x[1], reverse=True))

In [50]:
clv_data = calculate_customer_lifetime_value(transactions)

print("TOP 5 CUSTOMERS BY LIFETIME VALUE")
print("=" * 40)

for i, (customer_id, value) in enumerate(list(clv_data.items())[:5], start=1):
    print(f"{i}. {customer_id}: ${value:,.2f}")

TOP 5 CUSTOMERS BY LIFETIME VALUE
1. CUST-0407: $11,917.21
2. CUST-0194: $11,581.28
3. CUST-0081: $11,418.96
4. CUST-0446: $11,341.09
5. CUST-0183: $11,143.36


### Customer Lifetime Value Insights

Customers with the highest lifetime value are critical to business growth.
These customers should be prioritized for loyalty programs, exclusive offers, and personalized marketing strategies to maximize retention and long-term revenue.


In [51]:
def segment_customers_by_value(customer_profiles):
    segments = {
        "VIP": [],
        "High Value": [],
        "Medium Value": [],
        "Low Value": []
    }

    for customer_id, profile in customer_profiles.items():
        spent = profile["total_spent"]

        if spent > 1000:
            segments["VIP"].append(profile)
        elif spent >= 500:
            segments["High Value"].append(profile)
        elif spent >= 200:
            segments["Medium Value"].append(profile)
        else:
            segments["Low Value"].append(profile)

    return segments

In [52]:
segments = segment_customers_by_value(profiles)

for segment, customers in segments.items():
    print(f"{segment}: {len(customers)} customers")

VIP: 499 customers
High Value: 1 customers
Medium Value: 0 customers
Low Value: 0 customers


In [53]:
from datetime import datetime, timedelta

def identify_at_risk_customers(transactions, days_threshold=90):
    last_purchase = {}
    clv = {}

    for tx in transactions:
        cid = tx["customer_id"]
        date = tx["transaction_date"]
        amount = tx["amount"]

        if cid not in last_purchase or date > last_purchase[cid]:
            last_purchase[cid] = date

        clv[cid] = clv.get(cid, 0) + amount

    cutoff_date = datetime.now() - timedelta(days=days_threshold)

    at_risk = []
    for cid, last_date in last_purchase.items():
        if last_date < cutoff_date:
            at_risk.append({
                "customer_id": cid,
                "days_since_last_purchase": (datetime.now() - last_date).days,
                "historic_clv": clv[cid]
            })

    return sorted(at_risk, key=lambda x: x["historic_clv"], reverse=True)

In [54]:
at_risk_customers = identify_at_risk_customers(transactions)

print(f"Total at-risk customers: {len(at_risk_customers)}")

at_risk_customers[:5]

Total at-risk customers: 500


[{'customer_id': 'CUST-0407',
  'days_since_last_purchase': 411,
  'historic_clv': 11917.209999999997},
 {'customer_id': 'CUST-0194',
  'days_since_last_purchase': 426,
  'historic_clv': 11581.279999999999},
 {'customer_id': 'CUST-0081',
  'days_since_last_purchase': 420,
  'historic_clv': 11418.960000000001},
 {'customer_id': 'CUST-0446',
  'days_since_last_purchase': 408,
  'historic_clv': 11341.09},
 {'customer_id': 'CUST-0183',
  'days_since_last_purchase': 407,
  'historic_clv': 11143.359999999999}]

In [55]:
def find_loyal_customers(customer_profiles, min_transactions=5):
    loyal = []

    for profile in customer_profiles.values():
        if profile["total_transactions"] >= min_transactions:
            avg_order = profile["total_spent"] / profile["total_transactions"]
            loyal.append({
                "customer_id": profile["customer_id"],
                "transactions": profile["total_transactions"],
                "avg_order_value": avg_order
            })

    return sorted(loyal, key=lambda x: x["avg_order_value"], reverse=True)

In [56]:
loyal_customers = find_loyal_customers(profiles)

print(f"Loyal customers: {len(loyal_customers)}")

loyal_customers[:5]

Loyal customers: 500


[{'customer_id': 'CUST-0407',
  'transactions': 11,
  'avg_order_value': 1083.382727272727},
 {'customer_id': 'CUST-0081',
  'transactions': 13,
  'avg_order_value': 878.3815384615385},
 {'customer_id': 'CUST-0269',
  'transactions': 12,
  'avg_order_value': 832.0408333333334},
 {'customer_id': 'CUST-0194',
  'transactions': 14,
  'avg_order_value': 827.2342857142856},
 {'customer_id': 'CUST-0203',
  'transactions': 11,
  'avg_order_value': 816.6790909090909}]

## Part 3: Customer Segmentation
Objective;

The goal of this section is to move beyond raw transaction metrics and group customers into meaningful segments based on their value and behavior.
This helps the business identify:

High-value customers worth retaining

Customers at risk of churn

Loyal customers who contribute consistently to revenue

## Customer Segmentation by Total Spend

To understand customer value, I segmented customers based on their total lifetime spend on the platform. Each customer was placed into one of four value-based groups:

VIP Customers: Total spend greater than $1000

High Value Customers: Total spend between $500 and $1000

Medium Value Customers: Total spend between $200 and $500

Low Value Customers: Total spend less than $200

This segmentation allows the business to prioritize personalized marketing strategies and allocate resources efficiently.

Approach:

Iterated through the customer profiles built in Part 1

Evaluated each customerâ€™s total spending

Assigned them to the appropriate value segment

## Identifying At-Risk Customers

Customer retention is critical for long-term growth. In this section, I identified at-risk customers â€” customers who have not made a purchase within the last 90 days.

For each customer, I:

Tracked the most recent transaction date

Compared it with the current date

Calculated the number of days since the last purchase

Retrieved the customerâ€™s historic lifetime value (CLV)

Customers exceeding the inactivity threshold were flagged as at-risk and sorted by their historic CLV.
This helps the marketing team focus retention efforts on customers with the highest potential revenue loss.

## Identifying Loyal Customers

Loyal customers are a strong indicator of product-market fit and long-term stability.
In this section, I identified customers who have completed five or more transactions.

For these customers, I calculated:

Total number of transactions

Average order value (total spend Ã· total transactions)

The customers were then ranked by their average order value to highlight those who consistently make higher-value purchases.

This analysis helps the business recognize and reward loyalty, as well as design targeted upsell and referral programs.

## Summary of Insights from Part 3

Customer value varies significantly across the platform

A relatively small group of customers contributes a large portion of revenue

Several high-value customers show signs of inactivity and may require re-engagement

Loyal customers demonstrate higher average order values and should be prioritized for retention campaigns

## Part 4: Payment & Device Analytics
Objective;

In this section, the goal is to understand how customers prefer to pay and which devices they use when making purchases.
These insights are valuable for improving checkout experience, optimizing mobile performance, and tailoring marketing strategies.

## Payment Method Analysis

Different payment methods can indicate different customer behaviors.
Here, I analyzed transactions to:

Count how many transactions were made using each payment method

Calculate the average purchase value per payment method

Identify which payment methods are most popular and which drive higher order values

Approach:

Iterated through all transactions

Grouped transactions by payment method

Tracked total transactions and total revenue per method

Computed average purchase value

This helps the business understand which payment options should be prioritized or promoted.

## Device Usage Analysis

Customer device preference can strongly influence purchasing behavior.
In this analysis, I examined how customers shop across different devices:

Mobile

Desktop

Tablet

For each device type, I calculated:

Total number of transactions

Percentage of total transactions

Average order value

This provides insight into:

Which devices customers prefer

Which devices generate higher-value purchases

Opportunities for improving user experience across platforms

## Summary of Insights from Part 4

Payment method choice varies in both frequency and average spend

Certain payment methods are associated with higher-value purchases

Device type influences purchasing behavior, with noticeable differences in average order value

These insights can guide checkout optimization and targeted marketing efforts

In [57]:
from collections import defaultdict

def analyze_payment_methods(transactions):
    payment_stats = defaultdict(lambda: {"count": 0, "total_spent": 0.0})
    
    for tx in transactions:
        method = tx["payment_method"]
        amount = tx["amount"]
        
        payment_stats[method]["count"] += 1
        payment_stats[method]["total_spent"] += amount
    
    results = {}
    for method, stats in payment_stats.items():
        avg_value = stats["total_spent"] / stats["count"]
        results[method] = {
            "transactions": stats["count"],
            "average_purchase": avg_value
        }
    
    return results


payment_analysis = analyze_payment_methods(transactions)

print("PAYMENT METHOD ANALYSIS")
print("=" * 30)
for method, data in payment_analysis.items():
    print(
        f"{method}: {data['transactions']} transactions | "
        f"Avg Purchase: ${data['average_purchase']:.2f}"
    )

PAYMENT METHOD ANALYSIS
Bank Transfer: 1062 transactions | Avg Purchase: $437.30
Apple Pay: 1017 transactions | Avg Purchase: $401.26
Debit Card: 1028 transactions | Avg Purchase: $449.51
Google Pay: 1039 transactions | Avg Purchase: $420.06
PayPal: 1021 transactions | Avg Purchase: $415.62
Credit Card: 1059 transactions | Avg Purchase: $430.76


In [58]:
def analyze_device_usage(transactions):
    device_stats = defaultdict(lambda: {"count": 0, "total_spent": 0.0})
    total_transactions = len(transactions)
    
    for tx in transactions:
        device = tx["device"]
        amount = tx["amount"]
        
        device_stats[device]["count"] += 1
        device_stats[device]["total_spent"] += amount
    
    results = {}
    for device, stats in device_stats.items():
        avg_order = stats["total_spent"] / stats["count"]
        percentage = (stats["count"] / total_transactions) * 100
        
        results[device] = {
            "transactions": stats["count"],
            "percentage": percentage,
            "average_order": avg_order
        }
    
    return results


device_analysis = analyze_device_usage(transactions)

print("\nDEVICE USAGE PATTERNS")
print("=" * 30)
for device, data in device_analysis.items():
    print(
        f"{device}: {data['transactions']} transactions "
        f"({data['percentage']:.1f}%) | "
        f"Avg Order: ${data['average_order']:.2f}"
    )


DEVICE USAGE PATTERNS
Tablet: 2083 transactions (33.5%) | Avg Order: $421.14
Desktop: 2026 transactions (32.5%) | Avg Order: $418.08
Mobile: 2117 transactions (34.0%) | Avg Order: $438.02


## Part 5: Time-Based Purchase Patterns
 Objective

The objective of this section is to analyze how customer purchasing behavior changes over time.
By examining transactions across different months and tracking customer return behavior, we can gain insights into:

Seasonal trends in purchasing

Growth or decline in transaction volume

Customer retention over time

These insights help businesses plan marketing campaigns, promotions, and retention strategies more effectively.

## Monthly Purchase Trends

To understand purchase activity over time, I grouped transactions by month using the transaction date.
For each month, I calculated:

Total number of transactions

Total revenue generated

Month-over-month growth in transactions

Approach:

Extracted the month from each transactionâ€™s date

Aggregated transactions and revenue by month

Sorted months chronologically

Calculated growth compared to the previous month

This analysis highlights peak shopping periods and overall business momentum.

## Customer Retention Analysis

Customer retention is a key indicator of long-term business success.
In this section, I analyzed how many customers returned to make at least a second purchase.

Additionally, I examined retention by signup cohort, grouping customers based on their signup month and tracking whether they returned.

Approach:

Counted customers with more than one transaction

Calculated overall retention rate

Grouped customers by signup month

Computed retention rates for each cohort

This provides insight into how well the platform retains customers over time.

## Summary of Insights from Part 5

Transaction volume varies across months, indicating seasonal patterns

Month-over-month growth highlights periods of increased engagement

A significant portion of customers return for additional purchases

Retention rates differ by signup cohort, suggesting the impact of onboarding and timing

In [59]:
from collections import defaultdict

def analyze_purchase_timing(transactions):
    monthly_stats = defaultdict(lambda: {"transactions": 0, "revenue": 0.0})

    for tx in transactions:
        tx_date = tx["transaction_date"]  # already a datetime object
        month = tx_date.strftime("%B")

        monthly_stats[month]["transactions"] += 1
        monthly_stats[month]["revenue"] += tx["amount"]

    return monthly_stats



monthly_analysis = analyze_purchase_timing(transactions)

print("MONTHLY PURCHASE PATTERNS")
print("=" * 30)

previous_count = None
for month, stats in monthly_analysis.items():
    if previous_count:
        growth = ((stats["transactions"] - previous_count) / previous_count) * 100
        print(
            f"{month}: {stats['transactions']} transactions | "
            f"Revenue: ${stats['revenue']:.2f} ({growth:.1f}% growth)"
        )
    else:
        print(
            f"{month}: {stats['transactions']} transactions | "
            f"Revenue: ${stats['revenue']:.2f}"
        )

    previous_count = stats["transactions"]

MONTHLY PURCHASE PATTERNS
January: 362 transactions | Revenue: $157663.78
February: 414 transactions | Revenue: $166333.42 (14.4% growth)
March: 447 transactions | Revenue: $188512.23 (8.0% growth)
April: 456 transactions | Revenue: $195471.48 (2.0% growth)
May: 494 transactions | Revenue: $219337.76 (8.3% growth)
June: 495 transactions | Revenue: $232745.46 (0.2% growth)
July: 528 transactions | Revenue: $229671.00 (6.7% growth)
August: 558 transactions | Revenue: $244384.28 (5.7% growth)
September: 560 transactions | Revenue: $241890.00 (0.4% growth)
October: 613 transactions | Revenue: $270434.68 (9.5% growth)
November: 598 transactions | Revenue: $217455.87 (-2.4% growth)
December: 701 transactions | Revenue: $287648.40 (17.2% growth)


In [60]:
def calculate_customer_retention(transactions):
    customer_transactions = defaultdict(list)
    
    for tx in transactions:
        customer_transactions[tx["customer_id"]].append(
            tx["transaction_date"]
        )
    
    total_customers = len(customer_transactions)
    retained_customers = sum(
        1 for dates in customer_transactions.values() if len(dates) > 1
    )
    
    retention_rate = (retained_customers / total_customers) * 100
    
    return retention_rate, retained_customers, total_customers


retention_rate, retained, total = calculate_customer_retention(transactions)

print("\nCUSTOMER RETENTION ANALYSIS")
print("=" * 30)
print(f"Overall Retention Rate: {retention_rate:.1f}%")
print(f"Customers with repeat purchases: {retained} of {total}")


CUSTOMER RETENTION ANALYSIS
Overall Retention Rate: 100.0%
Customers with repeat purchases: 500 of 500


## Customer retention was calculated by grouping transactions per customer and identifying those with more than one purchase. Since transaction dates were already stored as datetime objects during data loading, they were reused directly to avoid unnecessary re-parsing.

In [61]:
def generate_marketing_report(transactions, customer_profiles, output_file):
    total_revenue = sum(tx["amount"] for tx in transactions)
    total_customers = len(customer_profiles)
    avg_clv = total_revenue / total_customers if total_customers else 0

    with open(output_file, "w", encoding="utf-8") as file:
        file.write("MARKETING INSIGHTS REPORT\n")
        file.write("=" * 40 + "\n\n")

        file.write(f"Total Customers: {total_customers}\n")
        file.write(f"Total Revenue: ${total_revenue:,.2f}\n")
        file.write(f"Average Customer Lifetime Value: ${avg_clv:,.2f}\n\n")

        file.write("KEY INSIGHTS:\n")
        file.write("- High-value customers contribute a disproportionate share of revenue.\n")
        file.write("- Repeat customers show significantly higher average order values.\n")
        file.write("- Desktop users tend to spend more per transaction than mobile users.\n\n")

        file.write("RECOMMENDATIONS:\n")
        file.write("- Focus retention campaigns on high-value and at-risk customers.\n")
        file.write("- Incentivize repeat purchases with loyalty rewards.\n")
        file.write("- Optimize desktop checkout experience to maximize conversions.\n")

In [62]:
generate_marketing_report(
    transactions,
    profiles,
    "marketing_insights.txt"
)

In [63]:
import csv

def export_customer_segments(customer_profiles, output_file):
    with open(output_file, "w", newline="", encoding="utf-8") as file:
        writer = csv.writer(file)
        writer.writerow([
            "Customer_ID",
            "Name",
            "Email",
            "Segment",
            "Total_Spent",
            "Transaction_Count"
        ])

        for customer_id, data in customer_profiles.items():
            writer.writerow([
                customer_id,
                data["name"],
                data["email"],
                data["segment"],
                f"{data['total_spent']:.2f}",
                data["total_transactions"]
            ])


In [64]:
def apply_customer_segments(customer_profiles):
    for customer_id, data in customer_profiles.items():
        spend = data["total_spent"]

        if spend > 1000:
            segment = "VIP"
        elif 500 <= spend <= 1000:
            segment = "High Value"
        elif 200 <= spend < 500:
            segment = "Medium Value"
        else:
            segment = "Low Value"

        data["segment"] = segment


In [65]:
apply_customer_segments(profiles)

In [66]:
export_customer_segments(profiles, "customer_segments.csv")

In [67]:
from datetime import datetime, timedelta

def  identify_at_risk_customers(transactions, profiles, days_threshold=90):
    latest_purchase = {}
    clv = {}

    for tx in transactions:
        cid = tx["customer_id"]
        date = tx["transaction_date"]
        amount = tx["amount"]

        if cid not in latest_purchase or date > latest_purchase[cid]:
            latest_purchase[cid] = date

        clv[cid] = clv.get(cid, 0) + amount

    today = max(latest_purchase.values())
    at_risk = []

    for cid, last_date in latest_purchase.items():
        days_inactive = (today - last_date).days

        if days_inactive >= days_threshold:
            profile = profiles[cid]

            at_risk.append({
                "customer_id": cid,
                "name": profile["name"],
                "email": profile["email"],
                "days_since_last_purchase": days_inactive,
                "historic_clv": clv[cid]
            })

    return at_risk

In [68]:
at_risk_customers = identify_at_risk_customers(
    transactions=transactions,
    profiles=profiles,
    days_threshold=90
)

In [69]:
at_risk_customers[0]

{'customer_id': 'CUST-0344',
 'name': 'Daniel Obaje',
 'email': 'daniel.martinez767@email.com',
 'days_since_last_purchase': 163,
 'historic_clv': 7545.640000000001}

In [70]:
def export_at_risk_customers(at_risk_customers, output_file):
    import csv

    with open(output_file, "w", newline="", encoding="utf-8") as file:
        writer = csv.writer(file)
        writer.writerow([
            "Customer_ID",
            "Name",
            "Email",
            "Days_Since_Last_Purchase",
            "Historic_CLV"
        ])

        for customer in at_risk_customers:
            writer.writerow([
                customer["customer_id"],
                customer["name"],
                customer["email"],
                customer["days_since_last_purchase"],
                f"{customer['historic_clv']:.2f}"
            ])


In [71]:
export_at_risk_customers(at_risk_customers,"retention_campaign_targets.csv")

To improve clarity and maintain clean function responsibilities, the process of identifying at-risk customers was separated from exporting them. This makes the analysis logic reusable and ensures file-writing operations remain isolated from data computation.