<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/221_Cross_Sell_Upsell_Orchestrator_.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



## Batch Processing ‚Äî highest value next step

### Why this matters
- Business intelligence: analyze all customers at once
- Pattern detection: identify common gaps (e.g., "80% missing SPF")
- Prioritization: find highest-value opportunities across the base
- Strategic insights: inform marketing, product, and sales decisions
- Uses existing architecture: loop through customers and aggregate results

### What it would do
1. Process multiple customers (all or a list)
2. Generate aggregate insights:
   - Most common routine gaps
   - Highest-value opportunities by segment
   - Revenue potential across customer base
   - Customer segments by opportunity type
3. Create executive summary report:
   - Total potential revenue
   - Top opportunities by category
   - Customer prioritization (who to target first)
   - Product development insights

### Example output
```
üìä Batch Analysis Report
- Processed: 10 customers
- Total Potential Revenue: $847.50
- Most Common Gap: SPF (70% of customers)
- Highest-Value Segment: Gold tier customers ($312 avg opportunity)
- Top Priority Customers: C001, C004, C007 (high CLV + multiple gaps)
```

### Implementation approach
- Create `run_batch_analysis.py` script
- Add batch processing utilities (aggregate, summarize)
- Generate executive summary report
- Reuse existing workflow (just loop)

This transforms the orchestrator from single-customer analysis to business intelligence.



In [None]:
"""Batch processing utilities for Cross-Sell & Upsell Orchestrator"""

from typing import Dict, List, Any, Optional
from collections import Counter, defaultdict
from .data_utils import load_all_customers, get_essential_categories
from .workflow import run_workflow


def process_batch(
    customer_ids: Optional[List[str]] = None,
    verbose: bool = False,
    skip_errors: bool = True
) -> Dict[str, Any]:
    """
    Process multiple customers and aggregate results

    Args:
        customer_ids: List of customer IDs to process (None = all customers)
        verbose: Whether to print progress for each customer
        skip_errors: Whether to continue processing if a customer fails

    Returns:
        Dictionary containing:
        - customer_results: List of individual customer results
        - aggregate_insights: Aggregated insights across all customers
        - summary_metrics: Summary statistics
        - errors: List of errors encountered
    """
    # Load all customers if not specified
    if customer_ids is None:
        all_customers = load_all_customers()
        customer_ids = [c.get("customer_id") for c in all_customers if c.get("customer_id")]

    customer_results = []
    errors = []

    # Process each customer
    for i, customer_id in enumerate(customer_ids, 1):
        if verbose:
            print(f"\n[{i}/{len(customer_ids)}] Processing customer {customer_id}...")

        try:
            result = run_workflow(customer_id, verbose=False)
            customer_results.append({
                "customer_id": customer_id,
                "result": result,
                "success": len(result.get("errors", [])) == 0
            })
        except Exception as e:
            error_msg = f"Failed to process {customer_id}: {str(e)}"
            errors.append(error_msg)
            if verbose:
                print(f"  ‚ùå {error_msg}")
            if not skip_errors:
                raise

    # Aggregate insights
    aggregate_insights = calculate_aggregate_insights(customer_results)

    # Calculate summary metrics
    summary_metrics = calculate_summary_metrics(customer_results, aggregate_insights)

    return {
        "customer_results": customer_results,
        "aggregate_insights": aggregate_insights,
        "summary_metrics": summary_metrics,
        "errors": errors,
        "total_customers_processed": len(customer_results),
        "total_customers_failed": len(errors)
    }


def calculate_aggregate_insights(customer_results: List[Dict[str, Any]]) -> Dict[str, Any]:
    """
    Calculate aggregate insights across all processed customers

    Args:
        customer_results: List of customer processing results

    Returns:
        Dictionary of aggregate insights
    """
    successful_results = [r for r in customer_results if r.get("success", False)]

    if not successful_results:
        return {
            "most_common_gaps": [],
            "opportunity_distribution": {},
            "revenue_by_segment": {},
            "customer_segments": {}
        }

    # Collect all routine gaps
    all_gaps = []
    gap_counter = Counter()

    # Collect opportunity types
    opportunity_types = Counter()
    total_revenue = 0.0

    # Segment analysis
    revenue_by_tier = defaultdict(float)
    revenue_by_sensitivity = defaultdict(float)
    customers_by_tier = defaultdict(int)
    customers_by_sensitivity = defaultdict(int)

    # Top opportunities
    all_opportunities = []

    for result_data in successful_results:
        result = result_data.get("result", {})
        customer_id = result_data.get("customer_id", "unknown")

        # Routine gaps
        routine_gaps = result.get("routine_gaps", [])
        all_gaps.extend(routine_gaps)
        gap_counter.update(routine_gaps)

        # Opportunity summary
        summary = result.get("opportunity_summary", {})
        total_revenue += summary.get("total_potential_revenue", 0.0)

        # Count opportunity types
        cross_sell_count = summary.get("total_cross_sell_opportunities", 0)
        upsell_count = summary.get("total_upsell_opportunities", 0)
        bundle_count = summary.get("total_bundle_opportunities", 0)

        if cross_sell_count > 0:
            opportunity_types["cross_sell"] += cross_sell_count
        if upsell_count > 0:
            opportunity_types["upsell"] += upsell_count
        if bundle_count > 0:
            opportunity_types["bundle"] += bundle_count

        # Customer data for segmentation
        customer_data = result.get("customer_data", {})
        loyalty_tier = customer_data.get("loyalty_tier", "unknown")
        price_sensitivity = customer_data.get("price_sensitivity", "unknown")

        revenue_by_tier[loyalty_tier] += summary.get("total_potential_revenue", 0.0)
        revenue_by_sensitivity[price_sensitivity] += summary.get("total_potential_revenue", 0.0)
        customers_by_tier[loyalty_tier] += 1
        customers_by_sensitivity[price_sensitivity] += 1

        # Collect top opportunities for each customer
        ranked_opportunities = result.get("ranked_opportunities", [])
        for opp in ranked_opportunities[:3]:  # Top 3 per customer
            all_opportunities.append({
                "customer_id": customer_id,
                "opportunity": opp
            })

    # Calculate most common gaps (with percentages)
    total_customers = len(successful_results)
    most_common_gaps = []
    for gap, count in gap_counter.most_common():
        percentage = (count / total_customers * 100) if total_customers > 0 else 0
        most_common_gaps.append({
            "category": gap,
            "count": count,
            "percentage": round(percentage, 1)
        })

    # Calculate average revenue by segment
    avg_revenue_by_tier = {
        tier: revenue_by_tier[tier] / customers_by_tier[tier] if customers_by_tier[tier] > 0 else 0
        for tier in revenue_by_tier
    }

    avg_revenue_by_sensitivity = {
        sensitivity: revenue_by_sensitivity[sensitivity] / customers_by_sensitivity[sensitivity]
        if customers_by_sensitivity[sensitivity] > 0 else 0
        for sensitivity in revenue_by_sensitivity
    }

    # Sort top opportunities by score
    all_opportunities.sort(key=lambda x: x["opportunity"].get("score", 0), reverse=True)

    return {
        "most_common_gaps": most_common_gaps,
        "opportunity_distribution": dict(opportunity_types),
        "total_potential_revenue": round(total_revenue, 2),
        "revenue_by_segment": {
            "by_loyalty_tier": {
                tier: {
                    "total_revenue": round(revenue_by_tier[tier], 2),
                    "avg_revenue": round(avg_revenue_by_tier[tier], 2),
                    "customer_count": customers_by_tier[tier]
                }
                for tier in revenue_by_tier
            },
            "by_price_sensitivity": {
                sensitivity: {
                    "total_revenue": round(revenue_by_sensitivity[sensitivity], 2),
                    "avg_revenue": round(avg_revenue_by_sensitivity[sensitivity], 2),
                    "customer_count": customers_by_sensitivity[sensitivity]
                }
                for sensitivity in revenue_by_sensitivity
            }
        },
        "top_opportunities": all_opportunities[:10],  # Top 10 across all customers
        "customer_segments": {
            "by_tier": dict(customers_by_tier),
            "by_sensitivity": dict(customers_by_sensitivity)
        }
    }


def calculate_summary_metrics(
    customer_results: List[Dict[str, Any]],
    aggregate_insights: Dict[str, Any]
) -> Dict[str, Any]:
    """
    Calculate summary metrics for batch processing

    Args:
        customer_results: List of customer processing results
        aggregate_insights: Aggregate insights dictionary

    Returns:
        Dictionary of summary metrics
    """
    successful_results = [r for r in customer_results if r.get("success", False)]

    if not successful_results:
        return {
            "total_customers": 0,
            "successful_customers": 0,
            "failed_customers": 0,
            "avg_opportunities_per_customer": 0,
            "avg_revenue_per_customer": 0
        }

    total_opportunities = 0
    total_revenue = 0.0

    for result_data in successful_results:
        result = result_data.get("result", {})
        summary = result.get("opportunity_summary", {})
        total_opportunities += summary.get("total_opportunities", 0)
        total_revenue += summary.get("total_potential_revenue", 0.0)

    return {
        "total_customers": len(customer_results),
        "successful_customers": len(successful_results),
        "failed_customers": len(customer_results) - len(successful_results),
        "avg_opportunities_per_customer": round(
            total_opportunities / len(successful_results), 2
        ) if successful_results else 0,
        "avg_revenue_per_customer": round(
            total_revenue / len(successful_results), 2
        ) if successful_results else 0,
        "total_potential_revenue": round(total_revenue, 2),
        "total_opportunities": total_opportunities
    }


def identify_priority_customers(
    customer_results: List[Dict[str, Any]],
    top_n: int = 5
) -> List[Dict[str, Any]]:
    """
    Identify priority customers based on opportunity value and customer profile

    Args:
        customer_results: List of customer processing results
        top_n: Number of priority customers to return

    Returns:
        List of priority customer dictionaries with ranking rationale
    """
    successful_results = [r for r in customer_results if r.get("success", False)]

    priority_list = []

    for result_data in successful_results:
        result = result_data.get("result", {})
        customer_id = result_data.get("customer_id", "unknown")
        customer_data = result.get("customer_data", {})
        summary = result.get("opportunity_summary", {})

        # Calculate priority score
        revenue = summary.get("total_potential_revenue", 0.0)
        clv = customer_data.get("lifetime_value", 0.0)
        opportunity_count = summary.get("total_opportunities", 0)
        high_value_count = summary.get("high_value_opportunities", 0)

        # Priority score: revenue + CLV weight + opportunity count
        priority_score = revenue + (clv * 0.1) + (opportunity_count * 5) + (high_value_count * 10)

        priority_list.append({
            "customer_id": customer_id,
            "customer_name": customer_data.get("name", "Unknown"),
            "loyalty_tier": customer_data.get("loyalty_tier", "unknown"),
            "lifetime_value": clv,
            "total_potential_revenue": revenue,
            "opportunity_count": opportunity_count,
            "high_value_opportunities": high_value_count,
            "priority_score": round(priority_score, 2),
            "rationale": f"{opportunity_count} opportunities, ${revenue:.2f} potential revenue"
        })

    # Sort by priority score
    priority_list.sort(key=lambda x: x["priority_score"], reverse=True)

    return priority_list[:top_n]



# test run batch analysis

In [None]:
"""Batch analysis script for Cross-Sell & Upsell Orchestrator

Processes multiple customers and generates aggregate insights.
"""

import sys
import argparse
from pathlib import Path
from datetime import datetime

# Add src to path
sys.path.insert(0, str(Path(__file__).parent))

from src.cross_sell_upsell.batch_utils import (
    process_batch,
    identify_priority_customers
)
from src.cross_sell_upsell.data_utils import load_all_customers
from config import CrossSellUpsellConfig


def generate_batch_report(
    batch_results: dict,
    output_file: str = None
) -> str:
    """
    Generate markdown report for batch analysis

    Args:
        batch_results: Results from process_batch()
        output_file: Optional file path to save report

    Returns:
        Markdown report string
    """
    config = CrossSellUpsellConfig()

    # Extract data
    summary_metrics = batch_results.get("summary_metrics", {})
    aggregate_insights = batch_results.get("aggregate_insights", {})
    priority_customers = identify_priority_customers(
        batch_results.get("customer_results", []),
        top_n=10
    )

    # Generate report
    report_lines = []
    report_lines.append("# Cross-Sell & Upsell Batch Analysis Report")
    report_lines.append("")
    report_lines.append(f"**Generated:** {datetime.now().strftime('%Y-%m-%d %H:%M:%S')}")
    report_lines.append("")

    # Executive Summary
    report_lines.append("## Executive Summary")
    report_lines.append("")
    report_lines.append(f"- **Total Customers Processed:** {summary_metrics.get('total_customers', 0)}")
    report_lines.append(f"- **Successful:** {summary_metrics.get('successful_customers', 0)}")
    report_lines.append(f"- **Failed:** {summary_metrics.get('failed_customers', 0)}")
    report_lines.append(f"- **Total Potential Revenue:** ${summary_metrics.get('total_potential_revenue', 0):,.2f}")
    report_lines.append(f"- **Average Revenue per Customer:** ${summary_metrics.get('avg_revenue_per_customer', 0):,.2f}")
    report_lines.append(f"- **Average Opportunities per Customer:** {summary_metrics.get('avg_opportunities_per_customer', 0):.1f}")
    report_lines.append("")

    # Most Common Gaps
    report_lines.append("## Most Common Routine Gaps")
    report_lines.append("")
    most_common_gaps = aggregate_insights.get("most_common_gaps", [])
    if most_common_gaps:
        report_lines.append("| Category | Customers Missing | Percentage |")
        report_lines.append("|----------|-------------------|------------|")
        for gap in most_common_gaps:
            report_lines.append(
                f"| {gap['category'].title()} | {gap['count']} | {gap['percentage']}% |"
            )
    else:
        report_lines.append("No common gaps identified.")
    report_lines.append("")

    # Opportunity Distribution
    report_lines.append("## Opportunity Distribution")
    report_lines.append("")
    opp_dist = aggregate_insights.get("opportunity_distribution", {})
    if opp_dist:
        total_opps = sum(opp_dist.values())
        for opp_type, count in opp_dist.items():
            percentage = (count / total_opps * 100) if total_opps > 0 else 0
            report_lines.append(f"- **{opp_type.replace('_', ' ').title()}:** {count} ({percentage:.1f}%)")
    else:
        report_lines.append("No opportunities found.")
    report_lines.append("")

    # Revenue by Segment
    report_lines.append("## Revenue by Customer Segment")
    report_lines.append("")

    revenue_by_segment = aggregate_insights.get("revenue_by_segment", {})

    # By Loyalty Tier
    by_tier = revenue_by_segment.get("by_loyalty_tier", {})
    if by_tier:
        report_lines.append("### By Loyalty Tier")
        report_lines.append("")
        report_lines.append("| Tier | Total Revenue | Avg per Customer | Customers |")
        report_lines.append("|------|---------------|------------------|-----------|")
        for tier, data in sorted(by_tier.items(), key=lambda x: x[1]["total_revenue"], reverse=True):
            report_lines.append(
                f"| {tier} | ${data['total_revenue']:,.2f} | ${data['avg_revenue']:,.2f} | {data['customer_count']} |"
            )
        report_lines.append("")

    # By Price Sensitivity
    by_sensitivity = revenue_by_segment.get("by_price_sensitivity", {})
    if by_sensitivity:
        report_lines.append("### By Price Sensitivity")
        report_lines.append("")
        report_lines.append("| Sensitivity | Total Revenue | Avg per Customer | Customers |")
        report_lines.append("|-------------|---------------|------------------|-----------|")
        for sensitivity, data in sorted(by_sensitivity.items(), key=lambda x: x[1]["total_revenue"], reverse=True):
            report_lines.append(
                f"| {sensitivity} | ${data['total_revenue']:,.2f} | ${data['avg_revenue']:,.2f} | {data['customer_count']} |"
            )
        report_lines.append("")

    # Priority Customers
    report_lines.append("## Top Priority Customers")
    report_lines.append("")
    report_lines.append("Customers with highest opportunity value and customer profile scores.")
    report_lines.append("")
    if priority_customers:
        report_lines.append("| Rank | Customer | Tier | CLV | Potential Revenue | Opportunities | Rationale |")
        report_lines.append("|------|----------|------|-----|-------------------|---------------|------------|")
        for i, customer in enumerate(priority_customers, 1):
            report_lines.append(
                f"| {i} | {customer['customer_name']} ({customer['customer_id']}) | "
                f"{customer['loyalty_tier']} | ${customer['lifetime_value']:,.2f} | "
                f"${customer['total_potential_revenue']:,.2f} | {customer['opportunity_count']} | "
                f"{customer['rationale']} |"
            )
    else:
        report_lines.append("No priority customers identified.")
    report_lines.append("")

    # Top Opportunities Across All Customers
    report_lines.append("## Top Opportunities Across All Customers")
    report_lines.append("")
    top_opportunities = aggregate_insights.get("top_opportunities", [])
    if top_opportunities:
        report_lines.append("| Customer | Product | Type | Price | Score |")
        report_lines.append("|----------|---------|------|-------|-------|")
        for item in top_opportunities[:15]:  # Top 15
            opp = item.get("opportunity", {})
            customer_id = item.get("customer_id", "unknown")
            product_name = opp.get("product_name", "Unknown")
            opp_type = opp.get("type", "unknown")
            price = opp.get("price", 0.0)
            score = opp.get("score", 0.0)
            report_lines.append(
                f"| {customer_id} | {product_name} | {opp_type} | ${price:.2f} | {score:.2f} |"
            )
    else:
        report_lines.append("No top opportunities found.")
    report_lines.append("")

    # Errors (if any)
    errors = batch_results.get("errors", [])
    if errors:
        report_lines.append("## Errors Encountered")
        report_lines.append("")
        for error in errors:
            report_lines.append(f"- ‚ö†Ô∏è {error}")
        report_lines.append("")

    report = "\n".join(report_lines)

    # Save to file if specified
    if output_file:
        output_path = Path(output_file)
        output_path.parent.mkdir(parents=True, exist_ok=True)
        with open(output_path, 'w') as f:
            f.write(report)
        print(f"\n‚úÖ Batch report saved to: {output_path}")

    return report


def main():
    """Main entry point for batch analysis"""
    parser = argparse.ArgumentParser(
        description="Batch analysis for Cross-Sell & Upsell Orchestrator"
    )
    parser.add_argument(
        "--customers",
        nargs="+",
        help="List of customer IDs to process (default: all customers)"
    )
    parser.add_argument(
        "--output",
        type=str,
        help="Output file path for batch report (default: auto-generated)"
    )
    parser.add_argument(
        "--verbose",
        action="store_true",
        help="Print progress for each customer"
    )

    args = parser.parse_args()

    # Determine customer IDs
    customer_ids = args.customers
    if customer_ids is None:
        all_customers = load_all_customers()
        customer_ids = [c.get("customer_id") for c in all_customers if c.get("customer_id")]
        print(f"\nüìä Processing all {len(customer_ids)} customers...")
    else:
        print(f"\nüìä Processing {len(customer_ids)} specified customers...")

    # Process batch
    print("="*60)
    batch_results = process_batch(
        customer_ids=customer_ids,
        verbose=args.verbose,
        skip_errors=True
    )

    # Generate output file path if not specified
    config = CrossSellUpsellConfig()
    if args.output is None:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        output_dir = Path(config.reports_dir).parent / "batch_reports"
        output_file = output_dir / f"batch_analysis_{timestamp}.md"
    else:
        output_file = args.output

    # Generate and save report
    report = generate_batch_report(batch_results, output_file=str(output_file))

    # Print summary
    summary = batch_results.get("summary_metrics", {})
    print("\n" + "="*60)
    print("üìä BATCH ANALYSIS COMPLETE")
    print("="*60)
    print(f"‚úÖ Processed: {summary.get('successful_customers', 0)} customers")
    print(f"‚ùå Failed: {summary.get('failed_customers', 0)} customers")
    print(f"üí∞ Total Potential Revenue: ${summary.get('total_potential_revenue', 0):,.2f}")
    print(f"üìà Avg Revenue per Customer: ${summary.get('avg_revenue_per_customer', 0):,.2f}")
    print(f"üìÑ Report saved to: {output_file}")
    print("="*60 + "\n")

    # Print top insights
    insights = batch_results.get("aggregate_insights", {})
    most_common_gaps = insights.get("most_common_gaps", [])
    if most_common_gaps:
        print("üîç Top Routine Gaps:")
        for gap in most_common_gaps[:3]:
            print(f"   - {gap['category'].title()}: {gap['percentage']}% of customers")
        print()


if __name__ == "__main__":
    main()



In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_029_CrossSell_Upsell_Orchestrator % python3 run_batch_analysis.py

üìä Processing all 10 customers...
============================================================
/Users/micahshull/Documents/AI_LangGraph/LG_Cursor_029_CrossSell_Upsell_Orchestrator/.venv/lib/python3.13/site-packages/pydantic/v1/main.py:1054: UserWarning: LangSmith now uses UUID v7 for run and trace identifiers. This warning appears when passing custom IDs. Please use: from langsmith import uuid7
            id = uuid7()
Future versions will require UUID v7.
  input_data = validator(cls_, input_data)

‚úÖ Batch report saved to: output/batch_reports/batch_analysis_20251120_182732.md

============================================================
üìä BATCH ANALYSIS COMPLETE
============================================================
‚úÖ Processed: 10 customers
‚ùå Failed: 0 customers
üí∞ Total Potential Revenue: $1,336.62
üìà Avg Revenue per Customer: $133.66
üìÑ Report saved to: output/batch_reports/batch_analysis_20251120_182732.md
============================================================

üîç Top Routine Gaps:
   - Spf: 80.0% of customers
   - Toner: 60.0% of customers
   - Serum: 60.0% of customers

(.venv) micahshull@Micahs-iMac LG_Cursor_029_CrossSell_Upsell_Orchestrator %

# Cross-Sell & Upsell Batch Analysis Report

In [None]:
# Cross-Sell & Upsell Batch Analysis Report

**Generated:** 2025-11-20 18:27:32

## Executive Summary

- **Total Customers Processed:** 10
- **Successful:** 10
- **Failed:** 0
- **Total Potential Revenue:** $1,336.62
- **Average Revenue per Customer:** $133.66
- **Average Opportunities per Customer:** 6.7

## Most Common Routine Gaps

| Category | Customers Missing | Percentage |
|----------|-------------------|------------|
| Spf | 8 | 80.0% |
| Toner | 6 | 60.0% |
| Serum | 6 | 60.0% |
| Cleanser | 5 | 50.0% |
| Moisturizer | 5 | 50.0% |

## Opportunity Distribution

- **Cross Sell:** 36 (53.7%)
- **Upsell:** 23 (34.3%)
- **Bundle:** 8 (11.9%)

## Revenue by Customer Segment

### By Loyalty Tier

| Tier | Total Revenue | Avg per Customer | Customers |
|------|---------------|------------------|-----------|
| bronze | $449.42 | $149.81 | 3 |
| silver | $447.43 | $149.14 | 3 |
| gold | $439.77 | $109.94 | 4 |

### By Price Sensitivity

| Sensitivity | Total Revenue | Avg per Customer | Customers |
|-------------|---------------|------------------|-----------|
| medium | $752.78 | $150.56 | 5 |
| high | $419.94 | $139.98 | 3 |
| low | $163.90 | $81.95 | 2 |

## Top Priority Customers

Customers with highest opportunity value and customer profile scores.

| Rank | Customer | Tier | CLV | Potential Revenue | Opportunities | Rationale |
|------|----------|------|-----|-------------------|---------------|------------|
| 1 | Alicia Gomez (C005) | silver | $175.20 | $169.30 | 8 | 8 opportunities, $169.30 potential revenue |
| 2 | Lisa Wang (C009) | gold | $280.50 | $152.30 | 8 | 8 opportunities, $152.30 potential revenue |
| 3 | Emily Chen (C003) | bronze | $142.10 | $155.60 | 8 | 8 opportunities, $155.60 potential revenue |
| 4 | Mark Johnson (C002) | silver | $89.50 | $154.56 | 7 | 7 opportunities, $154.56 potential revenue |
| 5 | Jason Patel (C006) | bronze | $95.30 | $152.01 | 7 | 7 opportunities, $152.01 potential revenue |
| 6 | James Anderson (C010) | bronze | $110.25 | $141.81 | 7 | 7 opportunities, $141.81 potential revenue |
| 7 | Sarah Lee (C001) | gold | $210.40 | $123.57 | 6 | 6 opportunities, $123.57 potential revenue |
| 8 | Michael Torres (C008) | silver | $125.80 | $123.57 | 6 | 6 opportunities, $123.57 potential revenue |
| 9 | Rachel Kim (C007) | gold | $450.60 | $81.95 | 5 | 5 opportunities, $81.95 potential revenue |
| 10 | David Brooks (C004) | gold | $310.75 | $81.95 | 5 | 5 opportunities, $81.95 potential revenue |

## Top Opportunities Across All Customers

| Customer | Product | Type | Price | Score |
|----------|---------|------|-------|-------|
| C002 | Bundle | bundle | $58.62 | 37.14 |
| C006 | Bundle | bundle | $56.07 | 35.43 |
| C005 | Bundle | bundle | $54.37 | 35.21 |
| C003 | Bundle | bundle | $52.67 | 33.64 |
| C010 | Bundle | bundle | $45.87 | 29.68 |
| C001 | Bundle | bundle | $41.62 | 28.97 |
| C008 | Bundle | bundle | $41.62 | 28.06 |
| C009 | Bundle | bundle | $37.37 | 26.80 |
| C004 | Hydrating Hyaluronic Serum | replenishment | $19.99 | 18.77 |
| C001 | Hydrating Hyaluronic Serum | routine_gap | $19.99 | 18.72 |


# Cross-Sell & Upsell Recommendations Report

**Customer:** Lisa Wang (C009)

**Generated:** 2025-11-20 18:27:32

## Customer Overview

- **Loyalty Tier:** Gold
- **Lifetime Value:** $280.50
- **Churn Risk:** 15.0%
- **Price Sensitivity:** Medium
- **Current Products:** 3 products
- **Routine Completeness:** 40.0%

## Routine Analysis

**Missing Essential Categories:** cleanser, toner, spf

‚ö†Ô∏è  **3 products past replenishment date**

## Opportunities Summary

- **Total Cross-Sell Opportunities:** 4
- **Total Upsell Opportunities:** 3
- **Bundle Opportunities:** 1 ‚≠ê
- **Total Potential Revenue:** $152.30
- **High-Value Opportunities:** 3

## ‚≠ê Bundle Opportunities

### Complete Your Routine Bundle
**Products:** Gentle Foaming Cleanser, Balancing Facial Toner, SPF 30 Everyday Sunscreen
**Original Price:** $43.97
**Bundle Price:** $37.37
**Savings:** $6.60 (15% off)
**Rationale:** Complete your routine bundle - 3 essential products, save 15%
**Score:** 26.80

## Top Individual Recommendations

### 1. Hydrating Hyaluronic Serum
**Category:** Serum
**Price:** $19.99
**Type:** Replenishment
**Rationale:** Time to replenish Hydrating Hyaluronic Serum - 658 days since purchase
**Score:** 17.76
  - Business Value: 29.98
  - Customer Fit: 12.54
  - Routine Completeness: 5.00
  - Replenishment Urgency: 10.00

### 2. Gentle Eye Cream
**Category:** Eye
**Price:** $18.99
**Type:** Replenishment
**Rationale:** Time to replenish Gentle Eye Cream - 651 days since purchase
**Score:** 17.16
  - Business Value: 28.48
  - Customer Fit: 12.54
  - Routine Completeness: 5.00
  - Replenishment Urgency: 10.00

### 3. SPF 30 Everyday Sunscreen
**Category:** Spf
**Price:** $15.99
**Type:** Routine Gap
**Rationale:** Customer missing essential spf step in routine
**Score:** 13.16
  - Business Value: 15.99
  - Customer Fit: 12.54
  - Routine Completeness: 15.00
  - Replenishment Urgency: 0.00

### 4. Daily Lightweight Moisturizer
**Category:** Moisturizer
**Price:** $17.99
**Type:** Replenishment
**Rationale:** Time to replenish Daily Lightweight Moisturizer - 658 days since purchase
**Score:** 12.96
  - Business Value: 17.99
  - Customer Fit: 12.54
  - Routine Completeness: 5.00
  - Replenishment Urgency: 10.00

### 5. Gentle Foaming Cleanser
**Category:** Cleanser
**Price:** $14.99
**Type:** Routine Gap
**Rationale:** Customer missing essential cleanser step in routine
**Score:** 12.76
  - Business Value: 14.99
  - Customer Fit: 12.54
  - Routine Completeness: 15.00
  - Replenishment Urgency: 0.00

## All Opportunities

*Showing top 5 above. Total of 8 opportunities found.*


# Cross-Sell & Upsell Recommendations Report

**Customer:** James Anderson (C010)

**Generated:** 2025-11-20 18:27:32

## Customer Overview

- **Loyalty Tier:** Bronze
- **Lifetime Value:** $110.25
- **Churn Risk:** 18.0%
- **Price Sensitivity:** High
- **Current Products:** 2 products
- **Routine Completeness:** 40.0%

## Routine Analysis

**Missing Essential Categories:** serum, moisturizer, spf

‚ö†Ô∏è  **2 products past replenishment date**

## Opportunities Summary

- **Total Cross-Sell Opportunities:** 4
- **Total Upsell Opportunities:** 2
- **Bundle Opportunities:** 1 ‚≠ê
- **Total Potential Revenue:** $141.81
- **High-Value Opportunities:** 2

## ‚≠ê Bundle Opportunities

### Complete Your Routine Bundle
**Products:** Hydrating Hyaluronic Serum, Daily Lightweight Moisturizer, SPF 30 Everyday Sunscreen
**Original Price:** $53.97
**Bundle Price:** $45.87
**Savings:** $8.10 (15% off)
**Rationale:** Complete your routine bundle - 3 essential products, save 15%
**Score:** 29.68

## Top Individual Recommendations

### 1. Hydrating Hyaluronic Serum
**Category:** Serum
**Price:** $19.99
**Type:** Routine Gap
**Rationale:** Customer missing essential serum step in routine
**Score:** 16.41
  - Business Value: 29.98
  - Customer Fit: 4.72
  - Routine Completeness: 15.00
  - Replenishment Urgency: 0.00

### 2. Daily Lightweight Moisturizer
**Category:** Moisturizer
**Price:** $17.99
**Type:** Routine Gap
**Rationale:** Customer missing essential moisturizer step in routine
**Score:** 12.22
  - Business Value: 17.99
  - Customer Fit: 6.75
  - Routine Completeness: 15.00
  - Replenishment Urgency: 0.00

### 3. SPF 30 Everyday Sunscreen
**Category:** Spf
**Price:** $15.99
**Type:** Routine Gap
**Rationale:** Customer missing essential spf step in routine
**Score:** 11.42
  - Business Value: 15.99
  - Customer Fit: 6.75
  - Routine Completeness: 15.00
  - Replenishment Urgency: 0.00

### 4. Gentle Foaming Cleanser
**Category:** Cleanser
**Price:** $14.99
**Type:** Product Cross Sell
**Rationale:** Recommended complement to Balancing Facial Toner
**Score:** 9.62
  - Business Value: 14.99
  - Customer Fit: 6.75
  - Routine Completeness: 8.00
  - Replenishment Urgency: 0.00

### 5. Calming Chamomile Cleanser
**Category:** Cleanser
**Price:** $13.99
**Type:** Replenishment
**Rationale:** Time to replenish Calming Chamomile Cleanser - 662 days since purchase
**Score:** 9.62
  - Business Value: 13.99
  - Customer Fit: 6.75
  - Routine Completeness: 5.00
  - Replenishment Urgency: 10.00

## All Opportunities

*Showing top 5 above. Total of 7 opportunities found.*




## Insights from the full batch analysis

### Executive summary
- 10 customers processed successfully
- \$1,336.62 total potential revenue
- $133.66 average revenue per customer
- 6.7 average opportunities per customer

### Business insights

1. Most common gaps:
   - SPF: 80% of customers missing (highest priority)
   - Toner: 60%
   - Serum: 60%

2. Opportunity distribution:
   - Cross-sell: 53.7% (36 opportunities)
   - Upsell: 34.3% (23 opportunities)
   - Bundle: 11.9% (8 opportunities)

3. Revenue by segment:
   - Bronze tier: highest avg revenue per customer (\$149.81)
   - Medium price sensitivity: highest total revenue (\$752.78)
   - Gold tier: lower avg revenue ($109.94) ‚Äî may indicate more complete routines

4. Top priority customers:
   - #1: Alicia Gomez (C005) ‚Äî \$169.30 potential, 8 opportunities
   - #2: Lisa Wang (C009) ‚Äî \$152.30 potential, 8 opportunities
   - #3: Emily Chen (C003) ‚Äî \$155.60 potential, 8 opportunities

5. Top opportunities:
   - Bundles are scoring highest (as expected)
   - C002's bundle leads with score 37.14

## Strategic recommendations

1. SPF campaign: 80% of customers are missing SPF ‚Äî high-impact marketing opportunity
2. Target priority customers: Focus on C005, C009, C003 first (highest opportunity value)
3. Bundle strategy: 8 customers qualified for bundles ‚Äî promote bundle offers
4. Bronze tier focus: Highest avg revenue per customer ‚Äî consider targeted campaigns

Batch processing is working and providing actionable business intelligence. The orchestrator can now:
- Analyze individual customers (single-customer reports)
- Analyze the entire customer base (batch reports)
- Generate personalized recommendations (LLM-enhanced rationales)
- Identify strategic opportunities (priority customers, common gaps)



## Adding more data: pros and cons

### Pros
1. Tests scalability: how the architecture handles larger datasets
2. Reveals edge cases: data quality issues, missing fields, unexpected patterns
3. More realistic: closer to production scenarios
4. Performance insights: bottlenecks, optimization needs
5. Business logic complexity: more nuanced rules, exceptions

### Cons
1. May obscure learning: debugging can overshadow orchestrator patterns
2. Diminishing returns: more rows may not add new learning
3. Time sink: data cleanup can take focus away from architecture

## Recommendation: strategic complexity

Instead of just more rows, add meaningful complexity that tests the orchestrator architecture:

### Option 1: Multi-tier product catalog (recommended)
- Tier 1: Essential products (current)
- Tier 2: Advanced products (active ingredients, treatments)
- Tier 3: Professional/clinical products
- Challenge: Multi-tier recommendation logic, upgrade paths, tier-based scoring

### Option 2: Temporal data complexity
- Seasonal products
- Limited-time offers
- Product lifecycle (new, mature, discontinued)
- Challenge: Time-aware recommendations, inventory constraints

### Option 3: Customer relationship complexity
- Purchase history with returns/exchanges
- Subscription customers vs one-time buyers
- Multi-channel behavior (online + in-store)
- Challenge: More complex customer segmentation, behavior patterns

### Option 4: Cross-category orchestration
- Skincare + makeup + wellness products
- Cross-category bundles
- Challenge: Multi-domain orchestration, category-specific rules

## My recommendation

Start with Option 1 (multi-tier products) because:
- Builds on current foundation
- Tests orchestration complexity (tier-based logic)
- Adds business value (upsell paths)
- Maintains focus on orchestrator patterns
- Scales naturally (can add more tiers later)

## Alternative: reinforce patterns first

Before adding complexity, consider:
1. Building a different orchestrator type (e.g., customer service, inventory, marketing)
2. Adding conditional workflows (branches, loops, parallel processing)
3. Multi-agent orchestration (this agent + another agent)

## Bottom line

Yes, add complexity, but make it strategic:
- Add multi-tier products (Tier 2, Tier 3)
- Keep customer count similar (10‚Äì15 customers)
- Focus on complexity in business logic, not just data volume

This tests:
- How well your architecture scales
- Whether patterns hold with more complexity
- Edge case handling
- Performance considerations

Without getting lost in data cleanup.



# Multi-Tier Product Catalog Data Generation Plan

**Purpose:** Generate realistic skincare product catalog and customer data with 3 product tiers to test orchestrator complexity and scalability.

---

## üéØ Overview

Create a **3-tier product catalog** where:
- **Tier 1 (Essential):** Basic, entry-level products (current MVP)
- **Tier 2 (Advanced):** Mid-tier products with active ingredients, targeted treatments
- **Tier 3 (Professional):** Premium, clinical-grade products

This tests the orchestrator's ability to:
- Handle multi-tier recommendation logic
- Create upgrade paths (Tier 1 ‚Üí Tier 2 ‚Üí Tier 3)
- Score opportunities across tiers
- Handle tier-based business rules

---

## üì¶ Product Catalog Structure

### **File:** `product_catalog.json`

**Format:** Array of product objects

### **Product Object Schema:**

```json
{
  "product_id": "P001",
  "name": "Gentle Foaming Cleanser",
  "category": "cleanser",
  "tier": 1,
  "price": 14.99,
  "margin": "medium",
  "replenishment_cycle_days": 45,
  "cross_sell_products": ["P002", "P003"],
  "upgrade_path": {
    "tier_2": "P101",
    "tier_3": "P201"
  },
  "active_ingredients": [],
  "target_concerns": ["general"],
  "description": "Basic daily cleanser for all skin types"
}
```

### **Field Definitions:**

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `product_id` | string | Yes | Unique ID (P001-P010 for Tier 1, P101-P120 for Tier 2, P201-P230 for Tier 3) |
| `name` | string | Yes | Product name |
| `category` | string | Yes | One of: "cleanser", "toner", "serum", "moisturizer", "spf", "mask", "treatment", "eye_cream" |
| `tier` | integer | Yes | 1, 2, or 3 |
| `price` | float | Yes | Product price (Tier 1: $10-25, Tier 2: $25-60, Tier 3: $60-150) |
| `margin` | string | Yes | "low", "medium", "high", "premium" |
| `replenishment_cycle_days` | integer | Yes | Days between purchases (30-90) |
| `cross_sell_products` | array | Yes | Array of product_ids that complement this product |
| `upgrade_path` | object | No | Object with `tier_2` and/or `tier_3` keys pointing to upgrade product IDs |
| `active_ingredients` | array | Yes | Array of ingredient names (empty for Tier 1, populated for Tier 2/3) |
| `target_concerns` | array | Yes | Array of skin concerns: "general", "acne", "aging", "dryness", "sensitivity", "hyperpigmentation", "fine_lines" |
| `description` | string | Yes | Brief product description |

---

## üìä Product Distribution by Tier

### **Tier 1 (Essential) - 10 Products**
**Purpose:** Basic, entry-level products for complete routine

**Categories:**
- Cleanser (2 products)
- Toner (1 product)
- Serum (1 product)
- Moisturizer (2 products)
- SPF (2 products)
- Mask (1 product)
- Eye Cream (1 product)

**Price Range:** $10.00 - $24.99
**Margins:** Mix of "low" and "medium"
**Active Ingredients:** Empty arrays (basic formulations)
**Target Concerns:** Mostly "general", some "dryness" or "sensitivity"

**Example Tier 1 Products:**
```json
{
  "product_id": "P001",
  "name": "Gentle Foaming Cleanser",
  "category": "cleanser",
  "tier": 1,
  "price": 14.99,
  "margin": "medium",
  "replenishment_cycle_days": 45,
  "cross_sell_products": ["P002", "P003"],
  "upgrade_path": {
    "tier_2": "P101"
  },
  "active_ingredients": [],
  "target_concerns": ["general"],
  "description": "Basic daily cleanser suitable for all skin types"
}
```

### **Tier 2 (Advanced) - 20 Products**
**Purpose:** Mid-tier products with active ingredients and targeted treatments

**Categories:**
- Cleanser (3 products - different formulations)
- Toner (2 products - exfoliating, hydrating)
- Serum (4 products - vitamin C, hyaluronic acid, niacinamide, retinol)
- Moisturizer (3 products - day, night, gel)
- SPF (2 products - mineral, chemical)
- Mask (2 products - clay, hydrating)
- Treatment (2 products - spot treatment, overnight)
- Eye Cream (2 products - basic, anti-aging)

**Price Range:** $25.00 - $59.99
**Margins:** Mix of "medium" and "high"
**Active Ingredients:** Include ingredients like "Vitamin C", "Hyaluronic Acid", "Niacinamide", "Retinol", "Salicylic Acid", "AHA/BHA"
**Target Concerns:** Specific concerns like "acne", "aging", "hyperpigmentation", "fine_lines"

**Example Tier 2 Products:**
```json
{
  "product_id": "P101",
  "name": "Advanced Hydrating Cleanser with Hyaluronic Acid",
  "category": "cleanser",
  "tier": 2,
  "price": 32.99,
  "margin": "high",
  "replenishment_cycle_days": 60,
  "cross_sell_products": ["P102", "P105"],
  "upgrade_path": {
    "tier_3": "P201"
  },
  "active_ingredients": ["Hyaluronic Acid", "Ceramides"],
  "target_concerns": ["dryness", "general"],
  "description": "Advanced cleanser with hyaluronic acid for deep hydration"
}
```

### **Tier 3 (Professional) - 30 Products**
**Purpose:** Premium, clinical-grade products with high concentrations

**Categories:**
- Cleanser (4 products - various formulations)
- Toner (3 products - exfoliating, pH-balancing, hydrating)
- Serum (8 products - multiple active ingredients, high concentrations)
- Moisturizer (4 products - day, night, anti-aging, barrier repair)
- SPF (3 products - various formulations)
- Mask (3 products - various types)
- Treatment (3 products - clinical-grade)
- Eye Cream (2 products - premium formulations)

**Price Range:** $60.00 - $149.99
**Margins:** Mostly "high" and "premium"
**Active Ingredients:** High-concentration ingredients, multiple actives, clinical formulations
**Target Concerns:** Specific, advanced concerns

**Example Tier 3 Products:**
```json
{
  "product_id": "P201",
  "name": "Professional Clinical Cleanser with Peptides",
  "category": "cleanser",
  "tier": 3,
  "price": 79.99,
  "margin": "premium",
  "replenishment_cycle_days": 90,
  "cross_sell_products": ["P202", "P205", "P210"],
  "upgrade_path": {},
  "active_ingredients": ["Peptides", "Ceramides", "Niacinamide", "Hyaluronic Acid"],
  "target_concerns": ["aging", "fine_lines", "general"],
  "description": "Clinical-grade cleanser with peptide complex for anti-aging benefits"
}
```

---

## üë• Customer Data Structure

### **File:** `customers.json`

**Format:** Array of customer objects (15-20 customers)

### **Customer Object Schema:**

```json
{
  "customer_id": "C001",
  "name": "Sarah Lee",
  "loyalty_tier": "gold",
  "lifetime_value": 210.40,
  "churn_risk": 15.0,
  "price_sensitivity": "low",
  "current_products": [
    {
      "product_id": "P001",
      "purchase_date": "2024-01-15",
      "tier": 1
    }
  ],
  "purchase_history": [
    {
      "product_id": "P001",
      "purchase_date": "2024-01-15",
      "price_paid": 14.99,
      "tier": 1
    }
  ],
  "rfm_features": {
    "recency_days": 45,
    "frequency": 3,
    "monetary_value": 89.50
  },
  "preferred_categories": ["serum", "moisturizer"],
  "skin_concerns": ["general", "dryness"],
  "tier_preference": 1,
  "notes": "Prefers simple routine, price-conscious"
}
```

### **Field Definitions:**

| Field | Type | Required | Description |
|-------|------|----------|-------------|
| `customer_id` | string | Yes | Unique ID (C001, C002, etc.) |
| `name` | string | Yes | Customer name |
| `loyalty_tier` | string | Yes | "bronze", "silver", "gold", "platinum" |
| `lifetime_value` | float | Yes | Total CLV (range: $50-$500) |
| `churn_risk` | float | Yes | Percentage 0-100 |
| `price_sensitivity` | string | Yes | "low", "medium", "high" |
| `current_products` | array | Yes | Array of products customer currently owns (mix of tiers) |
| `purchase_history` | array | Yes | Array of past purchases (can include returns/exchanges) |
| `rfm_features` | object | Yes | Recency, Frequency, Monetary value |
| `preferred_categories` | array | Yes | Categories customer prefers |
| `skin_concerns` | array | Yes | Skin concerns: "general", "acne", "aging", "dryness", "sensitivity", "hyperpigmentation", "fine_lines" |
| `tier_preference` | integer | Yes | 1, 2, or 3 (which tier they typically buy from) |
| `notes` | string | No | Optional customer notes |

### **Customer Distribution:**

Create 15-20 customers with diverse profiles:

**Tier Preference Distribution:**
- 40% Tier 1 customers (6-8 customers)
- 40% Tier 2 customers (6-8 customers)
- 20% Tier 3 customers (3-4 customers)

**Loyalty Tier Distribution:**
- 25% Bronze (4-5 customers)
- 35% Silver (5-7 customers)
- 30% Gold (4-6 customers)
- 10% Platinum (1-2 customers)

**Price Sensitivity Distribution:**
- 30% Low (4-6 customers) - willing to pay premium
- 40% Medium (6-8 customers) - balanced
- 30% High (4-6 customers) - price-conscious

**Current Products:**
- Mix of customers with products from different tiers
- Some customers with all Tier 1, some with mix, some with all Tier 2/3
- Vary routine completeness (some missing 1-2 categories, some missing 4-5)

**Skin Concerns:**
- Distribute across concerns: general, acne, aging, dryness, sensitivity, hyperpigmentation, fine_lines
- Some customers with multiple concerns

---

## üîó Relationships and Rules

### **Upgrade Paths:**
- Each Tier 1 product should have a Tier 2 upgrade option
- Each Tier 2 product should have a Tier 3 upgrade option
- Upgrade products should be in the same category
- Upgrade products should address similar concerns

### **Cross-Sell Relationships:**
- Products in complementary categories (e.g., cleanser ‚Üí toner ‚Üí serum)
- Products that address similar concerns
- Products that work well together (e.g., vitamin C serum + SPF)

### **Tier-Based Business Rules:**
- Tier 1: Basic routine completion
- Tier 2: Targeted treatment recommendations
- Tier 3: Premium, clinical-grade solutions

### **Customer-Product Matching:**
- Match products to customer's `skin_concerns`
- Consider `tier_preference` when recommending
- Consider `price_sensitivity` when recommending upgrades
- Higher `loyalty_tier` customers more likely to accept Tier 2/3 recommendations

---

## üìù Data Quality Requirements

1. **Consistency:**
   - All product_ids must be unique
   - All customer_ids must be unique
   - All product_ids referenced in customer data must exist in catalog
   - All upgrade_path product_ids must exist

2. **Realism:**
   - Prices should be realistic for skincare products
   - Margins should correlate with tiers (higher tier = higher margin)
   - Replenishment cycles should be realistic (30-90 days)
   - Customer data should be internally consistent

3. **Completeness:**
   - All required fields must be present
   - Arrays should not be empty unless intentional
   - Dates should be in "YYYY-MM-DD" format

---

## üéØ Example Scenarios to Include

### **Scenario 1: Tier 1 Customer Ready for Upgrade**
- Customer with all Tier 1 products
- High loyalty tier, low price sensitivity
- Should see Tier 2 upgrade opportunities

### **Scenario 2: Mixed Tier Customer**
- Customer with mix of Tier 1 and Tier 2 products
- Should see opportunities to complete routine at appropriate tier

### **Scenario 3: Tier 3 Customer**
- Customer with all Tier 3 products
- Should see replenishment and complementary Tier 3 products

### **Scenario 4: Price-Sensitive Customer**
- High price sensitivity, low loyalty tier
- Should see Tier 1 recommendations, not upgrades

### **Scenario 5: Concern-Specific Customer**
- Customer with specific skin concern (e.g., "acne")
- Should see products targeting that concern across tiers

---

## üìã Checklist for Data Generation

- [ ] 10 Tier 1 products (P001-P010)
- [ ] 20 Tier 2 products (P101-P120)
- [ ] 30 Tier 3 products (P201-P230)
- [ ] All products have required fields
- [ ] Upgrade paths defined for Tier 1 ‚Üí Tier 2, Tier 2 ‚Üí Tier 3
- [ ] Cross-sell relationships defined
- [ ] 15-20 customers with diverse profiles
- [ ] Customers have mix of tier preferences
- [ ] Customers have realistic purchase history
- [ ] All product_ids in customer data exist in catalog
- [ ] All upgrade_path product_ids exist
- [ ] JSON files are valid and parseable
- [ ] Data is internally consistent

---

## üöÄ Next Steps After Data Generation

1. **Validate Data:**
   - Check JSON validity
   - Verify all references exist
   - Check data consistency

2. **Update Orchestrator:**
   - Add tier-based scoring logic
   - Add upgrade path detection
   - Update business rules for multi-tier recommendations
   - Add tier-aware opportunity ranking

3. **Test:**
   - Test with Tier 1 customers
   - Test with Tier 2 customers
   - Test with Tier 3 customers
   - Test upgrade path recommendations
   - Test mixed-tier scenarios

---

## üí° Tips for ChatGPT

When generating data, ask ChatGPT to:
1. Create realistic product names (not generic placeholders)
2. Use realistic skincare ingredient names
3. Ensure price progression makes sense (Tier 1 < Tier 2 < Tier 3)
4. Create diverse customer profiles (not all similar)
5. Include edge cases (customers with no products, customers with complete routines)
6. Make cross-sell relationships logical (complementary products)
7. Make upgrade paths logical (same category, similar concerns)

---

**End of Data Generation Plan**

