<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/222_Tier_2_Skincare_Dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# üìò **Tier 2 Skincare Dataset ‚Äî Advanced Product Catalog & Customer Profiles**

This notebook introduces the **Tier 2 Advanced Skincare Dataset**, the second major dataset in my multi-tier AI Orchestrator project. It is designed to support the development of a more intelligent, context-aware **Cross-Sell & Upsell Orchestrator Agent**, capable of ingredient-level reasoning, upgrade detection, and skin-concern matching.

This dataset represents a significant evolution from the **Tier 1 Essential Care Dataset**, both in structure and in analytical richness. It serves as the foundation for building more advanced AI workflows and multi-node orchestrators.

---

## üå± **From Tier 1 ‚Üí Tier 2: Why This Dataset Matters**

Tier 1 focused on:

* simple routines
* basic product relationships
* essential steps (cleanser ‚Üí moisturizer ‚Üí SPF)
* straightforward cross-sell logic

Tier 2 introduces realistic complexity, enabling agents to operate more like real-world skincare recommendation engines used by modern beauty brands.

This dataset adds:

* **active ingredients** (Vitamin C, Niacinamide, Retinol, AHA/BHA, etc.)
* **target skin concerns** (acne, aging, fine lines, hyperpigmentation)
* **upgrade paths** from Tier 1 to Tier 2 products
* **routine step mapping** (cleanse ‚Üí prep ‚Üí treat ‚Üí moisturize ‚Üí protect)
* **more diverse customer personas**
* **ingredient ‚Üí concern alignment**
* **advanced cross-sell logic**
* **stronger RFM + churn risk patterns**

Tier 2 is where orchestrator agents begin to move from ‚Äúrule-based assistants‚Äù into **contextual reasoning systems**.

---

## üß¥ **Tier 2 Product Catalog Overview**

The Tier 2 product catalog contains **20 advanced skincare products** across:

* cleansers
* toners
* serums
* moisturizers
* SPF
* masks
* targeted treatments
* eye creams

Each product has:

* realistic names
* active ingredients
* prices and margins
* replenishment cycles
* upgrade-path links
* cross-sell relationships
* concerns addressed (e.g., acne, aging, dryness, hyperpigmentation)
* routine_step classification (1‚Äì5)

These attributes enable the orchestrator to think in terms of **ingredient efficacy**, **skin needs**, and **routine-building sequences**, instead of just category matching.

---

## üë§ **Advanced Customer Dataset Overview**

The dataset includes **15‚Äì18 richly varied customer profiles**, each with:

* loyalty tier
* lifetime value
* churn risk
* price sensitivity
* tier preference (1, 2, or 3)
* skin concerns
* RFM features
* advanced product ownership patterns
* consistent purchase histories
* routines ranging from minimal ‚Üí complete ‚Üí chaotic

Customers are designed to reflect real-world diversity:

* Tier 1 budget shoppers needing upgrades
* Tier 2 engaged skincare users
* Tier 3 premium clients ready for professional products
* customers with sensitive skin
* acne-focused customers
* anti-aging customers
* high-churn candidates needing retention flows

This dataset mirrors the complexity needed to train a **mid-level AI recommendation system**.

---

## ü§ñ **What the Tier 2 Orchestrator Will Learn**

Using this dataset, the Tier 2 agent will be capable of:

### **1. Ingredient-based reasoning**

Matching Niacinamide to acne concerns, Vitamin C to dullness or dark spots, Retinol to aging, etc.

### **2. Upgrade-path detection**

Recognizing when a customer is ready to move from Tier 1 to Tier 2 products.

### **3. Concern-based cross-sells**

Recommending treatment serums, masks, or advanced moisturizers depending on skin goals.

### **4. Routine-step adjacency logic**

Ensuring recommendations follow skincare order:
cleanser ‚Üí toner ‚Üí serum ‚Üí moisturizer ‚Üí SPF.

### **5. Better revenue optimization**

Using RFM, price sensitivity, and CLV to prioritize the highest-value opportunities.

### **6. Higher-quality personalization**

Taking into account:

* skin sensitivity
* product actives
* routine completeness
* ingredient conflicts
* customer tier preference

Tier 2 lays the groundwork for **Tier 3 clinical orchestration**, which will introduce professional-grade formulations and even more precise personalization logic.

---

## üß± **Why This Dataset Is Valuable in a Portfolio**

This Tier 2 dataset demonstrates:

* growing complexity across orchestrator generations
* mastery of structured data design
* customer-product graph thinking
* machine learning‚Äìready feature engineering
* advanced recommender system logic
* real-world personalization strategy
* multi-tier product lifecycle modeling
* scalable data modeling practices

This dataset is a **key stepping stone** toward building full multi-agent, multi-tier AI systems that operate like enterprise recommendation engines.




#customers


In [None]:
[
  {
    "customer_id": "C001",
    "name": "Sarah Lee",
    "email": "sarah.lee.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 198.24,
    "churn_risk": 0.06,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P102",
        "purchase_date": "2025-11-01",
        "amount": 29.0
      },
      {
        "product_id": "P105",
        "purchase_date": "2025-11-03",
        "amount": 28.0
      },
      {
        "product_id": "P108",
        "purchase_date": "2025-11-06",
        "amount": 38.0
      },
      {
        "product_id": "P110",
        "purchase_date": "2025-11-09",
        "amount": 34.0
      },
      {
        "product_id": "P113",
        "purchase_date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P116",
        "purchase_date": "2025-10-17",
        "amount": 33.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P102",
        "date": "2025-11-01",
        "amount": 29.0
      },
      {
        "product_id": "P105",
        "date": "2025-11-03",
        "amount": 28.0
      },
      {
        "product_id": "P108",
        "date": "2025-11-06",
        "amount": 38.0
      },
      {
        "product_id": "P110",
        "date": "2025-11-09",
        "amount": 34.0
      },
      {
        "product_id": "P113",
        "date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P116",
        "date": "2025-10-17",
        "amount": 33.0
      }
    ],
    "categories": [
      "cleanser",
      "mask",
      "moisturizer",
      "serum",
      "spf",
      "toner"
    ],
    "rfm": {
      "recency_days": 10,
      "frequency_90d": 6,
      "monetary_90d": 197.0
    },
    "skin_concerns": [
      "dryness",
      "sensitivity"
    ],
    "tier_preference": 2,
    "notes": "Complete hydration-focused routine with cleanser, toner, serum, moisturizer, SPF, and mask."
  },
  {
    "customer_id": "C002",
    "name": "Mark Johnson",
    "email": "mark.johnson.t2@example.com",
    "loyalty_tier": "bronze",
    "lifetime_value": 102.56,
    "churn_risk": 0.31,
    "price_sensitivity": "high",
    "products_owned": [
      {
        "product_id": "P101",
        "purchase_date": "2025-10-12",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "purchase_date": "2025-10-14",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "purchase_date": "2025-10-17",
        "amount": 36.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P101",
        "date": "2025-10-12",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "date": "2025-10-14",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "date": "2025-10-17",
        "amount": 36.0
      }
    ],
    "categories": [
      "cleanser",
      "serum",
      "toner"
    ],
    "rfm": {
      "recency_days": 35,
      "frequency_90d": 3,
      "monetary_90d": 93.0
    },
    "skin_concerns": [
      "acne"
    ],
    "tier_preference": 1,
    "notes": "Budget-conscious acne routine missing moisturizer and SPF; good candidate for Tier 2 moisturizer upgrade."
  },
  {
    "customer_id": "C003",
    "name": "Emily Chen",
    "email": "emily.chen.t2@example.com",
    "loyalty_tier": "bronze",
    "lifetime_value": 50.0,
    "churn_risk": 0.29,
    "price_sensitivity": "high",
    "products_owned": [
      {
        "product_id": "P107",
        "purchase_date": "2025-09-27",
        "amount": 44.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P107",
        "date": "2025-09-27",
        "amount": 44.0
      }
    ],
    "categories": [
      "serum"
    ],
    "rfm": {
      "recency_days": 55,
      "frequency_90d": 1,
      "monetary_90d": 44.0
    },
    "skin_concerns": [
      "hyperpigmentation"
    ],
    "tier_preference": 1,
    "notes": "Single-product user relying on brightening serum; ideal for cleanser and SPF cross-sell."
  },
  {
    "customer_id": "C004",
    "name": "Natalie Park",
    "email": "natalie.park.t2@example.com",
    "loyalty_tier": "gold",
    "lifetime_value": 274.72,
    "churn_risk": 0.12,
    "price_sensitivity": "low",
    "products_owned": [
      {
        "product_id": "P103",
        "purchase_date": "2025-10-22",
        "amount": 32.0
      },
      {
        "product_id": "P104",
        "purchase_date": "2025-10-24",
        "amount": 30.0
      },
      {
        "product_id": "P109",
        "purchase_date": "2025-11-01",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "purchase_date": "2025-11-03",
        "amount": 52.0
      },
      {
        "product_id": "P113",
        "purchase_date": "2025-11-05",
        "amount": 35.0
      },
      {
        "product_id": "P119",
        "purchase_date": "2025-11-07",
        "amount": 42.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P103",
        "date": "2025-10-22",
        "amount": 32.0
      },
      {
        "product_id": "P104",
        "date": "2025-10-24",
        "amount": 30.0
      },
      {
        "product_id": "P109",
        "date": "2025-11-01",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "date": "2025-11-03",
        "amount": 52.0
      },
      {
        "product_id": "P113",
        "date": "2025-11-05",
        "amount": 35.0
      },
      {
        "product_id": "P119",
        "date": "2025-11-07",
        "amount": 42.0
      }
    ],
    "categories": [
      "cleanser",
      "eye_cream",
      "moisturizer",
      "serum",
      "spf",
      "toner"
    ],
    "rfm": {
      "recency_days": 14,
      "frequency_90d": 6,
      "monetary_90d": 240.0
    },
    "skin_concerns": [
      "aging",
      "fine_lines"
    ],
    "tier_preference": 2,
    "notes": "Mature-skin customer with full anti-aging routine; potential Tier 3 candidate later."
  },
  {
    "customer_id": "C005",
    "name": "Alicia Gomez",
    "email": "alicia.gomez.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 158.36,
    "churn_risk": 0.16,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P101",
        "purchase_date": "2025-10-30",
        "amount": 27.0
      },
      {
        "product_id": "P107",
        "purchase_date": "2025-11-02",
        "amount": 44.0
      },
      {
        "product_id": "P110",
        "purchase_date": "2025-11-04",
        "amount": 34.0
      },
      {
        "product_id": "P114",
        "purchase_date": "2025-11-06",
        "amount": 37.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P101",
        "date": "2025-10-30",
        "amount": 27.0
      },
      {
        "product_id": "P107",
        "date": "2025-11-02",
        "amount": 44.0
      },
      {
        "product_id": "P110",
        "date": "2025-11-04",
        "amount": 34.0
      },
      {
        "product_id": "P114",
        "date": "2025-11-06",
        "amount": 37.0
      }
    ],
    "categories": [
      "cleanser",
      "moisturizer",
      "serum",
      "spf"
    ],
    "rfm": {
      "recency_days": 15,
      "frequency_90d": 4,
      "monetary_90d": 142.0
    },
    "skin_concerns": [
      "aging",
      "hyperpigmentation"
    ],
    "tier_preference": 2,
    "notes": "Well-structured brightening and anti-aging routine; missing toner and eye care."
  },
  {
    "customer_id": "C006",
    "name": "Jason Patel",
    "email": "jason.patel.t2@example.com",
    "loyalty_tier": "bronze",
    "lifetime_value": 50.0,
    "churn_risk": 0.7,
    "price_sensitivity": "high",
    "products_owned": [
      {
        "product_id": "P114",
        "purchase_date": "2025-09-17",
        "amount": 37.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P114",
        "date": "2025-09-17",
        "amount": 37.0
      }
    ],
    "categories": [
      "spf"
    ],
    "rfm": {
      "recency_days": 65,
      "frequency_90d": 1,
      "monetary_90d": 37.0
    },
    "skin_concerns": [
      "general"
    ],
    "tier_preference": 1,
    "notes": "Single-product SPF user with high churn risk; ideal for re-engagement and routine-building cross-sell."
  },
  {
    "customer_id": "C007",
    "name": "Omar Hassan",
    "email": "omar.hassan.t2@example.com",
    "loyalty_tier": "gold",
    "lifetime_value": 193.58,
    "churn_risk": 0.15,
    "price_sensitivity": "low",
    "products_owned": [
      {
        "product_id": "P101",
        "purchase_date": "2025-11-13",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "purchase_date": "2025-11-14",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "purchase_date": "2025-11-15",
        "amount": 36.0
      },
      {
        "product_id": "P111",
        "purchase_date": "2025-11-16",
        "amount": 33.0
      },
      {
        "product_id": "P113",
        "purchase_date": "2025-11-17",
        "amount": 35.0
      },
      {
        "product_id": "P117",
        "purchase_date": "2025-11-18",
        "amount": 29.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P101",
        "date": "2025-11-13",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "date": "2025-11-14",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "date": "2025-11-15",
        "amount": 36.0
      },
      {
        "product_id": "P111",
        "date": "2025-11-16",
        "amount": 33.0
      },
      {
        "product_id": "P113",
        "date": "2025-11-17",
        "amount": 35.0
      },
      {
        "product_id": "P117",
        "date": "2025-11-18",
        "amount": 29.0
      }
    ],
    "categories": [
      "cleanser",
      "moisturizer",
      "serum",
      "spf",
      "toner",
      "treatment"
    ],
    "rfm": {
      "recency_days": 3,
      "frequency_90d": 6,
      "monetary_90d": 190.0
    },
    "skin_concerns": [
      "acne",
      "aging"
    ],
    "tier_preference": 2,
    "notes": "Highly engaged Tier 2 customer with acne-focused yet complete routine, including treatment and SPF."
  },
  {
    "customer_id": "C008",
    "name": "David Brooks",
    "email": "david.brooks.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 111.37,
    "churn_risk": 0.39,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P106",
        "purchase_date": "2025-10-07",
        "amount": 36.0
      },
      {
        "product_id": "P115",
        "purchase_date": "2025-10-10",
        "amount": 39.0
      },
      {
        "product_id": "P117",
        "purchase_date": "2025-10-12",
        "amount": 29.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P106",
        "date": "2025-10-07",
        "amount": 36.0
      },
      {
        "product_id": "P115",
        "date": "2025-10-10",
        "amount": 39.0
      },
      {
        "product_id": "P117",
        "date": "2025-10-12",
        "amount": 29.0
      }
    ],
    "categories": [
      "mask",
      "serum",
      "treatment"
    ],
    "rfm": {
      "recency_days": 40,
      "frequency_90d": 3,
      "monetary_90d": 104.0
    },
    "skin_concerns": [
      "acne"
    ],
    "tier_preference": 1,
    "notes": "Treatment-heavy acne routine with no dedicated cleanser or SPF; chaotic usage pattern."
  },
  {
    "customer_id": "C009",
    "name": "Julia Walters",
    "email": "julia.walters.t2@example.com",
    "loyalty_tier": "bronze",
    "lifetime_value": 87.96,
    "churn_risk": 0.63,
    "price_sensitivity": "high",
    "products_owned": [
      {
        "product_id": "P116",
        "purchase_date": "2025-10-02",
        "amount": 33.0
      },
      {
        "product_id": "P119",
        "purchase_date": "2025-10-04",
        "amount": 42.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P116",
        "date": "2025-10-02",
        "amount": 33.0
      },
      {
        "product_id": "P119",
        "date": "2025-10-04",
        "amount": 42.0
      }
    ],
    "categories": [
      "eye_cream",
      "mask"
    ],
    "rfm": {
      "recency_days": 50,
      "frequency_90d": 2,
      "monetary_90d": 75.0
    },
    "skin_concerns": [
      "dryness"
    ],
    "tier_preference": 2,
    "notes": "Uses hydrating mask and eye cream only; strong opportunity to introduce cleanser, serum, and moisturizer."
  },
  {
    "customer_id": "C010",
    "name": "Kevin Ramirez",
    "email": "kevin.ramirez.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 247.62,
    "churn_risk": 0.17,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P103",
        "purchase_date": "2025-11-03",
        "amount": 32.0
      },
      {
        "product_id": "P105",
        "purchase_date": "2025-11-05",
        "amount": 28.0
      },
      {
        "product_id": "P107",
        "purchase_date": "2025-11-07",
        "amount": 44.0
      },
      {
        "product_id": "P110",
        "purchase_date": "2025-11-09",
        "amount": 34.0
      },
      {
        "product_id": "P113",
        "purchase_date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P118",
        "purchase_date": "2025-11-12",
        "amount": 46.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P103",
        "date": "2025-11-03",
        "amount": 32.0
      },
      {
        "product_id": "P105",
        "date": "2025-11-05",
        "amount": 28.0
      },
      {
        "product_id": "P107",
        "date": "2025-11-07",
        "amount": 44.0
      },
      {
        "product_id": "P110",
        "date": "2025-11-09",
        "amount": 34.0
      },
      {
        "product_id": "P113",
        "date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P118",
        "date": "2025-11-12",
        "amount": 46.0
      }
    ],
    "categories": [
      "cleanser",
      "moisturizer",
      "serum",
      "spf",
      "toner",
      "treatment"
    ],
    "rfm": {
      "recency_days": 9,
      "frequency_90d": 6,
      "monetary_90d": 219.0
    },
    "skin_concerns": [
      "hyperpigmentation"
    ],
    "tier_preference": 2,
    "notes": "Complete brightening routine including dark-spot treatment; candidate for future Tier 3 clinical upgrades."
  },
  {
    "customer_id": "C011",
    "name": "Laura Kim",
    "email": "laura.kim.t2@example.com",
    "loyalty_tier": "gold",
    "lifetime_value": 156.88,
    "churn_risk": 0.24,
    "price_sensitivity": "low",
    "products_owned": [
      {
        "product_id": "P109",
        "purchase_date": "2025-10-20",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "purchase_date": "2025-10-22",
        "amount": 52.0
      },
      {
        "product_id": "P120",
        "purchase_date": "2025-10-24",
        "amount": 54.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P109",
        "date": "2025-10-20",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "date": "2025-10-22",
        "amount": 52.0
      },
      {
        "product_id": "P120",
        "date": "2025-10-24",
        "amount": 54.0
      }
    ],
    "categories": [
      "eye_cream",
      "moisturizer",
      "serum"
    ],
    "rfm": {
      "recency_days": 28,
      "frequency_90d": 3,
      "monetary_90d": 155.0
    },
    "skin_concerns": [
      "aging",
      "fine_lines"
    ],
    "tier_preference": 3,
    "notes": "Premium anti-aging shopper focused on serum, night cream, and eye serum; missing cleanser and SPF."
  },
  {
    "customer_id": "C012",
    "name": "Priya Desai",
    "email": "priya.desai.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 50.0,
    "churn_risk": 0.41,
    "price_sensitivity": "high",
    "products_owned": [
      {
        "product_id": "P108",
        "purchase_date": "2025-09-24",
        "amount": 38.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P108",
        "date": "2025-09-24",
        "amount": 38.0
      }
    ],
    "categories": [
      "serum"
    ],
    "rfm": {
      "recency_days": 58,
      "frequency_90d": 1,
      "monetary_90d": 38.0
    },
    "skin_concerns": [
      "sensitivity",
      "dryness"
    ],
    "tier_preference": 1,
    "notes": "Hydration-focused but minimal routine with only a serum; sensitive skin suggests gentle cleanser and moisturizer upsell."
  },
  {
    "customer_id": "C013",
    "name": "Thomas Nguyen",
    "email": "thomas.nguyen.t2@example.com",
    "loyalty_tier": "gold",
    "lifetime_value": 245.04,
    "churn_risk": 0.06,
    "price_sensitivity": "low",
    "products_owned": [
      {
        "product_id": "P101",
        "purchase_date": "2025-11-07",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "purchase_date": "2025-11-08",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "purchase_date": "2025-11-09",
        "amount": 36.0
      },
      {
        "product_id": "P111",
        "purchase_date": "2025-11-10",
        "amount": 33.0
      },
      {
        "product_id": "P113",
        "purchase_date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P117",
        "purchase_date": "2025-11-12",
        "amount": 29.0
      },
      {
        "product_id": "P118",
        "purchase_date": "2025-11-13",
        "amount": 46.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P101",
        "date": "2025-11-07",
        "amount": 27.0
      },
      {
        "product_id": "P104",
        "date": "2025-11-08",
        "amount": 30.0
      },
      {
        "product_id": "P106",
        "date": "2025-11-09",
        "amount": 36.0
      },
      {
        "product_id": "P111",
        "date": "2025-11-10",
        "amount": 33.0
      },
      {
        "product_id": "P113",
        "date": "2025-11-11",
        "amount": 35.0
      },
      {
        "product_id": "P117",
        "date": "2025-11-12",
        "amount": 29.0
      },
      {
        "product_id": "P118",
        "date": "2025-11-13",
        "amount": 46.0
      }
    ],
    "categories": [
      "cleanser",
      "moisturizer",
      "serum",
      "spf",
      "treatment",
      "toner"
    ],
    "rfm": {
      "recency_days": 8,
      "frequency_90d": 7,
      "monetary_90d": 236.0
    },
    "skin_concerns": [
      "acne",
      "hyperpigmentation"
    ],
    "tier_preference": 2,
    "notes": "Advanced routine with cleanser, toner, serum, moisturizer, SPF, and dual treatments for acne and dark spots."
  },
  {
    "customer_id": "C014",
    "name": "Isabella Rossi",
    "email": "isabella.rossi.t2@example.com",
    "loyalty_tier": "silver",
    "lifetime_value": 137.76,
    "churn_risk": 0.12,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P102",
        "purchase_date": "2025-10-27",
        "amount": 29.0
      },
      {
        "product_id": "P108",
        "purchase_date": "2025-10-30",
        "amount": 38.0
      },
      {
        "product_id": "P110",
        "purchase_date": "2025-11-01",
        "amount": 34.0
      },
      {
        "product_id": "P116",
        "purchase_date": "2025-11-03",
        "amount": 33.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P102",
        "date": "2025-10-27",
        "amount": 29.0
      },
      {
        "product_id": "P108",
        "date": "2025-10-30",
        "amount": 38.0
      },
      {
        "product_id": "P110",
        "date": "2025-11-01",
        "amount": 34.0
      },
      {
        "product_id": "P116",
        "date": "2025-11-03",
        "amount": 33.0
      }
    ],
    "categories": [
      "cleanser",
      "mask",
      "moisturizer",
      "serum"
    ],
    "rfm": {
      "recency_days": 18,
      "frequency_90d": 4,
      "monetary_90d": 134.0
    },
    "skin_concerns": [
      "dryness"
    ],
    "tier_preference": 2,
    "notes": "Hydration and barrier-repair focused routine missing SPF; strong candidate for protection-focused cross-sell."
  },
  {
    "customer_id": "C015",
    "name": "Ethan Wood",
    "email": "ethan.wood.t2@example.com",
    "loyalty_tier": "platinum",
    "lifetime_value": 209.76,
    "churn_risk": 0.07,
    "price_sensitivity": "low",
    "products_owned": [
      {
        "product_id": "P109",
        "purchase_date": "2025-11-15",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "purchase_date": "2025-11-16",
        "amount": 52.0
      },
      {
        "product_id": "P120",
        "purchase_date": "2025-11-17",
        "amount": 54.0
      },
      {
        "product_id": "P114",
        "purchase_date": "2025-11-18",
        "amount": 37.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P109",
        "date": "2025-11-15",
        "amount": 49.0
      },
      {
        "product_id": "P112",
        "date": "2025-11-16",
        "amount": 52.0
      },
      {
        "product_id": "P120",
        "date": "2025-11-17",
        "amount": 54.0
      },
      {
        "product_id": "P114",
        "date": "2025-11-18",
        "amount": 37.0
      }
    ],
    "categories": [
      "eye_cream",
      "moisturizer",
      "serum",
      "spf"
    ],
    "rfm": {
      "recency_days": 3,
      "frequency_90d": 4,
      "monetary_90d": 192.0
    },
    "skin_concerns": [
      "aging",
      "fine_lines"
    ],
    "tier_preference": 3,
    "notes": "High-spend, premium shopper with advanced anti-aging focus and SPF; could be introduced to complementary masks and treatments."
  },
  {
    "customer_id": "C016",
    "name": "Mia Alvarez",
    "email": "mia.alvarez.t2@example.com",
    "loyalty_tier": "platinum",
    "lifetime_value": 50.0,
    "churn_risk": 0.86,
    "price_sensitivity": "medium",
    "products_owned": [
      {
        "product_id": "P118",
        "purchase_date": "2025-09-07",
        "amount": 46.0
      }
    ],
    "purchase_history": [
      {
        "product_id": "P118",
        "date": "2025-09-07",
        "amount": 46.0
      }
    ],
    "categories": [
      "treatment"
    ],
    "rfm": {
      "recency_days": 75,
      "frequency_90d": 1,
      "monetary_90d": 46.0
    },
    "skin_concerns": [
      "hyperpigmentation"
    ],
    "tier_preference": 3,
    "notes": "Premium dark-spot treatment user with no supporting routine; high-value target for cleanser, SPF, and moisturizer upsell."
  }
]


# products

In [None]:
[
  {
    "product_id": "P101",
    "name": "Pro Renewal Gel Cleanser",
    "category": "cleanser",
    "tier": 2,
    "price": 27.00,
    "margin": "medium",
    "routine_step": 1,
    "active_ingredients": [
      "Lactic Acid",
      "Aloe Vera"
    ],
    "target_concerns": [
      "dullness",
      "congestion"
    ],
    "upgrades_from": [
      "P001"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P104",
      "P107",
      "P110"
    ]
  },
  {
    "product_id": "P102",
    "name": "Clarifying Deep Cleanse Foam",
    "category": "cleanser",
    "tier": 2,
    "price": 29.00,
    "margin": "high",
    "routine_step": 1,
    "active_ingredients": [
      "Salicylic Acid (BHA)",
      "Tea Tree"
    ],
    "target_concerns": [
      "acne",
      "oiliness"
    ],
    "upgrades_from": [
      "P010"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P104",
      "P106",
      "P110"
    ]
  },
  {
    "product_id": "P103",
    "name": "HydraBalance Prep Toner",
    "category": "toner",
    "tier": 2,
    "price": 32.00,
    "margin": "medium",
    "routine_step": 2,
    "active_ingredients": [
      "Hyaluronic Acid",
      "Panthenol"
    ],
    "target_concerns": [
      "dryness",
      "sensitivity"
    ],
    "upgrades_from": [
      "P002"
    ],
    "replenishment_cycle_days": 45,
    "recommended_cross_sells": [
      "P107",
      "P110",
      "P113"
    ]
  },
  {
    "product_id": "P104",
    "name": "BrightFresh Vitamin C Toner",
    "category": "toner",
    "tier": 2,
    "price": 30.00,
    "margin": "medium",
    "routine_step": 2,
    "active_ingredients": [
      "Vitamin C (SAP)",
      "Licorice Root"
    ],
    "target_concerns": [
      "hyperpigmentation",
      "dullness"
    ],
    "upgrades_from": [
      "P002"
    ],
    "replenishment_cycle_days": 45,
    "recommended_cross_sells": [
      "P107",
      "P118",
      "P113"
    ]
  },
  {
    "product_id": "P105",
    "name": "Niacinamide Balance Serum",
    "category": "serum",
    "tier": 2,
    "price": 28.00,
    "margin": "medium",
    "routine_step": 3,
    "active_ingredients": [
      "Niacinamide 5%",
      "Zinc PCA"
    ],
    "target_concerns": [
      "acne",
      "redness"
    ],
    "upgrades_from": [
      "P003"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P110",
      "P114",
      "P117"
    ]
  },
  {
    "product_id": "P106",
    "name": "AHA Resurfacing Serum",
    "category": "serum",
    "tier": 2,
    "price": 36.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Glycolic Acid 5%",
      "Lactic Acid"
    ],
    "target_concerns": [
      "texture",
      "dullness"
    ],
    "upgrades_from": [
      "P003"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P103",
      "P110",
      "P115"
    ]
  },
  {
    "product_id": "P107",
    "name": "Radiance Boost Vitamin C Serum",
    "category": "serum",
    "tier": 2,
    "price": 44.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Vitamin C (AA2G)",
      "Ferulic Acid"
    ],
    "target_concerns": [
      "hyperpigmentation",
      "brightness"
    ],
    "upgrades_from": [
      "P003"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P110",
      "P113",
      "P118"
    ]
  },
  {
    "product_id": "P108",
    "name": "Calm+ Barrier Repair Serum",
    "category": "serum",
    "tier": 2,
    "price": 38.00,
    "margin": "medium",
    "routine_step": 3,
    "active_ingredients": [
      "Ceramides",
      "Centella Asiatica"
    ],
    "target_concerns": [
      "sensitivity",
      "dryness"
    ],
    "upgrades_from": [
      "P003"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P103",
      "P110",
      "P116"
    ]
  },
  {
    "product_id": "P109",
    "name": "Revive Peptide Eye Serum",
    "category": "eye_cream",
    "tier": 2,
    "price": 49.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Peptides",
      "Caffeine"
    ],
    "target_concerns": [
      "fine_lines",
      "puffiness"
    ],
    "upgrades_from": [
      "P009"
    ],
    "replenishment_cycle_days": 45,
    "recommended_cross_sells": [
      "P112",
      "P110",
      "P113"
    ]
  },
  {
    "product_id": "P110",
    "name": "Daily Defense Hydrating Moisturizer",
    "category": "moisturizer",
    "tier": 2,
    "price": 34.00,
    "margin": "medium",
    "routine_step": 4,
    "active_ingredients": [
      "Squalane",
      "Ceramides"
    ],
    "target_concerns": [
      "dryness",
      "barrier_repair"
    ],
    "upgrades_from": [
      "P004"
    ],
    "replenishment_cycle_days": 40,
    "recommended_cross_sells": [
      "P103",
      "P108",
      "P114"
    ]
  },
  {
    "product_id": "P111",
    "name": "Oil-Free Clarifying Gel Moisturizer",
    "category": "moisturizer",
    "tier": 2,
    "price": 33.00,
    "margin": "medium",
    "routine_step": 4,
    "active_ingredients": [
      "Niacinamide",
      "Green Tea"
    ],
    "target_concerns": [
      "acne",
      "oiliness"
    ],
    "upgrades_from": [
      "P004"
    ],
    "replenishment_cycle_days": 40,
    "recommended_cross_sells": [
      "P102",
      "P105",
      "P114"
    ]
  },
  {
    "product_id": "P112",
    "name": "Firm+ Overnight Peptide Cream",
    "category": "moisturizer",
    "tier": 2,
    "price": 52.00,
    "margin": "high",
    "routine_step": 4,
    "active_ingredients": [
      "Peptides",
      "Retinal"
    ],
    "target_concerns": [
      "aging",
      "fine_lines"
    ],
    "upgrades_from": [
      "P004"
    ],
    "replenishment_cycle_days": 40,
    "recommended_cross_sells": [
      "P109",
      "P118",
      "P113"
    ]
  },
  {
    "product_id": "P113",
    "name": "UltraShield SPF 40",
    "category": "spf",
    "tier": 2,
    "price": 35.00,
    "margin": "medium",
    "routine_step": 5,
    "active_ingredients": [
      "Zinc Oxide",
      "Vitamin E"
    ],
    "target_concerns": [
      "sun_damage",
      "hyperpigmentation"
    ],
    "upgrades_from": [
      "P005"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P107",
      "P110",
      "P119"
    ]
  },
  {
    "product_id": "P114",
    "name": "SPF 50 Mattifying Fluid",
    "category": "spf",
    "tier": 2,
    "price": 37.00,
    "margin": "high",
    "routine_step": 5,
    "active_ingredients": [
      "Titanium Dioxide",
      "Niacinamide"
    ],
    "target_concerns": [
      "oiliness",
      "acne"
    ],
    "upgrades_from": [
      "P005"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P102",
      "P111",
      "P117"
    ]
  },
  {
    "product_id": "P115",
    "name": "AHA Glow Renewal Mask",
    "category": "mask",
    "tier": 2,
    "price": 39.00,
    "margin": "medium",
    "routine_step": 3,
    "active_ingredients": [
      "Glycolic Acid",
      "Pumpkin Enzymes"
    ],
    "target_concerns": [
      "texture",
      "dullness"
    ],
    "upgrades_from": [
      "P006"
    ],
    "replenishment_cycle_days": 20,
    "recommended_cross_sells": [
      "P106",
      "P103",
      "P110"
    ]
  },
  {
    "product_id": "P116",
    "name": "Barrier Repair Overnight Mask",
    "category": "mask",
    "tier": 2,
    "price": 33.00,
    "margin": "medium",
    "routine_step": 3,
    "active_ingredients": [
      "Ceramides",
      "Oat Extract"
    ],
    "target_concerns": [
      "dryness",
      "sensitivity"
    ],
    "upgrades_from": [
      "P006"
    ],
    "replenishment_cycle_days": 20,
    "recommended_cross_sells": [
      "P108",
      "P110",
      "P113"
    ]
  },
  {
    "product_id": "P117",
    "name": "Acne Control Spot Treatment",
    "category": "treatment",
    "tier": 2,
    "price": 29.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Benzoyl Peroxide 2.5%",
      "Niacinamide"
    ],
    "target_concerns": [
      "acne"
    ],
    "upgrades_from": [
      "P006"
    ],
    "replenishment_cycle_days": 25,
    "recommended_cross_sells": [
      "P102",
      "P111",
      "P114"
    ]
  },
  {
    "product_id": "P118",
    "name": "Dark Spot Corrector Serum",
    "category": "treatment",
    "tier": 2,
    "price": 46.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Tranexamic Acid",
      "Kojic Acid"
    ],
    "target_concerns": [
      "hyperpigmentation"
    ],
    "upgrades_from": [
      "P006"
    ],
    "replenishment_cycle_days": 25,
    "recommended_cross_sells": [
      "P107",
      "P113",
      "P110"
    ]
  },
  {
    "product_id": "P119",
    "name": "HydraRecover Soothing Mask",
    "category": "mask",
    "tier": 2,
    "price": 42.00,
    "margin": "medium",
    "routine_step": 3,
    "active_ingredients": [
      "Centella Asiatica",
      "Panthenol"
    ],
    "target_concerns": [
      "redness",
      "sensitivity"
    ],
    "upgrades_from": [
      "P006"
    ],
    "replenishment_cycle_days": 20,
    "recommended_cross_sells": [
      "P108",
      "P113",
      "P110"
    ]
  },
  {
    "product_id": "P120",
    "name": "Pro Peptide Lift Serum",
    "category": "serum",
    "tier": 2,
    "price": 54.00,
    "margin": "high",
    "routine_step": 3,
    "active_ingredients": [
      "Matrixyl Peptides",
      "Amino Acids"
    ],
    "target_concerns": [
      "aging",
      "fine_lines"
    ],
    "upgrades_from": [
      "P003"
    ],
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": [
      "P112",
      "P109",
      "P113"
    ]
  }
]

# data validation script

In [None]:
#!/usr/bin/env python3
"""
Validation script for Tier 2 customer and product data
"""
import json
from datetime import datetime, timedelta
from collections import defaultdict

# Load data
with open('data_tier_2/product_catalog.json') as f:
    products = json.load(f)

with open('data_tier_2/customers.json') as f:
    customers = json.load(f)

# Get product IDs and create lookup
product_ids = {p['product_id'] for p in products}
product_lookup = {p['product_id']: p for p in products}

print("=" * 70)
print("TIER 2 DATA VALIDATION REPORT")
print("=" * 70)
print(f"\nüì¶ Products loaded: {len(products)}")
print(f"üë• Customers loaded: {len(customers)}\n")

errors = []
warnings = []
info = []

# Validate products first
print("üîç Validating Product Catalog...")
for product in products:
    pid = product['product_id']

    # Check tier
    if product.get('tier') != 2:
        errors.append(f"Product {pid}: Should be tier 2, got {product.get('tier')}")

    # Check price range
    price = product.get('price', 0)
    if not (25.0 <= price <= 59.99):
        errors.append(f"Product {pid}: Price ${price:.2f} out of range ($25-$60)")

    # Check required fields
    required_fields = ['routine_step', 'active_ingredients', 'target_concerns', 'upgrades_from', 'recommended_cross_sells']
    for field in required_fields:
        if field not in product:
            errors.append(f"Product {pid}: Missing required field '{field}'")

    # Check routine_step
    if 'routine_step' in product:
        rs = product['routine_step']
        if not (1 <= rs <= 5):
            errors.append(f"Product {pid}: routine_step {rs} should be 1-5")

    # Check cross-sells
    if 'recommended_cross_sells' in product:
        cross_sells = product['recommended_cross_sells']
        if not (3 <= len(cross_sells) <= 5):
            errors.append(f"Product {pid}: Should have 3-5 cross-sells, got {len(cross_sells)}")
        if pid in cross_sells:
            errors.append(f"Product {pid}: Self-references in cross-sells")
        for cs_id in cross_sells:
            if cs_id not in product_ids:
                errors.append(f"Product {pid}: Cross-sell {cs_id} does not exist")

print(f"‚úÖ Product validation complete: {len(errors)} errors found\n")

# Validate customers
print("üîç Validating Customer Data...")
today = datetime.now()

for customer in customers:
    cid = customer['customer_id']

    # Check products_owned
    for product in customer.get('products_owned', []):
        pid = product['product_id']
        if pid not in product_ids:
            errors.append(f"Customer {cid}: Missing product {pid} in products_owned")
        else:
            # Check if product matches customer's skin concerns
            prod = product_lookup[pid]
            customer_concerns = set(customer.get('skin_concerns', []))
            product_concerns = set(prod.get('target_concerns', []))

            # Allow some flexibility - at least one concern should match or be general
            if not customer_concerns.intersection(product_concerns) and 'general' not in product_concerns:
                warnings.append(f"Customer {cid}: Product {pid} concerns {product_concerns} don't match customer concerns {customer_concerns}")

    # Check purchase_history
    purchase_history = customer.get('purchase_history', [])
    for purchase in purchase_history:
        pid = purchase['product_id']
        if pid not in product_ids:
            errors.append(f"Customer {cid}: Missing product {pid} in purchase_history")

    # Validate RFM consistency
    if purchase_history:
        try:
            dates = [datetime.strptime(p['date'], '%Y-%m-%d') for p in purchase_history]
            most_recent = max(dates)
            recency_calc = (today - most_recent).days
            recency_actual = customer.get('rfm', {}).get('recency_days', 0)

            if abs(recency_calc - recency_actual) > 5:
                errors.append(f"Customer {cid}: recency_days mismatch (calc: {recency_calc}, actual: {recency_actual}, most recent: {most_recent.date()})")

            cutoff = today - timedelta(days=90)
            frequency_calc = sum(1 for d in dates if d >= cutoff)
            frequency_actual = customer.get('rfm', {}).get('frequency_90d', 0)
            if frequency_calc != frequency_actual:
                errors.append(f"Customer {cid}: frequency_90d mismatch (calc: {frequency_calc}, actual: {frequency_actual})")

            monetary_calc = sum(p['amount'] for p, d in zip(purchase_history, dates) if d >= cutoff)
            monetary_actual = customer.get('rfm', {}).get('monetary_90d', 0)
            if abs(monetary_calc - monetary_actual) > 0.01:
                errors.append(f"Customer {cid}: monetary_90d mismatch (calc: ${monetary_calc:.2f}, actual: ${monetary_actual:.2f})")
        except (ValueError, KeyError) as e:
            errors.append(f"Customer {cid}: Error validating RFM - {e}")

    # Check churn_risk format
    churn = customer.get('churn_risk', 0)
    if not (0.0 <= churn <= 1.0):
        errors.append(f"Customer {cid}: churn_risk should be 0.0-1.0, got {churn}")

    # Check lifetime_value consistency
    total_history = sum(p['amount'] for p in purchase_history)
    ltv = customer.get('lifetime_value', 0)
    if abs(total_history - ltv) > 1.0:
        warnings.append(f"Customer {cid}: lifetime_value (${ltv:.2f}) differs from purchase_history sum (${total_history:.2f})")

    # Check churn_risk vs recency alignment
    recency = customer.get('rfm', {}).get('recency_days', 0)
    if recency < 30 and churn > 0.3:
        warnings.append(f"Customer {cid}: Low recency ({recency}d) but high churn_risk ({churn:.2f}) - may need review")
    elif recency > 60 and churn < 0.3:
        warnings.append(f"Customer {cid}: High recency ({recency}d) but low churn_risk ({churn:.2f}) - may need review")

    # Check routine completeness
    owned_categories = set(customer.get('categories', []))
    essential_categories = {'cleanser', 'moisturizer', 'spf'}
    missing_essentials = essential_categories - owned_categories
    if missing_essentials:
        info.append(f"Customer {cid}: Missing essential categories: {missing_essentials} (routine gap opportunity)")

# Print results
print("=" * 70)
if errors:
    print(f"‚ùå ERRORS FOUND: {len(errors)}")
    print("-" * 70)
    for error in errors[:20]:  # Show first 20
        print(f"  ‚Ä¢ {error}")
    if len(errors) > 20:
        print(f"  ... and {len(errors) - 20} more errors")
else:
    print("‚úÖ No critical errors found!")

print()
if warnings:
    print(f"‚ö†Ô∏è  WARNINGS: {len(warnings)}")
    print("-" * 70)
    for warning in warnings[:15]:  # Show first 15
        print(f"  ‚Ä¢ {warning}")
    if len(warnings) > 15:
        print(f"  ... and {len(warnings) - 15} more warnings")
else:
    print("‚úÖ No warnings!")

print()
if info:
    print(f"‚ÑπÔ∏è  INFO: {len(info)} routine gap opportunities found")
    print("-" * 70)
    for i in info[:10]:  # Show first 10
        print(f"  ‚Ä¢ {i}")
    if len(info) > 10:
        print(f"  ... and {len(info) - 10} more opportunities")

print()
print("=" * 70)
print("SUMMARY")
print("=" * 70)
print(f"Total Errors:   {len(errors)}")
print(f"Total Warnings: {len(warnings)}")
print(f"Opportunities:  {len(info)}")
print()

if len(errors) == 0:
    print("üéâ Data validation PASSED! Ready for orchestrator development.")
else:
    print("‚ö†Ô∏è  Please fix errors before proceeding.")



# validation results

In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_030_CrossSell_Upsell_Orchestrator % cd /Users/micahshull/Documents/AI_LangGraph/LG_Cursor_030_CrossSell_Upsell_Orchestrator && python3 validate_tier2_data.py
======================================================================
TIER 2 DATA VALIDATION REPORT
======================================================================

üì¶ Products loaded: 20
üë• Customers loaded: 16

üîç Validating Product Catalog...
‚úÖ Product validation complete: 0 errors found

üîç Validating Customer Data...
======================================================================
‚úÖ No critical errors found!

‚ö†Ô∏è  WARNINGS: 44
----------------------------------------------------------------------
  ‚Ä¢ Customer C001: Product P102 concerns {'acne', 'oiliness'} don't match customer concerns {'dryness', 'sensitivity'}
  ‚Ä¢ Customer C001: Product P105 concerns {'acne', 'redness'} don't match customer concerns {'dryness', 'sensitivity'}
  ‚Ä¢ Customer C001: Product P113 concerns {'sun_damage', 'hyperpigmentation'} don't match customer concerns {'dryness', 'sensitivity'}
  ‚Ä¢ Customer C001: lifetime_value ($198.24) differs from purchase_history sum ($197.00)
  ‚Ä¢ Customer C002: Product P101 concerns {'dullness', 'congestion'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C002: Product P104 concerns {'dullness', 'hyperpigmentation'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C002: Product P106 concerns {'dullness', 'texture'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C002: lifetime_value ($102.56) differs from purchase_history sum ($93.00)
  ‚Ä¢ Customer C003: lifetime_value ($50.00) differs from purchase_history sum ($44.00)
  ‚Ä¢ Customer C004: Product P103 concerns {'dryness', 'sensitivity'} don't match customer concerns {'fine_lines', 'aging'}
  ‚Ä¢ Customer C004: Product P104 concerns {'dullness', 'hyperpigmentation'} don't match customer concerns {'fine_lines', 'aging'}
  ‚Ä¢ Customer C004: Product P113 concerns {'sun_damage', 'hyperpigmentation'} don't match customer concerns {'fine_lines', 'aging'}
  ‚Ä¢ Customer C004: Product P119 concerns {'sensitivity', 'redness'} don't match customer concerns {'fine_lines', 'aging'}
  ‚Ä¢ Customer C004: lifetime_value ($274.72) differs from purchase_history sum ($240.00)
  ‚Ä¢ Customer C005: Product P101 concerns {'dullness', 'congestion'} don't match customer concerns {'aging', 'hyperpigmentation'}
  ... and 29 more warnings

‚ÑπÔ∏è  INFO: 10 routine gap opportunities found
----------------------------------------------------------------------
  ‚Ä¢ Customer C002: Missing essential categories: {'spf', 'moisturizer'} (routine gap opportunity)
  ‚Ä¢ Customer C003: Missing essential categories: {'cleanser', 'moisturizer', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C006: Missing essential categories: {'cleanser', 'moisturizer'} (routine gap opportunity)
  ‚Ä¢ Customer C008: Missing essential categories: {'cleanser', 'moisturizer', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C009: Missing essential categories: {'cleanser', 'moisturizer', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C011: Missing essential categories: {'cleanser', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C012: Missing essential categories: {'cleanser', 'moisturizer', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C014: Missing essential categories: {'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C015: Missing essential categories: {'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C016: Missing essential categories: {'cleanser', 'moisturizer', 'spf'} (routine gap opportunity)

======================================================================
SUMMARY
======================================================================
Total Errors:   0
Total Warnings: 44
Opportunities:  10

üéâ Data validation PASSED! Ready for orchestrator development.





## Validation results summary

### No critical errors
- All product IDs exist and are valid
- RFM metrics are consistent
- Data structure is correct
- Ready for orchestrator development

### Warnings analysis (44 total)

#### 1. Product concern mismatches (~30 warnings)
Examples:
- C001 (dryness/sensitivity) owns P102 (acne/oiliness)
- C002 (acne) owns P101 (dullness/congestion)

Assessment: Acceptable. This reflects real behavior:
- Customers try products that don‚Äôt perfectly match
- Products can address multiple concerns
- Some products are general-purpose (e.g., SPF)

Recommendation: No action needed. This adds realism.

#### 2. Lifetime value discrepancies (~10 warnings)
Examples:
- C001: LTV $198.24 vs history sum $197.00
- C004: LTV $274.72 vs history sum $240.00

Assessment: Likely intentional. LTV may include:
- Historical purchases outside the 90-day window
- Returns/refunds adjustments
- Future projected value

Recommendation: Acceptable if intentional. If you want exact matches, we can align them.

#### 3. Routine gap opportunities (10 customers)
Examples:
- C002: Missing SPF and moisturizer
- C003: Missing cleanser, moisturizer, SPF
- C014: Missing SPF

Assessment: Good for testing. These are clear cross-sell opportunities.

Recommendation: Keep as-is. These are valuable test cases.

## Overall assessment

The data is ready to use. The warnings are mostly acceptable and add realism. The routine gaps provide good test scenarios.



Recommendation: keep most warnings, but fix the lifetime value discrepancies.

## Recommendation: selective fixes

### Keep (realistic and useful)
1. Product concern mismatches (~30 warnings)
   - Realistic: customers buy products that don‚Äôt perfectly match
   - Useful: tests the agent‚Äôs ability to:
     - Detect mismatches
     - Recommend better alternatives
     - Handle edge cases
   - Example: C001 with dryness owns an acne product ‚Üí agent should suggest a better match

2. Routine gaps (10 opportunities)
   - Core use case for cross-sell/upsell
   - Clear test scenarios

### Fix (reduces confusion)
3. Lifetime value discrepancies (~10 warnings)
   - Why fix:
     - Can cause confusion during development
     - Makes it harder to verify agent calculations
     - Not a realistic ‚Äúerror‚Äù in the same way
   - Example: C001 LTV \$198.24 vs history $197.00 ‚Äî small but can cause debugging issues

## Recommended action plan

1. Fix LTV discrepancies (quick script)
   - Align `lifetime_value` with the sum of `purchase_history`
   - Takes ~2 minutes

2. Keep product concern mismatches
   - They add realism and test agent logic

3. Keep routine gaps
   - They are the core test scenarios

## Why this approach

For a learning-stage agent:
- Realistic data helps test robustness
- Clean LTV avoids confusion during debugging
- Mismatches test the agent‚Äôs reasoning

Should I create a quick script to fix the LTV discrepancies? It will:
- Calculate LTV from purchase_history
- Update the customers.json file
- Preserve everything else

This gives you realistic data with clean, verifiable metrics.

In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_030_CrossSell_Upsell_Orchestrator % cd /Users/micahshull/Documents/AI_LangGraph/LG_Cursor_030_CrossSell_Upsell_Orchestrator && python3 fix_ltv_discrepancies.py
======================================================================
FIXING LIFETIME VALUE DISCREPANCIES
======================================================================

üìä Found 16 customers with LTV discrepancies:

  C001: $198.24 ‚Üí $197.00 (diff: $-1.24)
  C002: $102.56 ‚Üí $93.00 (diff: $-9.56)
  C003: $50.00 ‚Üí $44.00 (diff: $-6.00)
  C004: $274.72 ‚Üí $240.00 (diff: $-34.72)
  C005: $158.36 ‚Üí $142.00 (diff: $-16.36)
  C006: $50.00 ‚Üí $37.00 (diff: $-13.00)
  C007: $193.58 ‚Üí $190.00 (diff: $-3.58)
  C008: $111.37 ‚Üí $104.00 (diff: $-7.37)
  C009: $87.96 ‚Üí $75.00 (diff: $-12.96)
  C010: $247.62 ‚Üí $219.00 (diff: $-28.62)
  C011: $156.88 ‚Üí $155.00 (diff: $-1.88)
  C012: $50.00 ‚Üí $38.00 (diff: $-12.00)
  C013: $245.04 ‚Üí $236.00 (diff: $-9.04)
  C014: $137.76 ‚Üí $134.00 (diff: $-3.76)
  C015: $209.76 ‚Üí $192.00 (diff: $-17.76)
  C016: $50.00 ‚Üí $46.00 (diff: $-4.00)

üíæ Saving updated data to data_tier_2/customers.json...
‚úÖ Successfully updated customer data!

üìà Total customers updated: 16

======================================================================
(.venv) micahshull@Micahs-iMac LG_Cursor_030_CrossSell_Upsell_Orchestrator %

In [None]:
(.venv) micahshull@Micahs-iMac LG_Cursor_030_CrossSell_Upsell_Orchestrator % cd /Users/micahshull/Documents/AI_LangGraph/LG_Cursor_030_CrossSell_Upsell_Orchestrator && python3 validate_tier2_data.py
======================================================================
TIER 2 DATA VALIDATION REPORT
======================================================================

üì¶ Products loaded: 20
üë• Customers loaded: 16

üîç Validating Product Catalog...
‚úÖ Product validation complete: 0 errors found

üîç Validating Customer Data...
======================================================================
‚úÖ No critical errors found!

‚ö†Ô∏è  WARNINGS: 28
----------------------------------------------------------------------
  ‚Ä¢ Customer C001: Product P102 concerns {'oiliness', 'acne'} don't match customer concerns {'sensitivity', 'dryness'}
  ‚Ä¢ Customer C001: Product P105 concerns {'acne', 'redness'} don't match customer concerns {'sensitivity', 'dryness'}
  ‚Ä¢ Customer C001: Product P113 concerns {'sun_damage', 'hyperpigmentation'} don't match customer concerns {'sensitivity', 'dryness'}
  ‚Ä¢ Customer C002: Product P101 concerns {'dullness', 'congestion'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C002: Product P104 concerns {'dullness', 'hyperpigmentation'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C002: Product P106 concerns {'dullness', 'texture'} don't match customer concerns {'acne'}
  ‚Ä¢ Customer C004: Product P103 concerns {'sensitivity', 'dryness'} don't match customer concerns {'aging', 'fine_lines'}
  ‚Ä¢ Customer C004: Product P104 concerns {'dullness', 'hyperpigmentation'} don't match customer concerns {'aging', 'fine_lines'}
  ‚Ä¢ Customer C004: Product P113 concerns {'sun_damage', 'hyperpigmentation'} don't match customer concerns {'aging', 'fine_lines'}
  ‚Ä¢ Customer C004: Product P119 concerns {'sensitivity', 'redness'} don't match customer concerns {'aging', 'fine_lines'}
  ‚Ä¢ Customer C005: Product P101 concerns {'dullness', 'congestion'} don't match customer concerns {'aging', 'hyperpigmentation'}
  ‚Ä¢ Customer C005: Product P110 concerns {'barrier_repair', 'dryness'} don't match customer concerns {'aging', 'hyperpigmentation'}
  ‚Ä¢ Customer C005: Product P114 concerns {'acne', 'oiliness'} don't match customer concerns {'aging', 'hyperpigmentation'}
  ‚Ä¢ Customer C006: Product P114 concerns {'acne', 'oiliness'} don't match customer concerns {'general'}
  ‚Ä¢ Customer C007: Product P101 concerns {'dullness', 'congestion'} don't match customer concerns {'aging', 'acne'}
  ... and 13 more warnings

‚ÑπÔ∏è  INFO: 10 routine gap opportunities found
----------------------------------------------------------------------
  ‚Ä¢ Customer C002: Missing essential categories: {'moisturizer', 'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C003: Missing essential categories: {'moisturizer', 'spf', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C006: Missing essential categories: {'moisturizer', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C008: Missing essential categories: {'moisturizer', 'spf', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C009: Missing essential categories: {'moisturizer', 'spf', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C011: Missing essential categories: {'spf', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C012: Missing essential categories: {'moisturizer', 'spf', 'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C014: Missing essential categories: {'spf'} (routine gap opportunity)
  ‚Ä¢ Customer C015: Missing essential categories: {'cleanser'} (routine gap opportunity)
  ‚Ä¢ Customer C016: Missing essential categories: {'moisturizer', 'spf', 'cleanser'} (routine gap opportunity)

======================================================================
SUMMARY
======================================================================
Total Errors:   0
Total Warnings: 28
Opportunities:  10

üéâ Data validation PASSED! Ready for orchestrator development.
(.venv) micahshull@Micahs-iMac LG_Cursor_030_CrossSell_Upsell_Orchestrator %

LTV discrepancies fixed. Validation summary:

## Final validation results

### Before fixes
- Errors: 0
- Warnings: 44 (16 LTV + 28 product concerns)
- Opportunities: 10

### After fixes
- Errors: 0
- Warnings: 28 (only product concern mismatches)
- Opportunities: 10

All 16 LTV discrepancies are resolved.

## Current data state

### Clean metrics
- All lifetime values match purchase history
- RFM metrics are consistent
- No structural errors

### Realistic test scenarios
- 28 product concern mismatches ‚Äî tests agent‚Äôs ability to detect and recommend better matches
- 10 routine gaps ‚Äî core cross-sell/upsell opportunities

## Data quality summary

| Metric | Status |
|--------|--------|
| Product catalog | 20 products, all valid |
| Customer data | 16 customers, all valid |
| RFM consistency | 100% accurate |
| LTV accuracy | 100% aligned |
| Product references | All exist |
| Routine gaps | 10 opportunities identified |

## Ready for orchestrator development

The Tier 2 dataset is:
- Clean and consistent (no errors)
- Realistic (product mismatches reflect real behavior)
- Test-ready (clear routine gaps for cross-sell scenarios)
- Well-structured (all fields validated)

