<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/216_Dataset_Introduction_Skincare_Customer_%26_Product_Foundation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



# üìò **Dataset Introduction ‚Äî Tier 1 Skincare Customer & Product Foundation**

This notebook introduces the **Tier 1 Skincare Dataset**, a foundational dataset created to support the development of AI **Cross-Sell & Upsell Orchestrator Agents**. It is intentionally scoped as a **minimal, clean, and extensible MVP dataset** suitable for learning, experimentation, and future expansion into more advanced multi-agent orchestration systems.

---

## üß© **Purpose of the Dataset**

The primary goal of this dataset is to serve as a **ground truth environment** for building AI agents that:

* understand customer purchase behavior
* detect missing steps in a skincare routine
* recommend relevant cross-sell and upsell opportunities
* score and prioritize recommendations
* generate personalized product pathways

This dataset reflects **Tier 1 (Essential Care)** skincare products only, which represent the simplest and most universal product types in the beauty domain. Tier 1 products make it easy to learn and debug AI logic before introducing complex active ingredients, multi-tier product hierarchies, or clinical-strength products.

---

## üí° **Why Tier 1?**

Tier 1 products include the foundational elements of a skincare routine:

* cleanser
* toner
* serum
* moisturizer
* SPF
* hydrating masks
* lip care
* basic eye cream

These products have:
‚úî high cross-sell potential
‚úî predictable usage patterns
‚úî simple category relationships
‚úî no ingredient interaction constraints

Focusing exclusively on Tier 1 allows the orchestrator to produce high-value recommendations without the added complexity of advanced formulations (retinol, acids, peptides, etc.).
This creates an ideal environment for an MVP-level orchestrator.

---

## üë§ **Customer Data Design**

Each customer entry includes the fields required for a minimal cross-sell model, plus three high-ROI Tier 2 fields that significantly improve scoring:

### **Tier 1 (Essential Fields)**

* basic identity fields
* products owned
* purchase history
* product categories used
* churn risk
* loyalty tier
* notes

### **Selected Tier 2 High-Value Fields**

* **RFM (Recency, Frequency, Monetary)**
* **Lifetime Value (CLV)**
* **Price Sensitivity**

These added fields enable the orchestrator to produce smarter, business-aware recommendations without overwhelming the MVP dataset.

---

## üéÅ **Product Catalog Design**

The product catalog includes **10 essential-care skincare products**, each with:

* product ID
* name
* category
* price
* margin
* replenishment cycle
* cross-sell compatibility

All products belong to **Tier 1**, making relationships simple, clean, and ideal for early-stage agent development.

Future versions of the catalog may introduce:

* Tier 2 (Advanced Active Ingredients)
* Tier 3 (Clinical Strength Products)
* subscription products
* seasonal items
* bundle products
* tools and accessories

The dataset is structured to grow in these directions naturally as the project evolves.

---

## ü§ñ **Use Cases Enabled by This Dataset**

This dataset supports the creation of:

### **MVP Cross-Sell & Upsell Orchestrator**

* detects gaps in routines
* recommends missing products
* scores opportunities using RFM + CLV
* personalizes recommendations

### **Customer Journey Analyzer**

* reconstructs buying patterns
* identifies routine-building moments

### **Replenishment & Retention Model**

* predicts when customers will need refills
* detects early churn based on purchase lapses

### **Future Multi-Agent Extensions**

* identity resolution agents
* graph-building agents
* product embedding models
* dynamic pricing agents
* marketing sequence generators

This dataset forms the **first layer** of a much larger orchestrator ecosystem.




# üß¥ Tier 1 Product Catalog (10 Essential Care Products)

# ‚≠ê Why This Catalog Works for Your MVP

### ‚úî Clean category structure

Perfect for rule-based and graph-based cross-sell logic.

### ‚úî Strong cross-sell relationships

Every product connects to others naturally based on routine steps.

### ‚úî Realistic replenishment cycles

These matter later when predicting retention and refill needs.

### ‚úî Varying margins

Useful when you want your orchestrator to prefer high-margin products.

### ‚úî Two cleansers (P001 & P010)

Lets your agent learn diversity within the same category.

### ‚úî Perfect dataset size (10 products)

Small enough for MVPs, large enough for meaningful relationships.




In [None]:
[
  {
    "product_id": "P001",
    "name": "Gentle Foaming Cleanser",
    "category": "cleanser",
    "price": 14.99,
    "margin": "medium",
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": ["P002", "P003", "P004", "P005"]
  },
  {
    "product_id": "P002",
    "name": "Balancing Facial Toner",
    "category": "toner",
    "price": 12.99,
    "margin": "medium",
    "replenishment_cycle_days": 45,
    "recommended_cross_sells": ["P001", "P003", "P004"]
  },
  {
    "product_id": "P003",
    "name": "Hydrating Hyaluronic Serum",
    "category": "serum",
    "price": 19.99,
    "margin": "high",
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": ["P001", "P002", "P004"]
  },
  {
    "product_id": "P004",
    "name": "Daily Lightweight Moisturizer",
    "category": "moisturizer",
    "price": 17.99,
    "margin": "medium",
    "replenishment_cycle_days": 40,
    "recommended_cross_sells": ["P001", "P002", "P003", "P005"]
  },
  {
    "product_id": "P005",
    "name": "SPF 30 Everyday Sunscreen",
    "category": "spf",
    "price": 15.99,
    "margin": "medium",
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": ["P001", "P004"]
  },
  {
    "product_id": "P006",
    "name": "Soothing Aloe Gel Mask",
    "category": "mask",
    "price": 16.50,
    "margin": "medium",
    "replenishment_cycle_days": 20,
    "recommended_cross_sells": ["P003", "P004"]
  },
  {
    "product_id": "P007",
    "name": "Nourishing Lip Balm",
    "category": "lip",
    "price": 6.99,
    "margin": "high",
    "replenishment_cycle_days": 25,
    "recommended_cross_sells": ["P004"]
  },
  {
    "product_id": "P008",
    "name": "Refresh Facial Mist",
    "category": "mist",
    "price": 11.99,
    "margin": "medium",
    "replenishment_cycle_days": 35,
    "recommended_cross_sells": ["P004", "P006"]
  },
  {
    "product_id": "P009",
    "name": "Gentle Eye Cream",
    "category": "eye",
    "price": 18.99,
    "margin": "high",
    "replenishment_cycle_days": 45,
    "recommended_cross_sells": ["P004", "P003"]
  },
  {
    "product_id": "P010",
    "name": "Calming Chamomile Cleanser",
    "category": "cleanser",
    "price": 13.99,
    "margin": "medium",
    "replenishment_cycle_days": 30,
    "recommended_cross_sells": ["P002", "P003", "P004"]
  }
]


# Customer Data

These customers will be realistic, diverse, and perfect for testing your MVP cross-sell/upsell orchestrator.

Each customer includes:

* Tier 1 fields (identity, products owned, purchase history, categories used)
* 3 high-ROI Tier 2 fields (RFM, CLV, price sensitivity)
* realistic product ownership using your 10-product catalog
* variation in spending, behavior, and routine completeness
* varied gaps ‚Üí perfect for testing cross-sell logic

In [None]:
[
  {
    "customer_id": "C001",
    "name": "Sarah Lee",
    "email": "sarah.lee@example.com",
    "loyalty_tier": "gold",
    "churn_risk": 0.12,
    "products_owned": [
      {"product_id": "P001", "purchase_date": "2024-01-10", "amount": 14.99},
      {"product_id": "P004", "purchase_date": "2024-02-03", "amount": 17.99}
    ],
    "purchase_history": [
      {"product_id": "P001", "date": "2024-01-10", "amount": 14.99},
      {"product_id": "P004", "date": "2024-02-03", "amount": 17.99}
    ],
    "categories": ["cleanser", "moisturizer"],
    "rfm": {"recency_days": 12, "frequency_90d": 2, "monetary_90d": 32.98},
    "lifetime_value": 210.40,
    "price_sensitivity": "medium",
    "notes": "High-value repeat customer; routine missing toner, serum, and SPF."
  },
  {
    "customer_id": "C002",
    "name": "Mark Johnson",
    "email": "mark.j@example.com",
    "loyalty_tier": "silver",
    "churn_risk": 0.28,
    "products_owned": [
      {"product_id": "P002", "purchase_date": "2024-01-05", "amount": 12.99}
    ],
    "purchase_history": [
      {"product_id": "P002", "date": "2024-01-05", "amount": 12.99}
    ],
    "categories": ["toner"],
    "rfm": {"recency_days": 47, "frequency_90d": 1, "monetary_90d": 12.99},
    "lifetime_value": 89.50,
    "price_sensitivity": "high",
    "notes": "Recently reported delivery issues; extremely incomplete routine."
  },
  {
    "customer_id": "C003",
    "name": "Emily Chen",
    "email": "emily.chen@example.com",
    "loyalty_tier": "bronze",
    "churn_risk": 0.08,
    "products_owned": [
      {"product_id": "P003", "purchase_date": "2024-02-11", "amount": 19.99},
      {"product_id": "P007", "purchase_date": "2024-02-15", "amount": 6.99}
    ],
    "purchase_history": [
      {"product_id": "P003", "date": "2024-02-11", "amount": 19.99},
      {"product_id": "P007", "date": "2024-02-15", "amount": 6.99}
    ],
    "categories": ["serum", "lip"],
    "rfm": {"recency_days": 10, "frequency_90d": 2, "monetary_90d": 26.98},
    "lifetime_value": 142.10,
    "price_sensitivity": "medium",
    "notes": "New customer who prefers hydrating products; routine missing cleanser, moisturizer, SPF."
  },
  {
    "customer_id": "C004",
    "name": "David Brooks",
    "email": "david.b@example.com",
    "loyalty_tier": "gold",
    "churn_risk": 0.35,
    "products_owned": [
      {"product_id": "P001", "purchase_date": "2023-12-20", "amount": 14.99},
      {"product_id": "P002", "purchase_date": "2023-12-20", "amount": 12.99},
      {"product_id": "P003", "purchase_date": "2024-01-02", "amount": 19.99}
    ],
    "purchase_history": [
      {"product_id": "P001", "date": "2023-12-20", "amount": 14.99},
      {"product_id": "P002", "date": "2023-12-20", "amount": 12.99},
      {"product_id": "P003", "date": "2024-01-02", "amount": 19.99}
    ],
    "categories": ["cleanser", "toner", "serum"],
    "rfm": {"recency_days": 38, "frequency_90d": 3, "monetary_90d": 47.97},
    "lifetime_value": 310.75,
    "price_sensitivity": "low",
    "notes": "High-value but current churn risk due to unresolved ticket; strong routine builder missing moisturizer and SPF."
  },
  {
    "customer_id": "C005",
    "name": "Alicia Gomez",
    "email": "alicia.g@example.com",
    "loyalty_tier": "silver",
    "churn_risk": 0.15,
    "products_owned": [
      {"product_id": "P004", "purchase_date": "2024-02-05", "amount": 17.99},
      {"product_id": "P009", "purchase_date": "2024-01-26", "amount": 18.99}
    ],
    "purchase_history": [
      {"product_id": "P004", "date": "2024-02-05", "amount": 17.99},
      {"product_id": "P009", "date": "2024-01-26", "amount": 18.99}
    ],
    "categories": ["moisturizer", "eye"],
    "rfm": {"recency_days": 20, "frequency_90d": 2, "monetary_90d": 36.98},
    "lifetime_value": 175.20,
    "price_sensitivity": "medium",
    "notes": "Routine focused on hydration and under-eye care; missing cleanser, toner, serum, SPF."
  },
  {
    "customer_id": "C006",
    "name": "Jason Patel",
    "email": "jason.p@example.com",
    "loyalty_tier": "bronze",
    "churn_risk": 0.10,
    "products_owned": [
      {"product_id": "P005", "purchase_date": "2024-02-10", "amount": 15.99}
    ],
    "purchase_history": [
      {"product_id": "P005", "date": "2024-02-10", "amount": 15.99}
    ],
    "categories": ["spf"],
    "rfm": {"recency_days": 11, "frequ_




# üöÄ **Cross-Sell & Upsell Orchestrator ‚Äî Project Introduction**

This project develops a **Cross-Sell & Upsell Orchestrator Agent**, an AI system designed to identify missing skincare products in a customer‚Äôs routine, recommend relevant complementary products, and prioritize suggestions by business value.

The orchestrator is powered by two aligned Tier-1 datasets:

### **Product Catalog (Tier 1 Skincare Essentials)**

A curated set of 10 essential skincare products (cleanser, toner, serum, moisturizer, SPF, mask, etc.) with realistic prices, margins, replenishment cycles, and built-in cross-sell relationships.

### **Customer Profiles (Tier 1 + High-ROI Tier 2 Features)**

10 realistic customer profiles including product ownership, purchase history, loyalty tier, RFM features, lifetime value (CLV), price sensitivity, and routine gaps.

These datasets form the foundation for a full orchestrator workflow that simulates how a real e-commerce or beauty brand would increase revenue by intelligently expanding a customer‚Äôs skincare routine.

---

# üåê **What This Orchestrator Agent Is**

This orchestrator is more than a recommendation system or a simple rule-based engine. It is an AI agent that:

* **connects customer history with product relationships**
* **identifies gaps in a user‚Äôs routine**
* **detects the highest-value cross-sell and upsell opportunities**
* **scores, ranks, and explains recommendations**
* **outputs structured insights for sales, marketing, and personalization systems**

It is designed using orchestrator principles:
multiple steps ‚Üí multiple data sources ‚Üí multiple business objectives ‚Üí unified logic.

---

# üåü **Value to a Company (CEO-Level)**

If you are the CEO of a beauty or e-commerce brand, *this is the agent you'd want built immediately* ‚Äî here‚Äôs why:

---

## üí∞ **1. Immediate Revenue Lift from Every Customer**

Most customers buy only 1‚Äì2 skincare products.
But a full routine contains **5‚Äì7 steps**.

This orchestrator identifies missing products such as:

* cleanser
* toner
* serum
* moisturizer
* SPF
* hydration mask

‚Ä¶and recommends them based on:

* what the customer already owns
* what they *should* add to complete their routine
* their purchase frequency
* their lifetime value
* their price sensitivity

This **directly increases average order value (AOV)** and **expands cart size**.

A 5‚Äì15% revenue lift is common in real companies using cross-sell logic.

---

## üõí **2. Turns Single-Product Buyers Into Routine Buyers**

Your dataset shows many customers who own only:

* a serum
* a moisturizer
* a cleanser
* an SPF

The orchestrator transforms them into **routine customers**, who spend 4‚Äì8√ó more annually.

Routine completion = predictable, recurring revenue.

---

## üîÑ **3. Drives Replenishment and Repeat Purchases**

Your product catalog includes replenishment cycles (30‚Äì45 days).

This allows the orchestrator to:

* predict when a customer is running low
* recommend timely refills
* automatically increase customer retention

Retention is the most profitable lever in skincare:

* It costs **5‚Äì7√ó less** to keep a customer than to acquire a new one.
* Increasing retention by 5% can increase profits **25‚Äì95%**.

This agent directly targets that profit band.

---

## ‚≠ê **4. Builds Personalized, High-Value Journeys**

Because the dataset includes:

* RFM
* CLV
* price sensitivity
* product categories
* loyalty tier

‚Ä¶the orchestrator can tailor recommendations to individual customer behavior and value.

For example:

* *High CLV, low price sensitivity:* recommend premium bundles
* *High churn risk:* recommend essential low-cost products to re-engage
* *Low recency:* trigger replenishment flows
* *High frequency:* promote membership or subscription tiers

These personalized journeys massively outperform generic recommendations.

---

## üß† **5. Creates a Strategic Data Graph (Moat)**

This orchestrator builds relationships across:

* products
* categories
* routines
* customers
* behavior patterns

This evolving customer-product graph becomes:

* a defensible data asset
* a recommendation intelligence layer
* a competitive moat competitors can‚Äôt easily copy

Over time, the orchestrator becomes smarter as more data flows through it.

---

## ‚öôÔ∏è **6. Operates Across Departments (True Orchestration)**

Unlike simple agents that answer one question or perform one task, an orchestrator agent:

* supports marketing (campaign targeting)
* supports sales (upsell logic)
* supports retention teams (replenishment prediction)
* supports product teams (bundle identification)
* supports operations (inventory forecasting)

It becomes a **company-wide intelligence layer**, not a task bot.

---

# üèÜ **Why This Orchestrator Is Worth Building *Right Now***

If you‚Äôre the CEO, here‚Äôs the pitch:

> **This agent increases revenue, improves customer retention, and creates a reusable intelligence asset ‚Äî all with minimal data and fast ROI.**

In beauty/skincare, routine-building is the core growth strategy.
This orchestrator automates it at scale.

It:

* increases AOV
* increases LTV
* reduces churn
* creates personalized experiences
* increases conversion from marketing campaigns
* optimizes inventory with predictable reorders

The beauty of skincare is that customers **use these products every day**, making this agent strategically akin to:

* food replenishment
* subscription services
* energy usage

It turns your brand from ‚Äúone-time products‚Äù into **predictable, recurring revenue**.

---

# üéØ **Conclusion: Why This Dataset + This Orchestrator Matter**

This Tier-1 dataset and cross-sell orchestrator form the foundation for a scalable intelligence system that can eventually expand into:

* Tier 2 (active ingredients)
* Tier 3 (clinical products)
* personalized regimens
* multi-agent systems
* dynamic pricing
* churn mitigation
* customer clustering
* graph-based embeddings
* and more

This initial orchestrator is your **first real AI business engine** ‚Äî the one that improves revenue from day one and grows with every extension.

