# Customer Behavior Analytics Metric Views

This notebook contains the SQL DDL statements to create all 10 customer behavior analytics metric views in Databricks.

**Catalog/Schema:** `juan_dev.retail`  
**Purpose:** Pre-aggregated KPIs for Customer Behavior Genie Room  
**Created:** 2025-01-29

## Overview

Metric views provide pre-aggregated KPIs that improve query performance and ensure consistent calculations across the Genie room's question types. Each metric view defines:
- **Dimensions**: Attributes for grouping and filtering (e.g., Segment, Channel, Product Category)
- **Measures**: Pre-defined aggregations (e.g., Customer Count, Avg CLTV, Transaction Count)
- **Source Data**: The underlying tables and transformations

**Note:** Execute each cell in order to create the metric views. Metric views use the `WITH METRICS` syntax and YAML language for defining dimensions and measures.


## 1. Customer Segmentation Metric View

**Purpose:** Customer segment summary metrics (optimized for segmentation questions)  
**Source:** `gold_customer_dim` (filtered to is_current = TRUE)  
**Performance:** Pre-filters to current customers and pre-aggregates by segment

**Use Cases:**
- "What are the key customer segments?"
- "What is the average lifetime value by segment?"
- "Show me customer distribution by segment and loyalty tier"


In [0]:
-- Customer Segmentation Metric View
CREATE OR REPLACE VIEW juan_dev.retail.customer_segmentation_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Customer segmentation summary with CLTV metrics by segment, loyalty tier, and geography. Use this view for questions about customer segments, segment distribution, and lifetime value by segment."
  source: |
    SELECT 
      c.customer_key,
      c.segment,
      c.loyalty_tier,
      c.geo_region,
      c.acquisition_channel,
      c.lifetime_value,
      c.acquisition_date
    FROM juan_dev.retail.gold_customer_dim c
    WHERE c.is_current = TRUE
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment: VIP, Premium, Loyal, Regular, New. Use to analyze customer distribution and behavior by segment."
    - name: Loyalty Tier
      expr: loyalty_tier
      comment: "Loyalty program tier. Use for loyalty program analysis and tier-based behavior comparison."
    - name: Geo Region
      expr: geo_region
      comment: "Geographic region. Use for regional customer distribution and regional segment analysis."
    - name: Acquisition Channel
      expr: acquisition_channel
      comment: "Channel where customer was acquired. Use for acquisition channel effectiveness analysis."
  measures:
    - name: Customer Count
      expr: COUNT(DISTINCT customer_key)
      comment: "Total number of customers. Use to analyze customer distribution by segment, region, or acquisition channel."
    - name: Average Lifetime Value
      expr: AVG(lifetime_value)
      comment: "Average customer lifetime value. Primary metric for segment value analysis."
    - name: Total Lifetime Value
      expr: SUM(lifetime_value)
      comment: "Total lifetime value across customers. Use for total value contribution by segment or region."
    - name: Min Lifetime Value
      expr: MIN(lifetime_value)
      comment: "Minimum lifetime value. Use for value range analysis by segment."
    - name: Max Lifetime Value
      expr: MAX(lifetime_value)
      comment: "Maximum lifetime value. Use for high-value customer identification by segment."
    - name: Median Lifetime Value
      expr: PERCENTILE_CONT(0.5) WITHIN GROUP (ORDER BY lifetime_value)
      comment: "Median lifetime value. Use for typical customer value by segment, less affected by outliers."
$$


## 2. Customer RFM Analysis Metric View

**Purpose:** RFM (Recency, Frequency, Monetary) metrics per customer  
**Source:** `gold_sales_fact` + `gold_customer_dim` + `gold_date_dim`  
**Performance:** Pre-calculates RFM scores and loyalty status

**Use Cases:**
- "Which customers are at risk of churning?"
- "Show me customers with low recency, high frequency, high monetary value"
- "Analyze RFM patterns across customer segments"


In [0]:
-- Customer RFM Analysis Metric View
CREATE OR REPLACE VIEW juan_dev.retail.customer_rfm_analysis_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "RFM (Recency, Frequency, Monetary) analysis with loyalty status classification. Use this view for questions about customer churn risk, loyal customers, and RFM patterns by segment."
  source: |
    SELECT 
      c.customer_key,
      c.customer_id,
      c.segment,
      DATEDIFF(CURRENT_DATE, MAX(d.calendar_date)) as recency_days,
      COUNT(DISTINCT s.transaction_id) as frequency,
      SUM(s.net_sales_amount) as monetary_value,
      CASE 
        WHEN DATEDIFF(CURRENT_DATE, MAX(d.calendar_date)) > 90 AND COUNT(DISTINCT s.transaction_id) < 3 THEN 'At Risk'
        WHEN DATEDIFF(CURRENT_DATE, MAX(d.calendar_date)) <= 30 AND COUNT(DISTINCT s.transaction_id) >= 5 AND SUM(s.net_sales_amount) > 1000 THEN 'Champion'
        WHEN DATEDIFF(CURRENT_DATE, MAX(d.calendar_date)) <= 60 AND COUNT(DISTINCT s.transaction_id) >= 3 THEN 'Loyal'
        ELSE 'Regular'
      END as loyalty_status
    FROM juan_dev.retail.gold_customer_dim c
    JOIN juan_dev.retail.gold_sales_fact s ON c.customer_key = s.customer_key
    JOIN juan_dev.retail.gold_date_dim d ON s.date_key = d.date_key
    WHERE c.is_current = TRUE
      AND s.is_return = FALSE
      AND d.calendar_date >= DATE_SUB(CURRENT_DATE, 365)
    GROUP BY c.customer_key, c.customer_id, c.segment
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze RFM patterns by segment."
    - name: Loyalty Status
      expr: loyalty_status
      comment: "Loyalty status: Champion (recent, frequent, high-value), Loyal (recent, frequent), At Risk (not recent, infrequent), Regular. Filter for At Risk to identify churn candidates."
  measures:
    - name: Customer Count
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of customers. Use with Loyalty Status to count champions, at-risk customers, etc."
    - name: Average Recency Days
      expr: AVG(recency_days)
      comment: "Average days since last purchase. Lower values indicate more recent purchases."
    - name: Average Frequency
      expr: AVG(frequency)
      comment: "Average number of purchases in last 365 days. Higher values indicate more frequent purchasers."
    - name: Average Monetary Value
      expr: AVG(monetary_value)
      comment: "Average total spend in last 365 days. Higher values indicate higher-value customers."
    - name: Total Monetary Value
      expr: SUM(monetary_value)
      comment: "Total revenue from customers. Use for revenue contribution by loyalty status or segment."
    - name: Champions Count
      expr: COUNT(CASE WHEN loyalty_status = 'Champion' THEN 1 END)
      comment: "Number of champion customers (recent, frequent, high-value). Target for VIP programs."
    - name: At Risk Count
      expr: COUNT(CASE WHEN loyalty_status = 'At Risk' THEN 1 END)
      comment: "Number of at-risk customers. Target for retention campaigns."
$$


## 3. Customer Purchase Summary Metric View

**Purpose:** Purchase behavior metrics by segment and channel  
**Source:** `gold_sales_fact` + `gold_customer_dim` + `gold_channel_dim` + `gold_date_dim`  
**Performance:** Pre-aggregates purchase metrics for faster analysis

**Use Cases:**
- "What is the average order value by segment?"
- "Show me purchase frequency by channel"
- "What is total revenue by segment over time?"


In [0]:
-- Customer Purchase Summary Metric View
CREATE OR REPLACE VIEW juan_dev.retail.customer_purchase_summary_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Purchase behavior summary by segment, channel, and time. Use this view for questions about average order value, purchase frequency, revenue by segment, and channel behavior."
  source: |
    SELECT 
      s.transaction_id,
      c.segment,
      ch.channel_name,
      d.month_name,
      d.year,
      d.calendar_date,
      s.customer_key,
      s.net_sales_amount
    FROM juan_dev.retail.gold_sales_fact s
    JOIN juan_dev.retail.gold_customer_dim c ON s.customer_key = c.customer_key
    JOIN juan_dev.retail.gold_channel_dim ch ON s.channel_key = ch.channel_key
    JOIN juan_dev.retail.gold_date_dim d ON s.date_key = d.date_key
    WHERE c.is_current = TRUE
      AND s.is_return = FALSE
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze purchase behavior by segment."
    - name: Channel Name
      expr: channel_name
      comment: "Sales channel. Use for channel-specific purchase analysis."
    - name: Month Name
      expr: month_name
      comment: "Month of purchase. Use for monthly purchase trend analysis."
    - name: Year
      expr: year
      comment: "Year of purchase. Use for year-over-year purchase comparisons."
    - name: Calendar Date
      expr: calendar_date
      comment: "Purchase date. Use for daily purchase trend analysis."
  measures:
    - name: Transaction Count
      expr: COUNT(DISTINCT transaction_id)
      comment: "Total number of transactions. Use for purchase frequency analysis."
    - name: Total Revenue
      expr: SUM(net_sales_amount)
      comment: "Total revenue from purchases. Primary metric for revenue analysis by segment and channel."
    - name: Average Order Value
      expr: AVG(net_sales_amount)
      comment: "Average order value (AOV). Use to compare spending patterns by segment and channel."
    - name: Unique Customers
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of unique customers. Use for customer reach analysis."
    - name: Transactions Per Customer
      expr: COUNT(DISTINCT transaction_id) * 1.0 / NULLIF(COUNT(DISTINCT customer_key), 0)
      comment: "Average transactions per customer. Use for frequency analysis by segment."
$$


## 4. Product Affinity Metric View

**Purpose:** Product affinity scores with segment breakdown  
**Source:** `gold_customer_product_affinity_agg` + `gold_product_dim` + `gold_customer_dim`  
**Performance:** Pre-joins affinity data with product and customer dimensions

**Use Cases:**
- "Show me products with high customer affinity scores"
- "How does category affinity differ by customer segment?"
- "What products have the highest affinity for VIP customers?"


In [0]:
-- Product Affinity Metric View
CREATE OR REPLACE VIEW juan_dev.retail.product_affinity_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Product affinity scores by segment and category with CLTV impact. Use this view for questions about product recommendations, affinity patterns by segment, and personalization effectiveness."
  source: |
    SELECT 
      a.customer_key,
      a.product_key,
      c.segment,
      p.product_name,
      p.category_level_1,
      p.category_level_2,
      a.affinity_score,
      a.purchase_count,
      a.predicted_cltv_impact,
      CASE 
        WHEN a.affinity_score >= 0.7 THEN 'High Affinity'
        WHEN a.affinity_score >= 0.4 THEN 'Medium Affinity'
        ELSE 'Low Affinity'
      END as affinity_level
    FROM juan_dev.retail.gold_customer_product_affinity_agg a
    JOIN juan_dev.retail.gold_customer_dim c ON a.customer_key = c.customer_key
    JOIN juan_dev.retail.gold_product_dim p ON a.product_key = p.product_key
    WHERE c.is_current = TRUE
      AND p.is_active = TRUE
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze affinity patterns by segment."
    - name: Product Category Level 1
      expr: category_level_1
      comment: "Top-level product category. Use for high-level category affinity analysis."
    - name: Product Category Level 2
      expr: category_level_2
      comment: "Second-level product category. Use for detailed category affinity analysis."
    - name: Affinity Level
      expr: affinity_level
      comment: "Affinity level: High (>=0.7), Medium (>=0.4), Low (<0.4). Filter for High Affinity for strong recommendations."
  measures:
    - name: Customer Count
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of customers with affinity. Use to measure reach of affinity-based targeting."
    - name: Product Count
      expr: COUNT(DISTINCT product_key)
      comment: "Number of products with affinity relationships. Use for recommendation breadth analysis."
    - name: Average Affinity Score
      expr: AVG(affinity_score)
      comment: "Average affinity score. Use to compare affinity strength by segment or category."
    - name: Total Purchases
      expr: SUM(purchase_count)
      comment: "Total purchases from affinity relationships. Use for affinity-driven sales analysis."
    - name: Average Predicted CLTV Impact
      expr: AVG(predicted_cltv_impact)
      comment: "Average predicted CLTV impact from affinity targeting. Use for ROI analysis."
    - name: Total Predicted CLTV Impact
      expr: SUM(predicted_cltv_impact)
      comment: "Total predicted CLTV impact. Use to quantify total value of affinity-based personalization."
    - name: High Affinity Customer Count
      expr: COUNT(DISTINCT CASE WHEN affinity_score >= 0.7 THEN customer_key END)
      comment: "Number of customers with high affinity (>=0.7). Target for personalization campaigns."
$$


## 5. Channel Behavior Metric View

**Purpose:** Channel performance metrics by segment  
**Source:** `gold_sales_fact` + `gold_channel_dim` + `gold_customer_dim`  
**Performance:** Pre-aggregates channel metrics for faster queries

**Use Cases:**
- "Through which channels do customers prefer to shop?"
- "What is the impact of channel on purchase frequency and value?"
- "Show me channel performance by customer segment"


In [0]:
-- Channel Behavior Metric View
CREATE OR REPLACE VIEW juan_dev.retail.channel_behavior_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Channel behavior metrics by segment. Use this view for questions about channel preferences, channel performance by segment, and multi-channel customer behavior."
  source: |
    SELECT 
      s.transaction_id,
      s.customer_key,
      c.segment,
      ch.channel_name,
      ch.is_digital,
      s.net_sales_amount,
      d.calendar_date
    FROM juan_dev.retail.gold_sales_fact s
    JOIN juan_dev.retail.gold_customer_dim c ON s.customer_key = c.customer_key
    JOIN juan_dev.retail.gold_channel_dim ch ON s.channel_key = ch.channel_key
    JOIN juan_dev.retail.gold_date_dim d ON s.date_key = d.date_key
    WHERE c.is_current = TRUE
      AND s.is_return = FALSE
  dimensions:
    - name: Channel Name
      expr: channel_name
      comment: "Sales channel name. Use for channel-specific behavior analysis."
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to compare channel behavior by segment."
    - name: Is Digital
      expr: CASE WHEN is_digital THEN 'Digital' ELSE 'Physical' END
      comment: "Channel type: Digital or Physical. Use for digital vs physical channel comparison."
  measures:
    - name: Unique Customers
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of unique customers. Use for channel reach analysis."
    - name: Transaction Count
      expr: COUNT(DISTINCT transaction_id)
      comment: "Total transactions. Use for channel volume analysis."
    - name: Total Revenue
      expr: SUM(net_sales_amount)
      comment: "Total revenue by channel. Primary metric for channel performance."
    - name: Average Order Value
      expr: AVG(net_sales_amount)
      comment: "Average order value by channel. Use to compare spending patterns across channels."
    - name: Transactions Per Customer
      expr: COUNT(DISTINCT transaction_id) * 1.0 / NULLIF(COUNT(DISTINCT customer_key), 0)
      comment: "Average transactions per customer by channel. Use for channel engagement frequency."
$$


## 6. Channel Migration Metric View

**Purpose:** Channel migration patterns from acquisition to preferred  
**Source:** `gold_customer_dim` (filtered to is_current = TRUE)  
**Performance:** Pre-aggregates migration patterns

**Use Cases:**
- "How do customers migrate between acquisition and preferred channels?"
- "What percentage of customers acquired in-store now prefer online?"
- "Show me channel migration by segment"


In [0]:
-- Channel Migration Metric View
CREATE OR REPLACE VIEW juan_dev.retail.channel_migration_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Channel migration patterns from acquisition to preferred channel by segment. Use this view for questions about channel migration, omnichannel behavior, and channel loyalty."
  source: |
    SELECT 
      c.customer_key,
      c.segment,
      c.acquisition_channel,
      c.preferred_channel,
      c.lifetime_value,
      CASE 
        WHEN c.acquisition_channel = c.preferred_channel THEN 'Same Channel'
        ELSE 'Migrated'
      END as migration_status
    FROM juan_dev.retail.gold_customer_dim c
    WHERE c.is_current = TRUE
      AND c.acquisition_channel IS NOT NULL
      AND c.preferred_channel IS NOT NULL
  dimensions:
    - name: Acquisition Channel
      expr: acquisition_channel
      comment: "Channel where customer was acquired. Use to analyze migration from acquisition channel."
    - name: Preferred Channel
      expr: preferred_channel
      comment: "Customer's preferred channel. Use to analyze migration to preferred channel."
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze migration patterns by segment."
    - name: Migration Status
      expr: migration_status
      comment: "Migration status: Same Channel (no migration), Migrated. Use to identify channel-loyal vs multi-channel customers."
  measures:
    - name: Customer Count
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of customers. Use to quantify migration patterns."
    - name: Average CLTV
      expr: AVG(lifetime_value)
      comment: "Average customer lifetime value. Use to compare value of migrated vs same-channel customers."
    - name: Total CLTV
      expr: SUM(lifetime_value)
      comment: "Total lifetime value. Use for total value contribution by migration pattern."
    - name: Migration Rate
      expr: (COUNT(CASE WHEN migration_status = 'Migrated' THEN 1 END) * 100.0) / NULLIF(COUNT(*), 0)
      comment: "Percentage of customers who migrated channels. Use to measure multi-channel adoption."
$$


## 7. Engagement Funnel Metric View

**Purpose:** Engagement funnel metrics (View → Add to Cart → Purchase)  
**Source:** `gold_customer_event_fact` + `gold_sales_fact` + `gold_customer_dim`  
**Performance:** Pre-calculates funnel stages and conversion rates

**Use Cases:**
- "What are the conversion rates at each stage of the engagement funnel?"
- "Which channels drive the most engagement and conversions?"
- "Where are customers dropping off in the purchase funnel?"


In [0]:
-- Engagement Funnel Metric View
CREATE OR REPLACE VIEW juan_dev.retail.engagement_funnel_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Engagement funnel metrics with conversion rates by segment and channel. Use this view for questions about funnel performance, conversion rates, drop-off analysis, and engagement by segment."
  source: |
    SELECT 
      e.session_id,
      e.customer_key,
      c.segment,
      ch.channel_name,
      d.month_name,
      d.year,
      e.event_type,
      CASE WHEN s.transaction_id IS NOT NULL THEN 1 ELSE 0 END as has_purchase
    FROM juan_dev.retail.gold_customer_event_fact e
    JOIN juan_dev.retail.gold_customer_dim c ON e.customer_key = c.customer_key
    JOIN juan_dev.retail.gold_channel_dim ch ON e.channel_key = ch.channel_key
    JOIN juan_dev.retail.gold_date_dim d ON e.date_key = d.date_key
    LEFT JOIN juan_dev.retail.gold_sales_fact s ON e.customer_key = s.customer_key 
      AND e.date_key = s.date_key
      AND s.is_return = FALSE
    WHERE c.is_current = TRUE
      AND d.calendar_date >= DATE_SUB(CURRENT_DATE, 90)
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze funnel performance by segment."
    - name: Channel Name
      expr: channel_name
      comment: "Channel name. Use for channel-specific funnel analysis."
    - name: Month Name
      expr: month_name
      comment: "Month of engagement. Use for monthly funnel trend analysis."
    - name: Year
      expr: year
      comment: "Year of engagement. Use for year-over-year funnel comparison."
  measures:
    - name: Total Sessions
      expr: COUNT(DISTINCT session_id)
      comment: "Total number of sessions. Base metric for funnel analysis."
    - name: Views
      expr: COUNT(DISTINCT CASE WHEN event_type = 'view' THEN session_id END)
      comment: "Number of sessions with product views. Top of funnel metric."
    - name: Add to Cart
      expr: COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END)
      comment: "Number of sessions with add-to-cart actions. Middle of funnel metric."
    - name: Purchases
      expr: COUNT(DISTINCT CASE WHEN has_purchase = 1 THEN session_id END)
      comment: "Number of sessions that resulted in purchase. Bottom of funnel metric."
    - name: Cart Conversion Rate
      expr: (COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END) * 100.0) / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'view' THEN session_id END), 0)
      comment: "Percentage of views that result in add-to-cart. View to cart conversion rate."
    - name: Purchase Conversion Rate
      expr: (COUNT(DISTINCT CASE WHEN has_purchase = 1 THEN session_id END) * 100.0) / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'add_to_cart' THEN session_id END), 0)
      comment: "Percentage of add-to-cart that result in purchase. Cart to purchase conversion rate."
    - name: Overall Conversion Rate
      expr: (COUNT(DISTINCT CASE WHEN has_purchase = 1 THEN session_id END) * 100.0) / NULLIF(COUNT(DISTINCT CASE WHEN event_type = 'view' THEN session_id END), 0)
      comment: "Percentage of views that result in purchase. Overall funnel conversion rate."
$$


## 8. Cart Abandonment Metric View

**Purpose:** Cart abandonment summary with recovery tracking  
**Source:** `gold_cart_abandonment_fact` + `gold_customer_dim`  
**Performance:** Pre-aggregates abandonment metrics and recovery rates

**Use Cases:**
- "What is the rate of cart abandonment?"
- "Which stages have the highest abandonment rates?"
- "How effective are recovery campaigns?"


In [0]:
-- Cart Abandonment Metric View
CREATE OR REPLACE VIEW juan_dev.retail.cart_abandonment_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Cart abandonment metrics with recovery effectiveness by segment and stage. Use this view for questions about abandonment rates, recovery campaigns, lost revenue, and abandonment patterns."
  source: |
    SELECT 
      ca.abandonment_id,
      ca.cart_id,
      c.segment,
      ca.abandonment_stage,
      ca.cart_value,
      ca.items_count,
      ca.recovery_email_sent,
      ca.is_recovered,
      ca.recovery_revenue,
      CASE 
        WHEN ca.is_recovered = TRUE THEN 'Recovered'
        WHEN ca.recovery_email_sent = TRUE THEN 'Email Sent - Not Recovered'
        ELSE 'No Recovery Attempt'
      END as recovery_status
    FROM juan_dev.retail.gold_cart_abandonment_fact ca
    JOIN juan_dev.retail.gold_customer_dim c ON ca.customer_key = c.customer_key
    WHERE c.is_current = TRUE
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze abandonment patterns by segment."
    - name: Abandonment Stage
      expr: abandonment_stage
      comment: "Stage where abandonment occurred: cart, shipping, payment. Use to identify problematic funnel stages."
    - name: Recovery Status
      expr: recovery_status
      comment: "Recovery status: Recovered, Email Sent - Not Recovered, No Recovery Attempt. Use for recovery campaign effectiveness."
  measures:
    - name: Total Carts
      expr: COUNT(*)
      comment: "Total number of abandoned carts. Base metric for abandonment analysis."
    - name: Abandoned Carts
      expr: COUNT(CASE WHEN is_recovered = FALSE THEN 1 END)
      comment: "Number of carts that were not recovered. Use for abandonment rate calculation."
    - name: Recovered Carts
      expr: COUNT(CASE WHEN is_recovered = TRUE THEN 1 END)
      comment: "Number of carts that were recovered. Use for recovery rate calculation."
    - name: Abandonment Rate
      expr: (COUNT(CASE WHEN is_recovered = FALSE THEN 1 END) * 100.0) / NULLIF(COUNT(*), 0)
      comment: "Percentage of carts not recovered. Primary abandonment metric."
    - name: Recovery Rate
      expr: (COUNT(CASE WHEN is_recovered = TRUE THEN 1 END) * 100.0) / NULLIF(COUNT(CASE WHEN recovery_email_sent = TRUE THEN 1 END), 0)
      comment: "Percentage of recovery emails that resulted in purchase. Campaign effectiveness metric."
    - name: Lost Revenue
      expr: SUM(CASE WHEN is_recovered = FALSE THEN cart_value ELSE 0 END)
      comment: "Total value of abandoned carts not recovered. Lost revenue metric."
    - name: Recovered Revenue
      expr: SUM(CASE WHEN is_recovered = TRUE THEN recovery_revenue ELSE 0 END)
      comment: "Total revenue recovered from abandoned carts. Campaign ROI metric."
    - name: Average Cart Value
      expr: AVG(cart_value)
      comment: "Average value of abandoned carts. Use for abandonment value analysis."
    - name: Average Items Per Cart
      expr: AVG(items_count)
      comment: "Average number of items in abandoned carts. Use for cart composition analysis."
$$


## 9. Personalization Impact Metric View

**Purpose:** Personalization effectiveness with CLTV impact  
**Source:** `gold_customer_product_affinity_agg` + `gold_customer_dim`  
**Performance:** Pre-aggregates personalization metrics and predicted impact

**Use Cases:**
- "Which segments respond best to personalization?"
- "How effective are personalized recommendations?"
- "What is the predicted impact on CLTV from affinity-based targeting?"


In [0]:
-- Personalization Impact Metric View
CREATE OR REPLACE VIEW juan_dev.retail.personalization_impact_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Personalization effectiveness with predicted CLTV impact by segment and affinity level. Use this view for questions about personalization ROI, segment response to personalization, and CLTV impact predictions."
  source: |
    SELECT 
      a.customer_key,
      c.segment,
      a.affinity_score,
      a.purchase_count,
      a.predicted_cltv_impact,
      c.lifetime_value,
      CASE 
        WHEN a.affinity_score >= 0.7 THEN 'High Affinity'
        WHEN a.affinity_score >= 0.4 THEN 'Medium Affinity'
        ELSE 'Low Affinity'
      END as affinity_level
    FROM juan_dev.retail.gold_customer_product_affinity_agg a
    JOIN juan_dev.retail.gold_customer_dim c ON a.customer_key = c.customer_key
    WHERE c.is_current = TRUE
  dimensions:
    - name: Segment
      expr: segment
      comment: "Customer segment. Use to analyze personalization effectiveness by segment."
    - name: Affinity Level
      expr: affinity_level
      comment: "Affinity level: High (>=0.7), Medium (>=0.4), Low (<0.4). Use to analyze impact by affinity strength."
  measures:
    - name: Customers with Affinity
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of customers with affinity relationships. Reach metric for personalization."
    - name: Average Affinity Score
      expr: AVG(affinity_score)
      comment: "Average affinity score. Use to compare affinity strength by segment."
    - name: Total Purchases from Affinity
      expr: SUM(purchase_count)
      comment: "Total purchases attributed to affinity relationships. Volume metric for personalization."
    - name: Average Purchases per Customer
      expr: AVG(purchase_count)
      comment: "Average purchases per customer from affinity. Frequency metric for personalization."
    - name: Average Predicted CLTV Impact
      expr: AVG(predicted_cltv_impact)
      comment: "Average predicted CLTV impact per customer. Use for per-customer ROI analysis."
    - name: Total Predicted CLTV Impact
      expr: SUM(predicted_cltv_impact)
      comment: "Total predicted CLTV impact from personalization. Primary ROI metric."
    - name: Current Average CLTV
      expr: AVG(lifetime_value)
      comment: "Current average CLTV. Use to compare with predicted impact."
    - name: High Affinity Customer Count
      expr: COUNT(DISTINCT CASE WHEN affinity_score >= 0.7 THEN customer_key END)
      comment: "Number of customers with high affinity. Target for priority personalization."
$$


## 10. Segment Trends Daily Metric View

**Purpose:** Daily trends by segment for time-series analysis  
**Source:** `gold_sales_fact` + `gold_customer_dim` + `gold_date_dim`  
**Performance:** Pre-aggregates daily metrics for faster trend analysis

**Use Cases:**
- "How have sales by segment changed over time?"
- "Show me monthly revenue trends by segment"
- "What are seasonal purchase patterns by segment?"


In [0]:
-- Segment Trends Daily Metric View
CREATE OR REPLACE VIEW juan_dev.retail.segment_trends_daily_mv
WITH METRICS
LANGUAGE YAML
AS $$
  version: 1.1
  comment: "Daily segment trends with temporal attributes for time-series analysis. Use this view for questions about segment trends over time, seasonal patterns, and historical segment performance."
  source: |
    SELECT 
      s.transaction_id,
      s.customer_key,
      c.segment,
      d.calendar_date,
      d.month_name,
      d.year,
      d.quarter,
      d.season,
      d.is_peak_season,
      s.net_sales_amount
    FROM juan_dev.retail.gold_sales_fact s
    JOIN juan_dev.retail.gold_customer_dim c ON s.customer_key = c.customer_key
    JOIN juan_dev.retail.gold_date_dim d ON s.date_key = d.date_key
    WHERE c.is_current = TRUE
      AND s.is_return = FALSE
  dimensions:
    - name: Calendar Date
      expr: calendar_date
      comment: "Transaction date. Use for daily trend analysis and time-based filtering."
    - name: Month Name
      expr: month_name
      comment: "Month name. Use for monthly trend analysis and month-over-month comparisons."
    - name: Year
      expr: year
      comment: "Year. Use for year-over-year trend comparisons."
    - name: Quarter
      expr: quarter
      comment: "Quarter (1-4). Use for quarterly trend analysis."
    - name: Season
      expr: season
      comment: "Season (Spring, Summer, Fall, Winter). Use for seasonal pattern analysis."
    - name: Peak Season Indicator
      expr: CASE WHEN is_peak_season THEN 'Peak Season' ELSE 'Regular Season' END
      comment: "Peak season indicator. Use to compare peak vs regular season performance."
    - name: Segment
      expr: segment
      comment: "Customer segment. Use for segment-specific trend analysis."
  measures:
    - name: Transaction Count
      expr: COUNT(DISTINCT transaction_id)
      comment: "Number of transactions. Use to track transaction volume trends."
    - name: Total Revenue
      expr: SUM(net_sales_amount)
      comment: "Total revenue. Primary metric for revenue trend analysis."
    - name: Unique Customers
      expr: COUNT(DISTINCT customer_key)
      comment: "Number of unique customers. Use to track customer engagement trends."
    - name: Average Order Value
      expr: AVG(net_sales_amount)
      comment: "Average order value. Use to track spending pattern trends."
    - name: Revenue Per Customer
      expr: SUM(net_sales_amount) / NULLIF(COUNT(DISTINCT customer_key), 0)
      comment: "Average revenue per customer. Use for customer value trend analysis."
$$


## Metric Views Summary

### Created Metric Views (10)

1. **customer_segmentation_mv** - Customer segment summary with CLTV
2. **customer_rfm_analysis_mv** - RFM metrics with loyalty status
3. **customer_purchase_summary_mv** - Purchase behavior by segment and channel
4. **product_affinity_mv** - Product affinity with CLTV impact
5. **channel_behavior_mv** - Channel performance by segment
6. **channel_migration_mv** - Channel migration patterns
7. **engagement_funnel_mv** - Engagement funnel with conversion rates
8. **cart_abandonment_mv** - Cart abandonment with recovery tracking
9. **personalization_impact_mv** - Personalization effectiveness
10. **segment_trends_daily_mv** - Daily segment trends

### Next Steps

1. Execute each SQL cell to create the metric views in Databricks
2. Grant SELECT permissions to Genie users
3. Add metric views to Customer Behavior Genie space
4. Test with sample queries
5. Monitor query performance improvements
