-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Issue Summary
The reported Lift values in the PRIM Subgroup Analysis table contain a fundamental calculation error that makes the metric uninterpretable and potentially misleading. Specifically, the EI scenario shows Lift = 0.931 despite having the highest adoption density (81.07%) of all scenarios.
The Problem
Contradictory Values
| Scenario | Density (Adoption %) | Lift | Interpretation Problem |
|---|---|---|---|
| NI | 32.46% | 1.176 | Reasonable |
| SI | 67.89% | 1.781 | Seems reasonable |
| EI | 81.07% | 0.931 | ❌ IMPOSSIBLE |
Key Issue: A Lift value < 1.0 means the segment has below-average adoption. Yet EI has the highest adoption rate (81.07%) across all scenarios.
Root Cause Analysis
Expected Definition of Lift
Lift should measure: "How much better does this segment perform compared to a reference?"
Two Standard Approaches
Option A: Within-Scenario Comparison
Compare segment to the mean of its own scenario:
Lift_EI = Density_EI / Mean_Adoption_EI
Lift_EI = 0.8107 / 0.4637 = 1.748
Option B: Cross-Scenario Comparison
Compare all segments to the baseline (NI) mean:
Lift_EI = Density_EI / Mean_Adoption_NI
Lift_EI = 0.8107 / 0.28 = 2.895
Both approaches yield Lift > 1.0, which is correct for a high-adoption segment.
Verification Against Reported Data
Using Aggregate Adoption Metrics
From the second table:
- Mean Adoption NI = 28.00%
- Mean Adoption SI = 47.03%
- Mean Adoption EI = 46.37%
Reconstructing Lift (Option A: Within-Scenario)
| Scenario | Density | Mean (Same Scenario) | Calculated Lift | Reported Lift | ✓/✗ |
|---|---|---|---|---|---|
| NI | 32.46% | 28.00% | 1.160 | 1.176 | ✓ Close |
| SI | 67.89% | 47.03% | 1.444 | 1.781 | ✗ Off by 23% |
| EI | 81.07% | 46.37% | 1.748 | 0.931 | ✗ Wrong by 88% |
Reconstructing Lift (Option B: Baseline Reference)
| Scenario | Density | Mean NI (Baseline) | Calculated Lift | Reported Lift | ✓/✗ |
|---|---|---|---|---|---|
| NI | 32.46% | 28.00% | 1.160 | 1.176 | ✓ Close |
| SI | 67.89% | 28.00% | 2.425 | 1.781 | ✗ Off by 36% |
| EI | 81.07% | 28.00% | 2.895 | 0.931 | ✗ Wrong by 211% |
Impact on Interpretation
Current (Incorrect) Interpretation
"The EI segment has Lift = 0.931, meaning it performs 7% worse than the population average despite targeting high-trust, high-income individuals."
Correct Interpretation (Option A)
"The EI segment has Lift = 1.748, meaning it performs 75% better than the EI scenario average, successfully concentrating adoption in the targeted demographic."
Correct Interpretation (Option B)
"The EI segment has Lift = 2.895, achieving nearly 3× the baseline adoption rate, demonstrating substantial policy effectiveness."
Hypotheses for the Error
-
Inconsistent denominators: Different reference means used across scenarios (e.g., NI uses scenario mean, but SI/EI accidentally use segment-specific or inverted values)
-
Inverted calculation: The EI Lift may have been calculated as
Mean / Densityinstead ofDensity / Mean:
0.4637 / 0.8107 = 0.572 (still doesn't match 0.931)
-
Wrong variable substitution: Code may be pulling from incorrect columns in data structure
-
Copying error: Manual transcription error when populating the table
Required Actions
Immediate
- Audit the source code that calculates
Liftfor each scenario - Verify which denominator is used: scenario mean, baseline mean, or other
- Check if calculation method differs between NI, SI, and EI scenarios
- Re-run analysis with corrected formula
Documentation
- Add explicit formula for Lift to methods section:
Lift_scenario = Density_segment / Mean_Adoption_scenario
- Clarify interpretation: "Lift > 1 indicates segment outperforms scenario average"
- Report corrected values in results table
Quality Control
- Implement assertion tests:
assert all(Lift > 1) if segment is "high-adoption" - Add sanity checks:
assert Density_segment > Mean_population if Lift > 1 - Cross-validate with alternative calculation methods
Recommended Table Correction
Corrected PRIM Subgroup Analysis (assuming Option A: within-scenario Lift):
| Scenario | Coverage | Density | SD (Density) | Lift (Corrected) | Effect Size (d) | 95% CI Lower | 95% CI Upper | p-value | Stability | n_segment |
|---|---|---|---|---|---|---|---|---|---|---|
| NI | 0.1734 | 0.3246 | 0.1309 | 1.160 | 0 | 0 | 0 | 1.00 | 0 | 867 |
| SI | 0.1050 | 0.6789 | 0.1581 | 1.444 |
0.814 | 0.537 | 1.091 | < 0.001 | 0.067 | 525 |
| EI | 0.4126 | 0.8107 | 0.1579 | 1.748 |
1.037 | 0.760 | 1.315 | < 0.001 | 0.067 | 2063 |
Statistical Implications
This error affects:
- Interpretability: Readers cannot understand which segments truly outperform
- Reproducibility: Others cannot validate calculations without source code
- Policy recommendations: Incorrect Lift values may lead to wrong targeting decisions
- Credibility: Undermines trust in other reported metrics
Additional Concerns
While investigating Lift, review these related issues:
- Why is NI Lift ≠ 1.0? If NI is "baseline population (no segmentation)", its Lift should be exactly 1.0 by definition
- Stability = 6.67%: Why do optimal segments appear in only 1/15 iterations?
- n_segment for NI = 867: With 5,000 agents, why does "baseline population" only include 867 (17.34%)?
These suggest the NI "segment" may not actually represent the full population baseline as described.
Status: 🔴 BLOCKING ISSUE - Must be resolved before publication or policy recommendations.
Priority: CRITICAL - Affects core interpretation of primary results.
Assigned to: [Data Analysis Team / Lead Author]
Deadline: Before manuscript submission / next revision cycle