# Executive Summary

**Business Question:** Are our promotion-heavy customers high-value "smart shoppers" who drive revenue, or low-margin "cherry-pickers" who erode profits?

**Key Findings:**

1. **The Promotion Paradox:** Promotion-heavy customers spend 19% more annually ($2,092 vs $1,757) but buy 33% cheaper items ($0.92 vs $1.37 per unit). When controlled for spending level, promo-light high-spenders actually have 6% higher basket values ($45.10 vs $42.26), revealing a classic Simpson's Paradox.

2. **Margin Killers Outnumber Smart Shoppers 2:1:** We identified 238 "smart shoppers" (9.6%) who combine high promotion usage with high spending. However, they're outnumbered by 529 "margin killers" (21.4%) who use promotions heavily but maintain low basket values ($28.48).

3. **Missing the Affluent Customer:** Only 9.1% of 250K+ income households are promo-heavy, compared to 46.4% of 50-74K households. We're successfully attracting budget-constrained necessity shoppers but missing affluent discretionary spenders entirely.

4. **Highest Value Customers Don't Need Promotions:** The "Untapped Potential" segment (380 households, 15.4%) maintains the highest basket values ($45.10) with minimal promotion usage, proving that high value exists without heavy discounting.

**Recommendation:**

**Reduce overall promotional spending by 25-30% while shifting to segment-specific strategies:**
- PROTECT high-value, low-promo customers (offer convenience, not discounts)
- REWARD smart shoppers selectively (personalized, premium-focused offers)
- REDUCE subsidies to margin killers (basket minimums, category restrictions)
- ATTRACT missing affluent demographics (experience over price)

**Expected Impact:** ~$230,000 annual savings in promotional costs while maintaining or improving revenue from high-value segments. Pilot approach manages risk with 90-day test before full rollout.

**Strategic Shift:** From volume-focused (attract everyone) to value-focused (attract the right customers).



## 1. Business Problem & Approach

### Why This Matters to Regork

Promotions cost money. Every discount dollar represents potential margin erosion. But promotions 
also drive traffic, build loyalty, and can attract high-value customers. The question is: **which 
customers are we attracting with our promotions?**

There are two competing hypotheses:

**Hypothesis A: "Margin Killers"**  
Promotion-heavy households are price-sensitive cherry-pickers who:
- Only shop during sales
- Buy low-margin items
- Have small basket sizes
- Would defect to competitors for better deals

**Hypothesis B: "Smart Shoppers"**  
Promotion-heavy households are savvy, high-value customers who:
- Use promotions strategically but still spend significantly
- Buy premium products at a discount
- Have large basket sizes
- Are loyal if we keep engaging them with targeted offers

### Our Approach

We will analyze 1.47M transactions across 801 households to:

1. **Define promotion intensity** using discount usage and coupon redemptions
2. **Segment households** into a 2Ã—2 matrix:
   - Promotion usage (heavy vs. light)
   - Spending level (high vs. low)
3. **Identify "smart shoppers"** (high promotion + high spending)
4. **Analyze demographic patterns** (income, kids, household size)
5. **Recommend action** based on which hypothesis the data supports

### Data Sources

We use the Complete Journey dataset (via `completejourney_py`):

- **transactions** (1.47M records): Item-level purchases with discounts
- **demographics** (801 households): Age, income, kids, household composition
- **coupon_redemptions** (2.3K records): Specific coupon usage events
- **coupons** (116K records): Coupon metadata



In [1]:
import warnings
import os

warnings.filterwarnings('ignore')
warnings.simplefilter('ignore')
os.environ['PYTHONWARNINGS'] = 'ignore'

import os
os.environ['PYTHONWARNINGS'] = 'ignore'

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from completejourney_py import get_data
pd.set_option('display.max_columns', 50)
pd.set_option('display.max_rows', 100)
pd.set_option('display.width', 120)
pd.set_option('display.float_format', '{:.2f}'.format)

plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 6)
plt.rcParams['font.size'] = 10


