# Consumer Expenditure Survey Validation

This notebook validates our SPM base threshold calculations against BLS published values using Consumer Expenditure Survey methodology.

## BLS Methodology (Updated September 2021)

The SPM threshold calculation uses:
- 5 years of CE Survey data, lagged by 1 year
- FCSUti (Food, Clothing, Shelter, Utilities, telephone, internet) expenditures
- **83% of median** (47th-53rd percentile average) - changed from 33rd percentile in 2021
- FCSUti CPI-U composite index for inflation adjustment

### Data Sources

- [BLS SPM Thresholds 2024](https://www.bls.gov/pir/spm/spm_thresholds_2024.htm)
- [BLS SPM Historical Thresholds](https://www.bls.gov/pir/spm/spm_historic_thresholds.htm)
- [BLS Methodology Paper (2021)](https://www.bls.gov/pir/spm/garner_spm_choices_03_15_21.pdf)
- [CE Public Use Microdata](https://www.bls.gov/cex/pumd.htm)

In [None]:
import pandas as pd
import numpy as np

# Published BLS thresholds (reference family 2A2C)
BLS_THRESHOLDS = {
    2024: {
        "renter": 39430,
        "owner_with_mortgage": 39068,
        "owner_without_mortgage": 32586,
    },
    2023: {
        "renter": 36606,
        "owner_with_mortgage": 36192,
        "owner_without_mortgage": 30347,
    },
    2022: {
        "renter": 33402,
        "owner_with_mortgage": 32949,
        "owner_without_mortgage": 27679,
    },
    2021: {
        "renter": 31453,
        "owner_with_mortgage": 31089,
        "owner_without_mortgage": 26022,
    },
    2020: {
        "renter": 28881,
        "owner_with_mortgage": 28533,
        "owner_without_mortgage": 23948,
    },
}

print("BLS Published SPM Thresholds (Reference Family 2A2C)")
print("=" * 60)
for year, thresholds in sorted(BLS_THRESHOLDS.items(), reverse=True):
    print(f"\n{year}:")
    for tenure, value in thresholds.items():
        print(f"  {tenure:25s}: ${value:,}")

## Inflation Trends

SPM thresholds have increased significantly in recent years due to inflation. Let's examine the year-over-year changes.

In [None]:
# Calculate year-over-year changes
rows = []
years = sorted(BLS_THRESHOLDS.keys())

for i, year in enumerate(years):
    if i > 0:
        prev_year = years[i-1]
        for tenure in ["renter", "owner_with_mortgage", "owner_without_mortgage"]:
            curr = BLS_THRESHOLDS[year][tenure]
            prev = BLS_THRESHOLDS[prev_year][tenure]
            pct_change = (curr - prev) / prev * 100
            rows.append({
                "Year": year,
                "Tenure": tenure,
                "Threshold": f"${curr:,}",
                "Change": f"${curr - prev:,}",
                "% Change": f"{pct_change:.1f}%"
            })

change_df = pd.DataFrame(rows)
change_df.pivot(index="Year", columns="Tenure", values="% Change")

## FCSUti CPI-U Composite Index

The BLS uses a composite CPI index weighted by the FCSUti expenditure shares:

| Component | Weight | CPI Series |
|-----------|--------|------------|
| Food | 30% | CUUR0000SAF |
| Apparel | 5% | CUUR0000SAA |
| Shelter | 45% | CUUR0000SAH1 |
| Utilities | 12% | CUUR0000SAH2 |
| Telephone | 4% | CUUR0000SEED |
| Internet | 4% | SEEE02 |

In [None]:
# FCSUti component weights
FCSUTI_WEIGHTS = {
    "Food": 0.30,
    "Apparel": 0.05,
    "Shelter": 0.45,
    "Utilities": 0.12,
    "Telephone": 0.04,
    "Internet": 0.04,
}

print("FCSUti Component Weights")
print("=" * 40)
for component, weight in FCSUTI_WEIGHTS.items():
    bar = "█" * int(weight * 40)
    print(f"{component:12s} {weight*100:5.1f}% {bar}")
print(f"{'Total':12s} {sum(FCSUTI_WEIGHTS.values())*100:5.1f}%")

## CE Survey Methodology

### Step 1: Data Collection

The CE Survey collects expenditure data from ~7,000 consumer units per quarter. For SPM thresholds:
- Use 5 years of quarterly data (e.g., 2018Q1-2022Q4 for 2024 thresholds)
- Filter to consumer units with at least one child under 18

### Step 2: Calculate FCSUti

Sum expenditures on:
- **Food**: Food at home + food away from home
- **Clothing**: Apparel and services
- **Shelter**: Rent or owner costs (interest, insurance, taxes)
- **Utilities**: Electricity, gas, water, fuel
- **Telephone**: Landline and cellular service
- **Internet**: Internet service

### Step 3: Convert to Reference Family

Apply equivalence scale adjustment:
```
FCSUti_2A2C = FCSUti × (2.1 / raw_scale)
```

### Step 4: Calculate Threshold

For each tenure type:
1. Pool 5 years of adjusted FCSUti values
2. Calculate 47th and 53rd percentiles
3. Average to get the "median range"
4. Multiply by 0.83 to get threshold

In [None]:
# Simulated CE calculation to demonstrate methodology
# In practice, this would use actual CE PUMD files

np.random.seed(42)

# Generate synthetic FCSUti distribution (roughly calibrated to actual data)
# Real implementation would load CE Interview Survey MTBI files

def simulate_ce_thresholds(n_samples=10000):
    """Simulate CE threshold calculation methodology."""
    
    # Synthetic FCSUti distributions by tenure (in 2024 dollars)
    # Mean and std roughly calibrated to produce correct thresholds
    tenure_params = {
        "renter": {"mean": 47500, "std": 18000},
        "owner_with_mortgage": {"mean": 47000, "std": 17500},
        "owner_without_mortgage": {"mean": 39200, "std": 15000},
    }
    
    results = {}
    for tenure, params in tenure_params.items():
        # Generate lognormal distribution (expenditures are right-skewed)
        mu = np.log(params["mean"]) - 0.5 * (params["std"]/params["mean"])**2
        sigma = np.sqrt(np.log(1 + (params["std"]/params["mean"])**2))
        fcsuti = np.random.lognormal(mu, sigma, n_samples)
        
        # Calculate 47th-53rd percentile average (median range)
        p47 = np.percentile(fcsuti, 47)
        p53 = np.percentile(fcsuti, 53)
        median_range = (p47 + p53) / 2
        
        # Apply 83% factor
        threshold = 0.83 * median_range
        
        results[tenure] = {
            "median_range": median_range,
            "threshold": threshold,
            "p47": p47,
            "p53": p53,
        }
    
    return results

simulated = simulate_ce_thresholds()

print("Simulated CE Threshold Calculation (2024)")
print("=" * 70)
print(f"{'Tenure':<25} {'P47':>10} {'P53':>10} {'Median Range':>15} {'83% Threshold':>15}")
print("-" * 70)
for tenure, vals in simulated.items():
    print(f"{tenure:<25} ${vals['p47']:>9,.0f} ${vals['p53']:>9,.0f} ${vals['median_range']:>14,.0f} ${vals['threshold']:>14,.0f}")

## Validation Against BLS Published Values

Our spm-calculator package uses BLS published values directly, ensuring exact match.

In [None]:
try:
    from spm_calculator import get_published_thresholds
    
    print("Validation: spm-calculator vs BLS Published Values")
    print("=" * 60)
    
    for year in [2024, 2023, 2022]:
        calc_thresholds = get_published_thresholds(year)
        bls_thresholds = BLS_THRESHOLDS[year]
        
        print(f"\n{year}:")
        all_match = True
        for tenure in ["renter", "owner_with_mortgage", "owner_without_mortgage"]:
            calc = calc_thresholds[tenure]
            bls = bls_thresholds[tenure]
            match = "✓" if calc == bls else "✗"
            if calc != bls:
                all_match = False
            print(f"  {tenure:25s}: Calculator=${calc:,} vs BLS=${bls:,} {match}")
        
        if all_match:
            print(f"  All values match for {year}!")
            
except ImportError:
    print("spm-calculator package not installed.")
    print("Install with: pip install spm-calculator")

## Tenure Type Ratios

The ratio between tenure types is relatively stable over time.

In [None]:
print("Tenure Type Ratios (relative to Renter)")
print("=" * 50)

for year in sorted(BLS_THRESHOLDS.keys(), reverse=True):
    renter = BLS_THRESHOLDS[year]["renter"]
    mortgage = BLS_THRESHOLDS[year]["owner_with_mortgage"]
    no_mortgage = BLS_THRESHOLDS[year]["owner_without_mortgage"]
    
    print(f"\n{year}:")
    print(f"  Owner w/ mortgage:  {mortgage/renter:.3f}")
    print(f"  Owner w/o mortgage: {no_mortgage/renter:.3f}")

## Summary

### Key Findings

1. **BLS Methodology Change (2021)**: The switch from 33rd percentile to 83% of median represents a significant methodological update

2. **Inflation Impact**: SPM thresholds increased ~37% from 2020 to 2024 for renters ($28,881 → $39,430)

3. **Tenure Ratios**: 
   - Owner with mortgage: ~99% of renter threshold
   - Owner without mortgage: ~83% of renter threshold

4. **spm-calculator Accuracy**: Our package uses BLS published values, ensuring exact match with official thresholds

### Future Work

- Calculate thresholds directly from CE PUMD files
- Validate against CPS ASEC SPM thresholds
- Extend geographic adjustment calculations