# Notebook 3: Survival Analysis with SSA Life Tables

**Learning Objectives:**
- Understand actuarial life tables and mortality rates
- Sample death ages from real SSA data
- Combine mortality uncertainty with market uncertainty

---

## The Second Source of Uncertainty

In retirement planning, we face two major unknowns:
1. **Market returns** (covered in Notebook 2)
2. **How long will the person live?**

The Social Security Administration publishes actuarial life tables with mortality probabilities by age and gender.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(42)

## SSA Life Table Data

The key value is **q_x** = probability of dying within one year, given you've reached age x.

Source: SSA Period Life Table, 2021

In [None]:
# SSA 2021 Period Life Table - Death probabilities by age
# q_x = probability of dying within one year at age x
SSA_MALE_QX = {
    65: 0.01519, 66: 0.01652, 67: 0.01797, 68: 0.01957, 69: 0.02135,
    70: 0.02333, 71: 0.02554, 72: 0.02800, 73: 0.03073, 74: 0.03374,
    75: 0.03710, 76: 0.04085, 77: 0.04503, 78: 0.04969, 79: 0.05489,
    80: 0.06072, 81: 0.06726, 82: 0.07459, 83: 0.08280, 84: 0.09197,
    85: 0.10218, 86: 0.11351, 87: 0.12604, 88: 0.13983, 89: 0.15493,
    90: 0.17137, 91: 0.18916, 92: 0.20828, 93: 0.22867, 94: 0.25024,
    95: 0.27284, 96: 0.29628, 97: 0.32033, 98: 0.34472, 99: 0.36916,
    100: 0.39336, 105: 0.52717, 110: 0.65199, 115: 0.76009, 119: 1.0
}

SSA_FEMALE_QX = {
    65: 0.01035, 66: 0.01127, 67: 0.01228, 68: 0.01340, 69: 0.01466,
    70: 0.01609, 71: 0.01772, 72: 0.01958, 73: 0.02170, 74: 0.02411,
    75: 0.02687, 76: 0.03002, 77: 0.03361, 78: 0.03770, 79: 0.04234,
    80: 0.04762, 81: 0.05362, 82: 0.06042, 83: 0.06812, 84: 0.07681,
    85: 0.08660, 86: 0.09760, 87: 0.10991, 88: 0.12364, 89: 0.13889,
    90: 0.15574, 91: 0.17426, 92: 0.19449, 93: 0.21645, 94: 0.24012,
    95: 0.26541, 96: 0.29218, 97: 0.32022, 98: 0.34925, 99: 0.37893,
    100: 0.40889, 105: 0.55000, 110: 0.68000, 115: 0.80000, 119: 1.0
}

def get_death_probability(age, gender='M'):
    """Get probability of dying within one year at given age."""
    table = SSA_MALE_QX if gender == 'M' else SSA_FEMALE_QX
    if age in table:
        return table[age]
    # Interpolate for missing ages
    ages = sorted(table.keys())
    for i, a in enumerate(ages[:-1]):
        if a < age < ages[i+1]:
            t = (age - a) / (ages[i+1] - a)
            return table[a] * (1-t) + table[ages[i+1]] * t
    return 1.0 if age >= 119 else table[min(table.keys())]

# Show mortality rates
ages = range(65, 100)
male_qx = [get_death_probability(a, 'M') for a in ages]
female_qx = [get_death_probability(a, 'F') for a in ages]

plt.figure(figsize=(10, 5))
plt.plot(ages, male_qx, 'b-', label='Male')
plt.plot(ages, female_qx, 'r-', label='Female')
plt.xlabel('Age'); plt.ylabel('Probability of Death Within 1 Year')
plt.title('SSA Mortality Rates by Age'); plt.legend(); plt.grid(alpha=0.3)
plt.show()

In [None]:
def sample_death_ages(current_age, n_samples, gender='M', max_age=119, seed=None):
    """Sample death ages using mortality chain approach."""
    if seed is not None:
        np.random.seed(seed)
    death_ages = np.zeros(n_samples, dtype=int)
    for i in range(n_samples):
        age = current_age
        while age < max_age:
            if np.random.random() < get_death_probability(age, gender):
                death_ages[i] = age
                break
            age += 1
        else:
            death_ages[i] = max_age
    return death_ages

# Sample 10,000 death ages for a 65-year-old male
death_ages = sample_death_ages(65, 10000, 'M', seed=42)
print(f"Sampled {len(death_ages)} death ages")
print(f"Mean: {np.mean(death_ages):.1f}, Median: {np.median(death_ages):.0f}")
print(f"Range: {death_ages.min()} to {death_ages.max()}")

In [None]:
# Visualize death age distribution
plt.figure(figsize=(12, 5))
plt.hist(death_ages, bins=range(65, 120), edgecolor='black', alpha=0.7)
plt.axvline(np.mean(death_ages), color='red', linestyle='--', label=f'Mean: {np.mean(death_ages):.1f}')
plt.axvline(np.median(death_ages), color='green', linestyle='--', label=f'Median: {np.median(death_ages):.0f}')
plt.xlabel('Age at Death'); plt.ylabel('Frequency')
plt.title('Distribution of Death Ages (65-year-old Male)'); plt.legend(); plt.grid(alpha=0.3)
plt.show()

In [None]:
# Survival probabilities
print("Survival Probabilities from Age 65")
for target in [75, 80, 85, 90, 95, 100]:
    prob = np.mean(death_ages >= target)
    print(f"  P(survive to {target}): {prob*100:>5.1f}%")

In [None]:
# Compare male vs female
male_ages = sample_death_ages(65, 10000, 'M', seed=42)
female_ages = sample_death_ages(65, 10000, 'F', seed=42)

print(f"Male mean lifespan:   {np.mean(male_ages):.1f} years")
print(f"Female mean lifespan: {np.mean(female_ages):.1f} years")
print(f"Difference: {np.mean(female_ages) - np.mean(male_ages):.1f} years")

## Summary

- SSA life tables give us mortality probabilities by age
- We can sample death ages by simulating year-by-year survival
- Women live ~3-4 years longer than men on average
- There's significant uncertainty: some die at 70, others at 100+

**Next: Notebook 4 - Tax Strategies** (RMDs and the step-up in basis)