# ECON 0150 | Replication Notebook

**Title:** CPI and S&P 500

**Original Authors:** Stokes

**Original Date:** Fall 2024

---

This notebook replicates the analysis from a student final project in ECON 0150: Economic Data Analysis.

## About This Replication

**Research Question:** Does CPI inflation correlate with S&P 500 returns?

**Data Source:** FRED - CPI percent change and S&P 500 index (monthly data)

**Methods:** OLS regression of S&P 500 level on CPI inflation rate

**Main Finding:** Significant negative relationship: higher CPI inflation is associated with lower S&P 500 levels (coef = -166.28, p < 0.001, R² = 0.29).

**Course Concepts Used:**
- Simple and polynomial regression
- Time series visualization
- Dual-axis plotting
- Hypothesis testing

---
## Step 0 | Setup

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf

In [None]:
# Load data from course website
base_url = 'https://tayweid.github.io/econ-0150/projects/replications/0041/data/'

# Load CPI and S&P 500 data
cpi_data = pd.read_csv(base_url + 'CPIAUCSL_PC1.csv')
sp500_data = pd.read_csv(base_url + 'SP500.csv')

print(f"CPI data shape: {cpi_data.shape}")
print(f"S&P 500 data shape: {sp500_data.shape}")

---
## Step 1 | Data Preparation

In [None]:
# Convert dates and merge
cpi_data['observation_date'] = pd.to_datetime(cpi_data['observation_date'])
sp500_data['observation_date'] = pd.to_datetime(sp500_data['observation_date'])

# Merge datasets
data = pd.merge(cpi_data, sp500_data, on='observation_date', how='inner')

# Add squared term for polynomial regression
data['CPIAUCSL_PC1_sq'] = data['CPIAUCSL_PC1']**2

print(f"Merged data: {len(data)} observations")
data.head()

---
## Step 2 | Data Exploration

In [None]:
# Summary statistics
print("Summary Statistics:")
print(data[['CPIAUCSL_PC1', 'SP500']].describe())

In [None]:
# Correlation
correlation = data['CPIAUCSL_PC1'].corr(data['SP500'])
print(f"Correlation between CPI inflation and S&P 500: {correlation:.3f}")

In [None]:
# Histograms
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

sns.histplot(data['SP500'].dropna(), kde=True, ax=axes[0])
axes[0].set_title('Distribution of S&P 500')
axes[0].set_xlabel('S&P 500 Value')

sns.histplot(data['CPIAUCSL_PC1'].dropna(), kde=True, ax=axes[1])
axes[1].set_title('Distribution of CPI Inflation')
axes[1].set_xlabel('CPI Inflation (%)')

plt.tight_layout()
plt.show()

---
## Step 3 | Visualization

In [None]:
# Time series: CPI and S&P 500
fig, ax1 = plt.subplots(figsize=(12, 6))

ax1.plot(data['observation_date'], data['CPIAUCSL_PC1'], color='blue', label='CPI Inflation')
ax1.set_xlabel('Date')
ax1.set_ylabel('CPI Inflation (%)', color='blue')
ax1.tick_params(axis='y', labelcolor='blue')
ax1.set_title('CPI Inflation and S&P 500 Performance Over Time')
ax1.grid(True, alpha=0.3)

ax2 = ax1.twinx()
ax2.plot(data['observation_date'], data['SP500'], color='red', label='S&P 500')
ax2.set_ylabel('S&P 500 Value', color='red')
ax2.tick_params(axis='y', labelcolor='red')

lines1, labels1 = ax1.get_legend_handles_labels()
lines2, labels2 = ax2.get_legend_handles_labels()
ax2.legend(lines1 + lines2, labels1 + labels2, loc='upper left')

plt.tight_layout()
plt.show()

---
## Step 4 | Statistical Analysis

In [None]:
# Linear regression
# H0: B1 = 0 (No relationship between CPI and S&P 500)
# Model: SP500 = B0 + B1 * CPI + e

model_linear = smf.ols('SP500 ~ CPIAUCSL_PC1', data=data).fit()
print("Linear Model Summary:")
print(model_linear.summary())

In [None]:
# Polynomial regression (quadratic)
model_poly = smf.ols('SP500 ~ CPIAUCSL_PC1 + CPIAUCSL_PC1_sq', data=data).fit()
print("\nPolynomial (Quadratic) Model Summary:")
print(model_poly.summary())

In [None]:
# Scatter plot with regression line
data['linear_predicted'] = model_linear.predict(data)

plt.figure(figsize=(10, 6))
sns.scatterplot(x='CPIAUCSL_PC1', y='SP500', data=data, label='Actual S&P500', alpha=0.6)
sns.lineplot(x='CPIAUCSL_PC1', y='linear_predicted', data=data, color='red', label='Regression Line')
plt.xlabel('CPI Inflation (%)')
plt.ylabel('S&P 500 Value')
plt.title('S&P 500 vs. CPI Inflation')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

In [None]:
# Key results
print("\n" + "="*50)
print("KEY RESULTS")
print("="*50)
print(f"\nNull Hypothesis: No relationship between CPI and S&P 500 (beta = 0)")
print(f"\nLinear Model:")
print(f"  Intercept: {model_linear.params['Intercept']:.2f}")
print(f"  CPI coefficient: {model_linear.params['CPIAUCSL_PC1']:.2f}")
print(f"  R-squared: {model_linear.rsquared:.3f}")
print(f"  P-value: {model_linear.pvalues['CPIAUCSL_PC1']:.4f}")
print(f"\nInterpretation:")
print(f"  Each 1 percentage point increase in CPI inflation is associated with")
print(f"  a {abs(model_linear.params['CPIAUCSL_PC1']):.0f} point decrease in the S&P 500")

---
## Step 5 | Results Interpretation

### Key Findings

| Model | Variable | Coefficient | P-value | R² |
|-------|----------|-------------|---------|----|
| Linear | CPI | -166.28 | < 0.001 | 0.29 |
| Quadratic | CPI | -517.31 | 0.081 | 0.32 |
| Quadratic | CPI² | 32.62 | 0.225 | - |

1. **Negative Relationship:** Higher inflation is associated with lower stock market levels

2. **Moderate Fit:** CPI explains about 29% of S&P 500 variation

3. **Quadratic Not Better:** The squared term is not significant (p = 0.225)

### Economic Interpretation

Why might higher inflation hurt stock prices?
- **Higher interest rates:** Fed raises rates to fight inflation, making bonds more attractive
- **Cost pressures:** Input costs rise for companies
- **Consumer spending:** Reduced purchasing power
- **Uncertainty:** Inflation creates economic uncertainty

### Cautions

- **Time series issues:** The Durbin-Watson statistic (0.211) indicates strong autocorrelation
- **Spurious correlation:** Both series trend over time
- **Not causal:** Many factors affect both CPI and S&P 500

---
## Replication Exercises

### Exercise 1: First Differences
Use changes in CPI and S&P 500 instead of levels. Does the relationship persist?

### Exercise 2: Lagged Effects
Does last month's CPI predict this month's S&P 500 change?

### Exercise 3: Subperiods
Has the relationship changed over time? Compare different decades.

### Challenge Exercise
Research the Fisher Effect and stock returns. What does theory predict about inflation and nominal returns?

In [None]:
# Your code for exercises

# Example: First differences
# data['SP500_change'] = data['SP500'].diff()
# data['CPI_change'] = data['CPIAUCSL_PC1'].diff()
# model_diff = smf.ols('SP500_change ~ CPI_change', data=data.dropna()).fit()
# print(model_diff.summary().tables[1])