# ECON 0150 | Replication Notebook

**Title:** Minimum Wage and Unemployment

**Original Authors:** Ortega

**Original Date:** Fall 2024

---

This notebook replicates the analysis from a student final project in ECON 0150: Economic Data Analysis.

## About This Replication

**Research Question:** What is the relationship between minimum wage and the unemployment rate across U.S. states?

**Data Source:** State-level minimum wage and unemployment rate data (2024)

**Methods:** OLS regression of unemployment rate on minimum wage

**Main Finding:** Positive correlation between minimum wage and unemployment (r = 0.35), but the relationship requires careful interpretation given confounding factors.

**Course Concepts Used:**
- Simple linear regression
- Cross-state comparison
- Scatter plots with regression lines
- Hypothesis testing

---
## Step 0 | Setup

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf

In [None]:
# Load data from course website
base_url = 'https://tayweid.github.io/econ-0150/projects/replications/0034/data/'

df = pd.read_csv(base_url + 'cleaned_data.csv')

print(f"Number of states: {len(df)}")
df.head()

---
## Step 1 | Data Preparation

In [None]:
# Check column names and clean if needed
print("Columns:", df.columns.tolist())

# Rename columns for consistency
data = df.rename(columns={
    'MinimumWage': 'MinWage',
    'UnemploymentRate': 'UnempRate'
}).copy()

# Ensure numeric types
data['MinWage'] = pd.to_numeric(data['MinWage'], errors='coerce')
data['UnempRate'] = pd.to_numeric(data['UnempRate'], errors='coerce')

# Drop missing values
data = data.dropna(subset=['MinWage', 'UnempRate'])

print(f"\nClean data: {len(data)} states")

---
## Step 2 | Data Exploration

In [None]:
# Summary statistics
print("Summary Statistics:")
print(data[['MinWage', 'UnempRate']].describe())

In [None]:
# Distribution of minimum wage
fig, axes = plt.subplots(1, 2, figsize=(12, 4))

axes[0].hist(data['MinWage'], bins=10, edgecolor='black', color='teal')
axes[0].set_xlabel('Minimum Wage ($)')
axes[0].set_ylabel('Number of States')
axes[0].set_title('Distribution of State Minimum Wages')

axes[1].hist(data['UnempRate'], bins=10, edgecolor='black', color='purple')
axes[1].set_xlabel('Unemployment Rate (%)')
axes[1].set_ylabel('Number of States')
axes[1].set_title('Distribution of Unemployment Rates')

plt.tight_layout()
plt.show()

In [None]:
# Correlation
correlation = data['MinWage'].corr(data['UnempRate'])
print(f"Correlation between minimum wage and unemployment: {correlation:.3f}")

---
## Step 3 | Visualization

In [None]:
# Scatter plot with regression line
plt.figure(figsize=(10, 6))
plt.scatter(data['MinWage'], data['UnempRate'], color='navy', alpha=0.7)
plt.xlabel('Minimum Wage ($)')
plt.ylabel('Unemployment Rate (%)')
plt.title('Minimum Wage vs Unemployment Rate by State')
plt.grid(True, alpha=0.3)
plt.show()

In [None]:
# Regression plot with confidence interval
plt.figure(figsize=(10, 6))
sns.regplot(data=data, x='MinWage', y='UnempRate', ci=95, 
            scatter_kws={'alpha': 0.7}, line_kws={'color': 'red'})
plt.xlabel('Minimum Wage ($)')
plt.ylabel('Unemployment Rate (%)')
plt.title('Minimum Wage vs Unemployment Rate with Regression Line')
plt.grid(True, alpha=0.3)
plt.show()

---
## Step 4 | Statistical Analysis

In [None]:
# OLS Regression
model = smf.ols('UnempRate ~ MinWage', data=data).fit()
print(model.summary())

In [None]:
# Key results
print("\n" + "="*50)
print("KEY RESULTS")
print("="*50)
print(f"Null Hypothesis: Minimum wage has no effect on unemployment (beta = 0)")
print(f"Alternative: Minimum wage affects unemployment (beta != 0)")
print(f"\nIntercept: {model.params['Intercept']:.3f}")
print(f"MinWage coefficient: {model.params['MinWage']:.4f}")
print(f"\nInterpretation:")
print(f"  Each $1 increase in minimum wage is associated with")
print(f"  a {model.params['MinWage']:.3f} percentage point change in unemployment")
print(f"\nR-squared: {model.rsquared:.3f}")
print(f"P-value: {model.pvalues['MinWage']:.4f}")

In [None]:
# Identify outliers - states with highest and lowest unemployment
print("\nStates with Highest Unemployment:")
print(data.nlargest(5, 'UnempRate')[['State', 'MinWage', 'UnempRate']])

print("\nStates with Lowest Unemployment:")
print(data.nsmallest(5, 'UnempRate')[['State', 'MinWage', 'UnempRate']])

---
## Step 5 | Results Interpretation

### Key Findings

| Metric | Value |
|--------|-------|
| Correlation | 0.35 |
| R-squared | ~0.12 |
| P-value | Varies |

### Interpretation

1. **Positive Correlation:** States with higher minimum wages tend to have slightly higher unemployment rates

2. **Weak Relationship:** The RÂ² is low, meaning minimum wage explains only a small portion of variation in unemployment

3. **Many Confounders:** This cross-sectional relationship is difficult to interpret causally

### Why This Doesn't Show Causation

**Omitted Variables:**
- States with high minimum wages (CA, NY) also have high cost of living
- Economic conditions vary across states
- Industry composition differs

**Reverse Causality:**
- States may raise minimum wage in response to strong economies

**The Economics Literature:**
- Extensive research on minimum wage effects
- Results are mixed and depend on methodology
- Cross-sectional comparisons like this are not the best design

---
## Replication Exercises

### Exercise 1: Regional Analysis
Does the relationship differ by Census region? Add a regional indicator.

### Exercise 2: Controls
Add state-level GDP or cost of living as controls. Does the minimum wage coefficient change?

### Exercise 3: Federal Minimum Wage States
Compare states at the federal minimum ($7.25) to states with higher minimums.

### Challenge Exercise
Research the Card-Krueger debate on minimum wage. What methodological improvements do economists use to identify causal effects?

In [None]:
# Your code for exercises

# Example: Compare federal minimum wage states to others
# data['FederalMin'] = data['MinWage'] <= 7.25
# print(data.groupby('FederalMin')['UnempRate'].mean())