# ECON 0150 | Replication Notebook

**Title:** Beach Replenishment and Home Values

**Original Authors:** Student Author

**Original Date:** Fall 2024

---

This notebook replicates the analysis from a student final project in ECON 0150: Economic Data Analysis.

## About This Replication

**Research Question:** Does beach replenishment spending affect home values in New Jersey coastal towns?

**Data Source:** Army Corps of Engineers beach restoration spending data (2000-2024); Zillow home value data (2000-2025)

**Methods:** OLS regression: Percentage_Increase ~ Spending_Per_Mile

**Main Finding:** Positive but not statistically significant relationship between beach restoration spending and home price appreciation.

**Course Concepts Used:**
- Data merging and matching
- Simple linear regression
- Scatter plots with regression lines
- Inflation adjustment
- Geographic data analysis

---
## Step 0 | Setup

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf

In [None]:
# Load processed data from course website
# Note: Original analysis used 90MB Zillow file; this is aggregated version
base_url = 'https://tayweid.github.io/econ-0150/projects/replications/0058/data/'

data = pd.read_csv(base_url + 'nj_beach_towns.csv')

print(f"Number of NJ beach towns: {len(data)}")
data.head()

---
## Step 1 | Data Preparation

In [None]:
# Check columns
print("Columns:", data.columns.tolist())
print(f"\nData shape: {data.shape}")

In [None]:
# Convert spending to millions for easier interpretation
data['Spending_Per_Mile_Millions'] = data['Spending_Per_Mile'] / 1e6

# Drop any missing values
data = data.dropna()

print(f"\nCleaned data: {len(data)} observations")
data.head(10)

---
## Step 2 | Data Exploration

In [None]:
# Summary statistics
print("Summary Statistics:")
print(data[['Percentage_Increase', 'Spending_Per_Mile_Millions', 'Beach_Length_Miles']].describe())

In [None]:
# Correlation
correlation = data['Spending_Per_Mile_Millions'].corr(data['Percentage_Increase'])
print(f"\nCorrelation between spending per mile and price increase: {correlation:.3f}")

In [None]:
# Top 5 towns by home price appreciation
print("\nTop 5 Towns by Home Price Increase (2000-2025):")
print(data.nlargest(5, 'Percentage_Increase')[['RegionName', 'Percentage_Increase', 'Spending_Per_Mile_Millions']])

In [None]:
# Top 5 towns by spending per mile
print("\nTop 5 Towns by Beach Restoration Spending Per Mile:")
print(data.nlargest(5, 'Spending_Per_Mile_Millions')[['RegionName', 'Spending_Per_Mile_Millions', 'Percentage_Increase']])

---
## Step 3 | Visualization

In [None]:
# Bar chart: Home price appreciation by town
plt.figure(figsize=(14, 6))
data_sorted = data.sort_values('Percentage_Increase', ascending=False)
sns.barplot(x='RegionName', y='Percentage_Increase', data=data_sorted)
plt.xticks(rotation=90)
plt.title('Home Price Appreciation by NJ Beach Town (2000-2025)')
plt.xlabel('Town')
plt.ylabel('Percentage Increase (%)')
plt.tight_layout()
plt.show()

In [None]:
# Bar chart: Spending per mile by town
plt.figure(figsize=(14, 6))
data_sorted = data.sort_values('Spending_Per_Mile_Millions', ascending=False)
sns.barplot(x='RegionName', y='Spending_Per_Mile_Millions', data=data_sorted)
plt.xticks(rotation=90)
plt.title('Beach Restoration Spending Per Mile by Town')
plt.xlabel('Town')
plt.ylabel('Spending Per Mile ($ Millions)')
plt.tight_layout()
plt.show()

In [None]:
# Scatter plot with regression line
plt.figure(figsize=(10, 6))
sns.regplot(x='Spending_Per_Mile_Millions', y='Percentage_Increase', data=data, 
            scatter_kws={'alpha': 0.7, 's': 80}, line_kws={'color': 'red'})
plt.title('Beach Restoration Spending vs. Home Price Appreciation')
plt.xlabel('Spending Per Mile ($ Millions)')
plt.ylabel('Percentage Increase in Home Prices (2000-2025)')
plt.grid(True, alpha=0.3)
plt.show()

---
## Step 4 | Statistical Analysis

In [None]:
# OLS Regression
model = smf.ols('Percentage_Increase ~ Spending_Per_Mile_Millions', data=data).fit()
print("OLS Regression: Percentage_Increase ~ Spending_Per_Mile_Millions")
print(model.summary())

In [None]:
# Key results
print("\n" + "="*50)
print("KEY RESULTS")
print("="*50)
print(f"\nNull Hypothesis: Beach spending does not affect home values (beta = 0)")
print(f"\nModel Results:")
print(f"  Intercept: {model.params['Intercept']:.1f}% appreciation (no spending)")
print(f"  Spending coefficient: {model.params['Spending_Per_Mile_Millions']:.2f}")
print(f"  P-value: {model.pvalues['Spending_Per_Mile_Millions']:.3f}")
print(f"  R-squared: {model.rsquared:.3f}")
print(f"\nInterpretation:")
if model.pvalues['Spending_Per_Mile_Millions'] < 0.05:
    print(f"  Each additional $1M per mile is associated with")
    print(f"  {model.params['Spending_Per_Mile_Millions']:.2f} percentage points more appreciation")
else:
    print(f"  The relationship is NOT statistically significant (p > 0.05)")
    print(f"  Cannot reject the null hypothesis")

---
## Step 5 | Results Interpretation

### Key Findings

1. **Wide Variation in Appreciation:** NJ beach town home prices increased 260% to 790% from 2000-2025

2. **Spending Varies Dramatically:** Some towns received 10-20x more restoration spending per mile

3. **Weak Relationship:** The correlation between spending and appreciation is weak

### Why Might Spending NOT Predict Appreciation?

- **Reverse causality:** Valuable areas may receive more protection funding
- **Other factors:** Location, amenities, school districts, development patterns
- **Storm damage:** Towns needing more restoration may have had more damage
- **Timing:** When spending occurred matters (before vs after housing booms)

### The Beach Economics Story

Beach replenishment is a significant public investment:
- **Total NJ spending:** Over $1 billion since 2000 (inflation-adjusted)
- **Purpose:** Protect coastal infrastructure and maintain tourism
- **Controversy:** Who should pay? Federal, state, local, or homeowners?

### Policy Implications

- Beach restoration benefits extend beyond property values
- Tourism, storm protection, and environmental factors matter
- Cost-benefit analysis requires broader metrics

---
## Replication Exercises

### Exercise 1: Beach Length Analysis
Does beach length itself predict home prices? Are longer beaches associated with higher appreciation?

### Exercise 2: Outlier Analysis
Remove the towns with highest spending (potential outliers). How does the relationship change?

### Exercise 3: Total Spending
Instead of spending per mile, use total spending. Does this change the results?

### Challenge Exercise
Research the economics of beach nourishment. What do environmental economists say about costs and benefits?

In [None]:
# Your code for exercises

# Example: Beach length vs appreciation
# model_length = smf.ols('Percentage_Increase ~ Beach_Length_Miles', data=data).fit()
# print(model_length.summary().tables[1])