# Modeling U.S. Home Price Dynamics Over 20 Years
**Using S&P Case-Shiller Index & Economic Indicators**

This notebook walks through data collection, preprocessing, EDA, modeling, and interpretation for how key factors influenced U.S. home prices.

In [None]:
# 1. Data Collection
import pandas_datareader.data as web
import pandas as pd
import datetime

start = datetime.datetime(2005, 1, 1)
end = datetime.datetime(2024, 12, 31)

series = {
    'HomePriceIndex':'CSUSHPINSA',
    'MortgageRate':'MORTGAGE30US',
    'Unemployment':'UNRATE',
    'CPI':'CPIAUCSL',
    'GDP':'GDP',
    'FedFundsRate':'FEDFUNDS',
    'HousingStarts':'HOUST'
}

data = {}
for name, code in series.items():
    df = web.DataReader(code, 'fred', start, end)
    if name == 'GDP':
        df = df.resample('M').ffill()
    data[name] = df

df = pd.concat(data.values(), axis=1)
df.columns = data.keys()
df.dropna(inplace=True)
df.head()

In [None]:
# 2. Exploratory Data Analysis
import matplotlib.pyplot as plt

# Plot time series
df.plot(subplots=True, figsize=(10,12), title='Time Series of Variables')
plt.tight_layout()
plt.show()

# Correlation heatmap
import seaborn as sns
plt.figure(figsize=(8,6))
sns.heatmap(df.corr(), annot=True, cmap='coolwarm')
plt.title('Correlation Matrix')
plt.show()

In [None]:
# 3. Modeling
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

X = df.drop(columns='HomePriceIndex')
y = df['HomePriceIndex']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False)

model = LinearRegression()
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

print("R² Score:", r2_score(y_test, y_pred))
print("RMSE:", mean_squared_error(y_test, y_pred, squared=False))

In [None]:
# 4. Feature Importance
import pandas as pd

importance = pd.Series(model.coef_, index=X.columns)
importance.sort_values().plot(kind='barh', figsize=(6,4), title='Feature Importance')
plt.xlabel('Coefficient')
plt.show()

## Conclusions
- Discussed relationships between interest rates, economic growth, unemployment, and home prices.
- Linear model R² and RMSE provide a baseline; consider advanced models next.