# Earnings Regression — Analysis (with script steps)

This notebook reproduces the analysis I ran in `scripts/run_earnings_analysis.py`.
What the script does (and what the cells below perform):
- Load `Data/earnings_regression_panel.csv` (prepared panel of earnings announcements).
- Clean and coerce types (`Surprise`, `CAR`, `VIX`, `Δ10Y` -> `D10Y`).
- Compute mean `Surprise` and `CAR` by `Regime` and save to `Data/means_by_regime.csv`.
- Produce and save plots: `Data/fig_surprise_box.png`, `Data/fig_car_box.png`, `Data/fig_scatter.png`.
- Run an OLS regression `CAR ~ Surprise + C(Regime) + VIX + D10Y`, save summary to `Data/regression_summary.txt`.

The following cells execute the same steps and include short explanations and inline output where useful.

In [None]:
# Imports
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import statsmodels.formula.api as smf
import statsmodels.api as sm
from IPython.display import Image, display
sns.set(style='whitegrid')
%matplotlib inline

In [None]:
# Load prepared panel (file saved by the preparation notebook)
panel = pd.read_csv('Data/earnings_regression_panel.csv', parse_dates=['Earnings Date'])
print('panel shape:', panel.shape)
panel.head()

panel shape: (248, 7)


  panel = pd.read_csv('Data/earnings_regression_panel.csv', parse_dates=['Earnings Date'], infer_datetime_format=True)


Unnamed: 0,Ticker,Earnings Date,Surprise,CAR,Regime,VIX,Δ10Y
0,GOOGL,2005-04-21,0.0095,0.000635,0,14.41,0.035971
1,NVDA,2005-05-12,0.004,0.026393,0,16.12,0.027027
2,AAPL,2005-07-13,0.0,0.001773,0,10.84,-0.047945
3,GOOGL,2005-07-21,0.0035,0.006394,0,10.97,-0.009259
4,NVDA,2005-08-11,0.0,0.002138,0,12.42,0.033493


In [None]:
# Cleaning and type coercion (matches the script)
df = panel.copy()
# Rename Δ10Y to D10Y for convenience
if 'Δ10Y' in df.columns:
# Ensure numeric types where applicable
for c in ['Surprise','CAR','VIX','D10Y']:
# Drop rows missing key vars
df = df.dropna(subset=['Surprise','CAR'])
print('cleaned shape:', df.shape)
df.head()

cleaned shape: (248, 7)


Unnamed: 0,Ticker,Earnings Date,Surprise,CAR,Regime,VIX,Δ10Y
237,GOOGL,2025-02-04,0.02,0.01284,0,17.21,0.061033
93,AAPL,2013-01-23,0.0125,0.014481,0,12.46,0.033333
181,NVDA,2020-05-21,0.003,-0.010186,1,29.53,-0.552632
65,GOOGL,2010-10-14,0.0235,-0.001464,0,19.88,-0.16
225,AAPL,2024-02-01,0.07,0.002099,0,13.88,-0.171306


In [None]:
# Compute means by Regime and save (script saved to Data/means_by_regime.csv)
if 'Regime' in df.columns:
    means = df.groupby('Regime')[['Surprise','CAR']].mean().reset_index()
    print('Means by Regime:
    means.to_csv('Data/means_by_regime.csv', index=False)


    print('No Regime column found')else:    print("Regime column found in dataframe")

Regime column found in dataframe


In [None]:
# Visualisations: boxplots and scatter and save figures (script saved PNGs)
sns.set(style='whitegrid')
# Surprise boxplot by regime
if 'Regime' in df.columns:
: 
,
: {
: 

: [

SyntaxError: invalid syntax (1479862297.py, line 9)