In [1]:
# ---
# title: Econometrics & Alpha Generation
# tags: [Econometrics, Statistics, HypothesisTesting, Regression]
# difficulty: Intermediate
# ---

# Econometrics: Finding Signal in the Noise

Before training Neural Networks, we must validate our assumptions using Classical Statistics.

## 1. Stationarity (ADF Test)
Most financial models assume data is "Stationary" (statistical properties like mean/variance don't change over time).
Raw prices are rarely stationary. Returns often are.

We use the **Augmented Dickey-Fuller (ADF)** test. If p-value < 0.05, we reject the Random Walk hypothesis.

In [2]:
import sys
import os
sys.path.append(os.path.abspath("../src"))

from analytics.stats import StatsEngine
stats = StatsEngine(silver_path="../data/silver")
res = stats.check_stationarity()
display(res) # Pretty print in notebook


=== Hypothesis Test: Stationarity (ADF) ===
H0: Series is Non-Stationary (Random Walk)
H1: Series is Stationary (Mean Reverting)
  Ticker  ADF Statistic  p-value  Stationary (95%)
0  GOOGL       -22.5366      0.0              True
1   NVDA       -13.4032      0.0              True
2   MSFT       -22.8175      0.0              True
3   AAPL       -11.3678      0.0              True


Unnamed: 0,Ticker,ADF Statistic,p-value,Stationary (95%)
0,GOOGL,-22.5366,0.0,True
1,NVDA,-13.4032,0.0,True
2,MSFT,-22.8175,0.0,True
3,AAPL,-11.3678,0.0,True


## 2. Causality (Granger)
Correlation does not imply causation. But **Granger Causality** tests if past values of X contain information that predicts future values of Y beyond what Y's own past values predict.

In [3]:
stats.test_granger_causality(output_target="AAPL")


=== Hypothesis Test: Granger Causality (Target: AAPL) ===
Does X provide statistically significant info about future AAPL?

Testing: Does GOOGL -> AAPL?
  - No significant causality found.

Testing: Does NVDA -> AAPL?
  - No significant causality found.

Testing: Does MSFT -> AAPL?
  - No significant causality found.


## 3. Factor Analysis (Lasso Regression)
Which tickers drive `AAPL`? We use Lasso (L1 regularization) to find a sparse set of explanatory variables.

In [4]:
from analytics.econometrics import EconometricModels
econ = EconometricModels(silver_path="../data/silver")
coeffs = econ.run_factor_analysis(target_ticker="AAPL")

Loading returns from: ../data/silver/market_returns_20251214_022109.parquet

=== Factor Analysis (Lasso Regression) for AAPL ===
Model R-Squared (Out-of-Sample): 0.0629

Key Drivers (Coefficients):
Ticker
NVDA     0.088050
GOOGL    0.080098
MSFT     0.014522
dtype: float64
