# Log‑Return Normality Testing
This notebook investigates the classic **“log‑returns are normally distributed”** assumption that underpins many models in theoretical and computational finance.

1. Pull historical price data with `yfinance`.
2. Compute daily log‑returns.
3. Run four normality tests (Shapiro‑Wilk, Anderson–Darling, Jarque‑Bera, Kolmogorov–Smirnov).
4. Scan rolling windows for periods that look Gaussian.
5. Examine how trimming extreme returns affects normality.
6. Aggregate assets into a portfolio and re‑test.



In [4]:
import yfinance as yf
import pandas as pd, numpy as np
from scipy import stats
from statsmodels.stats.stattools import jarque_bera
from tqdm.auto import tqdm

def get_prices(tickers, start="2000-01-01", end=None):
    """Download adjusted close prices from Yahoo Finance."""
    data = yf.download(tickers, start=start, end=end, progress=False)["Close"]
    return data.dropna(how="all")

def log_returns(prices):
    return np.log(prices).diff().dropna(how="all")

def run_normality_tests(series, alpha=0.05):
    """Return dict of p‑values for several normality tests."""
    s = series.dropna()
    if len(s) < 8:                       # guard against short samples
        return {k: np.nan for k in ['Shapiro', 'Anderson', 'Jarque‑Bera', 'KS']}
    stat, p = stats.shapiro(s)
    result = {'Shapiro': p}
    stat, crit, sig = stats.anderson(s, dist='norm')
    # pseudo‑p for AD by linear interpolation of critical values
    result['Anderson'] = np.interp(stat, crit[::-1], np.array(sig[::-1]) / 100.0)
    jb_stat, jb_p, _, _ = jarque_bera(s)
    result['Jarque‑Bera'] = jb_p
    ks_stat, ks_p = stats.kstest((s - s.mean()) / s.std(ddof=0), 'norm')
    result['KS'] = ks_p
    result['n'] = len(s)
    result['Accept'] = all(pv > alpha for k, pv in result.items() if k not in ('n',))
    return result

In [5]:
# ✏️ Choose your tickers
tickers = ["SPY", "QQQ", "IWM", "TLT", "GLD"]  # mix of equity, bond, gold ETFs

prices  = get_prices(tickers, start="2010-01-01")
rets    = log_returns(prices)
rets.head()

  data = yf.download(tickers, start=start, end=end, progress=False)["Close"]


Ticker,GLD,IWM,QQQ,SPY,TLT
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
2010-01-05,-0.000911,-0.003444,0.0,0.002643,0.006437
2010-01-06,0.016365,-0.000941,-0.00605,0.000704,-0.013477
2010-01-07,-0.006207,0.007351,0.00065,0.004213,0.001681
2010-01-08,0.004951,0.005439,0.008197,0.003322,-0.000448
2010-01-11,0.013201,-0.004038,-0.00409,0.001396,-0.005503


In [6]:
# ----- Whole‑period normality tests ---------------------------------------
import pandas as pd

tests_full = rets.apply(lambda s: pd.Series(run_normality_tests(s))).T

# In case you passed only **one** ticker, the result will be a Series → fix:
if isinstance(tests_full, pd.Series):
    tests_full = tests_full.to_frame().T

# Pretty display (falls back gracefully if your pandas version lacks Series.style)
try:
    display(tests_full.style.format("{:.4f}").background_gradient(axis=0, cmap="RdYlGn_r"))
except AttributeError:
    display(tests_full.round(4))


Unnamed: 0_level_0,Shapiro,Anderson,Jarque‑Bera,KS,n,Accept
Ticker,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
GLD,0.0,0.15,0.0,0.0,3893.0,0.0
IWM,0.0,0.15,0.0,0.0,3893.0,0.0
QQQ,0.0,0.15,0.0,0.0,3893.0,0.0
SPY,0.0,0.15,0.0,0.0,3893.0,0.0
TLT,0.0,0.15,0.0,0.0005,3893.0,0.0


In [7]:
WIN = 252  # one trading year
results = []
for tkr in tqdm(tickers, desc="Rolling windows"):
    for end in range(WIN, len(rets)):
        sub = rets[tkr].iloc[end-WIN:end]
        row = run_normality_tests(sub)
        row.update({'Ticker': tkr, 'EndDate': sub.index[-1]})
        results.append(row)
roll_df = pd.DataFrame(results)
normal_windows = roll_df[roll_df['Accept']]
normal_windows.head()

Rolling windows: 100%|██████████| 5/5 [00:18<00:00,  3.71s/it]


Unnamed: 0,Shapiro,Anderson,Jarque‑Bera,KS,n,Accept,Ticker,EndDate
1123,0.053569,0.15,0.234636,0.423499,252,True,SPY,2015-06-22
1124,0.053725,0.15,0.233043,0.428281,252,True,SPY,2015-06-23
1125,0.05712,0.15,0.241773,0.420706,252,True,SPY,2015-06-24
1126,0.058967,0.15,0.247512,0.478864,252,True,SPY,2015-06-25
1127,0.058596,0.15,0.246322,0.481574,252,True,SPY,2015-06-26


In [10]:
trim_pct = 0.01  # winsorize 1 % tails
for tkr in tickers:
    trimmed = rets[tkr].clip(lower=rets[tkr].quantile(trim_pct),
                             upper =rets[tkr].quantile(1-trim_pct))
    print(tkr, "→ All tests accept normality:" , run_normality_tests(trimmed)['Accept'])

SPY → All tests accept normality: False
QQQ → All tests accept normality: False
IWM → All tests accept normality: False
TLT → All tests accept normality: False
GLD → All tests accept normality: False


In [9]:
w = pd.Series(1/len(tickers), index=tickers)  # equal weights
port_rets = (rets * w).sum(axis=1)
run_normality_tests(port_rets)

{'Shapiro': np.float64(2.5517133233339107e-40),
 'Anderson': np.float64(0.15),
 'Jarque‑Bera': np.float64(0.0),
 'KS': np.float64(4.3288609453078797e-17),
 'n': 3893,
 'Accept': False}

## Interpretation
* **Most daily log‑return series reject normality** over long horizons because of heavy tails and volatility clustering.
* **Calm sub‑periods** (occasionally visible in the rolling scan) *can* pass stringent tests.
* **Tail trimming** mainly addresses excess kurtosis and often helps.
* **Portfolio aggregation** (diversification) plus the central‑limit effect makes the distribution more Gaussian.

### Ideas for extension
* Replace daily with **weekly** or **monthly** frequency and re‑test.
* **Volatility‑scale** returns by dividing by an intraday or implied‑vol measure.
* Fit and compare **t‑distributions** or **skew‑t** alternatives.
