## Notebook 06 – Bootstrap Validation

This notebook performs boostrap validation of long vs short momentum returns conditioned on volatility and dispersion regimes. The bootstrap function returns mean difference, lower and upper confidence intervals, as well as p-value for the differences.

### Step 0 - Import packages and functions

In [1]:
import sys, os
sys.path.append(os.path.abspath("../src"))

from stats_helpers import bootstrap_ci, bootstrap_mean_diff

import pandas as pd
import numpy as np



### Step 1 - Import long-short returns by low vs high volatility and dispersion regimes.

In [2]:
# Import walk-forward long-short returns in low volatility regime
low = pd.read_parquet("../data/processed/wf_mom_low.parquet")["wf_mom_low_vol"]

# Import walk-forward long-short returns in high volatility regime
high = pd.read_parquet("../data/processed/wf_mom_high.parquet")["wf_mom_high_vol"]

# Import walk-forward long-short returns in low dispersion regime
low_disp = pd.read_parquet("../data/processed/wf_mom_disp_low.parquet")["wf_mom_disp_low_vol"]

# Import walk-forward long-short returns in high dispersion regime
high_disp = pd.read_parquet("../data/processed/wf_mom_disp_high.parquet")["wf_mom_disp_high_vol"]

### Step 2 - Conduct bootstrap validation on long-short returns by volatility regime

In [5]:
# Set Seed for replicability
np.random.seed(0)

# Bootstrap on low volatility returns
ci_low = bootstrap_ci(low)

# Bootstrap on high volatility returns
ci_high = bootstrap_ci(high)

# Bootstrap on difference between low and high volatility regime returns
diff_result = bootstrap_mean_diff(low, high)

print("Low Vol Regime:", ci_low) # Mean + Confidence interval for low volatility regime returns
print("High Vol Regime:", ci_high) # Mean + Confidence interval for high volatility regime returns
print("Low vs High Vol Difference:", diff_result) # Mean + Confidence interval for volatility regime return differences

Low Vol Regime: {'mean': 0.0003365919609624065, 'ci_lower': -0.00022868157723523433, 'ci_upper': 0.0008854924857720503}
High Vol Regime: {'mean': -0.0003741125617997544, 'ci_lower': -0.0015388415861314437, 'ci_upper': 0.0008574853711580156}
Low vs High Vol Difference: {'mean_diff': 0.000710704522762161, 'ci_lower': -0.0006078240770691856, 'ci_upper': 0.0019478905173025902, 'p_value': 0.514}


### Step 3 - Conduct bootstrap validation on long-short returns by dispersion regime

In [6]:
# Set Seed for replicability
np.random.seed(0)

# Bootstrap on low dispersion returns
ci_low_disp = bootstrap_ci(low_disp)

# Bootstrap on high dispersion returns
ci_high_disp = bootstrap_ci(high_disp)

# Bootstrap on difference between low and high dispersion regime returns
diff_result_disp = bootstrap_mean_diff(low_disp, high_disp)

print("Low Disp Regime:", ci_low_disp) # Mean + Confidence interval for low dispersion regime returns
print("High Disp Regime:", ci_high_disp) # Mean + Confidence interval for low dispersion regime returns
print("Low vs High Disp Difference:", diff_result_disp) # Mean + Confidence interval for dispersion regime return differences

Low Disp Regime: {'mean': 0.00020150910816641812, 'ci_lower': -0.00034550971752278313, 'ci_upper': 0.0007285933788388281}
High Disp Regime: {'mean': 8.67800443106957e-05, 'ci_lower': -0.001017544174076189, 'ci_upper': 0.0012228705453475432}
Low vs High Disp Difference: {'mean_diff': 0.00011472906385572242, 'ci_lower': -0.0011715538600531683, 'ci_upper': 0.0014323217054292227, 'p_value': 0.871}
