# VIX-Based Continuous Regression Analysis

This notebook estimates a continuous model where the regime dummies are replaced by the **VIX index** (CBOE Volatility Index). [cite_start]The VIX serves as a continuous proxy for market stress and uncertainty[cite: 21].

## Model Specification

We estimate the following regression model:

$$R_{t} = \alpha + \theta \cdot VIX_{t-1} + \beta \cdot Factors_t + \eta \cdot (VIX_{t-1} \times Factors_t) + \epsilon_t$$

### Variable Definitions

- [cite_start]**$R_t$**: Monthly long-short anomaly return[cite: 8].
- [cite_start]**$VIX_{t-1}$**: The **lagged** and **standardized** CBOE Volatility Index[cite: 19, 21].
- [cite_start]**$Factors_t$**: Vector of Famaâ€“French three factors (MKT, SMB, HML)[cite: 11].
- [cite_start]**$\theta$**: Coefficient measuring the direct effect of market stress (VIX) on anomaly returns[cite: 22].
- [cite_start]**$\eta$**: Vector of interaction coefficients capturing how the factor loadings vary with the level of market stress[cite: 23].


In [1]:
import pandas as pd
import numpy as np
import statsmodels.api as sm
from IPython.display import display

# Suppress warnings for cleaner output
import warnings
warnings.filterwarnings('ignore')

In [2]:
# 1. Load VIX Data
vix_df = pd.read_csv('VIX Regression Data/VIXCLS.csv', parse_dates=['observation_date'], index_col='observation_date')

# Handle potential non-numeric data (replace '.' with NaN if exists)
vix_df['VIXCLS'] = pd.to_numeric(vix_df['VIXCLS'], errors='coerce')

# 2. Standardization
vix_mean = vix_df['VIXCLS'].mean()
vix_std = vix_df['VIXCLS'].std()
vix_df['VIX_Std'] = (vix_df['VIXCLS'] - vix_mean) / vix_std

# 3. Lagging
# The model uses VIX_{t-1} (lagged by one month)
vix_df['VIX_Lagged_Std'] = vix_df['VIX_Std'].shift(1)

# Display to verify
print("VIX Data Preview (Standardized and Lagged):")
display(vix_df[['VIXCLS', 'VIX_Std', 'VIX_Lagged_Std']].head())

VIX Data Preview (Standardized and Lagged):


Unnamed: 0_level_0,VIXCLS,VIX_Std,VIX_Lagged_Std
observation_date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
2000-01-01,24.95,0.647924,
2000-02-01,23.37,0.446203,0.647924
2000-03-01,24.11,0.54068,0.446203
2000-04-01,26.2,0.807514,0.54068
2000-05-01,23.65,0.481951,0.807514


In [3]:
# Load Excess Returns and Fama-French Factors
excess_returns = pd.read_excel('./Regression Data/excess_returns.xlsx', index_col=0, parse_dates=True)
ff_factors = pd.read_excel('./Regression Data/fama_french_factors.xlsx', index_col='date', parse_dates=True)

# 1. Convert Returns and Factors to Period (safe because they are freshly loaded above)
excess_returns.index = excess_returns.index.to_period('M')
ff_factors.index = ff_factors.index.to_period('M')

# 2. Convert VIX to Period (Safely)
# vix_df is loaded in the previous cell, so we check if it's already a PeriodIndex
if not isinstance(vix_df.index, pd.PeriodIndex):
    vix_df.index = vix_df.index.to_period('M')

print("Data Loaded and Indices Aligned.")

Data Loaded and Indices Aligned.


In [4]:
# Select required columns
ff_subset = ff_factors[['Mkt-RF', 'SMB', 'HML']]
vix_subset = vix_df[['VIX_Lagged_Std']]

# Merge all datasets
df_vix_model = pd.merge(excess_returns, ff_subset, left_index=True, right_index=True, how='inner')
df_vix_model = pd.merge(df_vix_model, vix_subset, left_index=True, right_index=True, how='inner')

# Drop any rows with NaN (likely the first row due to lagging)
df_vix_model.dropna(inplace=True)

display(df_vix_model.head())

Unnamed: 0,Accruals,Asset Growth,BM,Gross Profit,Momentum,Leverage Ret,Regime,Mkt-RF,SMB,HML,VIX_Lagged_Std
2003-01,-0.006968,0.012729,-0.003569,-0.013421,-0.048965,0.013673,Pre-Crisis,-0.0273,0.0187,0.0084,1.11648
2003-02,-0.033603,0.009119,-0.044708,0.02196,0.050196,-0.005761,Pre-Crisis,-0.0184,0.0048,0.0167,1.442043
2003-03,-0.038969,-0.003821,-0.034231,-0.012498,-0.006273,0.000917,Pre-Crisis,-0.0035,0.0048,-0.0106,1.245428
2003-04,0.001758,0.048677,0.101618,-0.10985,-0.119545,0.007383,Pre-Crisis,0.087,-0.0005,-0.0076,1.184146
2003-05,0.067344,0.076671,0.105503,-0.170907,-0.327041,0.00173,Pre-Crisis,0.0658,0.0321,0.0048,0.170432


In [5]:
def run_vix_regressions(data, anomalies):
    results_list = []
    
    # Define Core Variables
    factors = ['Mkt-RF', 'SMB', 'HML']
    vix_col = 'VIX_Lagged_Std'
    
    # Create Interaction Terms: VIX_Lagged * Factor
    # This captures the 'eta' coefficients (change in loading due to stress)
    interaction_cols = []
    for f in factors:
        int_col = f"VIX_x_{f}"
        data[int_col] = data[vix_col] * data[f]
        interaction_cols.append(int_col)
    
    # Independent Variables: Constant + Factors + VIX + Interactions
    X_cols = factors + [vix_col] + interaction_cols
    X = sm.add_constant(data[X_cols])
    
    for anomaly in anomalies:
        y = data[anomaly]
        
        # Fit OLS with Newey-West (HAC) standard errors
        model = sm.OLS(y, X).fit(cov_type='HAC', cov_kwds={'maxlags': 3})
        
        # Store Results
        results_list.append({
            'Anomaly': anomaly,
            'Alpha': model.params['const'],
            'Alpha_P_Value': model.pvalues['const'],
            
            # Direct VIX Effect (Theta)
            'Theta_VIX': model.params[vix_col],
            'Theta_P_Value': model.pvalues[vix_col],
            
            # Base Factor Loadings (Beta)
            'Beta_MKT': model.params['Mkt-RF'],
            'Beta_P_Value_MKT': model.pvalues['Mkt-RF'],
            'Beta_SMB': model.params['SMB'],
            'Beta_P_Value_SMB': model.pvalues['SMB'],
            'Beta_HML': model.params['HML'],
            'Beta_P_Value_HML': model.pvalues['HML'],
            
            # Interaction Coefficients (Eta)
            'Eta_MKT (VIX*MKT)': model.params['VIX_x_Mkt-RF'],
            'Eta_P_Value_MKT': model.pvalues['VIX_x_Mkt-RF'],
            'Eta_SMB (VIX*SMB)': model.params['VIX_x_SMB'],
            'Eta_P_Value_SMB': model.pvalues['VIX_x_SMB'],
            'Eta_HML (VIX*HML)': model.params['VIX_x_HML'],
            'Eta_P_Value_HML': model.pvalues['VIX_x_HML'],
            'Adj_R2': model.rsquared_adj
        })
        
    return pd.DataFrame(results_list)

In [6]:
# List of anomalies (Dependent Variables)
anomalies_list = ['Accruals', 'Asset Growth', 'BM', 'Gross Profit', 'Momentum', 'Leverage Ret']

print("Running VIX-Based Interaction Model...")
vix_results = run_vix_regressions(df_vix_model, anomalies_list)

# Export and Display
vix_results.to_excel('./VIX Regression Results/vix_model_results.xlsx', index=False)
display(round(vix_results, 4))

Running VIX-Based Interaction Model...


Unnamed: 0,Anomaly,Alpha,Alpha_P_Value,Theta_VIX,Theta_P_Value,Beta_MKT,Beta_P_Value_MKT,Beta_SMB,Beta_P_Value_SMB,Beta_HML,Beta_P_Value_HML,Eta_MKT (VIX*MKT),Eta_P_Value_MKT,Eta_SMB (VIX*SMB),Eta_P_Value_SMB,Eta_HML (VIX*HML),Eta_P_Value_HML,Adj_R2
0,Accruals,-0.0012,0.5594,-0.0029,0.0749,0.178,0.0129,0.1917,0.156,-0.4078,0.028,-0.0385,0.1808,0.0856,0.3366,-0.2812,0.0072,0.1599
1,Asset Growth,-0.0035,0.1637,0.001,0.607,0.2213,0.0106,0.1776,0.108,0.7999,0.0,-0.1213,0.0109,-0.0998,0.4941,-0.1188,0.4518,0.2022
2,BM,0.001,0.7662,0.001,0.6257,0.2687,0.0009,0.7457,0.0,0.9869,0.0,0.0244,0.5152,-0.1784,0.1317,-0.1367,0.3343,0.3188
3,Gross Profit,0.0019,0.5326,0.0001,0.9558,-0.2472,0.0414,-0.6737,0.0086,0.0407,0.8774,-0.0466,0.3612,-0.1329,0.4858,0.1328,0.3338,0.1612
4,Momentum,0.0023,0.6739,-0.015,0.1564,-0.1352,0.3976,-1.1043,0.0012,-0.8683,0.0446,-0.1032,0.4587,-0.5502,0.2872,-0.2466,0.6218,0.1821
5,Leverage Ret,0.0011,0.3941,0.0042,0.017,-0.0597,0.257,-0.1499,0.0155,0.2202,0.0321,-0.0187,0.5602,-0.1852,0.1526,0.2335,0.0066,0.2207
