# Statistical Tests for Financial Time Series

This notebook demonstrates the use of statistical tests in the MFE Toolbox for analyzing financial time series data. Statistical tests are essential for validating model assumptions, checking distribution properties, and detecting patterns in financial data.

The MFE Toolbox provides a comprehensive set of statistical tests implemented in Python with integration to NumPy, SciPy, and Pandas. Performance-critical calculations are accelerated using Numba's just-in-time compilation through `@jit` decorators.

In this notebook, we'll cover:

1. **Normality Tests**: Testing if data follows a normal distribution
2. **Distribution Tests**: Testing if data follows specific distributions
3. **Serial Correlation Tests**: Testing for autocorrelation in time series
4. **ARCH Effect Tests**: Testing for conditional heteroskedasticity
5. **Property-Based Testing**: Advanced testing techniques for distribution properties
6. **Asynchronous Testing**: Using async/await for non-blocking test execution

Let's start by importing the necessary modules:

In [None]:
# Import core packages
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import stats
import asyncio

# Import MFE Toolbox components
from mfe.models.tests import (
    jarque_bera, jarque_bera_async,
    kolmogorov_smirnov, kolmogorov_smirnov_async,
    berkowitz, berkowitz_async,
    ljung_box, ljung_box_async,
    lm_test, lm_test_async,
    pvalue_calculator
)

# Import models for generating test data
from mfe import GARCH, ARMA

# Import distributions
from mfe.models.distributions import normal, student_t, generalized_error, skewed_t

# Set up plotting
%matplotlib inline
plt.style.use('ggplot')
plt.rcParams['figure.figsize'] = (12, 6)

# Set random seed for reproducibility
np.random.seed(42)

## 1. Generating Test Data

Let's generate different types of financial time series data to test our statistical tests:

In [None]:
# Generate sample sizes
n = 1000

# 1. Normal data (Gaussian white noise)
normal_data = np.random.standard_normal(n)

# 2. Student's t-distributed data (heavy tails)
t_data = stats.t.rvs(df=5, size=n)  # t-distribution with 5 degrees of freedom

# 3. Skewed data
skewed_data = stats.skewnorm.rvs(a=5, size=n)  # Positively skewed data

# 4. AR(1) process (autocorrelated data)
ar_data = np.zeros(n)
ar_data[0] = np.random.standard_normal()
phi = 0.7  # AR coefficient
for t in range(1, n):
    ar_data[t] = phi * ar_data[t-1] + np.random.standard_normal()

# 5. GARCH(1,1) process (volatility clustering)
garch_data = np.zeros(n)
volatility = np.zeros(n)
volatility[0] = 0.01
omega, alpha, beta = 0.00001, 0.1, 0.85  # GARCH parameters
for t in range(1, n):
    volatility[t] = np.sqrt(omega + alpha * garch_data[t-1]**2 + beta * volatility[t-1]**2)
    garch_data[t] = volatility[t] * np.random.standard_normal()

# Create a DataFrame with all the data
dates = pd.date_range(start='2020-01-01', periods=n, freq='D')
df = pd.DataFrame({
    'normal': normal_data,
    't_dist': t_data,
    'skewed': skewed_data,
    'ar1': ar_data,
    'garch': garch_data
}, index=dates)

# Display the first few rows
df.head()

In [None]:
# Plot the different time series
fig, axes = plt.subplots(5, 1, figsize=(12, 15), sharex=True)

df['normal'].plot(ax=axes[0], title='Normal Data (Gaussian White Noise)')
df['t_dist'].plot(ax=axes[1], title="Student's t-Distributed Data (Heavy Tails)")
df['skewed'].plot(ax=axes[2], title='Skewed Data')
df['ar1'].plot(ax=axes[3], title='AR(1) Process (Autocorrelated Data)')
df['garch'].plot(ax=axes[4], title='GARCH(1,1) Process (Volatility Clustering)')

for ax in axes:
    ax.set_ylabel('Value')
    
axes[-1].set_xlabel('Date')
plt.tight_layout()
plt.show()

## 2. Normality Tests

### 2.1 Jarque-Bera Test

The Jarque-Bera test examines whether data has skewness and kurtosis matching a normal distribution. The test statistic is defined as:

$$JB = \frac{n}{6} \left( S^2 + \frac{(K-3)^2}{4} \right)$$

where $n$ is the sample size, $S$ is the sample skewness, and $K$ is the sample kurtosis.

Under the null hypothesis of normality, the JB statistic follows a chi-squared distribution with 2 degrees of freedom.

In [None]:
# Create a function to test and visualize Jarque-Bera results
def test_normality_jb(data, title):
    # Calculate Jarque-Bera statistic
    jb_stat, p_value = jarque_bera(data)
    
    # Calculate skewness and kurtosis for reference
    skewness = stats.skew(data)
    kurtosis = stats.kurtosis(data, fisher=False)  # Use non-Fisher definition (raw kurtosis)
    
    # Create a histogram with normal distribution overlay
    plt.figure(figsize=(12, 6))
    
    # Plot histogram
    n, bins, patches = plt.hist(data, bins=50, density=True, alpha=0.7, color='blue')
    
    # Plot normal distribution
    mu, std = np.mean(data), np.std(data)
    x = np.linspace(mu - 4*std, mu + 4*std, 100)
    plt.plot(x, stats.norm.pdf(x, mu, std), 'r-', linewidth=2, 
             label='Normal Distribution')
    
    # Add test results to the plot
    result_text = f"Jarque-Bera Statistic: {jb_stat:.4f}\n"
    result_text += f"p-value: {p_value:.4f}\n"
    result_text += f"Skewness: {skewness:.4f}\n"
    result_text += f"Kurtosis: {kurtosis:.4f}\n"
    
    if p_value < 0.05:
        result_text += "Conclusion: Reject normality (p < 0.05)"
    else:
        result_text += "Conclusion: Cannot reject normality (p ≥ 0.05)"
    
    # Add a text box with the results
    props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
    plt.text(0.05, 0.95, result_text, transform=plt.gca().transAxes, fontsize=12,
             verticalalignment='top', bbox=props)
    
    plt.title(f'Jarque-Bera Test: {title}')
    plt.xlabel('Value')
    plt.ylabel('Density')
    plt.legend()
    plt.show()
    
    return jb_stat, p_value, skewness, kurtosis


In [None]:
# Test normality for each dataset
test_normality_jb(df['normal'], 'Normal Data')

In [None]:
test_normality_jb(df['t_dist'], "Student's t-Distributed Data")

In [None]:
test_normality_jb(df['skewed'], 'Skewed Data')

### 2.2 Comparing Jarque-Bera Results Across Datasets

Let's compare the Jarque-Bera test results for all our datasets:

In [None]:
# Test all datasets and collect results
jb_results = {}
for col in df.columns:
    jb_stat, p_value, skewness, kurtosis = test_normality_jb(df[col], col)
    jb_results[col] = {
        'JB Statistic': jb_stat,
        'p-value': p_value,
        'Skewness': skewness,
        'Kurtosis': kurtosis,
        'Reject Normality': 'Yes' if p_value < 0.05 else 'No'
    }

# Create a DataFrame with the results
jb_results_df = pd.DataFrame(jb_results).T
jb_results_df

In [None]:
# Visualize the JB statistics and p-values
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Plot JB statistics
ax1.bar(jb_results_df.index, jb_results_df['JB Statistic'])
ax1.set_title('Jarque-Bera Test Statistics')
ax1.set_ylabel('JB Statistic')
ax1.set_xticklabels(jb_results_df.index, rotation=45)

# Add a horizontal line at the 5% critical value (chi-squared with 2 df)
critical_value = stats.chi2.ppf(0.95, 2)
ax1.axhline(y=critical_value, color='r', linestyle='--', 
            label=f'5% Critical Value: {critical_value:.4f}')
ax1.legend()

# Plot p-values
ax2.bar(jb_results_df.index, jb_results_df['p-value'])
ax2.set_title('Jarque-Bera p-values')
ax2.set_ylabel('p-value')
ax2.set_xticklabels(jb_results_df.index, rotation=45)

# Add a horizontal line at the 5% significance level
ax2.axhline(y=0.05, color='r', linestyle='--', label='5% Significance Level')
ax2.legend()

plt.tight_layout()
plt.show()

### 2.3 Asynchronous Jarque-Bera Test

The MFE Toolbox provides asynchronous versions of statistical tests for non-blocking execution. This is particularly useful for long-running tests or when processing multiple datasets in parallel.

In [None]:
# Define an asynchronous function to run Jarque-Bera tests
async def run_async_jb_tests(data_dict):
    results = {}
    
    # Create tasks for all datasets
    tasks = {
        name: jarque_bera_async(data) for name, data in data_dict.items()
    }
    
    # Wait for all tasks to complete
    for name, task in tasks.items():
        jb_stat, p_value = await task
        results[name] = {
            'JB Statistic': jb_stat,
            'p-value': p_value,
            'Reject Normality': 'Yes' if p_value < 0.05 else 'No'
        }
    
    return pd.DataFrame(results).T

# Run the asynchronous tests
async_results = await run_async_jb_tests(df.to_dict('series'))
async_results

## 3. Distribution Tests

### 3.1 Kolmogorov-Smirnov Test

The Kolmogorov-Smirnov (KS) test compares a sample with a reference probability distribution. The test statistic is defined as the maximum absolute difference between the empirical CDF of the sample and the reference CDF:

$$D_n = \sup_x |F_n(x) - F(x)|$$

where $F_n(x)$ is the empirical CDF and $F(x)$ is the reference CDF.

In [None]:
# Create a function to test and visualize Kolmogorov-Smirnov results
def test_distribution_ks(data, cdf, cdf_name, title):
    # Calculate KS statistic
    ks_stat, p_value = kolmogorov_smirnov(data, cdf)
    
    # Create a plot to visualize the empirical CDF vs. the reference CDF
    plt.figure(figsize=(12, 6))
    
    # Sort the data for ECDF
    sorted_data = np.sort(data)
    
    # Calculate ECDF
    ecdf = np.arange(1, len(sorted_data) + 1) / len(sorted_data)
    
    # Calculate reference CDF values
    ref_cdf = np.array([cdf(x) for x in sorted_data])
    
    # Plot ECDFs
    plt.plot(sorted_data, ecdf, 'b-', linewidth=2, label='Empirical CDF')
    plt.plot(sorted_data, ref_cdf, 'r-', linewidth=2, label=f'Reference CDF ({cdf_name})')
    
    # Find the point of maximum difference
    max_diff_idx = np.argmax(np.abs(ecdf - ref_cdf))
    max_diff_x = sorted_data[max_diff_idx]
    max_diff_y1 = ecdf[max_diff_idx]
    max_diff_y2 = ref_cdf[max_diff_idx]
    
    # Plot the maximum difference
    plt.plot([max_diff_x, max_diff_x], [max_diff_y1, max_diff_y2], 'g-', linewidth=2, 
             label=f'Max Difference: {ks_stat:.4f}')
    plt.scatter([max_diff_x], [max_diff_y1], color='green', s=50)
    plt.scatter([max_diff_x], [max_diff_y2], color='green', s=50)
    
    # Add test results to the plot
    result_text = f"KS Statistic: {ks_stat:.4f}\n"
    result_text += f"p-value: {p_value:.4f}\n"
    
    if p_value < 0.05:
        result_text += f"Conclusion: Reject {cdf_name} distribution (p < 0.05)\n"
    else:
        result_text += f"Conclusion: Cannot reject {cdf_name} distribution (p ≥ 0.05)\n"
    
    # Add a text box with the results
    props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
    plt.text(0.05, 0.95, result_text, transform=plt.gca().transAxes, fontsize=12,
             verticalalignment='top', bbox=props)
    
    plt.title(f'Kolmogorov-Smirnov Test: {title}')
    plt.xlabel('Value')
    plt.ylabel('Cumulative Probability')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()
    
    return ks_stat, p_value


In [None]:
# Test normal data against normal distribution
# Create a normal CDF function
normal_cdf = lambda x: stats.norm.cdf(x, loc=np.mean(df['normal']), scale=np.std(df['normal']))

test_distribution_ks(df['normal'], normal_cdf, 'Normal', 'Normal Data vs. Normal Distribution')

In [None]:
# Test t-distributed data against normal distribution (should reject)
t_mean, t_std = np.mean(df['t_dist']), np.std(df['t_dist'])
normal_cdf_for_t = lambda x: stats.norm.cdf(x, loc=t_mean, scale=t_std)

test_distribution_ks(df['t_dist'], normal_cdf_for_t, 'Normal', "t-Distributed Data vs. Normal Distribution")

In [None]:
# Test t-distributed data against t distribution (should not reject)
t_cdf = lambda x: stats.t.cdf(x, df=5, loc=t_mean, scale=t_std)

test_distribution_ks(df['t_dist'], t_cdf, 't(5)', "t-Distributed Data vs. t(5) Distribution")

### 3.2 Berkowitz Test

The Berkowitz test is particularly useful for evaluating density forecasts. It transforms the data using a specified CDF and then tests whether the transformed data follows a standard normal distribution using an AR(1) model.

In [None]:
# Create a function to test and visualize Berkowitz test results
def test_distribution_berkowitz(data, cdf, cdf_name, title):
    # Calculate Berkowitz statistic
    berk_stat, p_value = berkowitz(data, cdf)
    
    # Transform the data using the CDF and then the inverse normal CDF
    u = np.array([cdf(x) for x in data])
    z = stats.norm.ppf(u)
    
    # Create plots to visualize the transformation
    fig, (ax1, ax2, ax3) = plt.subplots(1, 3, figsize=(18, 6))
    
    # Plot original data histogram
    ax1.hist(data, bins=50, density=True, alpha=0.7, color='blue')
    ax1.set_title('Original Data')
    ax1.set_xlabel('Value')
    ax1.set_ylabel('Density')
    
    # Plot uniform transformation (PIT)
    ax2.hist(u, bins=50, density=True, alpha=0.7, color='green')
    ax2.plot([0, 1], [1, 1], 'r-', linewidth=2, label='Uniform(0,1)')
    ax2.set_title('Probability Integral Transform (PIT)')
    ax2.set_xlabel('Probability')
    ax2.set_ylabel('Density')
    ax2.legend()
    ax2.set_xlim(0, 1)
    
    # Plot normal transformation
    ax3.hist(z, bins=50, density=True, alpha=0.7, color='purple')
    x = np.linspace(-4, 4, 100)
    ax3.plot(x, stats.norm.pdf(x), 'r-', linewidth=2, label='Standard Normal')
    ax3.set_title('Normal Quantile Transform')
    ax3.set_xlabel('Value')
    ax3.set_ylabel('Density')
    ax3.legend()
    
    plt.suptitle(f'Berkowitz Test: {title}', fontsize=16)
    plt.tight_layout(rect=[0, 0, 1, 0.95])
    plt.show()
    
    # Display test results
    print(f"Berkowitz Test Results for {title}:")
    print(f"Test Statistic: {berk_stat:.4f}")
    print(f"p-value: {p_value:.4f}")
    
    if p_value < 0.05:
        print(f"Conclusion: Reject {cdf_name} distribution (p < 0.05)")
    else:
        print(f"Conclusion: Cannot reject {cdf_name} distribution (p ≥ 0.05)")
    print("
")
    
    return berk_stat, p_value


In [None]:
# Test normal data against normal distribution
test_distribution_berkowitz(df['normal'], normal_cdf, 'Normal', 'Normal Data vs. Normal Distribution')

In [None]:
# Test t-distributed data against normal distribution (should reject)
test_distribution_berkowitz(df['t_dist'], normal_cdf_for_t, 'Normal', "t-Distributed Data vs. Normal Distribution")

In [None]:
# Test t-distributed data against t distribution (should not reject)
test_distribution_berkowitz(df['t_dist'], t_cdf, 't(5)', "t-Distributed Data vs. t(5) Distribution")

## 4. Serial Correlation Tests

### 4.1 Ljung-Box Test

The Ljung-Box test examines whether there is significant autocorrelation in a time series up to a specified number of lags. The test statistic is defined as:

$$Q = n(n+2) \sum_{k=1}^{h} \frac{\hat{\rho}_k^2}{n-k}$$

where $n$ is the sample size, $h$ is the number of lags, and $\hat{\rho}_k$ is the sample autocorrelation at lag $k$.

In [None]:
# Create a function to test and visualize Ljung-Box results
def test_autocorrelation_lb(data, lags, title):
    # Calculate Ljung-Box statistic
    lb_stat, p_value = ljung_box(data, lags=lags)
    
    # Calculate autocorrelation function
    acf_values = np.array([1.0] + [np.corrcoef(data[:-i], data[i:])[0, 1] for i in range(1, lags+1)])
    
    # Create a plot to visualize the ACF
    plt.figure(figsize=(12, 6))
    
    # Plot ACF
    plt.bar(range(lags+1), acf_values, width=0.3, color='blue', alpha=0.7)
    plt.axhline(y=0, color='black', linestyle='-', alpha=0.3)
    
    # Add confidence bands
    conf_level = 1.96 / np.sqrt(len(data))  # 95% confidence bands
    plt.axhline(y=conf_level, color='red', linestyle='--', alpha=0.7, 
                label=f'95% Confidence Bands (±{conf_level:.4f})')
    plt.axhline(y=-conf_level, color='red', linestyle='--', alpha=0.7)
    
    # Add test results to the plot
    result_text = f"Ljung-Box Statistic (Q): {lb_stat:.4f}\n"
    result_text += f"p-value: {p_value:.4f}\n"
    result_text += f"Lags: {lags}\n"
    
    if p_value < 0.05:
        result_text += "Conclusion: Reject no autocorrelation (p < 0.05)"
    else:
        result_text += "Conclusion: Cannot reject no autocorrelation (p ≥ 0.05)"
    
    # Add a text box with the results
    props = dict(boxstyle='round', facecolor='wheat', alpha=0.5)
    plt.text(0.05, 0.95, result_text, transform=plt.gca().transAxes, fontsize=12,
             verticalalignment='top', bbox=props)
    
    plt.title(f'Ljung-Box Test: {title}')
    plt.xlabel('Lag')
    plt.ylabel('Autocorrelation')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.xticks(range(lags+1))
    plt.xlim(-0.5, lags+0.5)
    plt.show()
    
    return lb_stat, p_value, acf_values


In [None]:
# Test white noise data (should not reject)
test_autocorrelation_lb(df['normal'], lags=20, title='Normal Data (White Noise)')

In [None]:
# Test AR(1) process (should reject)
test_autocorrelation_lb(df['ar1'], lags=20, title='AR(1) Process')

In [None]:
# Test GARCH process returns (should not show significant autocorrelation)
test_autocorrelation_lb(df['garch'], lags=20, title='GARCH(1,1) Process Returns')

In [None]:
# Test GARCH process squared returns (should show significant autocorrelation)
test_autocorrelation_lb(df['garch']**2, lags=20, title='GARCH(1,1) Process Squared Returns')

### 4.2 Ljung-Box Test with Multiple Lags

Let's examine how the Ljung-Box test results change with different lag specifications:

In [None]:
# Test with multiple lags
lag_values = [5, 10, 15, 20, 25, 30]
lb_results = {}

for series_name in ['normal', 'ar1', 'garch', 'garch_squared']:
    # Use the original series or create squared returns for GARCH
    if series_name == 'garch_squared':
        series = df['garch']**2
        display_name = 'GARCH Squared Returns'
    else:
        series = df[series_name]
        display_name = series_name.capitalize()
    
    # Test with different lags
    lb_results[display_name] = {}
    for lag in lag_values:
        lb_stat, p_value = ljung_box(series, lags=lag)
        lb_results[display_name][lag] = {
            'Q-statistic': lb_stat,
            'p-value': p_value,
            'Reject': 'Yes' if p_value < 0.05 else 'No'
        }

# Create a DataFrame with p-values for different lags
p_values = {}
for series_name, results in lb_results.items():
    p_values[series_name] = [results[lag]['p-value'] for lag in lag_values]

p_values_df = pd.DataFrame(p_values, index=lag_values)
p_values_df.index.name = 'Lags'
p_values_df

In [None]:
# Visualize p-values for different lags
plt.figure(figsize=(12, 6))

for col in p_values_df.columns:
    plt.plot(p_values_df.index, p_values_df[col], marker='o', label=col)

plt.axhline(y=0.05, color='r', linestyle='--', label='5% Significance Level')
plt.title('Ljung-Box Test p-values for Different Lags')
plt.xlabel('Number of Lags')
plt.ylabel('p-value')
plt.legend()
plt.grid(True, alpha=0.3)
plt.xticks(lag_values)
plt.show()

## 5. ARCH Effect Tests

### 5.1 Lagrange Multiplier (LM) Test

The Lagrange Multiplier test examines whether there are ARCH effects (conditional heteroskedasticity) in a time series. It regresses squared residuals on lagged squared residuals and tests the joint significance of all lagged squared residuals.

In [None]:
# Create a function to test and visualize LM test results
def test_arch_effects_lm(data, lags, title):
    # Calculate LM statistic
    lm_stat, p_value = lm_test(data, lags=lags)
    
    # Calculate autocorrelation function of squared returns
    squared_data = data**2
    acf_values = np.array([1.0] + [np.corrcoef(squared_data[:-i], squared_data[i:])[0, 1] 
                                   for i in range(1, lags+1)])
    
    # Create plots to visualize the data and squared data
    fig, (ax1, ax2, ax3) = plt.subplots(3, 1, figsize=(12, 12))
    
    # Plot the original time series
    ax1.plot(range(len(data)), data)
    ax1.set_title(f'Original Time Series: {title}')
    ax1.set_xlabel('Time')
    ax1.set_ylabel('Value')
    
    # Plot the squared time series
    ax2.plot(range(len(squared_data)), squared_data)
    ax2.set_title(f'Squared Time Series: {title}')
    ax2.set_xlabel('Time')
    ax2.set_ylabel('Squared Value')
    
    # Plot ACF of squared data
    ax3.bar(range(lags+1), acf_values, width=0.3, color='blue', alpha=0.7)
    ax3.axhline(y=0, color='black', linestyle='-', alpha=0.3)
    
    # Add confidence bands
    conf_level = 1.96 / np.sqrt(len(data))  # 95% confidence bands
    ax3.axhline(y=conf_level, color='red', linestyle='--', alpha=0.7, 
                label=f'95% Confidence Bands (±{conf_level:.4f})')
    ax3.axhline(y=-conf_level, color='red', linestyle='--', alpha=0.7)
    
    ax3.set_title(f'ACF of Squared Time Series: {title}')
    ax3.set_xlabel('Lag')
    ax3.set_ylabel('Autocorrelation')
    ax3.legend()
    ax3.grid(True, alpha=0.3)
    ax3.set_xticks(range(lags+1))
    ax3.set_xlim(-0.5, lags+0.5)
    
    plt.tight_layout()
    plt.show()
    
    # Display test results
    print(f"Lagrange Multiplier (ARCH) Test Results for {title}:")
    print(f"LM Statistic: {lm_stat:.4f}")
    print(f"p-value: {p_value:.4f}")
    print(f"Lags: {lags}")
    
    if p_value < 0.05:
        print("Conclusion: Reject no ARCH effects (p < 0.05)")
    else:
        print("Conclusion: Cannot reject no ARCH effects (p ≥ 0.05)")
    print("
")
    
    return lm_stat, p_value, acf_values


In [None]:
# Test white noise data (should not reject)
test_arch_effects_lm(df['normal'], lags=10, title='Normal Data (White Noise)')

In [None]:
# Test GARCH process (should reject)
test_arch_effects_lm(df['garch'], lags=10, title='GARCH(1,1) Process')

### 5.2 Comparing LM Test Results Across Datasets

In [None]:
# Test all datasets with LM test
lm_results = {}
for col in df.columns:
    lm_stat, p_value, _ = test_arch_effects_lm(df[col], lags=10, title=col)
    lm_results[col] = {
        'LM Statistic': lm_stat,
        'p-value': p_value,
        'Reject No ARCH': 'Yes' if p_value < 0.05 else 'No'
    }

# Create a DataFrame with the results
lm_results_df = pd.DataFrame(lm_results).T
lm_results_df

In [None]:
# Visualize the LM statistics and p-values
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Plot LM statistics
ax1.bar(lm_results_df.index, lm_results_df['LM Statistic'])
ax1.set_title('LM Test Statistics')
ax1.set_ylabel('LM Statistic')
ax1.set_xticklabels(lm_results_df.index, rotation=45)

# Add a horizontal line at the 5% critical value (chi-squared with 10 df)
critical_value = stats.chi2.ppf(0.95, 10)  # 10 lags
ax1.axhline(y=critical_value, color='r', linestyle='--', 
            label=f'5% Critical Value: {critical_value:.4f}')
ax1.legend()

# Plot p-values
ax2.bar(lm_results_df.index, lm_results_df['p-value'])
ax2.set_title('LM Test p-values')
ax2.set_ylabel('p-value')
ax2.set_xticklabels(lm_results_df.index, rotation=45)

# Add a horizontal line at the 5% significance level
ax2.axhline(y=0.05, color='r', linestyle='--', label='5% Significance Level')
ax2.legend()

plt.tight_layout()
plt.show()

## 6. Model Residual Diagnostics

Statistical tests are often used to validate model assumptions by testing the residuals. Let's fit some models to our data and test the residuals.

In [None]:
# Fit an AR(1) model to the AR(1) process
ar_model = ARMA(ar_order=1, ma_order=0, include_constant=True)
ar_results = ar_model.fit(df['ar1'])

# Get the residuals
ar_residuals = ar_results.residuals

# Display model summary
print(ar_results)

In [None]:
# Test the residuals for normality
test_normality_jb(ar_residuals, 'AR(1) Model Residuals')

In [None]:
# Test the residuals for autocorrelation
test_autocorrelation_lb(ar_residuals, lags=20, title='AR(1) Model Residuals')

In [None]:
# Test the residuals for ARCH effects
test_arch_effects_lm(ar_residuals, lags=10, title='AR(1) Model Residuals')

In [None]:
# Fit a GARCH(1,1) model to the GARCH process
garch_model = GARCH(p=1, q=1, mean='zero')
garch_results = garch_model.fit(df['garch'])

# Get the standardized residuals
garch_std_residuals = garch_results.standardized_residuals

# Display model summary
print(garch_results)

In [None]:
# Test the standardized residuals for normality
test_normality_jb(garch_std_residuals, 'GARCH(1,1) Model Standardized Residuals')

In [None]:
# Test the standardized residuals for autocorrelation
test_autocorrelation_lb(garch_std_residuals, lags=20, title='GARCH(1,1) Model Standardized Residuals')

In [None]:
# Test the squared standardized residuals for ARCH effects
test_arch_effects_lm(garch_std_residuals, lags=10, title='GARCH(1,1) Model Standardized Residuals')

## 7. Property-Based Testing

Property-based testing is a powerful technique for validating statistical properties across a wide range of inputs. Let's demonstrate how to use property-based testing to verify distribution properties.

In [None]:
# Define a function to test distribution properties
def test_distribution_properties(dist_name, dist_params, n_samples=1000, n_tests=100):
    # Create the distribution
    if dist_name == 'normal':
        dist = normal.Normal(**dist_params)
    elif dist_name == 'student_t':
        dist = student_t.StudentT(**dist_params)
    elif dist_name == 'generalized_error':
        dist = generalized_error.GeneralizedError(**dist_params)
    elif dist_name == 'skewed_t':
        dist = skewed_t.SkewedT(**dist_params)
    else:
        raise ValueError(f"Unknown distribution: {dist_name}")
    
    # Define properties to test
    properties = [
        {
            'name': 'PDF integrates to 1',
            'test': lambda: np.abs(np.trapz(dist.pdf(np.linspace(-10, 10, 1000)), 
                                           np.linspace(-10, 10, 1000)) - 1) < 0.01
        },
        {
            'name': 'CDF is monotonically increasing',
            'test': lambda: np.all(np.diff(dist.cdf(np.linspace(-10, 10, 1000))) >= 0)
        },
        {
            'name': 'CDF(x) = P(X ≤ x)',
            'test': lambda: test_cdf_property(dist, n_samples)
        },
        {
            'name': 'PDF is non-negative',
            'test': lambda: np.all(dist.pdf(np.linspace(-10, 10, 1000)) >= 0)
        },
        {
            'name': 'PPF(CDF(x)) ≈ x',
            'test': lambda: test_ppf_cdf_property(dist)
        }
    ]
    
    # Run the tests
    results = {}
    for prop in properties:
        success_count = 0
        for _ in range(n_tests):
            try:
                if prop['test']():
                    success_count += 1
            except Exception as e:
                print(f"Error testing {prop['name']}: {e}")
        
        results[prop['name']] = {
            'success_rate': success_count / n_tests,
            'passed': success_count == n_tests
        }
    
    return results

# Helper function to test CDF property
def test_cdf_property(dist, n_samples):
    # Generate random samples
    samples = dist.rvs(size=n_samples)
    
    # Test points
    test_points = np.linspace(np.percentile(samples, 5), np.percentile(samples, 95), 10)
    
    # For each test point, check if CDF(x) ≈ proportion of samples ≤ x
    for x in test_points:
        cdf_value = dist.cdf(x)
        empirical_cdf = np.mean(samples <= x)
        if abs(cdf_value - empirical_cdf) > 0.05:  # Allow 5% error due to sampling
            return False
    
    return True

# Helper function to test PPF(CDF(x)) ≈ x property
def test_ppf_cdf_property(dist):
    # Test points
    test_points = np.linspace(-3, 3, 10)
    
    # For each test point, check if PPF(CDF(x)) ≈ x
    for x in test_points:
        y = dist.ppf(dist.cdf(x))
        if abs(x - y) > 0.01:  # Allow small numerical error
            return False
    
    return True


In [None]:
# Test normal distribution properties
normal_props = test_distribution_properties('normal', {'mu': 0, 'sigma': 1})

# Test Student's t distribution properties
t_props = test_distribution_properties('student_t', {'nu': 5})

# Test generalized error distribution properties
ged_props = test_distribution_properties('generalized_error', {'nu': 1.5})

# Test skewed t distribution properties
skewed_t_props = test_distribution_properties('skewed_t', {'nu': 5, 'lambda': 0.3})

# Combine results
all_props = {
    'Normal': normal_props,
    "Student's t": t_props,
    'Generalized Error': ged_props,
    'Skewed t': skewed_t_props
}

# Create a DataFrame with the results
prop_results = {}
for dist_name, props in all_props.items():
    for prop_name, result in props.items():
        if prop_name not in prop_results:
            prop_results[prop_name] = {}
        prop_results[prop_name][dist_name] = 'Pass' if result['passed'] else 'Fail'

prop_results_df = pd.DataFrame(prop_results)
prop_results_df

## 8. Asynchronous Testing

The MFE Toolbox provides asynchronous versions of all statistical tests, which can be useful for non-blocking execution in interactive environments or when processing multiple datasets in parallel.

In [None]:
# Define an asynchronous function to run multiple tests in parallel
async def run_all_tests_async(data):
    # Create tasks for all tests
    tasks = {
        'Jarque-Bera': jarque_bera_async(data),
        'Kolmogorov-Smirnov': kolmogorov_smirnov_async(data, stats.norm.cdf),
        'Ljung-Box': ljung_box_async(data, lags=10),
        'LM Test': lm_test_async(data, lags=10)
    }
    
    # Wait for all tasks to complete
    results = {}
    for name, task in tasks.items():
        stat, p_value = await task
        results[name] = {
            'Statistic': stat,
            'p-value': p_value,
            'Reject Null': 'Yes' if p_value < 0.05 else 'No'
        }
    
    return pd.DataFrame(results).T

# Run all tests asynchronously for normal data
normal_results = await run_all_tests_async(df['normal'])
normal_results

In [None]:
# Run all tests asynchronously for all datasets
async def test_all_datasets_async(data_dict):
    results = {}
    for name, data in data_dict.items():
        results[name] = await run_all_tests_async(data)
    return results

# Run the tests
all_results = await test_all_datasets_async(df.to_dict('series'))

# Display results for each dataset
for name, result in all_results.items():
    print(f"
Test Results for {name}:")
    display(result)

## 9. Conclusion

In this notebook, we've demonstrated the use of statistical tests in the MFE Toolbox for analyzing financial time series data. We've covered:

1. **Normality Tests**: Using the Jarque-Bera test to check if data follows a normal distribution
2. **Distribution Tests**: Using the Kolmogorov-Smirnov and Berkowitz tests to check if data follows specific distributions
3. **Serial Correlation Tests**: Using the Ljung-Box test to check for autocorrelation in time series
4. **ARCH Effect Tests**: Using the Lagrange Multiplier test to check for conditional heteroskedasticity
5. **Model Residual Diagnostics**: Testing model residuals to validate model assumptions
6. **Property-Based Testing**: Using property-based testing to verify distribution properties
7. **Asynchronous Testing**: Using async/await for non-blocking test execution

These statistical tests are essential tools for validating model assumptions, checking distribution properties, and detecting patterns in financial data. The MFE Toolbox provides a comprehensive set of tests with a consistent interface, making it easy to incorporate them into your financial analysis workflow.

The Python implementation leverages NumPy, SciPy, and Pandas for efficient computation and data handling, with performance-critical calculations accelerated using Numba's just-in-time compilation. The asynchronous interfaces enable non-blocking execution, which is particularly valuable in interactive environments or when processing multiple datasets in parallel.