# Method Comparison - What Works Best?

**Problem:** There are multiple ways to estimate Hurst exponent and detect long memory.

**Goal:** Try different approaches side-by-side and see which works for YOUR data.

## Methods We'll Compare:
1. **R/S Analysis** (Rescaled Range) - Classic Hurst method
2. **Climacogram** - Variance scaling
3. **Different parameters** - Does window size matter?
4. **Detrending effects** - Raw vs. detrended data

**Philosophy:** Try stuff, see what sticks!

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import signal

plt.style.use('seaborn-v0_8-notebook')
%matplotlib inline

import sys
sys.path.insert(0, '../Python')

from hurst import hurst_rs
from climacogram import compute_climacogram

print("Ready to compare methods!")

## 1. Generate Test Data

Create series with KNOWN properties to test methods.

In [None]:
np.random.seed(42)
n = 2000

# Known H=0.5 (white noise)
white_noise = np.random.randn(n)

# Known Hâ‰ˆ1.0 (random walk)
random_walk = np.cumsum(np.random.randn(n))

# Known H<0.5 (AR process with negative correlation)
ar_series = np.zeros(n)
for i in range(1, n):
    ar_series[i] = -0.3 * ar_series[i-1] + np.random.randn()

test_series = {
    'White Noise (Hâ‰ˆ0.5)': white_noise,
    'Random Walk (Hâ‰ˆ1.0)': random_walk,
    'AR(-0.3) (H<0.5)': ar_series
}

print("âœ“ Created 3 test series with known properties")

## 2. Method 1: R/S Analysis (Your Current Method)

In [None]:
print("R/S Analysis (Rescaled Range Method)")
print("=" * 60)

rs_results = {}

for name, data in test_series.items():
    result = hurst_rs(data, min_window=10, num_windows=30)
    rs_results[name] = result
    
    h = result['hurst']
    r2 = result['r_squared']
    print(f"{name:25s}  H={h:.4f}  RÂ²={r2:.4f}")

print("\nðŸ’¡ Check: Do estimated H values match expected?")

In [None]:
# Visualize R/S scaling for all
fig, axes = plt.subplots(1, 3, figsize=(16, 5))

for ax, (name, result) in zip(axes, rs_results.items()):
    ax.scatter(result['log_window_sizes'], result['log_rs_values'], 
               alpha=0.6, s=60)
    ax.plot(result['log_window_sizes'], result['fitted_log_rs'], 
            'r--', linewidth=2, label=f"H={result['hurst']:.3f}")
    ax.set_xlabel('log(Window Size)')
    ax.set_ylabel('log(R/S)')
    ax.set_title(name)
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("ðŸ‘€ Look for: Straight lines = good fit")

## 3. Method 2: Climacogram (Variance Scaling)

In [None]:
print("Climacogram (Variance Scaling Method)")
print("=" * 60)

fig, axes = plt.subplots(1, 3, figsize=(16, 5))

climaco_results = {}

for ax, (name, data) in zip(axes, test_series.items()):
    scales, variances = compute_climacogram(data, max_scale=100)
    
    # Remove NaNs for slope calculation
    valid = ~np.isnan(variances)
    log_scales = np.log10(scales[valid])
    log_vars = np.log10(variances[valid])
    
    # Fit slope
    slope = np.polyfit(log_scales, log_vars, 1)[0]
    climaco_results[name] = slope
    
    # Plot
    ax.loglog(scales, variances, 'o-', markersize=4, alpha=0.7)
    ax.set_xlabel('Scale')
    ax.set_ylabel('Variance')
    ax.set_title(f"{name}\nSlope={slope:.3f}")
    ax.grid(True, alpha=0.3, which='both')

plt.tight_layout()
plt.show()

print("\nClimatogram Slopes:")
for name, slope in climaco_results.items():
    print(f"{name:25s}  Slope={slope:.4f}")

print("\nðŸ’¡ For white noise, slope â‰ˆ -1")
print("   For H > 0.5, slope > -1 (flatter)")

## 4. Comparison: Do Methods Agree?

**Question:** Do R/S and Climacogram give similar results?

In [None]:
# Compare methods
comparison = []

for name in test_series.keys():
    comparison.append({
        'Series': name,
        'R/S Hurst': rs_results[name]['hurst'],
        'R/S RÂ²': rs_results[name]['r_squared'],
        'Climaco Slope': climaco_results[name]
    })

comp_df = pd.DataFrame(comparison)
display(comp_df)

## 5. Parameter Sensitivity Test

**Critical question:** Do results change dramatically with parameters?

**Stable = Good | Unstable = Suspicious**

In [None]:
# Test different min_window values on white noise
test_data = white_noise

min_windows = [8, 16, 32, 50, 75, 100]
sensitivity = []

for min_win in min_windows:
    try:
        result = hurst_rs(test_data, min_window=min_win, num_windows=25)
        sensitivity.append({
            'min_window': min_win,
            'Hurst': result['hurst'],
            'RÂ²': result['r_squared']
        })
    except:
        pass

sens_df = pd.DataFrame(sensitivity)

# Plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))

ax1.plot(sens_df['min_window'], sens_df['Hurst'], 'o-', markersize=10, linewidth=2)
ax1.axhline(y=0.5, color='red', linestyle='--', linewidth=2, alpha=0.5, label='Expected H=0.5')
ax1.set_xlabel('Minimum Window Size')
ax1.set_ylabel('Estimated Hurst')
ax1.set_title('Parameter Sensitivity (White Noise)')
ax1.legend()
ax1.grid(True, alpha=0.3)

ax2.plot(sens_df['min_window'], sens_df['RÂ²'], 'o-', markersize=10, linewidth=2, color='orange')
ax2.set_xlabel('Minimum Window Size')
ax2.set_ylabel('RÂ² (Fit Quality)')
ax2.set_title('Fit Quality')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Stability check
h_std = sens_df['Hurst'].std()
h_mean = sens_df['Hurst'].mean()
print(f"\nStability Analysis:")
print(f"  Mean H: {h_mean:.4f}")
print(f"  Std H: {h_std:.4f}")
print(f"  CV: {h_std/h_mean*100:.2f}%")

if h_std < 0.05:
    print("  âœ“ Very stable")
elif h_std < 0.1:
    print("  âœ“ Reasonably stable")
else:
    print("  âš  Unstable - results depend heavily on parameters!")

## 6. Detrending Effects

**Question:** Does detrending help or hurt?

In [None]:
# Create series with trend
trend = np.linspace(0, 50, n)
noise = np.random.randn(n) * 5
trending_series = trend + noise

# Detrend (remove linear trend)
detrended = signal.detrend(trending_series)

# Compare
h_raw = hurst_rs(trending_series)['hurst']
h_detrended = hurst_rs(detrended)['hurst']

print("Detrending Comparison")
print("=" * 40)
print(f"Raw (with trend):    H = {h_raw:.4f}")
print(f"Detrended:           H = {h_detrended:.4f}")
print(f"Difference:          Î”H = {abs(h_raw - h_detrended):.4f}")

# Visualize
fig, axes = plt.subplots(2, 2, figsize=(14, 8))

# Raw series
axes[0, 0].plot(trending_series, linewidth=0.8)
axes[0, 0].set_title(f'Raw (with trend) - H={h_raw:.3f}')
axes[0, 0].grid(True, alpha=0.3)

# Detrended
axes[0, 1].plot(detrended, linewidth=0.8)
axes[0, 1].set_title(f'Detrended - H={h_detrended:.3f}')
axes[0, 1].grid(True, alpha=0.3)

# R/S plots
result_raw = hurst_rs(trending_series)
result_det = hurst_rs(detrended)

axes[1, 0].scatter(result_raw['log_window_sizes'], result_raw['log_rs_values'], alpha=0.6)
axes[1, 0].plot(result_raw['log_window_sizes'], result_raw['fitted_log_rs'], 'r--', linewidth=2)
axes[1, 0].set_xlabel('log(Window)')
axes[1, 0].set_ylabel('log(R/S)')
axes[1, 0].set_title('Raw R/S Scaling')
axes[1, 0].grid(True, alpha=0.3)

axes[1, 1].scatter(result_det['log_window_sizes'], result_det['log_rs_values'], alpha=0.6)
axes[1, 1].plot(result_det['log_window_sizes'], result_det['fitted_log_rs'], 'r--', linewidth=2)
axes[1, 1].set_xlabel('log(Window)')
axes[1, 1].set_ylabel('log(R/S)')
axes[1, 1].set_title('Detrended R/S Scaling')
axes[1, 1].grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nðŸ’¡ Detrending usually recommended for data with visible trends")

## 7. Test on YOUR Data

Load your own data and compare methods.

In [None]:
# Load your data here
# your_data = ...

# Example placeholder:
your_data = random_walk  # REPLACE THIS

print("Comparing methods on your data...")
print("=" * 60)

# R/S method
rs_result = hurst_rs(your_data, min_window=10, num_windows=30)
print(f"R/S Hurst:          {rs_result['hurst']:.4f}  (RÂ²={rs_result['r_squared']:.4f})")

# Climacogram
scales, variances = compute_climacogram(your_data, max_scale=min(100, len(your_data)//2))
valid = ~np.isnan(variances)
slope = np.polyfit(np.log10(scales[valid]), np.log10(variances[valid]), 1)[0]
print(f"Climacogram Slope:  {slope:.4f}")

# Detrended
detrended_data = signal.detrend(your_data)
rs_detrended = hurst_rs(detrended_data, min_window=10, num_windows=30)
print(f"Detrended Hurst:    {rs_detrended['hurst']:.4f}  (RÂ²={rs_detrended['r_squared']:.4f})")

print("\n" + "=" * 60)
print("Which method do you trust most for YOUR data?")

## 8. Summary & Recommendations

**What did we learn?**

Run this cell to generate summary recommendations.

In [None]:
print("\n" + "="*70)
print("SUMMARY & RECOMMENDATIONS")
print("="*70)

print("\n1. R/S Method:")
print("   - Good for: General purpose")
print("   - Watch for: Low RÂ² values (< 0.9)")
print("   - Tip: Check parameter sensitivity")

print("\n2. Climacogram:")
print("   - Good for: Visual inspection")
print("   - Watch for: Non-linear patterns in log-log plot")
print("   - Tip: Slope â‰ˆ -1 means H â‰ˆ 0.5")

print("\n3. Detrending:")
print("   - Use when: Data has visible trend")
print("   - Don't use when: Trend is part of the process")
print("   - Tip: Compare both, see which makes sense")

print("\n4. Parameter Selection:")
print("   - min_window: Start with 10-20")
print("   - num_windows: Use 20-30 for stability")
print("   - ALWAYS test sensitivity!")

print("\n" + "="*70)
print("ðŸ’¡ BOTTOM LINE: Try multiple methods, compare results.")
print("   If they disagree wildly, something is wrong!")
print("="*70)

In [None]:
# Your experiments!
