# Time Series Exploration - Play Around!

This notebook is for **trying stuff and seeing what sticks**.

No production code, no perfect documentation - just exploration.

## What's here:
1. Quick data loading
2. Hurst exponent with different parameters
3. Visual comparison of methods
4. Try your own ideas!

**Workflow:** Change parameters â†’ Run cell â†’ See what happens

In [None]:
# Setup
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from pathlib import Path

# Make plots look nice
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

# Import your tools
import sys
sys.path.insert(0, '../Python')

from hurst import hurst_rs
from climacogram import compute_climacogram, plot_climacogram

print("âœ“ Ready to explore!")

## 1. Quick Test - Generate Synthetic Data

Try different processes to see what Hurst values you get.

In [None]:
# Generate different types of series
np.random.seed(42)
n = 1000

# White noise (H should be ~0.5)
white_noise = np.random.randn(n)

# Random walk (H should be ~1.0)
random_walk = np.cumsum(np.random.randn(n))

# Mean reverting (H should be <0.5)
mean_reverting = np.zeros(n)
for i in range(1, n):
    mean_reverting[i] = 0.9 * mean_reverting[i-1] + np.random.randn()

# Trending (H should be >0.5)
trending = np.linspace(0, 100, n) + np.random.randn(n) * 10

print("Generated 4 test series")

In [None]:
# Quick visualization
fig, axes = plt.subplots(2, 2, figsize=(14, 8))

series_dict = {
    'White Noise': white_noise,
    'Random Walk': random_walk,
    'Mean Reverting': mean_reverting,
    'Trending': trending
}

for ax, (name, data) in zip(axes.flatten(), series_dict.items()):
    ax.plot(data, linewidth=0.8, alpha=0.7)
    ax.set_title(name)
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 2. Calculate Hurst - See What You Get

**Try changing:**
- `min_window`: Start window size
- `num_windows`: How many windows to test
- The series itself

In [None]:
# Calculate Hurst for all series
results = {}

for name, data in series_dict.items():
    result = hurst_rs(data, min_window=10, num_windows=20)
    results[name] = result
    
    print(f"{name:20} H = {result['hurst']:.4f}  (RÂ² = {result['r_squared']:.4f})")

### Visualize the R/S scaling

This shows the log-log plot used to calculate H. 

**Good fit** = points follow straight line

In [None]:
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

for ax, (name, result) in zip(axes.flatten(), results.items()):
    # Plot log-log scaling
    ax.scatter(result['log_window_sizes'], result['log_rs_values'], 
               alpha=0.6, s=50, label='Data')
    ax.plot(result['log_window_sizes'], result['fitted_log_rs'], 
            'r--', linewidth=2, label=f'Fit (H={result["hurst"]:.3f})')
    
    ax.set_xlabel('log(Window Size)')
    ax.set_ylabel('log(R/S)')
    ax.set_title(f'{name} - RÂ² = {result["r_squared"]:.4f}')
    ax.legend()
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 3. Try Different Parameters - See What Changes

**Experiment:** Does changing window sizes affect the result?

In [None]:
# Try different min_window values on random walk
test_series = random_walk

min_windows = [8, 16, 32, 64]
parameter_results = []

for min_win in min_windows:
    result = hurst_rs(test_series, min_window=min_win, num_windows=20)
    parameter_results.append({
        'min_window': min_win,
        'hurst': result['hurst'],
        'r_squared': result['r_squared']
    })

df = pd.DataFrame(parameter_results)
print(df)

In [None]:
# Plot how parameters affect result
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 4))

ax1.plot(df['min_window'], df['hurst'], 'o-', markersize=8)
ax1.set_xlabel('Minimum Window Size')
ax1.set_ylabel('Hurst Exponent')
ax1.set_title('How min_window affects H')
ax1.grid(True, alpha=0.3)

ax2.plot(df['min_window'], df['r_squared'], 'o-', markersize=8, color='orange')
ax2.set_xlabel('Minimum Window Size')
ax2.set_ylabel('RÂ²')
ax2.set_title('Fit Quality')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

## 4. Load Your Own Data

Replace the path with your actual data file.

In [None]:
# Example: Load water reserves data
# CHANGE THIS PATH to your data!
data_path = '../data/water_reserves.csv'

try:
    df_data = pd.read_csv(data_path)
    print("âœ“ Data loaded!")
    print(f"Shape: {df_data.shape}")
    print(f"Columns: {df_data.columns.tolist()}")
    print("\nFirst few rows:")
    display(df_data.head())
except FileNotFoundError:
    print("âš  File not found. Update the path above!")
    df_data = None

In [None]:
# Extract a time series from your data
# MODIFY THIS based on your data structure

if df_data is not None:
    # Example: Get one reservoir's data
    # Adjust column names for your data!
    reservoir_name = 'Mornos'  # CHANGE THIS
    
    reservoir_data = df_data[df_data['Reservoir'] == reservoir_name]
    values = reservoir_data['Value'].values
    
    print(f"Extracted {len(values)} points for {reservoir_name}")
    
    # Quick plot
    plt.figure(figsize=(14, 4))
    plt.plot(values)
    plt.title(f'{reservoir_name} Time Series')
    plt.xlabel('Time')
    plt.ylabel('Value')
    plt.grid(True, alpha=0.3)
    plt.show()

In [None]:
# Calculate Hurst for your data
if df_data is not None and len(values) > 50:
    my_result = hurst_rs(values, min_window=10, num_windows=20)
    
    print(f"Hurst exponent: {my_result['hurst']:.4f}")
    print(f"RÂ²: {my_result['r_squared']:.4f}")
    
    # Interpretation
    h = my_result['hurst']
    if h < 0.5:
        interp = "Anti-persistent (mean-reverting)"
    elif h > 0.5:
        interp = "Persistent (long memory)"
    else:
        interp = "Random walk"
    print(f"\nInterpretation: {interp}")
    
    # Plot R/S scaling
    plt.figure(figsize=(8, 6))
    plt.scatter(my_result['log_window_sizes'], my_result['log_rs_values'], 
                alpha=0.6, s=60)
    plt.plot(my_result['log_window_sizes'], my_result['fitted_log_rs'], 
             'r--', linewidth=2, label=f'H = {h:.3f}')
    plt.xlabel('log(Window Size)')
    plt.ylabel('log(R/S)')
    plt.title(f'{reservoir_name} - Hurst Analysis')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()

## 5. Climacogram - Alternative View

Another way to visualize long-range dependence.

**Slope in log-log plot** tells you about persistence.

In [None]:
# Try climacogram on random walk
scales, variances = compute_climacogram(random_walk, max_scale=50)

plt.figure(figsize=(10, 6))
plt.loglog(scales, variances, 'o-', markersize=6, alpha=0.7)
plt.xlabel('Scale (aggregation window)')
plt.ylabel('Variance')
plt.title('Climacogram - Random Walk')
plt.grid(True, alpha=0.3, which='both')
plt.show()

print("ðŸ’¡ Tip: Straight line in log-log means power-law scaling!")

## 6. Compare Multiple Series - Side by Side

**Useful for:** Comparing different reservoirs, channels, or conditions

In [None]:
# Compare all test series we created
comparison = []

for name, data in series_dict.items():
    result = hurst_rs(data, min_window=10, num_windows=20)
    comparison.append({
        'Series': name,
        'Hurst': result['hurst'],
        'RÂ²': result['r_squared'],
        'Mean': np.mean(data),
        'Std': np.std(data)
    })

comparison_df = pd.DataFrame(comparison)
display(comparison_df.sort_values('Hurst'))

In [None]:
# Visual comparison
fig, ax = plt.subplots(figsize=(10, 6))

bars = ax.barh(comparison_df['Series'], comparison_df['Hurst'])

# Color code by value
for i, (bar, h) in enumerate(zip(bars, comparison_df['Hurst'])):
    if h < 0.5:
        bar.set_color('blue')
    elif h > 0.5:
        bar.set_color('red')
    else:
        bar.set_color('gray')

ax.axvline(x=0.5, color='black', linestyle='--', linewidth=2, alpha=0.5, label='H=0.5 (random)')
ax.set_xlabel('Hurst Exponent')
ax.set_title('Comparison of Different Series')
ax.legend()
ax.grid(True, alpha=0.3, axis='x')

plt.tight_layout()
plt.show()

print("Blue = anti-persistent | Gray = random | Red = persistent")

## 7. Your Experiments Here!

**Ideas to try:**
- Load different datasets
- Try different window parameters
- Compare before/after transformations
- Test on detrended data
- Compare different reservoirs/channels

**Just duplicate cells and modify!**

In [None]:
# Your experiments here!
