# Volatility Forecasting Toolkit - Quick Start Guide

Welcome to the **Volatility Forecasting and Regime Analysis** toolkit! This notebook provides a quick introduction to the core capabilities of the project.

## What You'll Learn
1. Loading and preparing financial data
2. Computing returns and basic statistics
3. Calculating volatility using different methods
4. Classifying volatility regimes
5. Generating comprehensive reports

## Prerequisites
- Python 3.8+
- All dependencies installed (`pip install -r requirements.txt`)
- Basic understanding of financial markets

Let's get started! üöÄ

## 1. Setup and Imports

First, let's import all the necessary modules and set up our environment.

In [None]:
# Standard library imports
import sys
import os
from pathlib import Path
import warnings
warnings.filterwarnings('ignore')

# Add src to path
sys.path.insert(0, os.path.join(os.path.dirname(os.getcwd()), 'src'))

# Data manipulation
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

# Our custom modules
from data_loader import DataLoader, fetch_prices
from returns import ReturnsCalculator, compute_log_returns
from rolling_vol import RollingVolatility
from ewma_vol import EWMAVolatility
from garch_model import GARCHModel
from volatility_regimes import VolatilityRegimes
from utils import setup_plot_style, annualize_volatility

# Setup plotting style
setup_plot_style()
plt.rcParams['figure.figsize'] = (14, 6)

print("‚úÖ All imports successful!")
print(f"üìä Ready to analyze volatility")

## 2. Load Financial Data

We'll use Apple (AAPL) stock data for this example. The toolkit supports loading data from:
- Yahoo Finance (via yfinance)
- CSV files
- Other data sources (extensible)

In [None]:
# Load 2 years of AAPL data
print("üì• Loading AAPL data from Yahoo Finance...")

ticker = 'AAPL'
period = '2y'

# Method 1: Using convenience function
prices = fetch_prices(ticker, period=period)

print(f"\n‚úÖ Data loaded successfully!")
print(f"üìÖ Date range: {prices.index[0].date()} to {prices.index[-1].date()}")
print(f"üìä Total observations: {len(prices)}")
print(f"\nüí∞ Price statistics:")
print(prices.describe().round(2))

In [None]:
# Visualize price history
plt.figure(figsize=(14, 6))
plt.plot(prices.index, prices[ticker], linewidth=2, color='#2E86AB')
plt.title(f'{ticker} Price History', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Price ($)', fontsize=12)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print(f"üìà Current price: ${prices[ticker].iloc[-1]:.2f}")
print(f"üìâ 52-week low: ${prices[ticker].min():.2f}")
print(f"üìà 52-week high: ${prices[ticker].max():.2f}")
print(f"üíπ Total return: {((prices[ticker].iloc[-1] / prices[ticker].iloc[0]) - 1) * 100:.2f}%")

## 3. Calculate Returns

Returns are the foundation of volatility analysis. We'll compute log returns, which have better statistical properties for volatility modeling.

In [None]:
# Calculate log returns
print("üßÆ Calculating log returns...")

returns = compute_log_returns(prices)

print(f"\n‚úÖ Returns calculated!")
print(f"üìä Shape: {returns.shape}")
print(f"\nüìà Return statistics:")
print(returns.describe().round(6))

In [None]:
# Visualize returns distribution
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Time series
axes[0].plot(returns.index, returns[ticker] * 100, linewidth=1, alpha=0.7, color='#A23B72')
axes[0].axhline(y=0, color='black', linestyle='--', alpha=0.3)
axes[0].set_title('Daily Returns Over Time', fontsize=14, fontweight='bold')
axes[0].set_xlabel('Date', fontsize=11)
axes[0].set_ylabel('Returns (%)', fontsize=11)
axes[0].grid(True, alpha=0.3)

# Distribution
axes[1].hist(returns[ticker] * 100, bins=50, alpha=0.7, color='#F18F01', edgecolor='black')
axes[1].axvline(x=0, color='black', linestyle='--', alpha=0.3)
axes[1].set_title('Returns Distribution', fontsize=14, fontweight='bold')
axes[1].set_xlabel('Returns (%)', fontsize=11)
axes[1].set_ylabel('Frequency', fontsize=11)
axes[1].grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.show()

# Calculate annualized metrics
mean_return = returns[ticker].mean() * 252
std_return = returns[ticker].std() * np.sqrt(252)

print(f"\nüìä Annualized Metrics:")
print(f"   Mean Return: {mean_return*100:.2f}%")
print(f"   Volatility: {std_return*100:.2f}%")
print(f"   Sharpe Ratio (approx): {mean_return/std_return:.2f}")

## 4. Calculate Volatility

Now let's compute volatility using three different methods:
1. **Rolling Volatility** - Simple moving window
2. **EWMA** - Exponentially weighted moving average
3. **GARCH(1,1)** - Generalized autoregressive conditional heteroskedasticity

In [None]:
print("üìä Computing volatility using multiple models...\n")

# 1. Rolling Volatility (20-day window)
rolling_calc = RollingVolatility()
rolling_vol = rolling_calc.compute_volatility(returns, window=20)
rolling_vol_ann = rolling_calc.annualize(rolling_vol)

print(f"‚úÖ Rolling volatility (20d): mean = {rolling_vol_ann[ticker].mean():.4f}")

# 2. EWMA Volatility (Œª=0.94, RiskMetrics standard)
ewma_calc = EWMAVolatility()
ewma_vol = ewma_calc.compute_volatility(returns, lambda_param=0.94)
ewma_vol_ann = ewma_calc.annualize(ewma_vol)

print(f"‚úÖ EWMA volatility (Œª=0.94): mean = {ewma_vol_ann[ticker].mean():.4f}")

# 3. GARCH(1,1) Model
garch = GARCHModel()
garch.fit(returns)
garch_vol = garch.get_conditional_volatility()
garch_vol_ann = garch_vol * np.sqrt(252)

print(f"‚úÖ GARCH(1,1) volatility: mean = {garch_vol_ann[ticker].mean():.4f}")

# Get GARCH parameters
params_df = garch.get_parameters()
params = params_df.loc[ticker]
print(f"\nüìê GARCH Parameters:")
print(f"   œâ (omega): {params['omega']:.6f}")
print(f"   Œ± (alpha): {params['alpha[1]']:.6f}")
print(f"   Œ≤ (beta):  {params['beta[1]']:.6f}")
print(f"   Persistence: {params['persistence']:.6f}")

In [None]:
# Compare all volatility models
plt.figure(figsize=(14, 7))

plt.plot(rolling_vol_ann.index, rolling_vol_ann[ticker], 
         label='Rolling (20d)', linewidth=2, alpha=0.7, color='#2E86AB')
plt.plot(ewma_vol_ann.index, ewma_vol_ann[ticker], 
         label='EWMA (Œª=0.94)', linewidth=2, alpha=0.7, color='#A23B72')
plt.plot(garch_vol_ann.index, garch_vol_ann[ticker], 
         label='GARCH(1,1)', linewidth=2, alpha=0.7, color='#F18F01')

plt.title('Volatility Model Comparison', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Annualized Volatility', fontsize=12)
plt.legend(fontsize=11, loc='upper left')
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

print("\nüìä Model Comparison:")
print(f"   Rolling: {rolling_vol_ann[ticker].mean():.4f} (¬±{rolling_vol_ann[ticker].std():.4f})")
print(f"   EWMA:    {ewma_vol_ann[ticker].mean():.4f} (¬±{ewma_vol_ann[ticker].std():.4f})")
print(f"   GARCH:   {garch_vol_ann[ticker].mean():.4f} (¬±{garch_vol_ann[ticker].std():.4f})")

## 5. Classify Volatility Regimes

Regime classification helps identify different market conditions. We'll classify volatility into three regimes:
- **Low** volatility (favorable for trend strategies)
- **Medium** volatility (mixed conditions)
- **High** volatility (favorable for mean reversion, requires caution)

In [None]:
print("üéØ Classifying volatility regimes...\n")

# Use EWMA volatility for regime classification
regime_classifier = VolatilityRegimes(ewma_vol_ann)

# Classify using percentiles (33rd and 66th)
regimes = regime_classifier.classify_regimes(percentiles=(33, 66))

print("‚úÖ Regimes classified!")
print(f"\nüìä Regime Distribution:")
stats = regime_classifier.get_regime_statistics()
print(stats.to_string(index=False))

In [None]:
# Visualize regimes
plt.figure(figsize=(14, 7))

# Plot volatility
plt.plot(ewma_vol_ann.index, ewma_vol_ann[ticker], 
         linewidth=2, alpha=0.6, color='gray', label='EWMA Volatility')

# Color code by regime
for regime, color in [('Low', '#2E86AB'), ('Medium', '#F18F01'), ('High', '#A23B72')]:
    mask = regimes[ticker] == regime
    plt.scatter(ewma_vol_ann.index[mask], ewma_vol_ann[ticker][mask], 
                c=color, label=f'{regime} Volatility', alpha=0.6, s=20)

plt.title('Volatility with Regime Classification', fontsize=16, fontweight='bold')
plt.xlabel('Date', fontsize=12)
plt.ylabel('Annualized Volatility', fontsize=12)
plt.legend(fontsize=11)
plt.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# Current regime
current_regime = regime_classifier.get_current_regime()
print(f"\nüéØ Current Regime: {current_regime[ticker]}")

In [None]:
# Analyze regime transitions
transitions = regime_classifier.analyze_transitions()
transition_matrix = transitions[ticker]['percentages']

print("\nüìä Regime Transition Matrix (%):")
print(transition_matrix.round(1))

# Calculate persistence
persistence = regime_classifier.calculate_persistence()
print(f"\n‚è±Ô∏è  Regime Persistence:")
print(persistence.to_string(index=False))

## 6. Performance by Regime

Let's analyze how returns differ across volatility regimes.

In [None]:
# Align returns with regimes
returns_with_regimes = pd.DataFrame({
    'returns': returns[ticker],
    'regime': regimes[ticker]
})

# Calculate statistics by regime
regime_performance = returns_with_regimes.groupby('regime')['returns'].agg([
    ('count', 'count'),
    ('mean', 'mean'),
    ('std', 'std'),
    ('min', 'min'),
    ('max', 'max')
])

# Annualize
regime_performance['annual_return'] = regime_performance['mean'] * 252
regime_performance['annual_vol'] = regime_performance['std'] * np.sqrt(252)
regime_performance['sharpe'] = regime_performance['annual_return'] / regime_performance['annual_vol']

print("üìä Performance by Volatility Regime:\n")
print(regime_performance.round(4))

In [None]:
# Visualize returns distribution by regime
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

for idx, (regime, color) in enumerate([('Low', '#2E86AB'), ('Medium', '#F18F01'), ('High', '#A23B72')]):
    regime_returns = returns_with_regimes[returns_with_regimes['regime'] == regime]['returns'] * 100
    
    axes[idx].hist(regime_returns, bins=30, alpha=0.7, color=color, edgecolor='black')
    axes[idx].axvline(x=0, color='black', linestyle='--', alpha=0.5)
    axes[idx].set_title(f'{regime} Volatility Regime', fontsize=13, fontweight='bold')
    axes[idx].set_xlabel('Returns (%)', fontsize=11)
    axes[idx].set_ylabel('Frequency', fontsize=11)
    axes[idx].grid(True, alpha=0.3, axis='y')
    
    # Add statistics
    mean_ret = regime_returns.mean()
    std_ret = regime_returns.std()
    axes[idx].text(0.05, 0.95, f'Œº={mean_ret:.3f}%\nœÉ={std_ret:.3f}%', 
                   transform=axes[idx].transAxes, fontsize=10,
                   verticalalignment='top', bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))

plt.tight_layout()
plt.show()

## 7. Summary and Key Takeaways

Let's summarize what we've learned and display key metrics.

In [None]:
print("=" * 70)
print(f"{'VOLATILITY ANALYSIS SUMMARY':^70}")
print("=" * 70)
print(f"\nüìä Asset: {ticker}")
print(f"üìÖ Period: {prices.index[0].date()} to {prices.index[-1].date()}")
print(f"üìà Total Return: {((prices[ticker].iloc[-1] / prices[ticker].iloc[0]) - 1) * 100:.2f}%")
print(f"\nüìä Annualized Metrics:")
print(f"   Mean Return: {mean_return*100:.2f}%")
print(f"   Volatility:  {std_return*100:.2f}%")
print(f"   Sharpe Ratio: {mean_return/std_return:.2f}")

print(f"\nüìä Volatility Models (Annualized):")
print(f"   Rolling (20d): {rolling_vol_ann[ticker].mean():.2%}")
print(f"   EWMA (Œª=0.94): {ewma_vol_ann[ticker].mean():.2%}")
print(f"   GARCH(1,1):    {garch_vol_ann[ticker].mean():.2%}")

print(f"\nüéØ Current Regime: {current_regime[ticker]}")

print(f"\nüìä Regime Distribution:")
for _, row in stats.iterrows():
    print(f"   {row['regime']:8s}: {row['percentage']:5.1f}% ({int(row['count'])} days)")

print("\n" + "=" * 70)
print("‚úÖ Analysis Complete!")
print("=" * 70)

## 8. Next Steps

Now that you've completed the quick start guide, here are some next steps:

### Explore More Notebooks
1. **02_data_analysis.ipynb** - Deep dive into data loading and cleaning
2. **03_volatility_models.ipynb** - Comprehensive volatility modeling
3. **04_regime_analysis.ipynb** - Advanced regime classification
4. **05_strategy_integration.ipynb** - Integrate with trading strategies

### Use the Pipeline
For production workflows, use the automated pipeline:
```bash
python run_pipeline.py --ticker AAPL --period 2y --output results/aapl
```

### Customize the Analysis
- Try different tickers (SPY, MSFT, TSLA, etc.)
- Adjust time periods
- Modify regime thresholds
- Compare multiple assets

### Read the Documentation
- `README.md` - Project overview
- `PIPELINE_COMPLETION_REPORT.md` - Pipeline details
- `STRATEGY_ANALYSIS_COMPLETE.md` - Strategy integration

---

**Happy analyzing! üìàüéØ**