# Oracle Risk Analysis: High-Frequency Price Movements

This notebook analyzes 1-second spot price data to understand oracle-related risks and inform maximum leverage settings.

## Executive Summary

Oracle risks directly impact:
- **Maximum Leverage**: Extreme price moves can liquidate positions before they can react
- **Liquidation Engine**: Must handle sudden price gaps gracefully
- **Multi-Oracle Aggregation**: Understanding price deviation helps set deviation thresholds

By analyzing 1-second spot kline data (which closely tracks mark price), we can quantify:
- Distribution of price movements at various time scales
- Extreme tail events (99th, 99.9th, 99.99th percentiles)
- Maximum observed price gaps
- Safe leverage levels based on historical data

**Note**: We use Binance Spot API for 1s data as Futures API doesn't support 1s intervals. Spot price closely tracks futures mark price for major pairs like BTC.

In [None]:
# Essential imports
import sys
sys.path.append('../src')

from risk_model.chart_config import setup_chart_style, COLORS
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime, timedelta
from pathlib import Path

setup_chart_style()

# Configuration
SYMBOL = "BTCUSDT"
DATA_DIR = Path("../data/spot_klines")
DATA_DIR.mkdir(parents=True, exist_ok=True)

# Memory info
import os
print(f"Working directory: {os.getcwd()}")

## 1. Load Data

Data is downloaded separately using the `scripts/download_klines.py` script.

**To download data:**
```bash
# Test with 7 days
poetry run python scripts/download_klines.py --symbol BTCUSDT --days 7

# Full year
poetry run python scripts/download_klines.py --symbol BTCUSDT --days 365

# Multiple pairs
poetry run python scripts/download_klines.py --symbol BTCUSDT ETHUSDT SOLUSDT --days 365
```

The script has resume capability - just run the same command to continue an interrupted download.

In [None]:
# Load data from CSV (downloaded by scripts/download_klines.py)
csv_path = DATA_DIR / f"{SYMBOL}_1s.csv"

if csv_path.exists():
    # Get file size
    file_size_mb = csv_path.stat().st_size / (1024 * 1024)
    print(f"Loading {csv_path.name} ({file_size_mb:.1f} MB)...")
    
    # Use efficient dtypes to reduce memory
    dtype_spec = {
        'open': 'float32',
        'high': 'float32', 
        'low': 'float32',
        'close': 'float32',
        'volume': 'float32'
    }
    
    df = pd.read_csv(
        csv_path, 
        parse_dates=['timestamp'],
        dtype=dtype_spec
    )
    
    # Memory usage
    mem_mb = df.memory_usage(deep=True).sum() / (1024 * 1024)
    print(f"Loaded {len(df):,} rows ({mem_mb:.1f} MB in memory)")
    print(f"Date range: {df['timestamp'].min()} to {df['timestamp'].max()}")
    print(f"Duration: {(df['timestamp'].max() - df['timestamp'].min()).days} days")
else:
    print(f"No data file found at {csv_path}")
    print(f"\nRun the downloader first:")
    print(f"  poetry run python scripts/download_klines.py --symbol {SYMBOL} --days 7")
    raise FileNotFoundError(f"Data file not found: {csv_path}")

In [None]:
# Display data overview
print(f"Data Range: {df['timestamp'].min()} to {df['timestamp'].max()}")
print(f"Total Duration: {(df['timestamp'].max() - df['timestamp'].min()).days} days")
print(f"Total Rows: {len(df):,}")
print(f"\nData Preview:")
display(df.head(10))
display(df.describe())

## 2. Price Movement Calculations

We calculate price changes at multiple time scales:
- **1 second**: Tick-by-tick movements
- **5 seconds**: Short-term volatility
- **30 seconds**: Trading decision timeframe
- **1 minute**: Standard candle interval
- **5 minutes**: Liquidation reaction window

Both absolute and percentage changes are computed.

In [None]:
# Calculate price changes at different time scales
df = df.sort_values('timestamp').reset_index(drop=True)
df.set_index('timestamp', inplace=True)

# Use close price for analysis
price = df['close']

# Calculate returns at different windows
windows = {
    '1s': 1,
    '5s': 5,
    '30s': 30,
    '1m': 60,
    '5m': 300
}

returns_df = pd.DataFrame(index=df.index)
returns_df['price'] = price

for name, periods in windows.items():
    returns_df[f'return_{name}'] = price.pct_change(periods=periods) * 100  # In percent
    returns_df[f'abs_change_{name}'] = price.diff(periods=periods).abs()

# Also calculate high-low range within each second (intra-candle volatility)
returns_df['intra_range_pct'] = ((df['high'] - df['low']) / df['close']) * 100

print("Calculated return statistics:")
display(returns_df[[f'return_{w}' for w in windows.keys()]].describe())

## 3. Distribution Analysis

We visualize the distribution of price movements to understand:
- Central tendency and spread
- Fat tails (extreme events)
- Asymmetry (more down moves vs up moves?)

In [None]:
# Plot histograms of returns at different time scales
fig, axes = plt.subplots(2, 3, figsize=(15, 10))
axes = axes.flatten()

for idx, (name, periods) in enumerate(windows.items()):
    ax = axes[idx]
    data = returns_df[f'return_{name}'].dropna()
    
    # Remove extreme outliers for visualization (keep for statistics)
    q99 = data.abs().quantile(0.99)
    plot_data = data[data.abs() <= q99 * 2]
    
    ax.hist(plot_data, bins=100, alpha=0.7, color=COLORS['primary'], edgecolor='none')
    ax.axvline(x=0, color=COLORS['danger'], linestyle='--', alpha=0.7)
    
    # Add statistics annotation
    stats_text = f"Mean: {data.mean():.4f}%\nStd: {data.std():.4f}%\n99th: {data.abs().quantile(0.99):.4f}%"
    ax.text(0.95, 0.95, stats_text, transform=ax.transAxes, fontsize=9,
            verticalalignment='top', horizontalalignment='right',
            bbox=dict(boxstyle='round', facecolor='white', alpha=0.8))
    
    ax.set_title(f'{name} Returns Distribution')
    ax.set_xlabel('Return (%)')
    ax.set_ylabel('Frequency')
    ax.grid(True, alpha=0.3)

# Intra-candle volatility
ax = axes[5]
intra_data = returns_df['intra_range_pct'].dropna()
q99_intra = intra_data.quantile(0.99)
ax.hist(intra_data[intra_data <= q99_intra * 2], bins=100, alpha=0.7, 
        color=COLORS['warning'], edgecolor='none')
ax.set_title('Intra-Second Range Distribution')
ax.set_xlabel('High-Low Range (%)')
ax.set_ylabel('Frequency')
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(DATA_DIR / 'returns_distribution.png', dpi=150, bbox_inches='tight')
plt.show()

## 4. Extreme Movement Analysis

For setting leverage limits, we need to understand tail risks:
- **99th percentile**: 1 in 100 events
- **99.9th percentile**: 1 in 1,000 events (~3 per day at 1s resolution)
- **99.99th percentile**: 1 in 10,000 events
- **Maximum observed**: Worst case in historical data

A position can be liquidated if price moves against it by `1 / leverage`. For example:
- 10x leverage: liquidated at 10% adverse move
- 20x leverage: liquidated at 5% adverse move
- 50x leverage: liquidated at 2% adverse move

In [None]:
# Calculate percentile statistics for all time windows
percentiles = [0.9, 0.95, 0.99, 0.999, 0.9999]
percentile_stats = {}

for name in windows.keys():
    col = f'return_{name}'
    data = returns_df[col].dropna().abs()  # Absolute returns
    
    stats = {
        'mean': data.mean(),
        'std': data.std(),
        'max': data.max(),
        'count': len(data)
    }
    
    for p in percentiles:
        stats[f'p{int(p*100)}' if p < 1 else f'p{p*100:.2f}'] = data.quantile(p)
    
    percentile_stats[name] = stats

stats_df = pd.DataFrame(percentile_stats).T
print("Extreme Movement Statistics (Absolute % Returns):")
display(stats_df.round(4))

In [None]:
# Visualize extreme percentiles across time windows
fig, ax = plt.subplots(figsize=(12, 6))

window_names = list(windows.keys())
x = np.arange(len(window_names))
width = 0.15

colors_pct = [COLORS['primary'], COLORS['secondary'], COLORS['warning'], 
              COLORS['danger'], '#8B0000']

for i, p in enumerate(percentiles):
    label = f"{p*100:.2f}%" if p > 0.99 else f"{int(p*100)}%"
    col = f'p{int(p*100)}' if p < 1 else f'p{p*100:.2f}'
    values = [percentile_stats[w][col] for w in window_names]
    ax.bar(x + i*width, values, width, label=label, color=colors_pct[i], alpha=0.8)

ax.set_xlabel('Time Window')
ax.set_ylabel('Absolute Return (%)')
ax.set_title('Extreme Price Movements by Percentile and Time Window')
ax.set_xticks(x + width * 2)
ax.set_xticklabels(window_names)
ax.legend(title='Percentile')
ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig(DATA_DIR / 'extreme_movements.png', dpi=150, bbox_inches='tight')
plt.show()

## 5. Maximum Drawdown Analysis

Beyond point-to-point returns, we analyze rolling maximum drawdowns:
- Maximum adverse move within a rolling window
- Identifies sustained adverse price moves
- Critical for understanding liquidation risk

In [None]:
# Calculate rolling maximum drawdown for different windows
def calculate_max_drawdown(prices, window):
    """Calculate rolling maximum drawdown over a window."""
    rolling_max = prices.rolling(window=window, min_periods=1).max()
    drawdown = (prices - rolling_max) / rolling_max * 100
    return drawdown

drawdown_windows = {
    '1m': 60,
    '5m': 300,
    '15m': 900,
    '1h': 3600
}

drawdowns = {}
for name, window in drawdown_windows.items():
    dd = calculate_max_drawdown(price, window)
    drawdowns[name] = dd
    
# Statistics
dd_stats = {}
for name, dd in drawdowns.items():
    dd_clean = dd.dropna()
    dd_stats[name] = {
        'min_dd': dd_clean.min(),  # Worst drawdown (most negative)
        'p1': dd_clean.quantile(0.01),
        'p5': dd_clean.quantile(0.05),
        'mean': dd_clean.mean(),
        'p95': dd_clean.quantile(0.95),
        'p99': dd_clean.quantile(0.99)
    }

dd_df = pd.DataFrame(dd_stats).T
print("Rolling Maximum Drawdown Statistics (%):")
display(dd_df.round(3))

In [None]:
# Plot drawdown distribution for key windows
fig, axes = plt.subplots(2, 2, figsize=(14, 10))
axes = axes.flatten()

for idx, (name, dd) in enumerate(drawdowns.items()):
    ax = axes[idx]
    dd_clean = dd.dropna()
    
    # Focus on negative drawdowns
    negative_dd = dd_clean[dd_clean < 0]
    
    ax.hist(negative_dd, bins=100, alpha=0.7, color=COLORS['danger'], edgecolor='none')
    
    # Add percentile lines
    p1 = negative_dd.quantile(0.01)
    p5 = negative_dd.quantile(0.05)
    ax.axvline(x=p1, color='black', linestyle='--', label=f'1st pct: {p1:.2f}%')
    ax.axvline(x=p5, color='gray', linestyle='--', label=f'5th pct: {p5:.2f}%')
    
    ax.set_title(f'{name} Rolling Drawdown Distribution')
    ax.set_xlabel('Drawdown (%)')
    ax.set_ylabel('Frequency')
    ax.legend(loc='upper left')
    ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.savefig(DATA_DIR / 'drawdown_distribution.png', dpi=150, bbox_inches='tight')
plt.show()

## 6. Leverage Recommendations

Based on observed price movements, we can recommend safe leverage levels.

**Methodology:**
- A position with leverage L is liquidated if price moves against it by 1/L (minus maintenance margin)
- We want liquidations to be rare events (not triggered by normal volatility)
- Target: Liquidation should not occur more often than X% of the time

**Assumptions:**
- Maintenance margin buffer: ~1% (varies by exchange)
- Liquidation engine reaction time: 5 seconds to 5 minutes depending on network conditions

In [None]:
# Calculate safe leverage based on observed volatility
def calculate_safe_leverage(max_move_pct, safety_buffer=1.0):
    """Calculate safe leverage given a maximum expected price move.
    
    Args:
        max_move_pct: Maximum expected price move in percent
        safety_buffer: Additional buffer in percent (default 1%)
    
    Returns:
        Maximum safe leverage
    """
    total_move = max_move_pct + safety_buffer
    if total_move <= 0:
        return float('inf')
    return 100 / total_move

# Calculate for different time windows and percentiles
leverage_recs = {}

for window in ['1m', '5m']:
    if window in drawdown_windows:
        dd = drawdowns[window].dropna()
        worst_dd = abs(dd.min())
        p1_dd = abs(dd.quantile(0.01))
        p5_dd = abs(dd.quantile(0.05))
        
        leverage_recs[window] = {
            'worst_drawdown_pct': worst_dd,
            'p1_drawdown_pct': p1_dd,
            'p5_drawdown_pct': p5_dd,
            'max_leverage_conservative': calculate_safe_leverage(worst_dd, safety_buffer=2.0),
            'max_leverage_moderate': calculate_safe_leverage(p1_dd, safety_buffer=1.5),
            'max_leverage_aggressive': calculate_safe_leverage(p5_dd, safety_buffer=1.0)
        }

lev_df = pd.DataFrame(leverage_recs).T
print(f"Leverage Recommendations for {SYMBOL}:")
print("\n(Based on rolling drawdown analysis)")
display(lev_df.round(2))

In [None]:
# Summary visualization
fig, axes = plt.subplots(1, 2, figsize=(14, 5))

# Left: Extreme moves by time window
ax = axes[0]
window_names = list(windows.keys())
max_moves = [percentile_stats[w]['max'] for w in window_names]
p99_moves = [percentile_stats[w]['p99'] for w in window_names]

x = np.arange(len(window_names))
width = 0.35
ax.bar(x - width/2, max_moves, width, label='Maximum Observed', color=COLORS['danger'])
ax.bar(x + width/2, p99_moves, width, label='99th Percentile', color=COLORS['warning'])
ax.set_xlabel('Time Window')
ax.set_ylabel('Absolute Return (%)')
ax.set_title('Extreme Price Movements')
ax.set_xticks(x)
ax.set_xticklabels(window_names)
ax.legend()
ax.grid(True, alpha=0.3, axis='y')

# Right: Safe leverage by risk tolerance
ax = axes[1]
if leverage_recs:
    categories = ['Conservative', 'Moderate', 'Aggressive']
    lev_cols = ['max_leverage_conservative', 'max_leverage_moderate', 'max_leverage_aggressive']
    colors_lev = [COLORS['success'], COLORS['warning'], COLORS['danger']]
    
    x = np.arange(len(leverage_recs))
    width = 0.25
    
    for i, (cat, col) in enumerate(zip(categories, lev_cols)):
        values = [leverage_recs[w][col] for w in leverage_recs.keys()]
        # Cap at 100x for visualization
        values = [min(v, 100) for v in values]
        ax.bar(x + i*width, values, width, label=cat, color=colors_lev[i])
    
    ax.set_xlabel('Reaction Time Window')
    ax.set_ylabel('Maximum Leverage')
    ax.set_title('Recommended Maximum Leverage')
    ax.set_xticks(x + width)
    ax.set_xticklabels(leverage_recs.keys())
    ax.legend(title='Risk Tolerance')
    ax.grid(True, alpha=0.3, axis='y')

plt.tight_layout()
plt.savefig(DATA_DIR / 'leverage_recommendations.png', dpi=150, bbox_inches='tight')
plt.show()

## 7. Oracle Deviation Analysis

For multi-oracle aggregation (as proposed in `steps.md`), we need to understand:
- How much prices typically deviate within a second
- What threshold should trigger a deviation alert

The intra-candle range (high - low) gives us insight into short-term price uncertainty.

In [None]:
# Analyze intra-second price ranges for deviation threshold recommendations
intra_range = returns_df['intra_range_pct'].dropna()

print("Intra-Second Price Range Statistics:")
print(f"  Mean: {intra_range.mean():.4f}%")
print(f"  Median: {intra_range.median():.4f}%")
print(f"  Std Dev: {intra_range.std():.4f}%")
print(f"  90th percentile: {intra_range.quantile(0.90):.4f}%")
print(f"  95th percentile: {intra_range.quantile(0.95):.4f}%")
print(f"  99th percentile: {intra_range.quantile(0.99):.4f}%")
print(f"  Maximum: {intra_range.max():.4f}%")

# Recommendation for oracle deviation threshold
recommended_threshold = intra_range.quantile(0.99) * 2  # 2x the 99th percentile
print(f"\nRecommended Oracle Deviation Threshold: {recommended_threshold:.4f}%")
print("(Based on 2x the 99th percentile of intra-second price range)")

## 8. Conclusions and Recommendations

### Key Findings

Based on historical 1-second spot price data analysis:

1. **Price Movement Distribution**: Most 1-second moves are tiny, but fat tails exist
2. **Extreme Events**: The 99.99th percentile shows potential for significant moves
3. **Rolling Drawdowns**: 5-minute windows can see substantial adverse moves

### Recommendations

| Setting | Conservative | Moderate | Aggressive |
|---------|--------------|----------|------------|
| Max Leverage | See above | See above | See above |
| Oracle Deviation Threshold | 1% | 0.5% | 0.25% |
| Liquidation Delay | 5 min | 1 min | 30 sec |

### Next Steps

1. **Fetch Full Year Data**: `poetry run python scripts/download_klines.py --symbol BTCUSDT --days 365`
2. **Analyze Other Assets**: Add ETH, SOL, and altcoins
3. **Compare Spot vs Perp**: Analyze deviations between prices
4. **Backtest Liquidations**: Simulate liquidation engine on historical data

In [None]:
# Export summary statistics to CSV for reference
summary = {
    'symbol': SYMBOL,
    'data_start': str(df.index.min()),
    'data_end': str(df.index.max()),
    'total_rows': len(df),
    'return_1s_p99': percentile_stats['1s']['p99'],
    'return_1m_p99': percentile_stats['1m']['p99'],
    'return_5m_p99': percentile_stats['5m']['p99'],
    'max_1s_move': percentile_stats['1s']['max'],
    'max_5m_move': percentile_stats['5m']['max'],
    'recommended_deviation_threshold': recommended_threshold
}

summary_df = pd.DataFrame([summary])
summary_df.to_csv(DATA_DIR / f'{SYMBOL}_analysis_summary.csv', index=False)
print(f"Summary saved to {DATA_DIR / f'{SYMBOL}_analysis_summary.csv'}")
display(summary_df.T)