# Notebook 02: Price Band Mechanism and Order Flow Analysis

This notebook develops dynamic Price Bands to detect Mark-Book price deviations and implements order flow analysis to provide early warning signals for market stress and manipulation attempts. The Price Band serves as a circuit breaker rejecting trades outside acceptable deviation ranges, while order flow monitoring identifies unusual trading patterns that may precede or accompany attacks.

**Dependencies from Notebook 01**:
- Parkinson volatility estimates for EWMA initialization
- Amihud ILLIQ metrics for understanding OFI-liquidity relationships
- Asset tier classifications for parameter calibration

## Section 1: Setup and Data Loading

In [None]:
# Import required libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from scipy import stats
from scipy.stats import mstats
import warnings
from pathlib import Path

warnings.filterwarnings('ignore')

# Set visualization style
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")
%matplotlib inline

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 6)
pd.set_option('display.float_format', lambda x: f'{x:.6f}')

print("Libraries imported successfully")
print(f"NumPy version: {np.__version__}")
print(f"Pandas version: {pd.__version__}")

### Data Loading Functions

We load the same three data sources used in Notebook 01:
- **Mark Price**: Fair value used for liquidations and PnL
- **Index Price**: External spot market composite
- **Last Price (Klines)**: Actual trade executions with volume data

In [None]:
# Define symbols to analyze
SYMBOLS = ['BTCUSDT', 'ETHUSDT', 'SOLUSDT', 'BNBUSDT', 'HYPEUSDT', 'LITUSDT', 'SUIUSDT', 'ZECUSDT']

# Data paths
DATA_DIR = Path('../../data')
MARK_PRICE_DIR = DATA_DIR / 'mark_price'
INDEX_PRICE_DIR = DATA_DIR / 'index_price'
KLINES_DIR = DATA_DIR / 'klines'

def load_price_data(symbol: str, price_type: str = 'mark') -> pd.DataFrame:
    """
    Load OHLC price data for a given symbol.
    
    Args:
        symbol: Trading pair symbol (e.g., 'BTCUSDT')
        price_type: Type of price data ('mark', 'index', or 'klines')
    
    Returns:
        DataFrame with OHLC data indexed by timestamp
    """
    if price_type == 'mark':
        file_path = MARK_PRICE_DIR / f"{symbol}_1m.csv"
    elif price_type == 'index':
        file_path = INDEX_PRICE_DIR / f"{symbol}_1m.csv"
    elif price_type == 'klines':
        file_path = KLINES_DIR / f"{symbol}_1m.csv"
    else:
        raise ValueError(f"Unknown price_type: {price_type}")
    
    if not file_path.exists():
        print(f"Warning: {file_path} not found")
        return pd.DataFrame()
    
    df = pd.read_csv(file_path)
    df['open_time'] = pd.to_datetime(df['open_time'])
    df = df.set_index('open_time')
    
    # Ensure numeric columns
    for col in ['open', 'high', 'low', 'close']:
        if col in df.columns:
            df[col] = pd.to_numeric(df[col], errors='coerce')
    
    if 'volume' in df.columns:
        df['volume'] = pd.to_numeric(df['volume'], errors='coerce')
    
    return df

def load_all_symbols(price_type: str = 'mark') -> dict:
    """
    Load data for all symbols.
    
    Returns:
        Dictionary mapping symbol to DataFrame
    """
    data = {}
    for symbol in SYMBOLS:
        df = load_price_data(symbol, price_type)
        if not df.empty:
            data[symbol] = df
            print(f"Loaded {symbol} {price_type}: {len(df):,} records from {df.index[0]} to {df.index[-1]}")
    return data

print(f"Will load data for {len(SYMBOLS)} symbols")
print(f"Data directory: {DATA_DIR.absolute()}")

In [None]:
# Load all three price types
print("Loading Mark Price data...")
mark_data = load_all_symbols('mark')

print("\nLoading Index Price data...")
index_data = load_all_symbols('index')

print("\nLoading Last Price (Klines) data...")
klines_data = load_all_symbols('klines')

print(f"\nSuccessfully loaded data for {len(mark_data)} symbols")

## Section 2: Order Flow Theory and Trade Classification

### Why Order Flow Matters for Risk Management

Order flow represents the sequence and direction of trades: which trades were initiated by aggressive buyers hitting the ask versus aggressive sellers hitting the bid. In healthy markets, order flow is relatively balanced. During manipulation attacks or market crashes, order flow becomes severely imbalanced as one side overwhelms the other.

For a derivatives vault, sustained one-sided order flow in the spot market signals potential index manipulation attempts, while one-sided flow in the derivative market signals building liquidation cascades. The challenge is that we only have 1-minute OHLCV data, not tick-by-tick trade tape showing whether each individual trade was a buy or sell. We must infer order flow direction using statistical heuristics developed in market microstructure literature.

### Trade Classification Methods

We implement three complementary approaches to classify trade direction from OHLC data:

**The Tick Rule**: Classifies based on price direction relative to the previous period. If the close price increases (C_t > C_{t-1}), the period's volume is classified as net buying. If it decreases (C_t < C_{t-1}), it's net selling. Studies show the Tick Rule achieves approximately 70-75% classification accuracy, which is sufficient for aggregate risk analysis.

**The Quote Rule Proxy**: Since we lack bid-ask quote data, we proxy trade direction using the intraday price structure. If the close is in the upper half of the High-Low range (C_t - L_t > H_t - C_t), this suggests buying pressure pushing price toward the high. Conversely, if close is in the lower half, this suggests selling pressure.

**Volume-Weighted Intraday Direction**: A more sophisticated approach that estimates the proportion of volume occurring on upticks versus downticks within each 1-minute candle. This provides a probabilistic estimate ranging from -1 (all selling) to +1 (all buying).

In [None]:
def classify_trade_tick_rule(df: pd.DataFrame) -> pd.Series:
    """
    Classify trade direction using the Tick Rule.
    
    Args:
        df: DataFrame with 'close' column
    
    Returns:
        Series with values: 1 (buy), -1 (sell), 0 (neutral)
    """
    close_change = df['close'].diff()
    
    # Classify based on price direction
    direction = np.where(close_change > 0, 1,  # Uptick = buy
                        np.where(close_change < 0, -1,  # Downtick = sell
                                0))  # No change = neutral
    
    return pd.Series(direction, index=df.index)

def classify_trade_quote_rule(df: pd.DataFrame) -> pd.Series:
    """
    Classify trade direction using the Quote Rule proxy.
    
    Args:
        df: DataFrame with OHLC columns
    
    Returns:
        Series with values: 1 (buy), -1 (sell), 0 (neutral)
    """
    # Calculate position of close within High-Low range
    range_total = df['high'] - df['low']
    range_from_low = df['close'] - df['low']
    range_from_high = df['high'] - df['close']
    
    # Avoid division by zero
    direction = np.where(range_total == 0, 0,
                        np.where(range_from_low > range_from_high, 1,  # Close in upper half = buy
                                np.where(range_from_low < range_from_high, -1,  # Close in lower half = sell
                                        0)))  # Exactly in middle = neutral
    
    return pd.Series(direction, index=df.index)

def classify_trade_volume_weighted(df: pd.DataFrame) -> pd.Series:
    """
    Estimate trade direction using volume-weighted intraday price movement.
    
    Args:
        df: DataFrame with OHLC columns
    
    Returns:
        Series with continuous values from -1 (all selling) to +1 (all buying)
    """
    # Simplified model: weight by position in range
    range_total = df['high'] - df['low']
    
    # Normalize close position to [-1, 1]
    # If close = high, direction = +1 (buying)
    # If close = low, direction = -1 (selling)
    # If close = middle, direction = 0
    direction = np.where(range_total == 0, 0,
                        2 * (df['close'] - df['low']) / range_total - 1)
    
    return pd.Series(direction, index=df.index)

print("Trade classification functions defined")