# üö® Flash Crash Detector & Recovery Trade Simulator

**A Quantitative Trading Project by Gauri Gupta**

---

## Project Overview

This notebook implements a **statistical arbitrage trading strategy** that:
1. Detects cryptocurrency flash crashes using multi-factor analysis
2. Simulates buying during crashes and selling during recovery (mean reversion)
3. Analyzes profitability using industry-standard performance metrics
4. Visualizes results with professional charts

### The Strategy

**Observation:** Markets overreact to short-term fear, causing temporary price crashes that often recover within hours.

**Approach:**
- **Entry:** Buy when crash is detected (price drop + statistical significance + volume spike)
- **Hold:** Wait 24 hours for mean reversion
- **Exit:** Sell at market price
- **Risk Management:** 15% stop-loss to limit downside

---

## 1. Setup & Imports

First, let's import all required libraries and configure our environment.

In [1]:
# ============================================================================
# SETUP: Conda Environment & Package Installation
# ============================================================================
# 
# üîß To set up your environment, run these commands in your terminal:
#
# 1. Create dedicated environment
#    conda create -n flash-crash python=3.9
#
# 2. Activate the environment
#    conda activate flash-crash
#
# 3. Install required packages
#    pip install jupyter pandas numpy yfinance matplotlib seaborn
#
# 4. Launch Jupyter Notebook
#    jupyter notebook flash_crash_detector.ipynb
#
# 5. When done, deactivate
#    conda deactivate
#
# 6. Next time, just activate again
#    conda activate flash-crash
#
# ============================================================================

print("‚úÖ Environment setup instructions provided above!")
print("üìù Run the conda commands in your terminal before executing the remaining cells.")

‚úÖ Environment setup instructions provided above!
üìù Run the conda commands in your terminal before executing the remaining cells.


In [2]:
# Install required packages (run this cell first if packages not installed)
# Uncomment the line below if you need to install packages
# !pip install pandas numpy yfinance matplotlib seaborn

# Core libraries
import pandas as pd
import numpy as np
import warnings
from datetime import datetime, timedelta


# Data fetching 
import yfinance as yf

# Visualization
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import seaborn as sns

# Configuration
warnings.filterwarnings('ignore')
plt.style.use('seaborn-v0_8-darkgrid')
sns.set_palette("husl")

# Display settings
pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 2)

# For better plots in Jupyter
%matplotlib inline

print("‚úÖ All libraries imported successfully!")
print(f"üìä Pandas version: {pd.__version__}")
print(f"üî¢ NumPy version: {np.__version__}")

‚úÖ All libraries imported successfully!
üìä Pandas version: 2.3.3
üî¢ NumPy version: 2.0.2


## 2. Configuration Parameters

Set the main parameters for the strategy. You can modify these to test different scenarios.

In [3]:
# ============================================================================
# CONFIGURATION - Modify these parameters to customize the strategy
# ============================================================================

# Data parameters
TICKER = 'BTC-USD'              # Cryptocurrency to analyze
START_DATE = '2023-02-08'       # Start of analysis period
END_DATE = '2025-02-08'         # End of analysis period
INTERVAL = '1d'                 # Data interval (1d = daily)

# Crash detection parameters (daily-bar friendly)
CRASH_THRESHOLD = -6            # Minimum % drop to qualify as crash (-6 = 6% drop)
ZSCORE_THRESHOLD = -1.8         # Rolling Z-score threshold for statistical significance
VOLUME_PERCENTILE = 0.90        # Volume must be above rolling percentile (0.90 = top 10%)
DRAWDOWN_THRESHOLD = -10        # Optional drawdown filter (% from rolling max)
USE_DRAWDOWN_FILTER = True      # Include drawdown filter in crash detection

# Rolling window parameters (daily data)
ZSCORE_WINDOW = 90              # Rolling window for return z-score (days)
VOLUME_WINDOW = 60              # Rolling window for volume percentile (days)
DRAWDOWN_WINDOW = 20            # Rolling window for drawdown (days)

# Trading parameters
HOLD_PERIOD = 1                 # Days to hold position (1 = 1 day)
STOP_LOSS = -15                 # Maximum loss before exiting (-15 = 15% loss)

# Display parameters
VERBOSE = True                  # Show detailed output

print("üéØ Configuration Set:")
print(f"   Ticker: {TICKER}") 
print(f"   Date Range: {START_DATE} to {END_DATE}")
print(f"   Crash Threshold: {CRASH_THRESHOLD}%")
print(f"   Z-Score Threshold: {ZSCORE_THRESHOLD}")
print(f"   Volume Percentile: {int(VOLUME_PERCENTILE * 100)}th over {VOLUME_WINDOW}d")
print(f"   Use Drawdown Filter: {USE_DRAWDOWN_FILTER}")
print(f"   Drawdown Threshold: {DRAWDOWN_THRESHOLD}% over {DRAWDOWN_WINDOW}d")
print(f"   Hold Period: {HOLD_PERIOD} days")
print(f"   Stop Loss: {STOP_LOSS}%")


üéØ Configuration Set:
   Ticker: BTC-USD
   Date Range: 2023-02-08 to 2025-02-08
   Crash Threshold: -6%
   Z-Score Threshold: -1.8
   Volume Percentile: 90th over 60d
   Use Drawdown Filter: True
   Drawdown Threshold: -10% over 20d
   Hold Period: 1 days
   Stop Loss: -15%


## 3. Data Collection

Download cryptocurrency price data from Yahoo Finance via yfinance


In [4]:
def fetch_crypto_data(ticker, start_date, end_date, interval='1h'):
    """
    Downloads cryptocurrency price data from Yahoo Finance.
    
    Parameters:
    -----------
    ticker : str
        Cryptocurrency symbol (e.g., 'BTC-USD', 'ETH-USD')
    start_date : str
        Start date in 'YYYY-MM-DD' format
    end_date : str
        End date in 'YYYY-MM-DD' format
    interval : str
        Data interval ('1h', '15m', '1d', etc.)
    
    Returns:
    --------
    DataFrame with OHLCV data
    """
    print(f"üì• Downloading {ticker} data from {start_date} to {end_date}...")
    
    try:
        data = yf.download(ticker, start=start_date, end=end_date, 
                          interval=interval, progress=False)
        
        if data.empty:
            print(f"‚ùå No data found for {ticker}")
            return None
        
        print(f"‚úÖ Downloaded {len(data):,} data points")
        print(f"   Date Range: {data.index[0]} to {data.index[-1]}")
        print(f"   Price Range: ${float(data['Close'].min()):,.2f} - ${float(data['Close'].max()):,.2f}")
        
        return data
        
    except Exception as e:
        print(f"‚ùå Error downloading data: {e}")
        return None

# Download the data
data = fetch_crypto_data(TICKER, START_DATE, END_DATE, INTERVAL)

# Display first few rows
if data is not None:
    print("\nüìä First 5 rows of data:")
    display(data.head())
    
    print("\nüìä Data statistics:")
    display(data.describe())

üì• Downloading BTC-USD data from 2023-02-08 to 2025-02-08...
‚úÖ Downloaded 731 data points
   Date Range: 2023-02-08 00:00:00 to 2025-02-07 00:00:00
   Price Range: $20,187.24 - $106,146.27

üìä First 5 rows of data:


Price,Close,High,Low,Open,Volume
Ticker,BTC-USD,BTC-USD,BTC-USD,BTC-USD,BTC-USD
Date,Unnamed: 1_level_2,Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2
2023-02-08,22939.4,23367.96,22731.1,23263.42,25371367758
2023-02-09,21819.04,22996.44,21773.97,22946.57,32572572185
2023-02-10,21651.18,21941.19,21539.39,21819.01,27078406594
2023-02-11,21870.88,21891.41,21618.45,21651.84,16356226232
2023-02-12,21788.2,22060.99,21682.83,21870.9,17821046406



üìä Data statistics:


Price,Close,High,Low,Open,Volume
Ticker,BTC-USD,BTC-USD,BTC-USD,BTC-USD,BTC-USD
count,731.0,731.0,731.0,731.0,731.0
mean,51536.22,52447.01,50478.88,51436.19,29500000000.0
std,23630.72,24143.91,23041.21,23596.4,20800000000.0
min,20187.24,20370.6,19628.25,20187.88,5330000000.0
25%,28955.47,29341.55,28455.17,28876.55,15200000000.0
50%,46970.5,48146.17,45260.82,46656.07,24100000000.0
75%,66564.67,67666.11,65133.16,66559.86,37000000000.0
max,106146.27,109114.88,105291.73,106147.3,149000000000.0


In [5]:
# Verify data was downloaded successfully
if data is not None:
    print("‚úÖ Data download complete!")
    print(f"Shape: {data.shape}")
else:
    print("\n‚ùå Data download failed. Troubleshooting:")
    print("   ‚Ä¢ Check your internet connection")
    print("   ‚Ä¢ Try a different ticker: BTC-USD, ETH-USD, SOL-USD")
    print("   ‚Ä¢ Verify your date range is valid")
    print("   ‚Ä¢ Retry the data download cell above")

‚úÖ Data download complete!
Shape: (731, 5)


## 4. Feature Engineering

Calculate returns, volatility, and other features needed for crash detection.

In [7]:
def calculate_features(data):
    """
    Calculates returns, volatility, and other statistical features.
    
    Features calculated:
    - returns: Daily percentage returns
    - volatility_7d: Rolling 7-day standard deviation
    - volume_ma_7d: Rolling 7-day average volume
    - returns_zscore: Rolling Z-score of returns
    - rolling_max: Rolling max close (windowed)
    - drawdown_pct: % drawdown from rolling max
    - volume_pctl: Rolling volume percentile (configurable)
    """
    # Check if data is None or empty
    if data is None or len(data) == 0:
        print("‚ùå Error: No data to process!")
        print("   The previous cell (data download) failed or returned empty data.")
        print("   Please check:")
        print("   1. Your internet connection")
        print("   2. The ticker symbol (BTC-USD, ETH-USD, SOL-USD, etc.)")
        print("   3. Re-run the Yahoo Finance data download cell above")
        return None
    
    print("üìä Calculating features...")
    
    df = data.copy()
    
    # Handle MultiIndex columns from yfinance
    # yfinance returns MultiIndex: (Price/ColumnName, Ticker)
    # Level 0 = column names (Close, High, Low, Open, Volume)
    # Level 1 = ticker symbol (BTC-USD) or empty string for calculated features
    if isinstance(df.columns, pd.MultiIndex):
        df.columns = df.columns.get_level_values(0)  # Get the first level (field names)
    
    # Calculate daily returns (percentage change)
    df['returns'] = df['Close'].pct_change() * 100
    
    # Calculate rolling 7-day volatility
    df['volatility_7d'] = df['returns'].rolling(window=7).std()
    
    # Calculate rolling 7-day volume average
    df['volume_ma_7d'] = df['Volume'].rolling(window=7).mean()
    
    # Rolling Z-score of returns (daily-friendly)
    rolling_mean = df['returns'].rolling(window=ZSCORE_WINDOW).mean()
    rolling_std = df['returns'].rolling(window=ZSCORE_WINDOW).std()
    df['returns_zscore'] = (df['returns'] - rolling_mean) / rolling_std

    # Rolling max close and drawdown
    df['rolling_max'] = df['Close'].rolling(window=DRAWDOWN_WINDOW).max()
    df['drawdown_pct'] = (df['Close'] / df['rolling_max'] - 1) * 100

    # Rolling volume percentile (daily-friendly volume spike proxy)
    df['volume_pctl'] = df['Volume'].rolling(window=VOLUME_WINDOW).quantile(VOLUME_PERCENTILE)
    
    # Remove NaN values from rolling calculations (keep datetime index)
    df = df.dropna()
    
    if len(df) == 0:
        print("‚ùå Error: No valid data after feature calculation!")
        return None
    
    print(f"‚úÖ Features calculated ({len(df):,} valid data points after removing NaN)")
    print(f"   Average return: {df['returns'].mean():.4f}%")
    print(f"   Return std dev: {df['returns'].std():.4f}%")
    print(f"   Average volatility (7d): {df['volatility_7d'].mean():.4f}%")
    
    return df

# Calculate features
data = calculate_features(data)

# Display sample with new features
if data is not None:
    print("\nüìä Data with calculated features:")
    display(data[['Close', 'Volume', 'returns', 'volatility_7d', 'returns_zscore', 'drawdown_pct', 'volume_pctl']].head(10))
else:
    print("\n‚ö†Ô∏è Skipping feature display - please retry data download cell first")

üìä Calculating features...
‚úÖ Features calculated (641 valid data points after removing NaN)
   Average return: 0.2262%
   Return std dev: 2.5122%
   Average volatility (7d): 2.2917%

üìä Data with calculated features:


Price,Close,Volume,returns,volatility_7d,returns_zscore,drawdown_pct,volume_pctl
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2023-05-09,27658.78,14128593256,-0.13,1.81,-0.13,-6.35,36300000000.0
2023-05-10,27621.76,20656025026,-0.13,1.68,-0.15,-6.48,36300000000.0
2023-05-11,27000.79,16724343943,-2.25,1.77,-0.89,-8.58,36300000000.0
2023-05-12,26804.99,19313599897,-0.73,1.05,-0.35,-9.24,35800000000.0
2023-05-13,26784.08,9999171605,-0.08,1.08,-0.12,-9.31,34100000000.0
2023-05-14,26930.64,10014858959,0.55,1.21,0.1,-8.82,33400000000.0
2023-05-15,27192.69,14413231792,0.97,1.03,0.25,-7.93,32000000000.0
2023-05-16,27036.65,12732238816,-0.57,1.04,-0.27,-8.46,27600000000.0
2023-05-17,27398.8,15140006925,1.34,1.22,0.43,-7.23,26100000000.0
2023-05-18,26832.21,15222938600,-2.07,1.16,-0.84,-9.15,24700000000.0


In [8]:
# Diagnostic: Check actual column structure
print("Column structure of data from yfinance:")
print(f"Columns: {data.columns}")
print(f"Column levels: {data.columns.nlevels}")
if data.columns.nlevels > 1:
    for i in range(data.columns.nlevels):
        print(f"  Level {i}: {data.columns.get_level_values(i).unique()}")
print(f"\nFirst few rows:\n{data.head()}")
print(f"\nData shape: {data.shape}")
print(f"\nData dtypes:\n{data.dtypes}")

Column structure of data from yfinance:
Columns: Index(['Close', 'High', 'Low', 'Open', 'Volume', 'returns', 'volatility_7d',
       'volume_ma_7d', 'returns_zscore', 'rolling_max', 'drawdown_pct',
       'volume_pctl'],
      dtype='object', name='Price')
Column levels: 1

First few rows:
Price          Close      High       Low      Open       Volume  returns  \
Date                                                                       
2023-05-09  27658.78  27821.40  27375.60  27695.07  14128593256    -0.13   
2023-05-10  27621.76  28322.69  26883.67  27654.64  20656025026    -0.13   
2023-05-11  27000.79  27621.94  26781.83  27621.09  16724343943    -2.25   
2023-05-12  26804.99  27055.65  25878.43  26987.66  19313599897    -0.73   
2023-05-13  26784.08  27030.48  26710.87  26807.77   9999171605    -0.08   

Price       volatility_7d  volume_ma_7d  returns_zscore  rolling_max  \
Date                                                                   
2023-05-09           1.81      1

## 5. Crash Detection Algorithm

Implement the daily-bar crash detection model:
1. **Price Drop:** Returns < threshold (e.g., -6%)
2. **Rolling Z-Score:** Z-score < -1.8 (unusual move)
3. **Volume Confirmation:** Volume > rolling percentile baseline (panic activity)
4. **Optional Drawdown Filter:** Drawdown < threshold from rolling high


In [9]:
def detect_flash_crashes(data, crash_threshold=-6, zscore_threshold=-1.8,
                         drawdown_threshold=-10, use_drawdown=True,
                         volume_percentile=None):
    """
    Identifies flash crash events using multi-factor analysis (daily bars).
    
    A flash crash must satisfy ALL core conditions:
    1. Price drop exceeds threshold
    2. Rolling Z-score indicates statistical unusualness
    3. Volume exceeds rolling percentile baseline
    4. Optional drawdown filter (from rolling max)
    """
    if data is None or len(data) == 0:
        print("‚ùå Error: No data to analyze for crashes!")
        return None

    if volume_percentile is None:
        volume_percentile = VOLUME_PERCENTILE
    
    print(f"üîç Detecting flash crashes...")
    print(f"   Crash threshold: {crash_threshold}%")
    print(f"   Z-score threshold: {zscore_threshold}")
    print(f"   Volume percentile: {int(volume_percentile * 100)}th over {VOLUME_WINDOW}d")
    print(f"   Use drawdown filter: {use_drawdown} (threshold {drawdown_threshold}% over {DRAWDOWN_WINDOW}d)")
    
    df = data.copy()
    
    # Core crash conditions (use .values to avoid index alignment issues)
    condition_1 = df['returns'].values < crash_threshold
    condition_2 = df['returns_zscore'].values < zscore_threshold
    condition_3 = df['Volume'].values > df['volume_pctl'].values

    if use_drawdown:
        condition_4 = df['drawdown_pct'].values < drawdown_threshold
    else:
        condition_4 = True
    
    # ALL conditions must be true
    df['is_crash'] = condition_1 & condition_2 & condition_3 & condition_4
    
    # Calculate statistics
    num_crashes = df['is_crash'].sum()
    crashes = df[df['is_crash']]
    
    print(f"\nüö® Found {num_crashes} crash events")
    
    if num_crashes > 0:
        print(f"   Average crash magnitude: {crashes['returns'].mean():.2f}%")
        print(f"   Largest crash: {crashes['returns'].min():.2f}%")
        print(f"   Smallest crash: {crashes['returns'].max():.2f}%")
        print(f"   Average volume spike: {(crashes['Volume'] / crashes['volume_ma_7d']).mean():.2f}x")
        print(f"   Average drawdown: {crashes['drawdown_pct'].mean():.2f}%")
    
    return df

# Detect crashes
data = detect_flash_crashes(data, CRASH_THRESHOLD, ZSCORE_THRESHOLD,
                            DRAWDOWN_THRESHOLD, USE_DRAWDOWN_FILTER,
                            volume_percentile=VOLUME_PERCENTILE)

# Display crash events
if data is not None and data['is_crash'].sum() > 0:
    print("\nüìã Crash Events Detail:")
    crash_details = data[data['is_crash']][['Close', 'returns', 'returns_zscore', 'Volume', 'volume_ma_7d', 'drawdown_pct']]
    crash_details['volume_ratio'] = crash_details['Volume'] / crash_details['volume_ma_7d']
    display(crash_details)
elif data is not None:
    print("\n‚ö†Ô∏è No crashes detected with current parameters")

üîç Detecting flash crashes...
   Crash threshold: -6%
   Z-score threshold: -1.8
   Volume percentile: 90th over 60d
   Use drawdown filter: True (threshold -10% over 20d)

üö® Found 4 crash events
   Average crash magnitude: -7.14%
   Largest crash: -8.34%
   Smallest crash: -6.03%
   Average volume spike: 1.80x
   Average drawdown: -14.16%

üìã Crash Events Detail:


Price,Close,returns,returns_zscore,Volume,volume_ma_7d,drawdown_pct,volume_ratio
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
2023-08-17,26664.55,-7.1,-3.61,31120851211,13800000000.0,-10.42,2.26
2024-03-19,61912.77,-8.34,-2.97,74215844794,57300000000.0,-15.28,1.3
2024-08-02,61415.07,-6.03,-2.54,43060875727,34300000000.0,-10.02,1.25
2024-08-05,53991.46,-7.1,-2.77,108991085584,45600000000.0,-20.9,2.39


## 6. Trading Strategy Simulation

Simulate the mean-reversion trading strategy:
- **Entry:** Buy when crash is detected
- **Hold:** Wait for specified hold period (24 hours)
- **Exit:** Sell at market price or stop-loss
- **Risk Management:** Exit early if stop-loss is hit

In [None]:
def simulate_recovery_trades(data, hold_period=1, stop_loss=-15):
    """
    Simulates buying during crashes and selling during recovery.
    
    Strategy:
    1. Buy at crash price
    2. Hold for specified period (in days)
    3. Sell at close price after hold_period days
    4. Exit early if stop-loss is triggered
    """
    if data is None or 'is_crash' not in data.columns:
        print("‚ùå Error: No crash data to simulate trades!")
        return pd.DataFrame()
    
    print(f"\nüí∞ Simulating recovery trades...")
    print(f"   Hold period: {hold_period} days")
    print(f"   Stop loss: {stop_loss}%")
    
    trades = []
    
    # Loop through data to find crashes
    for i in range(len(data)):
        if data['is_crash'].iloc[i]:
            # ENTRY
            entry_price = data['Close'].iloc[i]
            entry_time = data.index[i]
            
            # EXIT (after hold_period days)
            exit_index = i + hold_period
            
            if exit_index < len(data):
                exit_price = data['Close'].iloc[exit_index]
                exit_time = data.index[exit_index]
                
                # Calculate return
                trade_return = ((exit_price - entry_price) / entry_price) * 100
                
                # Check for stop-loss during hold period
                period_prices = data['Close'].iloc[i:exit_index+1]
                max_drawdown = ((period_prices.min() - entry_price) / entry_price) * 100
                hit_stop_loss = max_drawdown < stop_loss
                
                # Record trade
                trades.append({
                    'entry_time': entry_time,
                    'exit_time': exit_time,
                    'entry_price': entry_price,
                    'exit_price': exit_price,
                    'return_pct': trade_return,
                    'max_drawdown': max_drawdown,
                    'hit_stop_loss': hit_stop_loss,
                    'hold_days': hold_period
                })
    
    trades_df = pd.DataFrame(trades)
    
    if len(trades_df) > 0:
        print(f"\nüìà Executed {len(trades_df)} trades")
        print(f"   Average Return: {trades_df['return_pct'].mean():.2f}%")
        print(f"   Median Return: {trades_df['return_pct'].median():.2f}%")
        print(f"   Best Trade: {trades_df['return_pct'].max():.2f}%")
        print(f"   Worst Trade: {trades_df['return_pct'].min():.2f}%")
        print(f"   Win Rate: {(trades_df['return_pct'] > 0).sum() / len(trades_df) * 100:.1f}%")
        print(f"   Stop-Loss Hit: {trades_df['hit_stop_loss'].sum()} times")
    else:
        print("‚ùå No trades executed (no crashes detected)")
    
    return trades_df

# Simulate trades
trades = simulate_recovery_trades(data, HOLD_PERIOD, STOP_LOSS)

# Display trade details
if len(trades) > 0:
    print("\nüìã Trade Details:")
    display(trades[['entry_time', 'entry_price', 'exit_price', 'return_pct', 'max_drawdown']])
else:
    print("\n‚ö†Ô∏è No trades to display")

## 7. Performance Metrics

Calculate key performance metrics used in quantitative trading:
- **Total Return:** Cumulative profit/loss
- **Sharpe Ratio:** Risk-adjusted returns
- **Maximum Drawdown:** Worst peak-to-valley loss
- **Win Rate:** Percentage of profitable trades
- **Profit Factor:** Total wins / total losses

In [None]:
def calculate_performance_metrics(trades_df):
    """
    Calculates comprehensive performance metrics for the trading strategy.
    """
    if len(trades_df) == 0:
        print("‚ùå No trades to analyze")
        return None
    
    print("\n" + "="*70)
    print("üìä STRATEGY PERFORMANCE METRICS")
    print("="*70)
    
    # Calculate cumulative returns
    trades_df['cumulative_return'] = (1 + trades_df['return_pct']/100).cumprod() - 1
    
    # Basic metrics
    total_return = trades_df['cumulative_return'].iloc[-1] * 100
    avg_return = trades_df['return_pct'].mean()
    median_return = trades_df['return_pct'].median()
    std_return = trades_df['return_pct'].std()
    
    # Sharpe Ratio (annualized, assuming daily trade returns)
    if std_return > 0:
        sharpe_ratio = (avg_return / std_return) * np.sqrt(252)
    else:
        sharpe_ratio = 0
    
    # Maximum Drawdown
    cumulative = (1 + trades_df['return_pct']/100).cumprod()
    running_max = cumulative.cummax()
    drawdown = (cumulative - running_max) / running_max
    max_drawdown = drawdown.min() * 100
    
    # Win Rate
    win_rate = (trades_df['return_pct'] > 0).sum() / len(trades_df) * 100
    num_wins = (trades_df['return_pct'] > 0).sum()
    num_losses = (trades_df['return_pct'] <= 0).sum()
    
    # Profit Factor
    total_wins = trades_df[trades_df['return_pct'] > 0]['return_pct'].sum()
    total_losses = abs(trades_df[trades_df['return_pct'] <= 0]['return_pct'].sum())
    profit_factor = total_wins / total_losses if total_losses > 0 else float('inf')
    
    # Best and worst streaks
    best_trade = trades_df['return_pct'].max()
    worst_trade = trades_df['return_pct'].min()
    
    # Create metrics dictionary
    metrics = {
        'Number of Trades': len(trades_df),
        'Winning Trades': num_wins,
        'Losing Trades': num_losses,
        'Win Rate (%)': win_rate,
        'Total Return (%)': total_return,
        'Average Return (%)': avg_return,
        'Median Return (%)': median_return,
        'Std Dev of Returns (%)': std_return,
        'Sharpe Ratio': sharpe_ratio,
        'Maximum Drawdown (%)': max_drawdown,
        'Best Trade (%)': best_trade,
        'Worst Trade (%)': worst_trade,
        'Profit Factor': profit_factor
    }
    
    # Print metrics
    for key, value in metrics.items():
        if 'Ratio' in key or 'Factor' in key:
            print(f"{key:.<50} {value:>15.3f}")
        else:
            print(f"{key:.<50} {value:>15.2f}")
    
    print("="*70)
    
    return metrics

# Calculate and display metrics
if len(trades) > 0:
    metrics = calculate_performance_metrics(trades)
else:
    print("‚ö†Ô∏è No trades to analyze - try adjusting parameters")

## 8. Visualizations

Create professional charts to visualize the strategy performance.

### 8.1 Price Chart with Crash Detection

In [None]:
def plot_crash_detection(data, ticker):
    """
    Creates a chart showing price with detected crash events highlighted.
    """
    fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(16, 10), sharex=True,
                                     gridspec_kw={'height_ratios': [2, 1]})
    
    # Plot 1: Price with crash markers
    ax1.plot(data.index, data['Close'], color='#2E86AB', linewidth=1.5, 
             label=f'{ticker} Price', alpha=0.8)
    
    # Mark crashes
    crashes = data[data['is_crash']]
    if len(crashes) > 0:
        ax1.scatter(crashes.index, crashes['Close'], 
                   color='#EE6352', s=200, marker='X', 
                   edgecolors='darkred', linewidths=2,
                   label=f'Flash Crash ({len(crashes)} events)', zorder=5)
    
    ax1.set_ylabel('Price (USD)', fontsize=13, fontweight='bold')
    ax1.set_title(f'{ticker} Flash Crash Detection System', 
                  fontsize=16, fontweight='bold', pad=20)
    ax1.legend(loc='upper left', fontsize=11, framealpha=0.9)
    ax1.grid(True, alpha=0.3)
    ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, p: f'${x:,.0f}'))
    
    # Plot 2: Volume
    colors = ['#EE6352' if crash else '#9E9E9E' for crash in data['is_crash']]
    ax2.bar(data.index, data['Volume'], color=colors, alpha=0.5, width=0.04)
    ax2.set_ylabel('Volume', fontsize=13, fontweight='bold')
    ax2.set_xlabel('Date', fontsize=13, fontweight='bold')
    ax2.grid(True, alpha=0.3)
    
    # Format x-axis
    ax2.xaxis.set_major_formatter(mdates.DateFormatter('%Y-%m'))
    plt.xticks(rotation=45)
    
    plt.tight_layout()
    plt.savefig('crash_detection_chart.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("‚úÖ Crash detection chart created and saved!")

# Generate plot
plot_crash_detection(data, TICKER)

### 8.2 Trading Performance Charts

In [None]:
def plot_trading_performance(trades_df):
    """
    Creates comprehensive trading performance visualizations.
    """
    if len(trades_df) == 0:
        print("‚ùå No trades to visualize")
        return
    
    # Calculate cumulative returns
    trades_df['cumulative_return_pct'] = ((1 + trades_df['return_pct']/100).cumprod() - 1) * 100
    
    fig, axes = plt.subplots(3, 1, figsize=(16, 14))
    
    # Chart 1: Cumulative Returns
    ax1 = axes[0]
    ax1.plot(trades_df['exit_time'], trades_df['cumulative_return_pct'], 
            color='#06D6A0', linewidth=3, marker='o', markersize=8)
    ax1.fill_between(trades_df['exit_time'], trades_df['cumulative_return_pct'], 
                     alpha=0.3, color='#06D6A0')
    ax1.axhline(y=0, color='black', linestyle='--', alpha=0.5, linewidth=1)
    ax1.set_ylabel('Cumulative Return (%)', fontsize=13, fontweight='bold')
    ax1.set_title('Strategy Cumulative Returns Over Time', fontsize=15, fontweight='bold')
    ax1.grid(True, alpha=0.3)
    
    # Chart 2: Distribution of Returns
    ax2 = axes[1]
    ax2.hist(trades_df['return_pct'], bins=20, color='#06D6A0', 
            alpha=0.7, edgecolor='black', linewidth=1.5)
    ax2.axvline(x=0, color='black', linestyle='--', alpha=0.5, linewidth=2)
    ax2.axvline(x=trades_df['return_pct'].mean(), color='red', linestyle='--', 
               linewidth=2, label=f'Mean: {trades_df["return_pct"].mean():.2f}%')
    ax2.set_xlabel('Return (%)', fontsize=13, fontweight='bold')
    ax2.set_ylabel('Frequency', fontsize=13, fontweight='bold')
    ax2.set_title('Distribution of Trade Returns', fontsize=15, fontweight='bold')
    ax2.legend(fontsize=11)
    ax2.grid(True, alpha=0.3)
    
    # Chart 3: Individual Trade Performance
    ax3 = axes[2]
    trade_nums = list(range(1, len(trades_df) + 1))
    bar_colors = ['#06D6A0' if x > 0 else '#EF476F' for x in trades_df['return_pct']]
    ax3.bar(trade_nums, trades_df['return_pct'], color=bar_colors, 
           alpha=0.8, edgecolor='black', linewidth=1)
    ax3.axhline(y=0, color='black', linestyle='-', alpha=0.7, linewidth=1)
    ax3.set_xlabel('Trade Number', fontsize=13, fontweight='bold')
    ax3.set_ylabel('Return (%)', fontsize=13, fontweight='bold')
    ax3.set_title('Individual Trade Performance', fontsize=15, fontweight='bold')
    ax3.grid(True, alpha=0.3, axis='y')
    
    plt.tight_layout()
    plt.savefig('trading_performance_chart.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("‚úÖ Trading performance charts created and saved!")

# Generate plots
if len(trades) > 0:
    plot_trading_performance(trades)

### 8.3 Crash Severity Analysis

In [None]:
def plot_crash_analysis(data):
    """
    Analyzes and visualizes crash characteristics.
    """
    crashes = data[data['is_crash']].copy()
    
    if len(crashes) == 0:
        print("‚ùå No crashes to analyze")
        return
    
    crashes['day'] = crashes.index.day
    crashes['volume_ratio'] = crashes['Volume'] / crashes['volume_ma_7d']
    
    fig, ax = plt.subplots(figsize=(12, 8))
    
    scatter = ax.scatter(crashes['returns'], crashes['volume_ratio'], 
                        c=crashes['day'], cmap='viridis', 
                        s=250, alpha=0.7, edgecolors='black', linewidth=1.5)
    
    ax.set_xlabel('Price Drop (%)', fontsize=13, fontweight='bold')
    ax.set_ylabel('Volume Spike (multiple of 7d avg)', fontsize=13, fontweight='bold')
    ax.set_title('Flash Crash Severity Analysis', fontsize=16, fontweight='bold', pad=20)
    ax.grid(True, alpha=0.3)
    
    # Add colorbar
    cbar = plt.colorbar(scatter, ax=ax)
    cbar.set_label('Day of Month', fontsize=12, fontweight='bold')
    
    plt.tight_layout()
    plt.savefig('crash_severity_analysis.png', dpi=300, bbox_inches='tight')
    plt.show()
    
    print("‚úÖ Crash severity analysis created and saved!")

# Generate plot
plot_crash_analysis(data)

## 9. Export Results

Save the results to CSV files for further analysis.

In [None]:
# Export trade results
if len(trades) > 0:
    trades.to_csv('crash_trades_results.csv', index=False)
    print("üíæ Trade results saved to 'crash_trades_results.csv'")

# Export crash events
if data['is_crash'].sum() > 0:
    crashes = data[data['is_crash']][['Close', 'returns', 'Volume', 'volatility_7d', 'returns_zscore']]
    crashes.to_csv('crash_events.csv')
    print("üíæ Crash events saved to 'crash_events.csv'")

# Export full dataset
data.to_csv('full_data.csv')
print("üíæ Full dataset saved to 'full_data.csv'")

print("\n‚úÖ All results exported successfully!")

In [None]:
# Check current working directory and file locations
import os

cwd = os.getcwd()
print(f"üìÅ Current working directory: {cwd}")
print(f"\nüìÑ Files in current directory:")
for f in os.listdir(cwd):
    if f.endswith('.csv'):
        filepath = os.path.join(cwd, f)
        size = os.path.getsize(filepath) / 1024  # Size in KB
        print(f"   ‚úì {f} ({size:.1f} KB)")

print(f"\nüîç Searching for CSV files...")
csv_files = [f for f in os.listdir(cwd) if f.endswith('.csv')]
if csv_files:
    print(f"‚úÖ Found {len(csv_files)} CSV file(s)")
else:
    print("‚ùå No CSV files found yet")

## 10. Summary & Interpretation

### Key Findings

This section provides interpretation of the results and key takeaways.

In [None]:
print("\n" + "="*70)
print("üìù SUMMARY & KEY TAKEAWAYS")
print("="*70)

if len(trades) > 0 and data['is_crash'].sum() > 0:
    print(f"\nüéØ Strategy Overview:")
    print(f"   Period Analyzed: {data.index[0].date()} to {data.index[-1].date()}")
    print(f"   Total Data Points: {len(data):,}")
    print(f"   Crashes Detected: {data['is_crash'].sum()}")
    print(f"   Trades Executed: {len(trades)}")
    
    print(f"\nüìä Performance Summary:")
    print(f"   Total Return: {trades['cumulative_return'].iloc[-1] * 100:.2f}%")
    print(f"   Win Rate: {(trades['return_pct'] > 0).sum() / len(trades) * 100:.1f}%")
    print(f"   Average Return: {trades['return_pct'].mean():.2f}%")
    
    print(f"\nüí° Key Insights:")
    print(f"   1. Flash crashes occurred {data['is_crash'].sum()} times over the period")
    print(f"   2. Mean reversion worked in {(trades['return_pct'] > 0).sum()}/{len(trades)} cases")
    print(f"   3. Average crash magnitude: {data[data['is_crash']]['returns'].mean():.2f}%")
    print(f"   4. Average recovery: {trades['return_pct'].mean():.2f}% over {HOLD_PERIOD} days")
    
    print(f"\n‚ö†Ô∏è Important Notes:")
    print(f"   ‚Ä¢ This is a backtest - past performance doesn't guarantee future results")
    print(f"   ‚Ä¢ Transaction costs (0.1-0.5%) are not included in returns")
    print(f"   ‚Ä¢ Slippage during volatile periods could impact execution prices")
    print(f"   ‚Ä¢ Strategy assumes mean reversion, which may not work in bear markets")
    
else:
    print("\n‚ö†Ô∏è Insufficient data to generate summary")
    print("   Try adjusting parameters to detect more crashes")

print("\n" + "="*70)

## 11. Parameter Experimentation

Test different parameter combinations to optimize the strategy.

In [None]:
def test_parameters(data, crash_thresholds, volume_percentiles, hold_periods,
                    zscore_thresholds=None, drawdown_thresholds=None, use_drawdown=True):
    """
    Tests different parameter combinations to find optimal settings.
    """
    print("üî¨ Testing different parameter combinations...\n")
    
    results = []
    zscore_thresholds = zscore_thresholds or [ZSCORE_THRESHOLD]
    drawdown_thresholds = drawdown_thresholds or [DRAWDOWN_THRESHOLD]
    
    for ct in crash_thresholds:
        for vp in volume_percentiles:
            for hp in hold_periods:
                for zs in zscore_thresholds:
                    for dd in drawdown_thresholds:
                        # Test this combination
                        test_data = data.copy()
                        # Recompute rolling volume percentile for this vp
                        test_data['volume_pctl'] = test_data['Volume'].rolling(window=VOLUME_WINDOW).quantile(vp)
                        test_data = test_data.dropna()
                        test_data = detect_flash_crashes(test_data, crash_threshold=ct,
                                                         zscore_threshold=zs,
                                                         drawdown_threshold=dd,
                                                         use_drawdown=use_drawdown,
                                                         volume_percentile=vp)
                        test_trades = simulate_recovery_trades(test_data, hold_period=hp, stop_loss=-15)
                        
                        if len(test_trades) > 0:
                            # Calculate Sharpe ratio
                            avg_ret = test_trades['return_pct'].mean()
                            std_ret = test_trades['return_pct'].std()
                            sharpe = (avg_ret / std_ret) * np.sqrt(252) if std_ret > 0 else 0
                            
                            results.append({
                                'crash_threshold': ct,
                                'volume_percentile': vp,
                                'hold_period': hp,
                                'zscore_threshold': zs,
                                'drawdown_threshold': dd,
                                'num_trades': len(test_trades),
                                'win_rate': (test_trades['return_pct'] > 0).sum() / len(test_trades) * 100,
                                'avg_return': avg_ret,
                                'sharpe': sharpe
                            })
    
    results_df = pd.DataFrame(results)
    
    if len(results_df) > 0:
        # Sort by Sharpe ratio
        results_df = results_df.sort_values('sharpe', ascending=False)
        
        print("\nüìä Top Parameter Combinations (by Sharpe Ratio):")
        display(results_df.head(10))
        
        # Best parameters
        best = results_df.iloc[0]
        print("\nüèÜ Best Parameters:")
        print(f"   Crash Threshold: {best['crash_threshold']}%")
        print(f"   Volume Percentile: {best['volume_percentile']}")
        print(f"   Hold Period: {best['hold_period']} days")
        print(f"   Z-Score Threshold: {best['zscore_threshold']}")
        print(f"   Drawdown Threshold: {best['drawdown_threshold']}%")
        print(f"   Sharpe Ratio: {best['sharpe']:.2f}")
        print(f"   Number of Trades: {best['num_trades']:.0f}")
    else:
        print("‚ùå No valid parameter combinations found")
    
    return results_df

# Uncomment to run parameter optimization (takes a few minutes)
# param_results = test_parameters(
#     data,
#     crash_thresholds=[-5, -6, -7, -8],
#     volume_percentiles=[0.85, 0.90, 0.95],
#     hold_periods=[1, 2, 3],
#     zscore_thresholds=[-1.5, -1.8, -2.0],
#     drawdown_thresholds=[-8, -10, -12],
#     use_drawdown=True
# )

print("üí° Uncomment the code above to run parameter optimization")

## 12. Conclusion

### What We've Learned

This project demonstrates:

1. **Statistical Anomaly Detection:** Using z-scores and multi-factor analysis to identify unusual market events
2. **Trading Strategy Development:** Building and backtesting a systematic mean-reversion strategy
3. **Risk Management:** Implementing stop-losses and position sizing rules
4. **Performance Analysis:** Calculating industry-standard metrics (Sharpe ratio, drawdown, win rate)
5. **Data Visualization:** Creating professional charts to communicate results

### Next Steps

**Potential Improvements:**
- Add machine learning to predict crash severity
- Implement regime detection (bull/bear/sideways markets)
- Test on multiple cryptocurrencies
- Add transaction costs and slippage modeling
- Dynamic position sizing based on confidence
- Multi-timeframe analysis

### Important Disclaimer

‚ö†Ô∏è **This is an educational project for learning quantitative trading concepts.**
- Past performance does not guarantee future results
- Real trading involves transaction costs, slippage, and execution risk
- This is not financial advice
- Cryptocurrency trading carries significant risk

---

**Project by Gauri Gupta**  
MSc Quantitative Finance | UCD Smurfit Graduate School of Business
