# Notebook 7: Outlier Analysis

This notebook analyzes high-performance trades (outliers) using Z-score methodology.

## Objectives
- Identify trades with Z-score > 3
- Analyze distinguishing features
- Pattern recognition and visualization

In [None]:
import sys
sys.path.insert(0, '..')

import pandas as pd
import numpy as np
import pickle
from pathlib import Path

from src.analysis.outlier_detection import OutlierDetector
from src.analysis.pattern_recognition import PatternRecognizer
from src.analysis.insights_summary import InsightsSummary

In [None]:
# Load trade log
trade_log = pd.read_csv('../results/trade_log.csv')
print(f"Total trades: {len(trade_log)}")

In [None]:
# Identify outliers
detector = OutlierDetector(z_threshold=3.0)
outliers = detector.identify_outliers(trade_log)

print(f"\nOutliers found: {len(outliers)}")
print(f"Outlier percentage: {len(outliers)/len(trade_log)*100:.2f}%")

In [None]:
# Display outlier trades
print("\nOutlier Trades:")
display(outliers[['entry_time', 'type', 'pnl', 'pnl_pct', 'z_score']])

In [None]:
# Load full analysis
with open('../models/outlier_analysis.pkl', 'rb') as f:
    analysis = pickle.load(f)

trades_with_features = analysis['all_trades_with_features']

In [None]:
# Generate insights
summary = InsightsSummary()
insights = summary.generate_full_summary(trades_with_features)
print(summary.print_summary())

In [None]:
# Create visualizations
recognizer = PatternRecognizer()
recognizer.plot_pnl_vs_duration(trades_with_features)
recognizer.plot_feature_boxplots(trades_with_features)
recognizer.plot_correlation_heatmap(trades_with_features)

## Key Findings

- **2.75%** of trades are outliers
- Outliers contribute **3,409%** of total profits
- Dominant pattern: **Downtrend + SHORT** (70%)
- Key distinguishing feature: **Duration** (81x longer)
- Entry conditions (IV, Greeks) are NOT distinguishing factors

**Conclusion**: It's not about better entries, it's about HOLDING WINNERS LONGER.