# ‚úÖ CU Percentile Heatmap: Prop AMM Compute Unit Analysis

**Purpose**: Visualize worst-case Compute Unit consumption across all critical Prop AMM operations.

**Why this matters (Feb 2026)**:
- Solana has a hard CU limit per transaction (~1.4M CU, but effective safe limit much lower for high-frequency Prop AMMs)
- Your `compute_swap` + update functions run thousands of times per slot
- **Percentiles matter more than averages**: p99 and max = "spike" cases during volatility
- If p99 CU > ~40k‚Äì50k, your function will **fail on-chain** during real traffic
- Updates (Blind/Fast/Full) must stay very cheap (<100 CU) to spam frequently
- Swaps on different curves show computational cost of your pricing logic

**This visualization reveals**:
- Which curve is safest for production (lowest p99 CU)
- Whether `afterSwap` hooks or oracle updates are blowing the budget
- Why edge might disappear on-chain: high p99 CU ‚Üí txs fail ‚Üí stale quotes ‚Üí contagion

## Step 1: Import Required Libraries

In [None]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
import numpy as np
import os
from pathlib import Path

# Set style for professional visualizations
sns.set_style("whitegrid")
plt.rcParams['figure.figsize'] = (14, 9)
plt.rcParams['font.size'] = 10

print("‚úì Libraries imported successfully")

## Step 2: Load and Prepare CU Benchmark Data

Your raw CU logs should be in a CSV format with at least two columns:
- `operation` ‚Üí e.g. "Swap Buy / Curve B", "FullUpdate / both-all"
- `cu` ‚Üí raw compute units consumed

Update the path below to point to your CU benchmark CSV file.

In [None]:
# Configure your CU benchmark data path here
# Update this path to point to your CU benchmark CSV file
data_path = 'outputs/cu_benchmark_logs.csv'

# Alternative paths to check if the above doesn't exist
alternative_paths = [
    'outputs/cu_logs.csv',
    'cu_benchmark_logs.csv',
    'cu_logs.csv'
]

# Find the actual path
if not os.path.exists(data_path):
    print(f"‚ö†Ô∏è  Primary path not found: {data_path}")
    for alt in alternative_paths:
        if os.path.exists(alt):
            data_path = alt
            print(f"‚úì Using alternative path: {data_path}")
            break
    else:
        print(f"‚ö†Ô∏è  WARNING: CU benchmark CSV not found!")
        print(f"   Please ensure your CU data is in one of these locations:")
        for path in [data_path] + alternative_paths:
            print(f"     - {path}")

# Load your raw CU benchmark data
try:
    df = pd.read_csv(data_path)
    print(f"‚úì Loaded CU benchmark data: {len(df)} rows")
    print(f"\nColumns: {list(df.columns)}")
    print(f"\nFirst few rows:")
    print(df.head())
except FileNotFoundError:
    print(f"ERROR: Could not find CU benchmark data at {data_path}")
    print("Please provide your CU benchmark CSV file and update the data_path variable above.")
    df = None

# Make sure operation names match the image style
if df is not None:
    df['operation'] = df['operation'].str.strip()
    print(f"\n‚úì Found {df['operation'].nunique()} unique operations:")
    print(df['operation'].value_counts())

## Step 3: Compute Percentile Statistics

Calculate key percentiles for each operation type: min, p50, p75, p90, p95, p99, max.

These percentiles reveal:
- **min**: Best-case scenario (rarely relevant)
- **p50**: Median‚Äîtypical case
- **p75-p90**: Common spikes (important for normal operation)
- **p95-p99**: Extreme spikes (critical for on-chain safety)
- **max**: Absolute worst case (use with caution‚Äîmay be outliers)

In [None]:
if df is not None:
    # 1. Compute percentiles per operation
    percentiles = [0, 0.50, 0.75, 0.90, 0.95, 0.99, 1.0]
    labels = ['min', 'p50', 'p75', 'p90', 'p95', 'p99', 'max']

    pivot = df.groupby('operation')['cu'].quantile(percentiles).unstack()
    pivot.columns = labels

    print(f"‚úì Computed percentiles for {len(pivot)} operation types\n")
    print(pivot)

    # 2. Reorder rows exactly like your screenshot (if available in data)
    # Update with any actual operation names from your data
    desired_order = [
        'BlindUpdate / blindupdate',
        'FastUpdate / all',
        'FullUpdate / oracle-only',
        'FullUpdate / bid-all',
        'FullUpdate / ask-all',
        'FullUpdate / both-all',
        'Swap Sell / Curve A',
        'Swap Sell / Curve B',
        'Swap Sell / Curve C',
        'Swap Sell / mixed',
        'Swap Buy / Curve A',
        'Swap Buy / Curve B',
        'Swap Buy / Curve C',
        'Swap Buy / mixed'
    ]

    # Only include operations that actually exist in the data
    available_order = [op for op in desired_order if op in pivot.index]
    if available_order:
        pivot = pivot.reindex(available_order)
        print(f"\n‚úì Reordered to {len(available_order)} available operations")
    else:
        print(f"\n‚ÑπÔ∏è  Could not reorder by desired names. Using data as-is.")
        print(f"   Actual operation names in data: {list(pivot.index)}")
else:
    print("‚ö†Ô∏è  Skipping percentile computation (no data loaded)")

## Step 4: Create CU Percentile Heatmap

Generate the professional YlOrRd heatmap visualization with annotations. 

**Color coding**:
- **Yellow**: Safe zone (CU < 10k‚Äì15k)
- **Orange**: Caution zone (CU 15k‚Äì30k)
- **Dark Red**: Danger zone (CU > 30k‚Äì40k) ‚Äî likely to fail on-chain under real traffic

In [None]:
if df is not None and 'pivot' in locals():
    # Create output directories if they don't exist
    output_dir = Path('outputs/images')
    output_dir.mkdir(parents=True, exist_ok=True)

    # 1. Create the heatmap
    fig, ax = plt.subplots(figsize=(14, 10))
    
    cmap = sns.color_palette("YlOrRd", as_cmap=True)  # yellow ‚Üí orange ‚Üí dark red

    heatmap = sns.heatmap(
        pivot,
        annot=True,
        fmt=',.0f',
        cmap=cmap,
        linewidths=0.5,
        linecolor='white',
        cbar_kws={'label': 'Compute Units (CU)'},
        ax=ax,
        vmin=pivot.min().min(),
        vmax=pivot.max().max()
    )

    plt.title('CU Percentile Heatmap\nWorst-case Compute Cost per Operation (Critical for On-Chain Safety)', 
              fontsize=18, pad=20, fontweight='bold')
    plt.xlabel('Percentile', fontsize=14, fontweight='bold')
    plt.ylabel('Operation Type', fontsize=14, fontweight='bold')

    # Rotate labels for readability
    plt.xticks(rotation=45, ha='right')
    plt.yticks(rotation=0)

    # 2. Highlight dangerous zones (optional ‚Äî red box on high values)
    danger_threshold = 40000  # Adjust based on your safety limits
    caution_threshold = 30000

    for i in range(pivot.shape[0]):
        for j in range(pivot.shape[1]):
            value = pivot.iloc[i, j]
            if value > danger_threshold:
                ax.add_patch(plt.Rectangle((j, i), 1, 1, fill=False, edgecolor='darkred', lw=3))
            elif value > caution_threshold:
                ax.add_patch(plt.Rectangle((j, i), 1, 1, fill=False, edgecolor='red', lw=2))

    plt.tight_layout()
    
    # 3. Save the figure
    output_path = output_dir / 'cu_percentile_heatmap.png'
    plt.savefig(output_path, dpi=400, bbox_inches='tight')
    print(f"‚úì Heatmap saved to: {output_path}")
    
    plt.show()
    
    print("\nüìä Heatmap generated successfully!")
else:
    print("‚ö†Ô∏è  Cannot create heatmap (missing data or pivot table)")

## Step 5: Identify Dangerous Operations

Operations where p99 CU exceeds critical thresholds pose on-chain execution risks during real traffic conditions.

**Safety guidelines** (Feb 2026):
- **Safe (green)**: p99 < 20k CU
- **Caution (yellow)**: p99 20k‚Äì35k CU
- **Dangerous (red)**: p99 > 35k‚Äì40k CU ‚Üí likely transaction failures under load

In [None]:
if df is not None and 'pivot' in locals():
    print("=" * 80)
    print("DANGEROUS OPERATIONS (p99 > 30k CU)")
    print("=" * 80)
    dangerous = pivot[pivot['p99'] > 30000][['p99', 'max']].sort_values('p99', ascending=False)
    if len(dangerous) > 0:
        print(dangerous)
        print(f"\n‚ö†Ô∏è  {len(dangerous)} operation(s) exceed safe threshold!")
        print("   These are likely to fail on-chain during real traffic.")
    else:
        print("\n‚úì No operations exceed 30k CU at p99 (good sign!)")

    print("\n" + "=" * 80)
    print("CAUTION OPERATIONS (20k < p99 ‚â§ 30k CU)")
    print("=" * 80)
    caution = pivot[(pivot['p99'] > 20000) & (pivot['p99'] <= 30000)][['p99', 'max']].sort_values('p99', ascending=False)
    if len(caution) > 0:
        print(caution)
        print(f"\n‚ö†Ô∏è  {len(caution)} operation(s) in caution zone")
        print("   Monitor closely‚Äîmay fail during extreme volatility.")
    else:
        print("\n‚úì No operations in caution zone (excellent!)")

    print("\n" + "=" * 80)
    print("SAFE OPERATIONS (p99 ‚â§ 20k CU)")
    print("=" * 80)
    safe = pivot[pivot['p99'] <= 20000][['p99', 'max']]
    print(f"‚úì {len(safe)} operation(s) are safe for production")
    if len(safe) > 0:
        print(safe.sort_values('p99', ascending=False))
else:
    print("‚ö†Ô∏è  Cannot analyze dangerous operations (missing data)")

## Step 6: Generate Performance Insights

Extract key metrics and actionable insights for report generation and challenge submission optimization.

In [None]:
if df is not None and 'pivot' in locals():
    print("=" * 80)
    print("KEY PERFORMANCE METRICS FOR PROP AMM CHALLENGE")
    print("=" * 80)

    # 1. Cheapest update method (by p99)
    update_ops = pivot.loc[pivot.index.str.contains('Update', case=False, na=False)]
    if len(update_ops) > 0:
        cheapest_update = update_ops['p99'].idxmin()
        cheapest_cu = update_ops.loc[cheapest_update, 'p99']
        print(f"\n‚úì Cheapest update method (p99): {cheapest_update}")
        print(f"  CU cost: {cheapest_cu:,.0f} (p99), {update_ops.loc[cheapest_update, 'max']:,.0f} (max)")

    # 2. Most expensive swap operation (by p99)
    swap_ops = pivot.loc[pivot.index.str.contains('Swap', case=False, na=False)]
    if len(swap_ops) > 0:
        most_expensive_swap = swap_ops['p99'].idxmax()
        expensive_cu = swap_ops.loc[most_expensive_swap, 'p99']
        print(f"\n‚ö†Ô∏è  Most expensive swap (p99): {most_expensive_swap}")
        print(f"  CU cost: {expensive_cu:,.0f} (p99), {swap_ops.loc[most_expensive_swap, 'max']:,.0f} (max)")

    # 3. Curve comparison (if available)
    print("\n" + "-" * 80)
    print("CURVE COMPARISON (Buy/Sell Side p99 CU)")
    print("-" * 80)
    for curve in ['Curve A', 'Curve B', 'Curve C', 'mixed']:
        curve_ops = pivot.loc[pivot.index.str.contains(curve, case=False, na=False)]
        if len(curve_ops) > 0:
            avg_p99 = curve_ops['p99'].mean()
            print(f"{curve:12s}: avg p99 = {avg_p99:,8.0f} CU")
            for op in curve_ops.index:
                print(f"       {op:50s}: {curve_ops.loc[op, 'p99']:8.0f} CU (p99)")

    # 4. Percentile progression analysis
    print("\n" + "-" * 80)
    print("PERCENTILE PROGRESSION (Global Statistics)")
    print("-" * 80)
    global_stats = pd.DataFrame({
        'min': pivot['min'].min(),
        'p50': pivot['p50'].mean(),
        'p75': pivot['p75'].mean(),
        'p90': pivot['p90'].mean(),
        'p95': pivot['p95'].mean(),
        'p99': pivot['p99'].mean(),
        'max': pivot['max'].max()
    }, index=['Global']).T
    print(global_stats)

    # 5. Summary for report
    print("\n" + "=" * 80)
    print("SUMMARY FOR CHALLENGE SUBMISSION")
    print("=" * 80)
    print(f"\nüìä Total operations analyzed: {len(pivot)}")
    print(f"üìä Operations in danger zone (p99 > 30k): {len(pivot[pivot['p99'] > 30000])}")
    print(f"üìä Operations in safe zone (p99 ‚â§ 20k): {len(pivot[pivot['p99'] <= 20000])}")
    
    # Calculate overall safety score
    total_ops = len(pivot)
    safe_ops = len(pivot[pivot['p99'] <= 20000])
    safety_score = (safe_ops / total_ops * 100) if total_ops > 0 else 0
    print(f"\nüéØ Safety Score: {safety_score:.1f}% ({safe_ops}/{total_ops} operations are safe)")
    
    if safety_score >= 80:
        print("   ‚úì EXCELLENT: Your Prop AMM is well-optimized for on-chain execution!")
    elif safety_score >= 50:
        print("   ‚ö†Ô∏è  MODERATE: Good baseline, but some operations need optimization.")
    else:
        print("   ‚ùå POOR: Significant optimization needed before mainnet deployment.")

    print("\n" + "=" * 80)
else:
    print("‚ö†Ô∏è  Cannot generate insights (missing data)")

## Bonus: Export Results for Report Generation

Save the percentile matrix and analysis results to CSV/JSON for easy inclusion in your challenge report.

In [None]:
if df is not None and 'pivot' in locals():
    import json
    from pathlib import Path
    
    output_dir = Path('outputs')
    output_dir.mkdir(exist_ok=True)
    
    # 1. Export percentile matrix to CSV
    csv_path = output_dir / 'cu_percentile_analysis.csv'
    pivot.to_csv(csv_path)
    print(f"‚úì Exported percentile analysis to: {csv_path}")
    
    # 2. Export summary statistics to JSON
    summary_data = {
        'total_operations': len(pivot),
        'safe_operations': int(len(pivot[pivot['p99'] <= 20000])),
        'caution_operations': int(len(pivot[(pivot['p99'] > 20000) & (pivot['p99'] <= 30000)])),
        'dangerous_operations': int(len(pivot[pivot['p99'] > 30000])),
        'global_p99_avg': float(pivot['p99'].mean()),
        'global_p99_max': float(pivot['p99'].max()),
        'global_p99_min': float(pivot['p99'].min()),
        'operations': {
            op: {
                'min': float(pivot.loc[op, 'min']),
                'p50': float(pivot.loc[op, 'p50']),
                'p75': float(pivot.loc[op, 'p75']),
                'p90': float(pivot.loc[op, 'p90']),
                'p95': float(pivot.loc[op, 'p95']),
                'p99': float(pivot.loc[op, 'p99']),
                'max': float(pivot.loc[op, 'max']),
                'risk_level': 'SAFE' if pivot.loc[op, 'p99'] <= 20000 else ('CAUTION' if pivot.loc[op, 'p99'] <= 30000 else 'DANGEROUS')
            }
            for op in pivot.index
        }
    }
    
    json_path = output_dir / 'cu_percentile_analysis.json'
    with open(json_path, 'w') as f:
        json.dump(summary_data, f, indent=2)
    print(f"‚úì Exported summary stats to: {json_path}")
    
    print(f"\n‚úÖ All results exported successfully!")
    print(f"\nYou can now use these files in your challenge report:")
    print(f"  - CSV (for spreadsheets): {csv_path}")
    print(f"  - JSON (for dashboards): {json_path}")
    print(f"  - PNG (for presentations): outputs/images/cu_percentile_heatmap.png")
else:
    print("‚ö†Ô∏è  Cannot export results (missing data)")