# üí∞ Deep Cost Analysis - Fully Dynamic

Detailed cost breakdown and projections using **100% dynamic model selection**.

**NO HARDCODED MODELS** - everything fetched from:
- üìä **LiveBench** - Real-time benchmarks
- üí∞ **Heuristic pricing** - Model size-based cost estimation
- ‚ö° **Throughput metrics** - Speed data

**Analysis includes:**
- Cost per token
- Projected costs for typical workloads
- LiveBench coding scores
- Value analysis (performance / cost)

In [None]:
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns


# Setup
sns.set_style("whitegrid")
plt.rcParams["figure.figsize"] = (14, 6)

# Import model selection system
from livebench_data import (
    enrich_models_with_metrics,
    fetch_top_models,
    get_top_by_price,
    get_top_by_quality,
    get_top_by_speed,
    get_top_overall,
)


print("‚úÖ Setup complete")

## üìä Step 1: Dynamically Load Top Models

Load and analyze top-30 unique models from LiveBench with pricing and speed metrics.

In [None]:
# Fetch top models
print("üîÑ Loading top models from LiveBench...")
models = fetch_top_models(top_n=50, unique=True)  # Get 50, keep ~33 unique
models = enrich_models_with_metrics(models[:30])  # Keep top-30 with pricing/speed

print(f"‚úÖ Loaded {len(models)} unique models\n")

# Show first 10
print("First 10 models:")
for i, m in enumerate(models[:10], 1):
    print(
        f"  {i:2d}. {m['model'][:40]:40s} | Score: {m['coding_score']:5.1f}% | Price: ${m['price']:.2e}/tok"
    )

## üíµ Step 2: Calculate Costs for Different Workload Scenarios

In [None]:
# Define realistic coding scenarios
SCENARIOS = {
    "Small Task": {
        "description": "Add docstring, fix typo",
        "input_tokens": 500,
        "output_tokens": 200,
    },
    "Medium Task": {
        "description": "Add feature, refactor function",
        "input_tokens": 2000,
        "output_tokens": 800,
    },
    "Large Task": {
        "description": "Complex refactoring, new module",
        "input_tokens": 10000,
        "output_tokens": 3000,
    },
    "Daily Usage (100 tasks)": {
        "description": "100 small tasks per day",
        "input_tokens": 50000,
        "output_tokens": 20000,
    },
}

# Calculate costs for each scenario and model
scenario_costs = []

for scenario_name, scenario in SCENARIOS.items():
    input_tokens = scenario["input_tokens"]
    output_tokens = scenario["output_tokens"]

    for model in models:
        # Calculate cost (assuming same price for input and output)
        total_tokens = input_tokens + output_tokens
        cost = total_tokens * model["price"]

        scenario_costs.append(
            {
                "Scenario": scenario_name,
                "Model": model["model"],
                "LiveBench": model["coding_score"],
                "Cost": cost,
                "Quality": model["coding_score"],
            }
        )

scenario_df = pd.DataFrame(scenario_costs)

# Show each scenario
for scenario_name in SCENARIOS:
    print(f"\n{'=' * 80}")
    print(f"üìã {scenario_name}: {SCENARIOS[scenario_name]['description']}")
    print(f"   Input: {SCENARIOS[scenario_name]['input_tokens']:,} tokens")
    print(f"   Output: {SCENARIOS[scenario_name]['output_tokens']:,} tokens")
    print(f"{'=' * 80}")

    scenario_data = scenario_df[scenario_df["Scenario"] == scenario_name].copy()
    scenario_data = scenario_data.sort_values("LiveBench", ascending=False)
    scenario_data = scenario_data.head(10)  # Show top 10
    scenario_data["Rank"] = range(1, len(scenario_data) + 1)

    print(scenario_data[["Rank", "Model", "LiveBench", "Cost"]].to_string(index=False))
    print()

## üìÖ Step 3: Monthly Cost Projections

In [None]:
# Calculate monthly costs (30 days of daily usage)
daily_scenario = scenario_df[scenario_df["Scenario"] == "Daily Usage (100 tasks)"].copy()
daily_scenario["Monthly Cost"] = daily_scenario["Cost"] * 30
daily_scenario = daily_scenario.sort_values("LiveBench", ascending=False)

print("\nüìÖ Monthly Cost Projections (100 tasks/day √ó 30 days):")
print("=" * 80)
print(f"{'Rank':<5} {'Model':<40} {'LiveBench':<12} {'Monthly Cost'}")
print("=" * 80)

for i, (_, row) in enumerate(daily_scenario.iterrows(), 1):
    if i <= 15:  # Show top 15
        print(
            f"{i:<5} {row['Model'][:40]:40s} {row['LiveBench']:>6.1f}%     ${row['Monthly Cost']:>8.2f}"
        )

# Visualize monthly costs
fig, ax = plt.subplots(figsize=(14, 8))

# Sort by LiveBench for display
monthly_sorted = daily_scenario.sort_values("LiveBench", ascending=True).head(15)

bars = ax.barh(range(len(monthly_sorted)), monthly_sorted["Monthly Cost"], color="green", alpha=0.7)
ax.set_yticks(range(len(monthly_sorted)))
ax.set_yticklabels(monthly_sorted["Model"], fontsize=10)
ax.set_xlabel("Monthly Cost ($)", fontsize=12)
ax.set_title("üí≥ Monthly Cost - 100 tasks/day (top 15 models)", fontsize=14, fontweight="bold")
ax.grid(axis="x", alpha=0.3)

# Add LiveBench scores as text
for i, (_, row) in enumerate(monthly_sorted.iterrows()):
    ax.text(
        row["Monthly Cost"] * 1.02,
        i,
        f"{row['LiveBench']:.0f}%",
        va="center",
        fontsize=9,
        fontweight="bold",
        color="blue",
    )

plt.tight_layout()
plt.savefig("monthly_costs_dynamic.png", dpi=150, bbox_inches="tight")
plt.show()

print("\nüì∏ Saved: monthly_costs_dynamic.png")

## üíé Step 4: Value Analysis - Best Models by Category

In [None]:
## üìä Step 5: Quality vs Cost Scatter Plot

# Prepare analysis data
monthly_analysis = daily_scenario.copy()
monthly_analysis['Value Score'] = monthly_analysis['LiveBench'] / (monthly_analysis['Monthly Cost'] + 0.01)
monthly_analysis = monthly_analysis.set_index('Model')

print("\n" + "="*80)
print("üèÜ BEST MODELS BY CATEGORY")
print("="*80)

# 1. Best LiveBench performance
best_quality = monthly_analysis['LiveBench'].idxmax()
print(f"\nü•á Best Quality (Highest LiveBench Score):")
print(f"   Model: {best_quality}")
print(f"   Score: {monthly_analysis.loc[best_quality, 'LiveBench']:.1f}%")
print(f"   Monthly: ${monthly_analysis.loc[best_quality, 'Monthly Cost']:.2f}")

# 2. Cheapest
cheapest = monthly_analysis['Monthly Cost'].idxmin()
print(f"\nüí∞ Cheapest:")
print(f"   Model: {cheapest}")
print(f"   Monthly: ${monthly_analysis.loc[cheapest, 'Monthly Cost']:.2f}")
print(f"   Score: {monthly_analysis.loc[cheapest, 'LiveBench']:.1f}%")

# 3. Best value (performance / cost)
best_value = monthly_analysis['Value Score'].idxmax()
print(f"\nüíé Best Value (Performance / Cost):")
print(f"   Model: {best_value}")
print(f"   Value Score: {monthly_analysis.loc[best_value, 'Value Score']:.2f}")
print(f"   Score: {monthly_analysis.loc[best_value, 'LiveBench']:.1f}%")
print(f"   Monthly: ${monthly_analysis.loc[best_value, 'Monthly Cost']:.2f}")

# 4. High-quality budget option (score > 70%, lowest cost)
budget_options = monthly_analysis[monthly_analysis['LiveBench'] >= 70.0]
if not budget_options.empty:
    best_budget = budget_options['Monthly Cost'].idxmin()
    print(f"\nüéØ Best Budget Option (Quality ‚â• 70%):")
    print(f"   Model: {best_budget}")
    print(f"   Score: {monthly_analysis.loc[best_budget, 'LiveBench']:.1f}%")
    print(f"   Monthly: ${monthly_analysis.loc[best_budget, 'Monthly Cost']:.2f}")

print("\n" + "="*80)

In [None]:
# Scatter plot: Quality vs Monthly Cost
plt.figure(figsize=(12, 8))

monthly_reset = monthly_analysis.reset_index()

scatter = plt.scatter(
    monthly_reset["Monthly Cost"],
    monthly_reset["LiveBench"],
    s=300,
    alpha=0.6,
    c=monthly_reset["Value Score"],
    cmap="viridis",
    edgecolors="black",
)

# Add model labels (shorter names for clarity)
for _, row in monthly_reset.iterrows():
    label = row["Model"][:35].split("/")[-1] if "/" in row["Model"] else row["Model"][:35]
    plt.annotate(
        label,
        xy=(row["Monthly Cost"], row["LiveBench"]),
        xytext=(8, 8),
        textcoords="offset points",
        fontsize=7,
        alpha=0.8,
        bbox={"boxstyle": "round,pad=0.3", "facecolor": "white", "alpha": 0.7},
    )

plt.xlabel("Monthly Cost ($) - 100 tasks/day", fontsize=12)
plt.ylabel("LiveBench Coding Score (%)", fontsize=12)
plt.title("Quality vs Cost - Find Your Sweet Spot", fontsize=14, fontweight="bold")
cbar = plt.colorbar(scatter, label="Value Score (higher = better)")
plt.grid(alpha=0.3)
plt.tight_layout()
plt.savefig("quality_vs_cost_scatter.png", dpi=150, bbox_inches="tight")
plt.show()

print("üì∏ Saved: quality_vs_cost_scatter.png")

## üéØ Step 6: Compare Top 4 Categories

In [None]:
# Get top models from each category
top_quality = get_top_by_quality(models, 3)
top_cheap = get_top_by_price(models, 3)
top_fast = get_top_by_speed(models, 3)
top_overall = get_top_overall(models, 3)

print("\nüèÜ TOP-3 SMARTEST (Best Quality):")
for i, m in enumerate(top_quality, 1):
    monthly = (500 + 200) * m["price"] * 30 * 100  # 100 small tasks/day for 30 days
    print(
        f"  {i}. {m['model'][:45]:45s} | Score: {m['coding_score']:.1f}% | Monthly: ${monthly:.2f}"
    )

print("\nüí∞ TOP-3 CHEAPEST:")
for i, m in enumerate(top_cheap, 1):
    monthly = (500 + 200) * m["price"] * 30 * 100
    print(f"  {i}. {m['model'][:45]:45s} | Price: ${m['price']:.2e}/tok | Monthly: ${monthly:.2f}")

print("\n‚ö° TOP-3 FASTEST:")
for i, m in enumerate(top_fast, 1):
    monthly = (500 + 200) * m["price"] * 30 * 100
    print(f"  {i}. {m['model'][:45]:45s} | Speed: {m['speed']:.0f} tok/s | Monthly: ${monthly:.2f}")

print("\nüéØ TOP-3 BEST OVERALL:")
for i, m in enumerate(top_overall, 1):
    monthly = (500 + 200) * m["price"] * 30 * 100
    print(
        f"  {i}. {m['model'][:45]:45s} | Overall: {m['overall_score']:.3f} | Monthly: ${monthly:.2f}"
    )

## ‚úÖ Summary

This notebook provides **fully dynamic cost analysis**:

‚úÖ **No hardcoded models** - all data fetched from LiveBench  
‚úÖ **Real pricing estimates** - based on model size and type  
‚úÖ **Quality metrics** - LiveBench coding scores  
‚úÖ **Value analysis** - performance / cost ratios  
‚úÖ **Realistic scenarios** - actual workload projections  

**Use the analysis above to:**
- Choose the best model for your budget
- Find the best value option
- Estimate monthly costs accurately
- Balance quality and cost based on your needs

**Data updates:** To get fresh data, update the `LIVEBENCH_DATE` in `livebench_data.py`  
**Source:** https://livebench.ai/table_2025_11_25.csv