# üöÄ HuggingFace Spaces Cost Optimization with Terradev CLI

## üìä Cut GPU Costs by 85% for ML Deployments

This notebook demonstrates how to use **Terradev CLI v2.8** to optimize HuggingFace Spaces deployment costs while maintaining performance.

### üéØ What You'll Learn:
- How to analyze model hardware requirements
- Smart hardware optimization strategies
- Cost comparison between different GPU options
- One-command deployment to HF Spaces

### üí∞ Expected Savings:
- **Manual deployment:** A100-80GB @ $4.06/hr = $974/month
- **Terradev optimized:** a10g-large @ $0.60/hr = $144/month
- **Total savings:** 85% cost reduction!

### üõ†Ô∏è Installation:

In [None]:
# Install Terradev CLI v2.8 with HuggingFace Spaces integration
!pip install terradev-cli==2.8.0 -q

# Verify installation
!python3 -m terradev_cli --version

## üîç Hardware Optimization Analysis

Let's analyze different popular models and see how Terradev optimizes hardware selection:

In [None]:
import subprocess
import json
import pandas as pd
from IPython.display import HTML, display

# Models to analyze
models = [
    "meta-llama/Llama-3-8B-Instruct",
    "mistralai/Mistral-7B-Instruct-v0.2",
    "sentence-transformers/all-MiniLM-L6-v2",
    "runwayml/stable-diffusion-v1-5",
    "codellama/CodeLlama-7b-hf"
]

def analyze_model_hardware(model_id):
    """Analyze hardware optimization for a model"""
    try:
        result = subprocess.run(
            ["python3", "-m", "terradev_cli", "hf", "optimize", model_id],
            capture_output=True, text=True, timeout=30
        )
        return result.stdout
    except Exception as e:
        return f"Error analyzing {model_id}: {str(e)}"

# Analyze all models
print("üîç Analyzing Hardware Optimization for Popular Models\n")
print("=" * 80)

for model in models:
    print(f"\nüéØ Model: {model}")
    print("-" * 60)
    analysis = analyze_model_hardware(model)
    print(analysis)
    print("\n" + "=" * 80)

## üí∞ Cost Comparison Analysis

Let's compare costs across different hardware options for Llama-3-8B:

In [None]:
# Compare hardware options for Llama-3-8B
model_id = "meta-llama/Llama-3-8B-Instruct"

print(f"üîç Hardware Comparison for {model_id}\n")
print("=" * 80)

# Get hardware comparison
try:
    result = subprocess.run(
        ["python3", "-m", "terradev_cli", "hf", "compare", model_id],
        capture_output=True, text=True, timeout=30
    )
    print(result.stdout)
except Exception as e:
    print(f"Error getting comparison: {str(e)}")

## üéõÔ∏è Budget-Constrained Optimization

What if you have a specific budget? Let's find the best hardware within $0.50/hr:

In [None]:
# Budget optimization for $0.50/hr
budget = 0.50
model_id = "meta-llama/Llama-3-8B-Instruct"

print(f"üí∞ Budget Optimization: ${budget}/hr for {model_id}\n")
print("=" * 80)

try:
    result = subprocess.run(
        ["python3", "-m", "terradev_cli", "hf", "optimize", model_id, "--budget", str(budget)],
        capture_output=True, text=True, timeout=30
    )
    print(result.stdout)
except Exception as e:
    print(f"Error with budget optimization: {str(e)}")

## üé® Template Preview

Let's preview the generated template before deployment:

In [None]:
# Preview chat template for Llama-3-8B
model_id = "meta-llama/Llama-3-8B-Instruct"
template_type = "chat"

print(f"üé® Preview Template: {template_type} for {model_id}\n")
print("=" * 80)

try:
    result = subprocess.run(
        ["python3", "-m", "terradev_cli", "hf", "preview", model_id, "--template", template_type],
        capture_output=True, text=True, timeout=30
    )
    print(result.stdout)
except Exception as e:
    print(f"Error previewing template: {str(e)}")

## üìã Cost Summary Table

Let's create a comprehensive cost comparison table:

In [None]:
# Create cost comparison table
import pandas as pd

# Sample data based on Terradev optimization
cost_data = [
    {
        "Model": "Llama-3-8B-Instruct",
        "Recommended Hardware": "a10g-large",
        "Hourly Cost": "$0.60",
        "Monthly (8h/day)": "$144",
        "Memory Utilization": "50%",
        "Performance Score": "8.0"
    },
    {
        "Model": "Mistral-7B-Instruct",
        "Recommended Hardware": "a10g-large",
        "Hourly Cost": "$0.60",
        "Monthly (8h/day)": "$144",
        "Memory Utilization": "45%",
        "Performance Score": "8.0"
    },
    {
        "Model": "MiniLM-L6-v2",
        "Recommended Hardware": "cpu-upgrade",
        "Hourly Cost": "$0.15",
        "Monthly (8h/day)": "$36",
        "Memory Utilization": "25%",
        "Performance Score": "2.0"
    },
    {
        "Model": "Stable-Diffusion-v1-5",
        "Recommended Hardware": "a10g-large",
        "Hourly Cost": "$0.60",
        "Monthly (8h/day)": "$144",
        "Memory Utilization": "60%",
        "Performance Score": "8.0"
    },
    {
        "Model": "CodeLlama-7b-hf",
        "Recommended Hardware": "a10g-large",
        "Hourly Cost": "$0.60",
        "Monthly (8h/day)": "$144",
        "Memory Utilization": "45%",
        "Performance Score": "8.0"
    }
]

df = pd.DataFrame(cost_data)
display(HTML(df.to_html(index=False, escape=False)))

print("\nüí∞ Total Monthly Cost for All 5 Models: $648")
print("üéØ Average Cost per Model: $129.60")
print("üìà Potential Savings vs A100: $2,316/month (78% savings)")

## üöÄ Deployment Examples

Here are the exact commands to deploy each model to HuggingFace Spaces:

In [None]:
# Deployment commands (commented out for safety)
deployment_commands = [
    {
        "name": "Budget Chat Bot",
        "command": "terradev hf space budget-chat --model-id mistralai/Mistral-7B-Instruct-v0.2 --template chat --budget 0.50",
        "description": "Affordable chat bot with cost optimization"
    },
    {
        "name": "Premium Chat",
        "command": "terradev hf space premium-chat --model-id meta-llama/Llama-3-8B-Instruct --template chat",
        "description": "Premium chat with latest Llama-3 model"
    },
    {
        "name": "Fast Embeddings",
        "command": "terradev hf space fast-embeddings --model-id sentence-transformers/all-MiniLM-L6-v2 --template embedding",
        "description": "Fast embedding service with batch processing"
    },
    {
        "name": "AI Art Generator",
        "command": "terradev hf space art-generator --model-id runwayml/stable-diffusion-v1-5 --template image",
        "description": "AI art generator with Stable Diffusion"
    },
    {
        "name": "Code Assistant",
        "command": "terradev hf space code-assistant --model-id codellama/CodeLlama-7b-hf --template chat",
        "description": "Code generation assistant"
    }
]

print("üöÄ Deployment Commands for HuggingFace Spaces\n")
print("=" * 80)
print("‚ö†Ô∏è  Note: Set HF_TOKEN environment variable before deployment\n")

for i, deployment in enumerate(deployment_commands, 1):
    print(f"{i}. üéØ {deployment['name']}")
    print(f"   üìù {deployment['description']}")
    print(f"   üíª {deployment['command']}")
    print("\n" + "-" * 60 + "\n")

## üìä Performance vs Cost Analysis

Let's visualize the performance-cost tradeoff:

In [None]:
# Performance vs Cost visualization
import matplotlib.pyplot as plt
import numpy as np

# Hardware options data
hardware_options = {
    'cpu-upgrade': {'cost': 0.15, 'performance': 2.0, 'memory': 16},
    't4-medium': {'cost': 0.35, 'performance': 5.0, 'memory': 16},
    'a10g-large': {'cost': 0.60, 'performance': 8.0, 'memory': 24},
    'a10g-xlarge': {'cost': 1.20, 'performance': 16.0, 'memory': 48},
    'a100-40gb': {'cost': 2.50, 'performance': 20.0, 'memory': 80},
    'a100-80gb': {'cost': 4.06, 'performance': 40.0, 'memory': 160}
}

# Create scatter plot
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

# Cost vs Performance
for hw, specs in hardware_options.items():
    ax1.scatter(specs['cost'], specs['performance'], s=specs['memory']*2, alpha=0.7)
    ax1.annotate(hw, (specs['cost'], specs['performance']), xytext=(5, 5), textcoords='offset points')

ax1.set_xlabel('Hourly Cost ($)')
ax1.set_ylabel('Performance Score')
ax1.set_title('Cost vs Performance')
ax1.grid(True, alpha=0.3)

# Cost vs Memory
for hw, specs in hardware_options.items():
    ax2.scatter(specs['cost'], specs['memory'], s=specs['performance']*10, alpha=0.7)
    ax2.annotate(hw, (specs['cost'], specs['memory']), xytext=(5, 5), textcoords='offset points')

ax2.set_xlabel('Hourly Cost ($)')
ax2.set_ylabel('Memory (GB)')
ax2.set_title('Cost vs Memory')
ax2.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("üìä Performance-Cost Analysis:")
print("‚Ä¢ Sweet spot: a10g-large (best performance/cost ratio)")
print("‚Ä¢ Budget option: t4-medium (good performance, low cost)")
print("‚Ä¢ High-end: a100-80gb (maximum performance, high cost)")
print("‚Ä¢ CPU-only: cpu-upgrade (minimal cost, basic performance)")

## üéØ Key Takeaways

### üí° Smart Hardware Optimization
- **Terradev automatically analyzes** model requirements
- **Finds optimal hardware** based on memory and performance needs
- **Considers budget constraints** to find best options
- **Provides cost transparency** before deployment

### üí∞ Massive Cost Savings
- **85% cost reduction** vs manual hardware selection
- **Intelligent memory utilization** (50-75% optimal range)
- **Performance optimization** without overprovisioning
- **Budget-friendly options** for every use case

### üöÄ One-Command Deployment
- **Professional templates** with streaming support
- **Built-in error handling** and optimization
- **"Built with Terradev" branding** for viral marketing
- **Production-ready spaces** in seconds

### üìà Business Impact
- **Reduce deployment time** from hours to seconds
- **Eliminate cost guessing** with precise breakdowns
- **Scale deployments** with consistent optimization
- **Build brand awareness** through deployed spaces

---

## üöÄ Get Started Now!

```bash
# Install Terradev CLI
pip install terradev-cli==2.8.0

# Set your HuggingFace token
export HF_TOKEN=your_huggingface_token_here

# Deploy your first optimized space
terradev hf space my-chatbot --model-id meta-llama/Llama-3-8B-Instruct --template chat
```

### üéØ Result:
- Production-ready chat space with streaming
- Optimized hardware (a10g-large @ $0.60/hr)
- Professional Gradio interface
- "Built with Terradev" branding
- Cost transparency and optimization

---

**üöÄ Start optimizing your ML deployments today and join the cost-saving revolution!**

*Built with ‚ù§Ô∏è using Terradev CLI v2.8*