##Coding
**Simulate a FrugalGPT cascade. Assume you have two models: A (cheap, 80% accuracy) and B (expensive, 90% accuracy). Define a simple confidence heuristic for model A (e.g. length of answer or presence of a certain keyword). Implement a policy that calls A, checks confidence; if confident, use A’s answer, if not, call B. Generate a dataset of queries with “ground truth” answers and simulate the cascade, measuring overall accuracy and cost. Compare this to always using B and always using A. Show how varying the confidence threshold produces a Pareto curve of cost vs. accuracy.**

frugalgpt -> if cheaper model crosses the confidence threshold then exit and use it's answer else sequentially call other models and do the same. 

2 open source models, confidence heuristic based on length of the answer. 


In [None]:
# FrugalGPT Cascade Logic
def cascade_policy(model_a, model_b, query, threshold):
    # Call cheap Model A
    answer_a, length_a = model_a.generate(query)
    
    # Calculate Confidence - length based heuristic
    confidence = min(1.0, length_a / 50.0) if length_a <= 50 else 1.0
    
    if confidence >= threshold:
        return answer_a, "Model A" # Stop early
    else:
        # Escalate to expensive Model B
        answer_b = model_b.generate(query)
        return answer_b, "Model B" # Higher cost

## Coding 
**Train a basic router model. Using an open dataset like LMSYS Chatbot Arena results, extract features (possibly the user query text or embedding) and labels (which model among a pair won). Train a classifier (e.g. a small BERT or even logistic regression on embedding features) to predict if a cheaper model’s output will be rated as good as GPT-5’s. Then evaluate: for new queries, use the classifier’s prediction to decide routing (cheap vs. expensive). How much cost can you cut while maintaining quality above a threshold?**

In [None]:
%pip install datasets sentence_transformers scikit-learn matplotlib
from datasets import load_dataset
from huggingface_hub import login
login("huggingface_token")
dataset = load_dataset("lmsys/chatbot_arena_conversations")

import pandas as pd
dataset = pd.DataFrame(dataset['train'])


Defaulting to user installation because normal site-packages is not writeable
Collecting sentence_transformers
  Downloading sentence_transformers-5.1.2-py3-none-any.whl.metadata (16 kB)
Collecting transformers<5.0.0,>=4.41.0 (from sentence_transformers)
  Downloading transformers-4.57.3-py3-none-any.whl.metadata (43 kB)
Collecting scikit-learn (from sentence_transformers)
  Downloading scikit_learn-1.6.1-cp39-cp39-macosx_12_0_arm64.whl.metadata (31 kB)
Collecting huggingface-hub<2.0,>=0.25.0 (from datasets)
  Downloading huggingface_hub-0.36.0-py3-none-any.whl.metadata (14 kB)
Collecting regex!=2019.12.17 (from transformers<5.0.0,>=4.41.0->sentence_transformers)
  Downloading regex-2025.11.3-cp39-cp39-macosx_11_0_arm64.whl.metadata (40 kB)
Collecting tokenizers<=0.23.0,>=0.22.0 (from transformers<5.0.0,>=4.41.0->sentence_transformers)
  Downloading tokenizers-0.22.2-cp39-abi3-macosx_11_0_arm64.whl.metadata (7.3 kB)
Collecting safetensors>=0.4.3 (from transformers<5.0.0,>=4.41.0->sente

**There are a total of 20 models used for preference comparision in LMSYS arena dataset. We can split them into cheap vs expensive based on the approximate costing.**


Then train a binary classifier to output probability of strong model being chosen over weak. 


In [67]:
columns_a = dataset['model_a']
unique_columns_a = list(set(columns_a))
print(unique_columns_a)
columns_b = dataset['model_b']
unique_columns_b = list(set(columns_b))
print(len(unique_columns_b))

# 1 maps to strong model, 0 maps to weak model
tier_map = {
    'gpt-4': 1, 'palm-2': 1, 'claude-v1': 1, 'gpt-3.5-turbo': 1, 'claude-instant-v1': 1,
    'guanaco-33b': 0, 'llama-13b': 0, 'vicuna-13b': 0, 'vicuna-7b': 0, 'wizardlm-13b': 0,
    'alpaca-13b': 0, 'koala-13b': 0, 'oasst-pythia-12b': 0, 'dolly-v2-12b': 0, 'mpt-7b-chat': 0,
    'RWKV-4-Raven-14B': 0, 'gpt4all-13b-snoozy': 0, 'chatglm-6b': 0, 'fastchat-t5-3b': 0, 
    'stablelm-tuned-alpha-7b': 0
}


['fastchat-t5-3b', 'mpt-7b-chat', 'wizardlm-13b', 'chatglm-6b', 'oasst-pythia-12b', 'llama-13b', 'claude-instant-v1', 'vicuna-13b', 'alpaca-13b', 'gpt-4', 'vicuna-7b', 'koala-13b', 'gpt-3.5-turbo', 'stablelm-tuned-alpha-7b', 'RWKV-4-Raven-14B', 'guanaco-33b', 'dolly-v2-12b', 'gpt4all-13b-snoozy', 'palm-2', 'claude-v1']
20


**Map the models in comparision to 1 or 0 and filter out comparisions among same class**

In [68]:
# Map tiers to new columns
dataset["model_a"] = dataset["model_a"].map(tier_map)
dataset["model_b"] = dataset["model_b"].map(tier_map)

# Apply filter
filtered_dataset = dataset[
    dataset["model_a"].notna() &
    dataset["model_b"].notna() &
    (dataset["model_a"] != dataset["model_b"])
]

print(len(filtered_dataset))
print(filtered_dataset['model_a'].iloc[0])

13621
0


**Extract and encode the query**

In [69]:
def extract_query(conversation):
    """
    Extracts the content of the first 'user' role message 
    from a list of conversation turns.
    """
    for turn in conversation:
        if turn['role'] == 'user':
            return turn['content']
    return ""

# Example Application on your Dataset
# Assuming 'filtered_dataset' is your Hugging Face dataset object
queries = [extract_query(filtered_dataset['conversation_a'].iloc[i]) for i in range(len(filtered_dataset))]

print(len(queries))


13621


In [77]:
# Encode queries using embedding model
from sentence_transformers import SentenceTransformer
import numpy as np

# Use SentenceTransformer which has the encode() method
# Try different model names if one fails
try:
    encoder = SentenceTransformer('all-MiniLM-L6-v2')
except:
    try:
        encoder = SentenceTransformer('sentence-transformers/all-MiniLM-L6-v2')
    except:
        # Fallback to paraphrase model
        encoder = SentenceTransformer('paraphrase-MiniLM-L6-v2')

def extract_features(prompts):
    """Extract features: embeddings + normalized length."""
    # SentenceTransformer.encode() returns numpy array
    embeddings = encoder.encode(prompts, show_progress_bar=True, convert_to_numpy=True)
    lengths = np.array([len(p) for p in prompts]).reshape(-1, 1)
    norm_lengths = lengths / (lengths.max() + 1e-10)
    return np.hstack((embeddings, norm_lengths))

print("Encoding queries...")
X = extract_features(queries)
print(f"Feature shape: {X.shape}")


Unexpected exception formatting exception. Falling back to standard exception


Traceback (most recent call last):
  File "/Users/namitha/Library/Python/3.9/lib/python/site-packages/huggingface_hub/utils/_http.py", line 657, in hf_raise_for_status
  File "/Users/namitha/Library/Python/3.9/lib/python/site-packages/httpx/_models.py", line 829, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '404 Not Found' for url 'https://huggingface.co/api/models/sentence-transformers/all-MiniLM-L6-v2/tree/main/additional_chat_templates?recursive=false&expand=false'
For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/var/folders/5z/2t6795n55wq1q_kmrdvlwflm0000gn/T/ipykernel_23322/1152804653.py", line 8, in <module>
    encoder = SentenceTransformer('all-MiniLM-L6-v2')
  File "/Users/namitha/Library/Python/3.9/lib/python/site-packages/sentence_transformers/Se

In [None]:
# Create labels from winner column
# Winner is either 'model_a' or 'model_b' (or None for tie)
# Label = 1 if strong model (tier 1) won, 0 if weak model (tier 0) won

def create_labels(filtered_dataset):
    """Create binary labels: 1 if strong model won, 0 if weak model won."""
    labels = []
    
    for i in range(len(filtered_dataset)):
        winner = filtered_dataset['winner'].iloc[i]
        tier_a = filtered_dataset['model_a'].iloc[i]  # Already mapped to 0 or 1
        tier_b = filtered_dataset['model_b'].iloc[i]  # Already mapped to 0 or 1
        
        if winner == 'model_a':
            label = tier_a  # 1 if strong, 0 if weak
        elif winner == 'model_b':
            label = tier_b  # 1 if strong, 0 if weak
        else:
            # Tie or invalid - use None (will filter out)
            label = None
        
        labels.append(label)
    
    return np.array(labels)

y = create_labels(filtered_dataset)

# Filter out None labels (ties)
valid_mask = ~pd.isna(y)
X_filtered = X[valid_mask]
y_filtered = y[valid_mask].astype(int)
queries_filtered = [q for i, q in enumerate(queries) if valid_mask[i]]

print(f"Valid examples: {len(y_filtered)}")
print(f"Strong model wins (1): {np.sum(y_filtered == 1)} ({np.mean(y_filtered == 1)*100:.1f}%)")
print(f"Weak model wins (0): {np.sum(y_filtered == 0)} ({np.mean(y_filtered == 0)*100:.1f}%)")

In [None]:
# Train binary classifier
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score, roc_curve
import matplotlib.pyplot as plt

# Split data
X_train, X_test, y_train, y_test = train_test_split(
    X_filtered, y_filtered, test_size=0.2, random_state=42, stratify=y_filtered
)

print(f"Train set: {len(X_train)} examples")
print(f"Test set: {len(X_test)} examples")

# Train logistic regression classifier
classifier = LogisticRegression(random_state=42, max_iter=1000)
classifier.fit(X_train, y_train)

# Evaluate
y_pred = classifier.predict(X_test)
y_proba = classifier.predict_proba(X_test)[:, 1]

accuracy = accuracy_score(y_test, y_pred)
auc = roc_auc_score(y_test, y_proba)

print(f"\n{'='*50}")
print("Classifier Performance:")
print(f"{'='*50}")
print(f"Accuracy: {accuracy:.4f}")
print(f"ROC-AUC: {auc:.4f}")
print(f"\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=['Weak Model Wins', 'Strong Model Wins']))

In [None]:
# Visualization: ROC Curve
fpr, tpr, thresholds = roc_curve(y_test, y_proba)

plt.figure(figsize=(8, 6))
plt.plot(fpr, tpr, linewidth=2, label=f'ROC Curve (AUC = {auc:.3f})')
plt.plot([0, 1], [0, 1], 'k--', alpha=0.5, label='Random')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve: Strong Model Wins Prediction')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

print("Classifier ready for routing decisions!")
print("Use classifier.predict_proba(query_embedding)[0][1] to get probability of strong model winning.")

In [None]:
# Router function: use classifier to decide routing
def router_policy(query, classifier, encoder, threshold=0.5):
    """
    Router policy using classifier.
    Returns: ('strong' or 'weak', probability)
    """
    # Encode query
    query_features = extract_features([query])
    
    # Predict probability that strong model wins
    proba_strong = classifier.predict_proba(query_features)[0][1]
    
    if proba_strong >= threshold:
        return 'strong', proba_strong
    else:
        return 'weak', proba_strong

# Example usage
example_query = queries_filtered[0]
route, prob = router_policy(example_query, classifier, encoder, threshold=0.5)
print(f"Example query: {example_query[:100]}...")
print(f"Routing decision: {route} (probability={prob:.3f})")
print(f"Expected cost: {'HIGH' if route == 'strong' else 'LOW'}")

In [None]:
# Cost analysis: Evaluate cost savings at different thresholds
# Assume: strong model = $0.01 per query, weak model = $0.001 per query

COST_STRONG = 0.01
COST_WEAK = 0.001

thresholds = np.arange(0.3, 0.95, 0.05)
results = []

for thresh in thresholds:
    # Route all test queries
    routes = []
    for query in queries_filtered[::10][:len(X_test)]:  # Sample for speed
        route, _ = router_policy(query, classifier, encoder, threshold=thresh)
        routes.append(route)
    
    # Calculate costs
    n_strong = sum(1 for r in routes if r == 'strong')
    n_weak = len(routes) - n_strong
    total_cost = n_strong * COST_STRONG + n_weak * COST_WEAK
    
    # Compare to baseline (always use strong)
    baseline_cost = len(routes) * COST_STRONG
    cost_savings = (baseline_cost - total_cost) / baseline_cost * 100
    
    results.append({
        'threshold': thresh,
        'total_cost': total_cost,
        'cost_savings_pct': cost_savings,
        'strong_ratio': n_strong / len(routes)
    })

results_df = pd.DataFrame(results)
print("Cost Analysis:")
print(results_df.to_string(index=False))

In [None]:
# Visualization: Cost savings vs threshold
plt.figure(figsize=(10, 5))

plt.subplot(1, 2, 1)
plt.plot(results_df['threshold'], results_df['cost_savings_pct'], 'o-', linewidth=2)
plt.xlabel('Routing Threshold')
plt.ylabel('Cost Savings (%)')
plt.title('Cost Savings vs Routing Threshold')
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
plt.plot(results_df['threshold'], results_df['strong_ratio'], 's-', linewidth=2, color='orange')
plt.xlabel('Routing Threshold')
plt.ylabel('Fraction Routed to Strong Model')
plt.title('Strong Model Usage vs Threshold')
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

print("\nSummary:")
print(f"Max cost savings: {results_df['cost_savings_pct'].max():.1f}%")
print(f"At threshold={results_df.loc[results_df['cost_savings_pct'].idxmax(), 'threshold']:.2f}")