# Smelt AI — Live Demo

Interactive walkthrough of smelt-ai across **OpenAI**, **Anthropic**, and **Google Gemini**.

Tests:
1. Basic classification (all 3 providers)
2. Sentiment analysis with score validation
3. Support ticket triage (complex schema)
4. Parameter tuning (temperature, top_p)
5. Batch configuration (batch_size, concurrency)
6. Error handling (stop_on_exhaustion)
7. Async execution

## Setup

In [1]:
import os
import csv
import time
from pathlib import Path
from typing import Literal

from dotenv import load_dotenv
from pydantic import BaseModel, Field

from smelt import Model, Job, SmeltResult, SmeltMetrics, BatchError
from smelt.errors import SmeltExhaustionError

load_dotenv()

OPENAI_KEY = os.getenv("OPENAI_API_KEY")
ANTHROPIC_KEY = os.getenv("ANTHROPIC_API_KEY")
GEMINI_KEY = os.getenv("GEMINI_API_KEY")

print(f"OpenAI key:    {'set' if OPENAI_KEY else 'MISSING'}")
print(f"Anthropic key: {'set' if ANTHROPIC_KEY else 'MISSING'}")
print(f"Gemini key:    {'set' if GEMINI_KEY else 'MISSING'}")

OpenAI key:    set
Anthropic key: set
Gemini key:    set


## Load Test Data

In [2]:
DATA_DIR = Path("../tests/data")


def load_csv(filename: str) -> list[dict[str, str]]:
    """Load CSV from tests/data directory."""
    with open(DATA_DIR / filename, newline="", encoding="utf-8") as f:
        return list(csv.DictReader(f))


companies = load_csv("companies.csv")
products = load_csv("products.csv")
tickets = load_csv("support_tickets.csv")

print(f"Companies: {len(companies)} rows")
print(f"Products:  {len(products)} rows")
print(f"Tickets:   {len(tickets)} rows")
print()
print("Sample company:", companies[0])

Companies: 10 rows
Products:  10 rows
Tickets:   10 rows

Sample company: {'name': 'Apple Inc.', 'description': 'Designs and manufactures consumer electronics and software', 'founded': '1976', 'headquarters': 'Cupertino CA', 'employees': '164000'}


## Define Output Models

In [3]:
class IndustryClassification(BaseModel):
    """Classification of a company by industry sector."""
    sector: str = Field(description="Primary industry sector")
    sub_sector: str = Field(description="More specific sub-sector")
    is_public: bool = Field(description="Whether the company is publicly traded")


class SentimentAnalysis(BaseModel):
    """Sentiment analysis of a product review."""
    sentiment: Literal["positive", "negative", "mixed"] = Field(description="Overall sentiment")
    score: float = Field(description="Score from 0.0 (negative) to 1.0 (positive)")
    key_themes: list[str] = Field(description="Main themes in the review (1-3 items)")


class TicketTriage(BaseModel):
    """Support ticket triage result."""
    category: str = Field(description="Category: billing, technical, shipping, account, or general")
    priority: Literal["low", "medium", "high", "urgent"] = Field(description="Priority level")
    requires_human: bool = Field(description="Whether human escalation is needed")
    suggested_response: str = Field(description="Brief suggested response to the customer")


class CompanySummary(BaseModel):
    """Structured company summary."""
    one_liner: str = Field(description="One sentence description")
    industry: str = Field(description="Primary industry")
    company_size: Literal["startup", "small", "medium", "large", "enterprise"] = Field(
        description="Size classification based on employee count"
    )
    age_years: int = Field(description="Approximate age in years")


print("Output models defined.")

Output models defined.


## Helper: Pretty-Print Results

In [4]:
def show_result(label: str, result: SmeltResult) -> None:
    """Pretty-print a SmeltResult."""
    status = "SUCCESS" if result.success else "FAILED"
    m = result.metrics
    print(f"\n{'='*70}")
    print(f"  {label}")
    print(f"  Status: {status}")
    print(f"  Rows: {m.successful_rows}/{m.total_rows} successful")
    print(f"  Batches: {m.successful_batches}/{m.total_batches} successful")
    print(f"  Tokens: {m.input_tokens:,} in / {m.output_tokens:,} out")
    print(f"  Retries: {m.total_retries} | Time: {m.wall_time_seconds:.2f}s")
    if result.errors:
        print(f"  Errors: {len(result.errors)}")
        for e in result.errors:
            print(f"    - Batch {e.batch_index}: {e.error_type} ({e.attempts} attempts)")
    print(f"{'='*70}")
    print()
    for i, row in enumerate(result.data):
        print(f"  [{i}] {row}")
    if len(result.data) > 3:
        print(f"  ... and {len(result.data) - 3} more rows")

---
## 1. Basic Classification — All 3 Providers

Same task, same data, three different LLMs.

In [5]:
# OpenAI — GPT-4.1-mini
model_openai = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY)

job = Job(
    prompt="Classify each company by its primary industry sector and sub-sector. "
    "Determine if the company is publicly traded.",
    output_model=IndustryClassification,
    batch_size=10,
    stop_on_exhaustion=False,
)

result = await job.arun(model_openai, data=companies)
show_result("OpenAI / gpt-4.1-mini — Company Classification", result)


  OpenAI / gpt-4.1-mini — Company Classification
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 1/1 successful
  Tokens: 961 in / 234 out
  Retries: 0 | Time: 3.78s

  [0] sector='Information Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financials' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True
  [3] sector='Consumer Discretionary' sub_sector='Electric Vehicles and Clean Energy' is_public=True
  [4] sector='Communication Services' sub_sector='Streaming Entertainment' is_public=True
  [5] sector='Industrials' sub_sector='Professional Services' is_public=False
  [6] sector='Communication Services' sub_sector='Digital Music and Podcast Streaming' is_public=True
  [7] sector='Information Technology' sub_sector='Financial Technology' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='C

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=9)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(


In [6]:
# Anthropic — Claude Sonnet 4
model_anthropic = Model(provider="anthropic", name="claude-sonnet-4-20250514", api_key=ANTHROPIC_KEY)

job = Job(
    prompt="Classify each company by its primary industry sector and sub-sector. "
    "Determine if the company is publicly traded.",
    output_model=IndustryClassification,
    batch_size=10,
    stop_on_exhaustion=False,
)

result = await job.arun(model_anthropic, data=companies)
show_result("Anthropic / claude-sonnet-4 — Company Classification", result)


  Anthropic / claude-sonnet-4 — Company Classification
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 1/1 successful
  Tokens: 1,386 in / 474 out
  Retries: 0 | Time: 5.74s

  [0] sector='Technology' sub_sector='Consumer Electronics' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals' is_public=True
  [3] sector='Automotive' sub_sector='Electric Vehicles' is_public=True
  [4] sector='Technology' sub_sector='Media Streaming' is_public=True
  [5] sector='Professional Services' sub_sector='Consulting' is_public=False
  [6] sector='Technology' sub_sector='Digital Media' is_public=True
  [7] sector='Financial Services' sub_sector='Payment Processing' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Technology' sub_sector='Online Marketplace' is_public=True
  ... and 7 more rows


In [7]:
# Google Gemini — Gemini 2.5 Flash
model_gemini = Model(provider="google_genai", name="gemini-2.5-flash", api_key=GEMINI_KEY)

job = Job(
    prompt="Classify each company by its primary industry sector and sub-sector. "
    "Determine if the company is publicly traded.",
    output_model=IndustryClassification,
    batch_size=10,
    stop_on_exhaustion=False,
)

result = await job.arun(model_gemini, data=companies)
show_result("Gemini / gemini-2.5-flash — Company Classification", result)

---
## 2. Sentiment Analysis — Score Validation

Analyze product reviews and verify scores are in [0, 1] range.

In [40]:
model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY, params={"temperature": 0})

job = Job(
    prompt="Analyze the sentiment of each product's customer_review. "
    "Identify the overall sentiment, assign a score between 0.0 and 1.0, "
    "and extract 1-3 key themes.",
    output_model=SentimentAnalysis,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=products)
show_result("OpenAI / gpt-4.1-mini — Sentiment Analysis", result)

# Validate scores
print("\nScore validation:")
for i, row in enumerate(result.data):
    in_range = 0.0 <= row.score <= 1.0
    print(f"  [{i}] score={row.score:.2f} sentiment={row.sentiment:8s} valid={in_range} themes={row.key_themes}")

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltS...nvenience'], row_id=9)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(



  OpenAI / gpt-4.1-mini — Sentiment Analysis
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,514 in / 306 out
  Retries: 0 | Time: 3.09s

  [0] sentiment='positive' score=0.9 key_themes=['sound quality', 'comfort', 'long flights']
  [1] sentiment='mixed' score=0.6 key_themes=['performance on hardwood', 'battery life']
  [2] sentiment='positive' score=0.95 key_themes=['reading experience', 'portability', 'glare-free display']
  [3] sentiment='positive' score=0.9 key_themes=['meal preparation', 'ease of use', 'time-saving']
  [4] sentiment='mixed' score=0.5 key_themes=['warmth', 'seasonal suitability']
  [5] sentiment='positive' score=0.9 key_themes=['kids enjoyment', 'screen quality']
  [6] sentiment='positive' score=0.85 key_themes=['sound quality', 'portability', 'waterproof']
  [7] sentiment='positive' score=0.9 key_themes=['ergonomics', 'pain relief', 'comfort']
  [8] sentiment='positive' score=0.8 key_themes=['weight', 'value', 'cooking versatility

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltS...itability'], row_id=4)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(


In [41]:
model = Model(provider="anthropic", name="claude-haiku-4-5-20251001", api_key=ANTHROPIC_KEY, params={"temperature": 0})

job = Job(
    prompt="Analyze the sentiment of each product's customer_review. "
    "Identify the overall sentiment, assign a score between 0.0 and 1.0, "
    "and extract 1-3 key themes.",
    output_model=SentimentAnalysis,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=products)
show_result("Anthropic / claude-haiku-4.5 — Sentiment Analysis", result)

print("\nScore validation:")
for i, row in enumerate(result.data):
    in_range = 0.0 <= row.score <= 1.0
    print(f"  [{i}] score={row.score:.2f} sentiment={row.sentiment:8s} valid={in_range} themes={row.key_themes}")


  Anthropic / claude-haiku-4.5 — Sentiment Analysis
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 2,915 in / 600 out
  Retries: 0 | Time: 2.62s

  [0] sentiment='positive' score=0.95 key_themes=['sound quality', 'comfort', 'long-distance use']
  [1] sentiment='mixed' score=0.65 key_themes=['performance on hardwood', 'battery life limitation']
  [2] sentiment='positive' score=0.9 key_themes=['portability', 'display quality', 'versatile usage']
  [3] sentiment='positive' score=0.85 key_themes=['meal preparation', 'convenience', 'time-saving']
  [4] sentiment='mixed' score=0.6 key_themes=['seasonal versatility', 'temperature limitations', 'lightweight design']
  [5] sentiment='positive' score=0.95 key_themes=['display quality', 'family satisfaction', 'product appeal']
  [6] sentiment='positive' score=0.9 key_themes=['audio quality', 'portability', 'durability']
  [7] sentiment='positive' score=0.92 key_themes=['health benefits', 'ergonomics', 'value propo

In [42]:
model = Model(provider="google_genai", name="gemini-2.0-flash", api_key=GEMINI_KEY, params={"temperature": 0})

job = Job(
    prompt="Analyze the sentiment of each product's customer_review. "
    "Identify the overall sentiment, assign a score between 0.0 and 1.0, "
    "and extract 1-3 key themes.",
    output_model=SentimentAnalysis,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=products)
show_result("Gemini / gemini-2.0-flash — Sentiment Analysis", result)

print("\nScore validation:")
for i, row in enumerate(result.data):
    in_range = 0.0 <= row.score <= 1.0
    print(f"  [{i}] score={row.score:.2f} sentiment={row.sentiment:8s} valid={in_range} themes={row.key_themes}")


  Gemini / gemini-2.0-flash — Sentiment Analysis
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,283 in / 606 out
  Retries: 0 | Time: 3.00s

  [0] sentiment='positive' score=0.9 key_themes=['sound quality', 'comfort']
  [1] sentiment='mixed' score=0.6 key_themes=['hardwood', 'battery life']
  [2] sentiment='positive' score=0.8 key_themes=['reading', 'beach', 'bed']
  [3] sentiment='positive' score=0.7 key_themes=['meal prep', 'easy']
  [4] sentiment='mixed' score=0.5 key_themes=['warmth', 'fall', 'winter']
  [5] sentiment='positive' score=0.9 key_themes=['kids love it', 'gorgeous screen']
  [6] sentiment='positive' score=0.85 key_themes=['poolside speaker', 'deep bass']
  [7] sentiment='positive' score=0.9 key_themes=['back pain relief', 'ergonomic']
  [8] sentiment='positive' score=0.8 key_themes=['heavy', 'worth the price', 'soups and braises']
  [9] sentiment='positive' score=0.9 key_themes=['obstacle avoidance', "doesn't eat socks"]

Score validat

---
## 3. Support Ticket Triage — Complex Schema

Tests Literal types, booleans, and longer text generation.

In [43]:
model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY)

job = Job(
    prompt="Triage each support ticket. Classify by category (billing, technical, "
    "shipping, account, or general), assign priority, determine if human escalation "
    "is needed, and write a brief suggested response.",
    output_model=TicketTriage,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=tickets)
show_result("OpenAI / gpt-4.1-mini — Ticket Triage", result)

print("\nFull triage results:")
for i, row in enumerate(result.data):
    print(f"\n  [{i}] {tickets[i]['ticket_id']}")
    print(f"      Category: {row.category} | Priority: {row.priority} | Human: {row.requires_human}")
    print(f"      Response: {row.suggested_response[:100]}...")


  OpenAI / gpt-4.1-mini — Ticket Triage
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,593 in / 506 out
  Retries: 0 | Time: 3.55s

  [0] category='shipping' priority='urgent' requires_human=True suggested_response="We're sorry to hear your laptop arrived damaged. We'll expedite a replacement immediately. Our support team will contact you shortly for details."
  [1] category='billing' priority='medium' requires_human=False suggested_response='To switch your subscription to annual billing, please follow the instructions in your account settings. Let us know if you need any further assistance.'
  [2] category='technical' priority='high' requires_human=True suggested_response="We're sorry for the inconvenience caused by the software crashing. Our technical team will investigate and get back to you with a solution as soon as possible."
  [3] category='billing' priority='high' requires_human=True suggested_response='We apologize for the double charge. Our 

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltT...s details.', row_id=9)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltT...ssistance.", row_id=4)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(


In [44]:
model = Model(provider="anthropic", name="claude-sonnet-4-20250514", api_key=ANTHROPIC_KEY)

job = Job(
    prompt="Triage each support ticket. Classify by category (billing, technical, "
    "shipping, account, or general), assign priority, determine if human escalation "
    "is needed, and write a brief suggested response.",
    output_model=TicketTriage,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=tickets)
show_result("Anthropic / claude-sonnet-4 — Ticket Triage", result)

print("\nFull triage results:")
for i, row in enumerate(result.data):
    print(f"\n  [{i}] {tickets[i]['ticket_id']}")
    print(f"      Category: {row.category} | Priority: {row.priority} | Human: {row.requires_human}")
    print(f"      Response: {row.suggested_response[:100]}...")


  Anthropic / claude-sonnet-4 — Ticket Triage
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 2,407 in / 964 out
  Retries: 0 | Time: 8.50s

  [0] category='shipping' priority='high' requires_human=True suggested_response="I apologize for the damaged laptop. I'll immediately arrange a replacement shipment and provide a return label for the damaged unit. You should receive your replacement within 1-2 business days."
  [1] category='billing' priority='low' requires_human=False suggested_response="I can help you switch to annual billing! You can change this in your account settings under 'Billing Preferences' or I can process this change for you right now. Annual billing also includes a 15% discount."
  [2] category='technical' priority='medium' requires_human=False suggested_response="I'm sorry you're experiencing crashes with PDF export. Let's troubleshoot this - please try updating to the latest version first. If the issue persists, I'll need some system

In [45]:
model = Model(provider="google_genai", name="gemini-2.5-flash", api_key=GEMINI_KEY)

job = Job(
    prompt="Triage each support ticket. Classify by category (billing, technical, "
    "shipping, account, or general), assign priority, determine if human escalation "
    "is needed, and write a brief suggested response.",
    output_model=TicketTriage,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=tickets)
show_result("Gemini / gemini-2.5-flash — Ticket Triage", result)

print("\nFull triage results:")
for i, row in enumerate(result.data):
    print(f"\n  [{i}] {tickets[i]['ticket_id']}")
    print(f"      Category: {row.category} | Priority: {row.priority} | Human: {row.requires_human}")
    print(f"      Response: {row.suggested_response[:100]}...")


  Gemini / gemini-2.5-flash — Ticket Triage
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,407 in / 2,302 out
  Retries: 0 | Time: 7.67s

  [0] category='shipping' priority='urgent' requires_human=True suggested_response='We apologize for the damaged item. We will arrange for a replacement to be sent to you immediately. Please provide details.'
  [1] category='billing' priority='low' requires_human=False suggested_response="You can change your subscription billing cycle from monthly to annual in your account settings under the 'Subscription' or 'Billing' section."
  [2] category='technical' priority='high' requires_human=True suggested_response='We apologize for the software crashing. Please provide more details about your operating system and software version so our technical team can assist.'
  [3] category='billing' priority='urgent' requires_human=True suggested_response='We apologize for the duplicate charge. We are investigating this immediately

---
## 4. Parameter Tuning — Temperature Comparison

Compare temperature=0 (deterministic) vs temperature=1.0 (creative) on the same task.

In [46]:
data_subset = companies[:3]

for temp in [0, 0.5, 1.0]:
    model = Model(
        provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY,
        params={"temperature": temp},
    )
    job = Job(
        prompt="Classify each company by industry sector.",
        output_model=IndustryClassification,
        batch_size=10,
        stop_on_exhaustion=False,
    )
    result = await job.arun(model, data=data_subset)
    show_result(f"OpenAI / gpt-4.1-mini — temp={temp}", result)

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=2)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(



  OpenAI / gpt-4.1-mini — temp=0
  Status: SUCCESS
  Rows: 3/3 successful
  Batches: 1/1 successful
  Tokens: 535 in / 74 out
  Retries: 0 | Time: 1.49s

  [0] sector='Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True

  OpenAI / gpt-4.1-mini — temp=0.5
  Status: SUCCESS
  Rows: 3/3 successful
  Batches: 1/1 successful
  Tokens: 535 in / 74 out
  Retries: 0 | Time: 1.94s

  [0] sector='Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True

  OpenAI / gpt-4.1-mini — temp=1.0
  Status: SUCCESS
  Rows: 3/3 successful
  Batches: 1/1 successful
  Tokens: 535 in / 67 out

In [None]:
# Anthropic: top_p (mutually exclusive with temperature) and top_k
# NOTE: Anthropic does NOT allow setting both temperature and top_p simultaneously.
for top_p in [0.5, 0.9]:
    model = Model(
        provider="anthropic", name="claude-haiku-4-5-20251001", api_key=ANTHROPIC_KEY,
        params={"top_p": top_p, "top_k": 40},
    )
    job = Job(
        prompt="Classify each company by industry sector.",
        output_model=IndustryClassification,
        batch_size=10,
        stop_on_exhaustion=False,
    )
    result = await job.arun(model, data=data_subset)
    show_result(f"Anthropic / claude-haiku-4.5 — top_p={top_p}, top_k=40", result)

---
## 5. Batch Configuration — Size & Concurrency

Compare different batch_size and concurrency settings on the same dataset.

In [51]:
configs = [
    {"batch_size": 10, "concurrency": 1, "label": "1 batch, serial"},
    {"batch_size": 5, "concurrency": 2, "label": "2 batches, conc=2"},
    {"batch_size": 2, "concurrency": 5, "label": "5 batches, conc=5"},
    {"batch_size": 1, "concurrency": 10, "label": "10 batches, conc=10"},
]

model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY, params={"temperature": 0})

for cfg in configs:
    job = Job(
        prompt="Classify each company by industry sector.",
        output_model=IndustryClassification,
        batch_size=cfg["batch_size"],
        concurrency=cfg["concurrency"],
        stop_on_exhaustion=False,
    )
    result = await job.arun(model, data=companies)
    show_result(f"Config: {cfg['label']} (batch={cfg['batch_size']}, conc={cfg['concurrency']})", result)
    
    # Verify all rows present and in order
    assert len(result.data) == len(companies), f"Row count mismatch: {len(result.data)} vs {len(companies)}"
    print(f"  Row ordering verified: {len(result.data)} rows in correct order")


  Config: 1 batch, serial (batch=10, conc=1)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 1/1 successful
  Tokens: 948 in / 223 out
  Retries: 0 | Time: 5.17s

  [0] sector='Technology' sub_sector='Consumer Electronics' is_public=True
  [1] sector='Financial Services' sub_sector='Banking and Investment' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True
  [3] sector='Automotive' sub_sector='Electric Vehicles and Clean Energy' is_public=True
  [4] sector='Media and Entertainment' sub_sector='Streaming Services' is_public=True
  [5] sector='Professional Services' sub_sector='Audit and Consulting' is_public=False
  [6] sector='Media and Entertainment' sub_sector='Digital Music Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Financial Technology' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Technology' sub_sector='Online Marketplace' is_public=True
  ..

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=4)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(



  Config: 2 batches, conc=2 (batch=5, conc=2)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,300 in / 232 out
  Retries: 0 | Time: 2.98s

  [0] sector='Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True
  [3] sector='Automotive and Energy' sub_sector='Electric Vehicles and Clean Energy' is_public=True
  [4] sector='Media and Entertainment' sub_sector='Streaming Services' is_public=True
  [5] sector='Professional Services' sub_sector='Consulting and Audit' is_public=False
  [6] sector='Technology' sub_sector='Digital Media Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Payment Processing' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Technology' sub_sector='Online M

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=3)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=1)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...ublic=False, row_id=5)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...ublic=False, row_id=7)]), input_type=_SmeltBatch])
  return self


  Config: 5 batches, conc=5 (batch=2, conc=5)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 5/5 successful
  Tokens: 2,356 in / 245 out
  Retries: 0 | Time: 2.03s

  [0] sector='Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals & Biotechnology' is_public=True
  [3] sector='Automotive' sub_sector='Electric Vehicles & Clean Energy' is_public=True
  [4] sector='Media & Entertainment' sub_sector='Streaming Services' is_public=True
  [5] sector='Professional Services' sub_sector='Audit and Consulting' is_public=False
  [6] sector='Technology' sub_sector='Digital Media Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Payment Processing' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Consumer Services' sub_sector='Online Marketplace

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=0)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=6)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(



  Config: 10 batches, conc=10 (batch=1, conc=10)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 10/10 successful
  Tokens: 4,116 in / 263 out
  Retries: 0 | Time: 1.97s

  [0] sector='Technology' sub_sector='Consumer Electronics' is_public=True
  [1] sector='Financial Services' sub_sector='Banking' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals & Biotechnology' is_public=True
  [3] sector='Automotive' sub_sector='Electric Vehicles' is_public=True
  [4] sector='Technology' sub_sector='Streaming Entertainment' is_public=True
  [5] sector='Professional Services' sub_sector='Audit and Consulting' is_public=False
  [6] sector='Technology' sub_sector='Digital Media Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Payment Processing' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Consumer Services' sub_sector='Online Travel and Lodging' is_public=True
  ... and 7 more rows
  Row ordering ve

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltI...public=True, row_id=8)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(


---
## 6. Error Handling — stop_on_exhaustion

Demonstrate graceful error handling when `stop_on_exhaustion=False` collects errors,
and when `stop_on_exhaustion=True` raises `SmeltExhaustionError` with partial results.

In [52]:
# stop_on_exhaustion=False: errors are collected, successful batches still returned
model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY, params={"temperature": 0})

job = Job(
    prompt="Create a concise structured summary for each company. "
    "Calculate age based on founded year (current year is 2026).",
    output_model=CompanySummary,
    batch_size=5,
    concurrency=2,
    max_retries=2,
    stop_on_exhaustion=False,  # collect errors, don't raise
)

result = await job.arun(model, data=companies)
show_result("Company Summary (stop_on_exhaustion=False)", result)

print(f"\nsuccess property: {result.success}")
print(f"result.data has {len(result.data)} rows")
print(f"result.errors has {len(result.errors)} errors")
print(f"result.metrics: {result.metrics}")


  Company Summary (stop_on_exhaustion=False)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 1,430 in / 373 out
  Retries: 0 | Time: 3.92s

  [0] one_liner='Apple Inc. designs and manufactures consumer electronics and software.' industry='Consumer Electronics' company_size='enterprise' age_years=50
  [1] one_liner='JPMorgan Chase is a multinational investment bank and financial services company.' industry='Financial Services' company_size='enterprise' age_years=227
  [2] one_liner='Pfizer is a global pharmaceutical and biotechnology corporation.' industry='Pharmaceuticals' company_size='enterprise' age_years=177
  [3] one_liner='Tesla Inc. is an electric vehicle and clean energy company.' industry='Automotive' company_size='enterprise' age_years=23
  [4] one_liner='Netflix is a streaming entertainment service provider.' industry='Entertainment' company_size='medium' age_years=29
  [5] one_liner='Deloitte is a professional services network providing audit

  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltC...ge_years=29, row_id=4)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(
  PydanticSerializationUnexpectedValue(Expected `none` - serialized value may not be as expected [field_name='parsed', input_value=_SmeltBatch(rows=[_SmeltC...ge_years=18, row_id=9)]), input_type=_SmeltBatch])
  return self.__pydantic_serializer__.to_python(


In [53]:
# stop_on_exhaustion=True with a valid request — should succeed without raising
model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY, params={"temperature": 0})

job = Job(
    prompt="Classify each company by industry sector.",
    output_model=IndustryClassification,
    batch_size=10,
    max_retries=3,
    stop_on_exhaustion=True,  # will raise on failure
)

try:
    result = await job.arun(model, data=companies)
    show_result("Classification (stop_on_exhaustion=True, no error expected)", result)
    print("No exception raised — all batches succeeded.")
except SmeltExhaustionError as e:
    print(f"SmeltExhaustionError: {e}")
    print(f"Partial results: {len(e.partial_result.data)} rows succeeded")
    print(f"Errors: {len(e.partial_result.errors)} batches failed")


  Classification (stop_on_exhaustion=True, no error expected)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 1/1 successful
  Tokens: 948 in / 225 out
  Retries: 0 | Time: 3.55s

  [0] sector='Technology' sub_sector='Consumer Electronics' is_public=True
  [1] sector='Financial Services' sub_sector='Banking and Investment' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True
  [3] sector='Automotive and Energy' sub_sector='Electric Vehicles and Clean Energy' is_public=True
  [4] sector='Media and Entertainment' sub_sector='Streaming Services' is_public=True
  [5] sector='Professional Services' sub_sector='Audit and Consulting' is_public=False
  [6] sector='Media and Entertainment' sub_sector='Digital Music Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Financial Technology' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Technology' sub_sector='Online Mark

---
## 7. Async Execution

Use `await job.arun()` directly (works in Jupyter notebooks).

In [54]:
model = Model(provider="openai", name="gpt-4.1-mini", api_key=OPENAI_KEY, params={"temperature": 0})

job = Job(
    prompt="Classify each company by industry sector.",
    output_model=IndustryClassification,
    batch_size=3,
    concurrency=4,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=companies)
show_result("OpenAI / gpt-4.1-mini — Async (batch=3, conc=4)", result)
print(f"  Batches: {result.metrics.total_batches} (ceil(10/3) = 4)")


  OpenAI / gpt-4.1-mini — Async (batch=3, conc=4)
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 4/4 successful
  Tokens: 2,004 in / 244 out
  Retries: 0 | Time: 2.84s

  [0] sector='Technology' sub_sector='Consumer Electronics and Software' is_public=True
  [1] sector='Financial Services' sub_sector='Investment Banking and Financial Services' is_public=True
  [2] sector='Healthcare' sub_sector='Pharmaceuticals and Biotechnology' is_public=True
  [3] sector='Automotive' sub_sector='Electric Vehicles' is_public=True
  [4] sector='Media & Entertainment' sub_sector='Streaming Services' is_public=True
  [5] sector='Professional Services' sub_sector='Audit and Consulting' is_public=False
  [6] sector='Technology' sub_sector='Digital Media Streaming' is_public=True
  [7] sector='Financial Services' sub_sector='Financial Technology (FinTech)' is_public=False
  [8] sector='Healthcare' sub_sector='Biotechnology' is_public=True
  [9] sector='Consumer Services' sub_sector='Online Travel a

In [55]:
model = Model(provider="anthropic", name="claude-haiku-4-5-20251001", api_key=ANTHROPIC_KEY)

job = Job(
    prompt="Analyze the sentiment of each product review.",
    output_model=SentimentAnalysis,
    batch_size=5,
    concurrency=2,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=products)
show_result("Anthropic / claude-haiku-4.5 — Async Sentiment", result)


  Anthropic / claude-haiku-4.5 — Async Sentiment
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 2/2 successful
  Tokens: 2,847 in / 583 out
  Retries: 0 | Time: 2.72s

  [0] sentiment='positive' score=0.95 key_themes=['sound quality', 'comfort', 'long-duration use']
  [1] sentiment='mixed' score=0.65 key_themes=['performance on hardwood', 'battery life limitation']
  [2] sentiment='positive' score=0.9 key_themes=['reading experience', 'versatile use cases']
  [3] sentiment='positive' score=0.85 key_themes=['convenience', 'meal preparation', 'time-saving']
  [4] sentiment='mixed' score=0.7 key_themes=['warmth for fall', 'seasonal limitation', 'material quality']
  [5] sentiment='positive' score=0.9 key_themes=['Kids satisfaction', 'Display quality']
  [6] sentiment='positive' score=0.85 key_themes=['Audio quality', 'Portability']
  [7] sentiment='positive' score=0.95 key_themes=['Health benefits', 'Quality improvement']
  [8] sentiment='positive' score=0.85 key_themes=['Value fo

In [56]:
model = Model(provider="google_genai", name="gemini-2.5-flash", api_key=GEMINI_KEY)

job = Job(
    prompt="Triage each support ticket with category, priority, escalation need, "
    "and a suggested response.",
    output_model=TicketTriage,
    batch_size=10,
    stop_on_exhaustion=False,
)

result = await job.arun(model, data=tickets)
show_result("Gemini / gemini-2.5-flash — Async Ticket Triage", result)


  Gemini / gemini-2.5-flash — Async Ticket Triage
  Status: SUCCESS
  Rows: 10/10 successful
  Batches: 1/1 successful
  Tokens: 1,128 in / 2,026 out
  Retries: 0 | Time: 9.01s

  [0] category='shipping' priority='urgent' requires_human=True suggested_response='Apologies for the damaged product. We will arrange for a replacement immediately. Please provide photos of the damage for our records.'
  [1] category='billing' priority='low' requires_human=False suggested_response="You can change your subscription in your account settings under 'Billing & Plans'. Follow the steps to switch to annual billing."
  [2] category='technical' priority='high' requires_human=True suggested_response='We apologize for the inconvenience. Please try these troubleshooting steps: [link to troubleshooting guide]. If the issue persists, we will escalate to our technical team.'
  [3] category='billing' priority='high' requires_human=True suggested_response='We apologize for the double charge. We are investigat

---
## Summary

All tests complete. Smelt successfully:
- Transforms structured data through OpenAI, Anthropic, and Google Gemini
- Returns strictly typed Pydantic models
- Handles batching and concurrency
- Provides detailed metrics (tokens, timing, retries)
- Works in both sync (`job.run()`) and async (`await job.arun()`) modes

> **Note:** Jupyter notebooks run inside an event loop, so all cells use `await job.arun()`.
> Use `job.run()` in regular Python scripts where no event loop is running.