<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/111_Best_N.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>


# Claude Code Starter Notebook

This notebook is a clean template for working with **Claude** (Anthropic's models) in Colab.

It supports:
- Loading your API key from a `.env` file
- A helper function `ask_claude` for single-turn Q&A
- A simple **conversation manager** to keep history across multiple turns
- Running shell commands via `!` or `%%bash`


## 1. Install dependencies

In [1]:

!pip -q install anthropic python-dotenv rich


[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/297.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m297.2/297.2 kB[0m [31m9.3 MB/s[0m eta [36m0:00:00[0m
[?25h

## 2. Load API key

In [2]:

import os
from dotenv import load_dotenv

# Adjust path to your secrets file
load_dotenv("/content/API_KEYS.env")

anthropic_key = os.getenv("ANTHROPIC_API_KEY")
if not anthropic_key:
    raise RuntimeError("Missing ANTHROPIC_API_KEY in /content/API_KEYS.env")

print("✅ Anthropic key loaded")


✅ Anthropic key loaded


## 3. Import libraries and set up client

In [3]:

from anthropic import Anthropic, APIError
from rich.console import Console
from rich.markdown import Markdown

console = Console()
client = Anthropic(api_key=anthropic_key)

# Default to Claude 3.5 Haiku for speed & low cost
MODEL_NAME = os.environ.get("CLAUDE_MODEL", "claude-3-5-haiku-latest")


## 4. Helper function for single-turn queries

In [16]:
import textwrap

def smart_print_markdown(output: str, width: int = 100):
    """Wrap plain text, preserve code fences."""
    in_code = False
    buf = []
    for line in output.splitlines():
        if line.strip().startswith("```"):
            # flush any wrapped text before toggling code mode
            if buf:
                print(textwrap.fill(" ".join(buf), width=width, replace_whitespace=False))
                print()
                buf = []
            print(line)
            in_code = not in_code
            continue
        if in_code:
            print(line)
        else:
            # collect non-code lines to wrap as paragraphs
            if line.strip() == "":
                if buf:
                    print(textwrap.fill(" ".join(buf), width=width, replace_whitespace=False))
                    print()
                    buf = []
            else:
                buf.append(line)
    if buf:
        print(textwrap.fill(" ".join(buf), width=width, replace_whitespace=False))
        print()

def ask_claude(prompt: str, system: str = "You are a helpful coding assistant.",
               render: str = "markdown",  # 'markdown' | 'wrapped' | 'none'
               return_text: bool = False) -> str | None:
    if not anthropic_key:
        raise RuntimeError("Missing ANTHROPIC_API_KEY.")
    msg = client.messages.create(
        model=MODEL_NAME,
        max_tokens=1000,
        temperature=0.2,
        system=system,
        messages=[{"role": "user", "content": prompt}],
    )
    parts = [b.text for b in msg.content if getattr(b, "type", None) == "text"]
    output = "\n\n".join(parts).strip() or "(No text)"

    if render == "markdown":
        console.print(Markdown(output))
    elif render == "wrapped":
        smart_print_markdown(output)
    # render == 'none' skips printing

    return output if return_text else None


## 5. Conversation manager for multi-turn chats

In [19]:
conversation = []

import textwrap
from rich.console import Console
from rich.markdown import Markdown

console = Console()

def smart_print_markdown(output: str, width: int = 100):
    """
    Wrap plain text, preserve fenced code blocks.
    """
    in_code = False
    para_buf = []

    def flush_paragraph():
        if para_buf:
            text = " ".join(para_buf)
            print(textwrap.fill(text, width=width, replace_whitespace=False))
            print()
            para_buf.clear()

    for line in output.splitlines():
        fence = line.strip().startswith("```")
        if fence:
            # Finish any pending wrapped paragraph before toggling code
            flush_paragraph()
            print(line)
            in_code = not in_code
            continue

        if in_code:
            # Inside code block -> print verbatim
            print(line)
        else:
            # Outside code block -> buffer/wrap paragraphs
            if line.strip() == "":
                flush_paragraph()
            else:
                para_buf.append(line)

    flush_paragraph()

def chat_with_claude(
    prompt: str,
    system: str = "You are a helpful coding assistant.",
    render: str = "markdown",      # 'markdown' | 'wrapped' | 'none'
    return_text: bool = False,
    wrap_width: int = 100,
) -> str | None:
    """
    Send a prompt with conversation memory.
    - render='markdown'  -> pretty Markdown rendering (code blocks look great)
    - render='wrapped'   -> wrap only plain text, preserve code fences
    - render='none'      -> print nothing (use return_text=True if you need the string)
    """
    if not anthropic_key:
        raise RuntimeError("Missing ANTHROPIC_API_KEY.")

    conversation.append({"role": "user", "content": prompt})

    try:
        msg = client.messages.create(
            model=MODEL_NAME,
            max_tokens=1000,
            temperature=0.2,
            system=system,
            messages=conversation,
        )
        parts = [b.text for b in msg.content if getattr(b, "type", None) == "text"]
        output = "\n\n".join(parts).strip() or "(No text)"

        if render == "markdown":
            console.print(Markdown(output))
        elif render == "wrapped":
            smart_print_markdown(output, width=wrap_width)
        # render == 'none' -> no printing

        conversation.append({"role": "assistant", "content": output})
        return output if return_text else None

    except APIError as e:
        print("Anthropic API error:", e)
        raise

# Optional helpers
def reset_conversation():
    conversation.clear()

def last_reply() -> str | None:
    for m in reversed(conversation):
        if m["role"] == "assistant":
            return m["content"]
    return None


# Exercise: Build Multiple Versions of a Feature with Best of N

## Prerequisites
- Completed **Tutorial 1.1** (NextJS expense tracking app)  
- Basic understanding of **Git branches**  
  - If you need help, prompt Claude with:  
    *"Teach me about Git branches. Let's do it interactively. Don't create an artifact. Explain things using concrete examples with example repositories with a handful of files."*  
- Claude Code installed  
- Working expense tracker project

---

## Part 1: Understanding the Best-of-N Pattern

### What is Best-of-N?
The Best-of-N pattern leverages a unique advantage of AI labor: it's **fast, cheap, and egoless**.  
Unlike human developers, Claude doesn't get frustrated when you ask it to throw away work and try again.

#### Traditional Development
- Implement one solution  
- Stick with it (too expensive to rewrite)  
- Hope it's the best approach  

#### AI Labor Development
- Implement **3–5 different solutions**  
- Compare and evaluate all of them  
- Choose the best, or combine elements  

⏱️ **Total time**: Still faster than one traditional implementation.

---

### Why This Works
- **AI is fast** → What takes days for humans takes minutes for AI.  
- **AI is cheap** → Even premium AI usage costs less than developer time.  
- **AI is creative** → Each attempt can explore radically different approaches.  
- **AI is egoless** → No hurt feelings when you discard its work.  

---

## Part 2: Setting Up for Multiple Implementations

### Step 1: Prepare Your Repository
First, make sure your expense tracker is committed and working:

```bash
cd expense-tracker-ai
git add .
git commit -m "Initial expense tracker implementation"
git status
```
---

## What to Observe

As Claude works on **Version 1**, notice:

- How it creates the **branch**
- The approach it takes (likely using **browser APIs**)
- The **simplicity** of the implementation
- **Where** it places the Export button
- **How** it formats the **CSV** data




In [23]:
chat = '''
I want to add data export functionality to my expense tracker. For this first version, implement a SIMPLE approach.

VERSION CONTROL:
- Before you start, create a new branch called "feature-data-export-v1"
- Make all your changes in this branch
- Commit your changes when complete

VERSION 1 REQUIREMENTS:
- Add an "Export Data" button to the main dashboard
- When clicked, export all expenses as a CSV file
- Include columns: Date, Category, Amount, Description
- Use a simple, straightforward implementation
- Keep the UI minimal - just a button that triggers the download

IMPLEMENTATION APPROACH:
Focus on simplicity and getting it working quickly. Don't overthink the user experience - just make it functional. Use standard browser APIs for file download.

PROCESS:
1. Create and checkout the new branch "feature-data-export-v1"
2. Implement the CSV export functionality
3. Add the export button to the dashboard
4. Test that it works correctly
5. Commit your changes with a descriptive message

Remember: This is Version 1 of 3 - keep it simple and functional.
'''

chat_with_claude(chat)

This is a clean “V1” CSV export. Let’s walk through it from the **export function** and then I’ll point out small fixes to make it safer/robuster in a Next.js/React app.

---

# 1) Walkthrough: `exportExpensesToCSV(expenses)`

### What it does (step-by-step)

1. **Builds CSV text**

   ```js
   const csvContent = [
     "Date,Category,Amount,Description",
     ...expenses.map(expense =>
       `${expense.date},${expense.category},${expense.amount},${expense.description.replace(/,/g, ' ')}`
     )
   ].join('\n');
   ```

   * Creates a header row.
   * Maps each expense to a CSV line.
   * Replaces commas in `description` so they don’t break the CSV column layout.

2. **Wraps CSV in a Blob**

   ```js
   const blob = new Blob([csvContent], { type: 'text/csv;charset=utf-8;' });
   ```

   * A Blob is a file-like object the browser can download.

3. **Creates a temporary download link**

   ```js
   const link = document.createElement('a');
   const url = URL.createObjectURL(blob);

   link.setAttribute('href', url);
   link.setAttribute('download', `expenses_export_${new Date().toISOString().split('T')[0]}.csv`);
   document.body.appendChild(link);
   link.click();
   document.body.removeChild(link);
   ```

   * Generates an object URL for the Blob.
   * Sets a filename like `expenses_export_2025-09-01.csv`.
   * Programmatically clicks the link to trigger download.

### Why it works

* Browsers allow you to “download” in-memory content by linking to a Blob URL.
* No server roundtrip; the file is generated fully client-side.

---

# 2) Quick improvements (recommended for V1.1)

CSV has some finicky rules. Instead of stripping commas only, **properly escape fields** (wrap in quotes and escape inner quotes), and clean up the object URL afterwards.

### A safer, “drop-in” version (JS)

```js
// utils/csv.js
function csvEscape(value) {
  // Convert null/undefined to empty, ensure string
  const s = value == null ? "" : String(value);
  // If value contains quote, comma, or newline, wrap in quotes and escape quotes
  if (/[",\n]/.test(s)) {
    return `"${s.replace(/"/g, '""')}"`;
  }
  return s;
}

export function exportExpensesToCSV(expenses) {
  const header = ["Date", "Category", "Amount", "Description"];
  const rows = expenses.map((e) => ([
    csvEscape(e.date),
    csvEscape(e.category),
    // ensure amount is string; you could format to 2 decimals if needed
    csvEscape(e.amount),
    csvEscape(e.description),
  ]));

  const lines = [
    header.join(","),
    ...rows.map((cols) => cols.join(",")),
  ];

  // Optional: prepend BOM so Excel opens UTF-8 correctly
  const csvContent = "\uFEFF" + lines.join("\n");

  const blob = new Blob([csvContent], { type: "text/csv;charset=utf-8;" });
  const url = URL.createObjectURL(blob);

  const link = document.createElement("a");
  link.href = url;
  link.download = `expenses_export_${new Date().toISOString().split("T")[0]}.csv`;
  document.body.appendChild(link);
  link.click();
  document.body.removeChild(link);

  // Avoid object URL leaks
  URL.revokeObjectURL(url);
}
```

### Why this is better

* Handles **commas, quotes, and newlines** correctly per CSV conventions.
* Won’t corrupt cells if a description has punctuation or newlines.
* Adds a **UTF-8 BOM** (`\uFEFF`) so Excel on Windows displays non-ASCII cleanly.
* **Revokes** the object URL to avoid memory leaks.

> If you prefer a library, `json2csv` or `papaparse` are solid; but for V1 this is sufficient and dependency-free.

---

# 3) Using it in a Next.js/React component

If you’re in Next.js **App Router**, anything that touches `document` must be a **Client Component**.

```tsx
// app/dashboard/Dashboard.tsx
"use client";

import React from "react";
import { exportExpensesToCSV } from "@/utils/csv"; // adjust the path
import { useExpenses } from "@/context/ExpenseContext";

export default function Dashboard() {
  const { expenses } = useExpenses();

  return (
    <div className="dashboard">
      {/* ...other content... */}
      <button
        onClick={() => exportExpensesToCSV(expenses)}
        className="export-button"
      >
        Export Data
      </button>
    </div>
  );
}
```

If you’re using TypeScript and have a typed `Expense`, make it explicit:

```ts
// types/expense.ts
export type Expense = {
  date: string;        // or ISO string
  category: string;
  amount: number;      // consider string if using big decimals
  description: string; // may contain commas/newlines
};
```

---

# 4) Styling (your CSS is fine)

Your button CSS is good for a V1. You can keep exactly what you had:

```css
.export-button {
  background-color: #4CAF50;
  color: white;
  border: none;
  padding: 10px 15px;
  margin: 10px 0;
  cursor: pointer;
  border-radius: 4px;
  transition: background-color 0.3s ease;
}
.export-button:hover {
  background-color: #45a049;
}
```

---

# 5) Git flow (spot on)

```bash
git checkout -b feature-data-export-v1
git add .
git commit -m "Add CSV export functionality for expenses"
git push -u origin feature-data-export-v1
```

---

# 6) Testing checklist (add a couple edge cases)

* Button renders and triggers download.
* Filename includes today’s date.
* Columns: `Date,Category,Amount,Description`.
* **Edge data**:

  * Description with commas: `Lunch, with client`.
  * Description with quotes: `Alice said "hi"`.
  * Description with newline: `first line\nsecond line`.
  * Non-ASCII: `Café, jalapeño, 東京`.
* Open in Excel/Numbers/Sheets; columns are correct.
* Large datasets: try 5k–10k rows (should still be fine in the browser).

---

# 7) Notes for later versions (V2/V3 ideas)

* **Filterable export** (date range, category, amount min/max).
* **Streaming export** if datasets get huge (web streams/worker).
* **Alternate formats** (JSON, XLSX via `xlsx` lib).
* **Server-side export** (API route that returns CSV; good for auth/logging).
* **i18n/formatting** for dates and currency (e.g., `Intl.NumberFormat`).

---

If you paste in a few sample `expense` objects, I can run a quick “tabletop test” mentally and show you exactly how the CSV lines will look given tricky descriptions (commas/quotes/newlines).




## Part 4: Implementing Version 2 — Advanced Export with Options

### Reset and Create Version 2

Now we'll ask Claude to implement the **same feature** completely differently.

This is a great prompt. It clearly resets scope (new branch), raises the bar on UX, and asks for a *different* architecture from V1. Here’s what stands out and how I’d approach it—with a ready blueprint you can hand to Claude or implement yourself.

# What the prompt does well

* **Version control discipline:** start from a clean base, new branch `feature-data-export-v2`.
* **Distinct approach:** explicitly asks for *different UI/UX + code structure* vs V1.
* **Concrete requirements:** format options (CSV/JSON/PDF), filters, preview, filename, counts, loading states.
* **Professional framing:** “power-user modal/drawer,” polish, and attention to detail.

# Risks & clarifications to add (make Claude succeed)

* **Client vs server:** CSV/JSON are easy on the client; **PDF generation is trickier** and may require a library (e.g., `jspdf`, `pdfmake`, `@react-pdf/renderer`) or a server/API route. Decide up front.
* **Next.js boundaries:** Anything touching `document`/`Blob` must be in a **Client Component** (`"use client"`). PDFs rendered server-side need an API route.
* **Large datasets:** For big exports, consider **server-side streaming**; otherwise client RAM/CPU may spike.
* **Accessibility:** Modal needs focus trap, `aria-modal`, keyboard support (ESC to close).
* **i18n/formatting:** Dates/currency formatting via `Intl.*` to avoid locale surprises.

In [24]:
prompt = '''
Excellent work on Version 1! Now I want you to implement the SAME data export feature in a completely different way.

VERSION CONTROL:
- Switch back to the original branch (before any export functionality)
- Create a new branch called "feature-data-export-v2"
- This should be a completely fresh implementation

VERSION 2 REQUIREMENTS:
Implement an ADVANCED export system with these features:
- Export modal/dialog with multiple options
- Multiple export formats: CSV, JSON, and PDF
- Date range filtering for exports (start date, end date)
- Category filtering for exports (select specific categories)
- Preview of data before export (show table of what will be exported)
- Custom filename input field
- Export summary showing how many records will be exported
- Loading states during export process

IMPLEMENTATION APPROACH:
This version should feel like a professional business application export feature. Think about what a power user would want - lots of control and options. Use a modal or drawer interface, not just a simple button.

Make this implementation completely different from Version 1:
- Different UI components and patterns
- Different user experience flow
- More sophisticated code architecture
- Professional polish and attention to detail

PROCESS:
1. Switch back to original branch
2. Create and checkout: git checkout -b feature-data-export-v2
3. Implement the advanced export system
4. Test all the functionality thoroughly
5. Commit your changes

Show me what's possible with a more sophisticated approach. Be creative!
'''

chat_with_claude(prompt)

Awesome—this is a solid “pro” V2. If you want to learn the most from it, focus on **how state, data filtering, UI composition, and exporting are wired together**. Here’s a quick roadmap with specific things to fix/learn (so this compiles cleanly and teaches you good habits).

# What to focus on (learning goals)

1. **Component architecture**

   * Modal as a focused, self-contained “export flow.”
   * Separation of concerns: UI (Modal) vs. pure helpers (filters/exporters).
   * Passing data in via props (`expenses`, `categories`) and exposing events (`onClose`).

2. **State & derived state**

   * `useState` for user choices (format, dates, categories, filename, loading).
   * `useMemo` to compute **filteredExpenses** from inputs → this is the “reactive data pipeline.”

3. **Data filtering & dates**

   * Correctly compare date ranges (watch types/timezones).
   * Category multi-select filtering.

4. **Async UI patterns**

   * `isExporting` loading state, disable/enable buttons.
   * Error paths + notifications (don’t close the modal on failure).

5. **Export implementation surface**

   * CSV escaping (commas, quotes, newlines) + UTF-8 BOM.
   * JSON pretty-print.
   * PDF strategy (client lib vs server route).
   * Creating a Blob, downloading, and **revoking** the object URL.

6. **Design system (Ant Design) use**

   * Modal, Select, RangePicker, Table, Button, Input — idiomatic usage.
   * AntD v4 vs v5 API differences (critical).
   * Accessibility defaults and keyboard handling.

7. **Type safety & props**

   * (If in TS) define `Expense`, `ExportOptions`, and component prop types.
   * `rowKey` for tables; stable keys.

---

# Things to fix right now (so it actually runs)

1. **Ant Design v5 uses `open`, not `visible`**
   If you’re on antd v5, change:

```jsx
<Modal
  title="Export Expenses"
  visible={isVisible}    // ❌ v5
  onCancel={onClose}
  ...
/>
```

to:

```jsx
<Modal
  title="Export Expenses"
  open={isVisible}       // ✅ v5
  onCancel={onClose}
  ...
/>
```

2. **RangePicker value types (dayjs/moment) vs `Date`**
   In antd v5, RangePicker returns **dayjs** objects. You’re comparing with `new Date(...) >= startDate`, which won’t work reliably. Convert or use dayjs for both sides. Example (JS, with dayjs):

```jsx
import dayjs from "dayjs";

const [range, setRange] = useState([null, null]); // [start, end]

const filteredExpenses = useMemo(() => {
  const [start, end] = range;
  return expenses.filter(exp => {
    const t = dayjs(expense.date);                      // assume ISO string
    const inStart = !start || t.isSameOrAfter(start, "day");
    const inEnd   = !end   || t.isSameOrBefore(end, "day");
    const inCat   = selectedCategories.length === 0 || selectedCategories.includes(expense.category);
    return inStart && inEnd && inCat;
  });
}, [expenses, range, selectedCategories]);

// In JSX:
<DatePicker.RangePicker value={range} onChange={(vals) => setRange(vals)} />
```

3. **`notification` not imported**
   You call `notification.error(...)` but didn’t import it. Add:

```jsx
import { notification } from "antd";
```

(or use `message.error` with `import { message } from "antd"`)

4. **Don’t auto-close on error**
   Right now `onClose()` lives in `finally`, so the modal closes even when export fails. Move `onClose()` into the **success** path:

```jsx
const handleExport = async () => {
  setIsExporting(true);
  try {
    if (exportFormat === "csv") await exportToCSV(filteredExpenses, customFilename);
    else if (exportFormat === "json") await exportToJSON(filteredExpenses, customFilename);
    else await exportToPDF(filteredExpenses, customFilename);

    onClose(); // ✅ only after success
  } catch (error) {
    notification.error({ message: "Export Failed", description: error?.message || String(error) });
  } finally {
    setIsExporting(false);
  }
};
```

5. **Prefer `loading` prop to `<Spin />`**
   Cleaner button UX:

```jsx
<Button
  key="export"
  type="primary"
  onClick={handleExport}
  loading={isExporting}                      // ✅ built-in
  disabled={filteredExpenses.length === 0}   // keep disabled on empty
>
  Export
</Button>
```

6. **Implement or import `exportToCSV/JSON/PDF`**
   Make sure these exist and handle escaping/download + `URL.revokeObjectURL`. Example CSV core:

```js
function csvEscape(v) {
  const s = v == null ? "" : String(v);
  return /[",\n]/.test(s) ? `"${s.replace(/"/g, '""')}"` : s;
}

export async function exportToCSV(rows, baseName = "expenses_export") {
  const header = ["Date","Category","Amount","Description"];
  const lines = [header.join(","), ...rows.map(e => [
    csvEscape(e.date),
    csvEscape(e.category),
    csvEscape(e.amount),
    csvEscape(e.description)
  ].join(","))];

  const csv = "\uFEFF" + lines.join("\n"); // BOM for Excel
  const blob = new Blob([csv], { type: "text/csv;charset=utf-8;" });
  const url = URL.createObjectURL(blob);

  const a = document.createElement("a");
  a.href = url;
  a.download = `${baseName}.csv`;
  document.body.appendChild(a);
  a.click();
  a.remove();
  URL.revokeObjectURL(url);
}
```

7. **Table preview (if you add it):**

* Provide `columns`, `dataSource`, and `rowKey` to avoid warnings and re-render churn.
* Limit preview rows (e.g., first 100) or virtualize for big sets.

8. **Client-only**
   Any code using `document`, `window`, `Blob`, or antd components must be in a **Client Component** in Next.js (`"use client"` at top of the file).

9. **Filename handling**
   Validate `customFilename` (fallback default) and **don’t double append** extensions:

```js
const safeBase = customFilename?.trim() || "expenses_export";
// exporters add the correct extension themselves
```

---

# Extra polish (good to learn next)

* **Accessibility**: AntD modal handles a lot, but ensure Escape closes; consider focus return on close.
* **Internationalization**: Format currency/dates using `Intl.NumberFormat` / `Intl.DateTimeFormat`.
* **Performance**: For >10k rows, consider Web Workers (offload CSV creation) or server-side streaming.
* **PDF**: Choose an approach:

  * Client: `jspdf`, `pdfmake`, `@react-pdf/renderer`.
  * Server: `/api/export/pdf` route returns a real PDF for download (often better for complex docs).
* **Types**: If you’re using TS, define `Expense`, `ExportFormat`, `ExportOptions`, and props types for the modal.

---

# TL;DR — What to learn from this code

* How to **compose a complex UI** (modal with multiple controls) that drives a **pure data pipeline** (filter → preview → export).
* How to handle **async flows** cleanly (loading, error, success).
* How to make **exports robust** (CSV quoting, JSON formatting, PDF strategy, URL cleanup).
* How design system components (AntD) fit into a **typed, client-only Next.js** flow.






# When to use Best-of-N

* **Open-ended design** (API shape, UI/UX flows, data models).
* **Algorithmic choices** (parsers, caching, pagination, deduping).
* **Refactors** where multiple architectures could fit.
* Not worth it for **tiny, deterministic edits**—use iterative refinement instead.

# How to create *real* diversity (not 5 near-duplicates)

* **Vary prompts**: give each attempt a distinct constraint (e.g., “functional style,” “OO with services,” “streaming I/O,” “no new deps,” “type-safe, strict TS”).
* **Vary sampling**: slightly different `temperature`/`top_p` across attempts.
* **Vary scaffolds**: ask for file-by-file vs. patch/diff, or client-only vs. server route.
* **Vary dependencies**: one with zero deps, one with a light utility lib, one with a full design system pattern.
* **Reset context between attempts** so they don’t converge (fresh branch, minimal prior content).

# How to pick a winner (objective beats vibes)

1. **Automated checks** (gate):

   * Unit/integration tests pass
   * Typecheck/lint/build succeed
   * Bundle size / perf budget within limits
2. **Rubric scoring** (score 1–5 each):

   * Correctness & edge-cases
   * Complexity & readability
   * UX polish & accessibility
   * Dependencies & maintainability
   * Observability (logs, errors, testability)
3. **Tie-breaker**:

   * Run a quick **pairwise “tournament”** (A vs B, winner vs C, …).
4. **Synthesis round (optional)**:

   * Ask Claude to **merge the best parts** (“keep A’s API + B’s CSV escaping + C’s modal UX”).

# Git hygiene that makes Best-of-N painless

* Create **one branch per attempt**: `feature-export-v2-a`, `...-b`, `...-c`.
* Commit message template includes **model, temperature, prompt hash**, and a short summary.
* Open **separate PRs** with your rubric table in the description.
* Keep attempts **isolated**; don’t “fix” A with edits from B—do that in a synthesis branch.

# Cost & time control (without losing quality)

* Use **Haiku** for breadth; switch to **Sonnet** only when refining the winner.
* Start with a **design doc Best-of-N** (short text) before code Best-of-N to avoid long, expensive generations.
* Ask for **patches/diffs** or **minimal file set**; set `max_tokens` sensibly; enable **auto-continue** only when needed.
* Cap N at **3–5**; beyond that, returns diminish.

# Common pitfalls (and fixes)

* **Near-duplicates** → you didn’t diversify constraints; rewrite the per-attempt instructions.
* **Truncated outputs** → raise `max_tokens` or auto-continue; also ask for **file manifests** first.
* **Broken CSV/PDF** → insist on **escaping rules** and a **BOM for CSV**, and pick a concrete PDF strategy (client lib vs API route).
* **Invisible technical debt** → enforce the rubric and require **tests + types** in every attempt.

# Tiny automation sketch (you can adapt in Colab)

Ask Claude for **3 variants with distinct constraints**, then run checks:

```python
variants = [
  "Variant A: client-only CSV/JSON/PDF; no new deps; accessibility-first modal.",
  "Variant B: server route for PDF; streaming CSV; strong TypeScript types.",
  "Variant C: minimal deps; functional style; strict error handling and logs."
]

results = []
for i, constraint in enumerate(variants, 1):
    prompt = f"{base_prompt}\n\nConstraints for this attempt:\n{constraint}\n\nReturn patch-style diffs only."
    out = chat_with_claude(prompt, render="none", return_text=True, max_tokens=3000)
    # Save, apply, and run your checks here (lint/test/build), collect scores into results
```

(Then score with your rubric, keep the winner, and optionally run a **synthesis** prompt.)

# Prompts that work well for Best-of-N

* **“Give me three distinct approaches”** with labeled sections and pros/cons table.
* **“Return a file manifest first; wait for approval”** to control scope.
* **“Use patch/diff format and do not modify files outside this list”** to keep changes tight.
* **“After code, output a rubric self-score 1–5 per criterion with justification”** to aid selection.

