<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/055_Two_Tool_Design_Styles.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ⚖️ Balancing Flexibility and Reliability in Data Extraction

## 🤖 Tool Design Tradeoffs: Dynamic vs. Fixed Extraction

When designing our agent’s toolset, we face a key architectural decision:

> Should we use a **general-purpose extraction tool** with dynamic schemas,  
> or build **specialized tools** with fixed schemas for specific document types?

This choice reflects a classic tradeoff between:

- 🧠 **Flexibility** — adapting on the fly
- 🛡️ **Reliability** — enforcing consistency and correctness

---

### 🧾 Example: Invoice Processing

Let’s return to our invoice processing agent.

#### 🛠️ Option 1: General-Purpose Tool (`prompt_llm_for_json`)
- The agent constructs **schemas and prompts dynamically**.
- It can handle many formats and extract varying fields based on the context.
- It adapts well to **new or unknown document types**.

**But…**
- It may produce **inconsistent schemas** over time.
- It might **omit critical fields**.
- The data might be structured in **ways that break downstream logic**.

---

### 🔒 Option 2: Specialized Extraction Tool with a Fixed Schema
- Enforces a **strict, reusable schema** for invoices.
- Requires critical fields (like `invoice_number`, `amount`, `date`).
- Includes guidance in the prompt to focus the LLM on specific patterns.

**This gives you:**
- ✅ Data consistency
- ✅ Validation rules
- ✅ Better downstream reliability



In [None]:

@register_tool(tags=["document_processing", "invoices"])
def extract_invoice_data(action_context: ActionContext, document_text: str) -> dict:
    """
    Extract standardized invoice data from document text. This tool enforces a consistent
    schema for invoice data extraction across all documents.

    Args:
        document_text: The text content of the invoice to process

    Returns:
        A dictionary containing extracted invoice data in a standardized format
    """
    # Define a fixed schema for invoice data
    invoice_schema = {
        "type": "object",
        "required": ["invoice_number", "date", "amount"],  # These fields must be present
        "properties": {
            "invoice_number": {"type": "string"},
            "date": {"type": "string", "format": "date"},
            "amount": {
                "type": "object",
                "properties": {
                    "value": {"type": "number"},
                    "currency": {"type": "string"}
                },
                "required": ["value", "currency"]
            },
            "vendor": {
                "type": "object",
                "properties": {
                    "name": {"type": "string"},
                    "tax_id": {"type": "string"},
                    "address": {"type": "string"}
                },
                "required": ["name"]
            },
            "line_items": {
                "type": "array",
                "items": {
                    "type": "object",
                    "properties": {
                        "description": {"type": "string"},
                        "quantity": {"type": "number"},
                        "unit_price": {"type": "number"},
                        "total": {"type": "number"}
                    },
                    "required": ["description", "total"]
                }
            }
        }
    }

    # Create a focused prompt that guides the LLM in invoice extraction
    extraction_prompt = f"""
    Extract invoice information from the following document text.
    Focus on identifying:
    - Invoice number (usually labeled as 'Invoice #', 'Reference', etc.)
    - Date (any dates labeled as 'Invoice Date', 'Issue Date', etc.)
    - Amount (total amount due, including currency)
    - Vendor information (company name, tax ID if present, address)
    - Line items (individual charges and their details)

    Document text:
    {document_text}
    """

    # Use our general extraction tool with the specialized schema and prompt
    return prompt_llm_for_json(
        action_context=action_context,
        schema=invoice_schema,
        prompt=extraction_prompt
    )


🧠 This lecture **contrasts** the previous ones. But it's not a contradiction — it's introducing a **design tradeoff**, and teaching you how to choose the right tool **based on your context**.

Let’s unpack that:

---

## ⚖️ The Two Design Styles

### 1. **General-Purpose Tool**

(*Like `prompt_llm_for_json`*)

* **Externalized prompt + schema**
* Very **modular**
* **Flexible** — handles different tasks and document types
* **Simple** and composable
* Useful for **exploration, prototyping, and rare cases**

**✅ Best when:**

* You need adaptability
* You're handling many types of documents
* You're rapidly iterating

---

### 2. **Specialized Tool with Built-In Schema + Prompt**

(*Like `extract_invoice_data`*)

* Hardcodes the **schema** and **prompt**
* Larger and more complex
* Less flexible — works for one specific task
* But: **Much more reliable**

**✅ Best when:**

* You need **data consistency**
* You have **strict downstream requirements** (e.g. database fields)
* You're in **production**

---

## 🧠 Why the Change in Design?

Because the *goal changed*.

The previous lecture was about **building a general tool**. This one is about **reliability for mission-critical workflows**.

When you're:

* Building *infrastructure*
* Feeding data into accounting systems, CRMs, or databases
* Managing invoices, legal docs, etc.

…you can't afford inconsistent keys or missing fields. You need:

* Required fields
* Field validation
* Repeatable structure

So the function **owns** the schema and prompt to enforce consistency.

---

## 🧩 Analogy: Think of It Like Web Forms

| Context                            | Tool Style                                         |
| ---------------------------------- | -------------------------------------------------- |
| **CMS or form builder**            | General tool — define your form fields dynamically |
| **Bank account registration form** | Specialized tool — fixed schema, strict validation |

---

## 💬 Key Insight

> The difference isn't that one tool is “better” — it’s that you’re learning to design **for the right purpose**.

* Simpler, general tools for adaptability
* Specialized, stricter tools for stability

Many systems use **both**, side by side:

* 🛠️ General-purpose extractor for rare or unknown documents
* 🔒 Specialized extractors for high-value, known templates



Let’s walk through the **specialized tool** from the lecture: `extract_invoice_data()`, and highlight exactly **how and why** it emphasizes **reliability and structure** over flexibility.

---

## 🔍 Specialized Tool Walkthrough: `extract_invoice_data`

Here’s the core function again (condensed for clarity):

```python
@register_tool(tags=["document_processing", "invoices"])
def extract_invoice_data(action_context: ActionContext, document_text: str) -> dict:
    """
    Extract standardized invoice data from document text.
    Enforces a consistent schema for reliability.
    """

    # ✅ 1. Fixed Schema
    invoice_schema = {
        "type": "object",
        "required": ["invoice_number", "date", "amount"],
        "properties": {
            "invoice_number": {"type": "string"},
            "date": {"type": "string", "format": "date"},
            "amount": {
                "type": "object",
                "properties": {
                    "value": {"type": "number"},
                    "currency": {"type": "string"}
                },
                "required": ["value", "currency"]
            },
            ...
        }
    }
```

### 🔒 This section hardcodes:

* What fields must be extracted (`required`)
* The data types for each field
* Sub-objects (`amount`, `vendor`, etc.)

> 🧠 Why? To **guarantee consistent structure** — crucial for databases and business logic.

---

```python
    # ✅ 2. Task-Specific Prompt
    extraction_prompt = f"""
    Extract invoice information from the following document text.
    Focus on identifying:
    - Invoice number
    - Date
    - Amount (with currency)
    - Vendor info
    - Line items

    Document text:
    {document_text}
    """
```

### 🗣️ This prompt is:

* Pre-written to guide the LLM clearly
* Focused on invoice-specific patterns
* Tightly scoped to match the fixed schema

> 🧠 Why? This ensures the LLM doesn’t guess or invent structure — it aligns tightly with the schema.

---

```python
    # ✅ 3. Calls General Tool Underneath
    return prompt_llm_for_json(
        action_context=action_context,
        schema=invoice_schema,
        prompt=extraction_prompt
    )
```

Even though this is a “specialized” tool, it's **composed** from the general-purpose one!
The core behavior is reused — this tool just wraps it with fixed instructions.

> 🧠 Why? This shows how to **layer tools**: you can specialize by **pre-filling arguments** to a generic tool.

---

## 💡 Why This Structure Matters

| Feature              | Benefit                                         |
| -------------------- | ----------------------------------------------- |
| Fixed schema         | Consistent outputs                              |
| Required fields      | Prevents partial/incomplete extractions         |
| Format specs         | Ensures clean integration with downstream tools |
| Guided prompt        | Reduces ambiguity for the LLM                   |
| Wrapped general tool | Code reuse and maintainability                  |

---

## 🧠 Your Takeaways

* **General tools** are building blocks
* **Specialized tools** are production-ready modules
* You can stack them: general → specialized → chained
* Design based on **what the system needs downstream**




## ✅ Benefits of Specialized Extraction Tools

Specialized tools — with fixed schemas and tightly scoped prompts — offer several key advantages:

### 📊 Data Consistency
- The fixed schema ensures that invoice data is always structured the same way.
- This makes it easier to integrate with downstream systems like:
  - Databases
  - Accounting software
  - Analytics pipelines

### 🔒 Required Fields
- You can define critical fields as `required` in the schema.
- If a required field is missing, the tool can raise an error — helping you catch issues early.

### 🧪 Field Validation
- The schema can specify data formats and constraints:
  - Dates in proper format
  - Numbers within ranges
  - Strings with specific patterns
- This prevents garbage data from creeping into your systems.

### 🎯 Focused Prompting
- The prompt can be written specifically for invoices:
  - “Look for invoice number near the top”
  - “Line items often appear under ‘Details’ or ‘Items’”
- This improves LLM accuracy and reduces hallucinations.

---

## 🤔 When to Use Each Approach

### Use **Specialized Tools** when:
- ✅ Data consistency is **critical**
- ✅ You’re working with a **fixed set of document types**
- ✅ You need to **enforce validation rules**
- ✅ The data feeds into **strict downstream systems** (e.g., SQL, ERPs)

---

### Use the **General-Purpose Tool** when:
- 🌀 You’re handling **many document types**
- 🔄 Document formats and needs **change frequently**
- 🧪 You’re **prototyping** or iterating on a new workflow
- 🧘 The downstream systems are **flexible** about data shape

---

## 🧬 Best Practice: Use Both

In practice, many systems use a **hybrid strategy**:

| Common & Critical Docs | Rare & Experimental Docs |
|------------------------|--------------------------|
| ✅ Specialized tools    | ✅ General-purpose tool   |
| 🔒 Fixed schemas       | 🧠 Dynamic, agent-generated |
| 🧪 Validated fields     | 🌀 Flexible structure     |

> 💡 This gives you the **best of both worlds**:  
> - **Reliability where it matters**,  
> - **Flexibility where it counts**.





> 🎛️ **Blend tight and loose control** based on what your pipeline needs — just like any well-designed system.


---

## 🧱 1. **Strict Schema Tools** = Precision and Structure

These are ideal when:

* 🗂️ You're inserting into **a database** (SQL, document store, etc.)
* 🧮 You're running **calculations** on the fields (e.g. totals, tax, budget logic)
* 🔄 The data is reused by **other systems** (ERP, billing, analytics)

They give you:

* 📐 Predictable structure
* 🔍 Validation and enforcement
* 🧪 Error checking early in the pipeline

Think of them as **“production-grade data extractors.”**

---

## 🌾 2. **Loose, Flexible Tools** = Exploration and Adaptation

Use these when:

* 📄 Documents vary wildly in format and structure
* 🤔 You’re not sure what’s extractable yet
* 🧠 You're prototyping with humans in the loop
* 🧪 You want the LLM to figure out what matters, instead of you dictating it

These tools trade control for:

* 🧠 Smarter behavior
* 🎨 Creative adaptation
* 🌱 Faster iteration

They’re your **“R\&D or fallback extractors.”**

---

## 🔀 3. **Blend Them in a Pipeline**

Absolutely — many modern agent systems combine them like this:

### Example: Hybrid Document Processor

1. 🔍 Try a **specialized tool** first
   (If it's an invoice, contract, resume, etc.)

2. ❓ If format is unknown or tool fails, fall back to:

   * A general-purpose tool
   * Or even a reflection step like:
     *"What kind of document is this?"*

3. 🧼 Final steps:

   * Normalize field names
   * Add confidence scores
   * Log anomalies or ask for human review

