<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/029_Tools.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>



## 🧠 Agent-Building Recipe

### **Step 1: Define the Agent’s Purpose (📋 Objective)**

* *“What do I want this agent to be able to do?”*

  * Example: List files, read content, summarize docs, search for keywords, etc.

---

### **Step 2: Implement the Python Tool Functions (🔧 Actions)**

* These are the actual Python functions that carry out the agent’s work.

  * `list_files()`, `read_file(filename)`, `search_file_names(keyword)`, etc.
* Test these independently first to make sure they work.

---

### **Step 3: Describe the Tools with JSON Schemas (📦 Tool Interfaces)**

* For each Python function, create a **tool schema**:

  * What is it called?
  * What does it do?
  * What parameters does it expect?

✅ Must include:

```json
"type": "function"
```

And must be structured using JSON Schema (`"parameters"`, `"properties"`, `"required"`).

---

### **Step 4: Assemble the Tool List (🧰 Tools Array)**

* Combine the tool schemas into a `tools = [...]` list.
* Each entry in the list wraps the schema like this:

```python
{
  "type": "function",
  "function": your_tool_schema
}
```

---

### **Step 5: Build the Tool Router (🚦 Dispatcher)**

* Create a function like:

```python
def tool_router(tool_name, args):
    if tool_name == "read_file":
        return read_file(args["filename"])
    ...
```

* This lets you **translate LLM tool requests into real Python function calls**.

---

### **Step 6: Prompt the LLM with Tool Awareness (🧠 Planning)**

* Set up your LLM chat:

```python
response = client.chat.completions.create(
    model="gpt-3.5-turbo",
    messages=[{"role": "user", "content": "Find docs about memory"}],
    tools=tools,
    tool_choice="auto"
)
```

* Let the LLM choose a tool and call it with arguments.

---

### **Step 7: Detect Tool Calls and Handle Execution (🤖 Run Tools)**

* Look at `response.choices[0].message.tool_calls`
* If a tool is requested, call it using `tool_router(...)`
* Feed results back into the conversation if needed.

---

### **Step 8 (Optional): Add Memory or Looping (🧠 Persistent Context)**

* If your agent needs memory of past actions, summarize or record history.
* You can loop tool calls or chain decisions based on results.

---

### **Step 9: Review and Iterate (🔁 Tweak & Refine)**

* Refine your tool descriptions to improve how well the LLM uses them.
* Add examples, tweak the prompts, test for edge cases.

---

### ✅ End Result:

You now have an LLM-powered agent that:

* Understands user goals
* Selects the right tool
* Executes actions in Python
* Returns results intelligently




In [3]:
%pip install -qU dotenv openai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/765.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m765.0/765.0 kB[0m [31m32.9 MB/s[0m eta [36m0:00:00[0m
[?25h

## Import & Environ Set Up


In [4]:
from openai import OpenAI
from dotenv import load_dotenv
import os
import json
import re
import textwrap

load_dotenv("/content/API_KEYS.env")
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

# 🔹 Step 1: Imports and Setup
source_dir = "/content/docs_folder"

# Make sure the directory exists
if not os.path.exists(source_dir):
    raise FileNotFoundError(f"📁 Directory not found: {source_dir}")

# List and build full file paths
file_list = [
    os.path.join(source_dir, f)
    for f in os.listdir(source_dir)
    if os.path.isfile(os.path.join(source_dir, f))
]

# Display the found files
print("📂 Files found:")
for file in file_list:
    print("  -", file)

📂 Files found:
  - /content/docs_folder/004_AGENT_Tools.txt
  - /content/docs_folder/001_PArse_the Response.txt
  - /content/docs_folder/003_gent Feedback and Memory.txt
  - /content/docs_folder/000_Prompting for Agents -GAIL.txt
  - /content/docs_folder/002_Execute_the_Action.txt


##  Define and Test Python Tools

In [5]:
# ✅ Tool 1: List all .txt files
def list_files():
    return [os.path.basename(f) for f in file_list if f.endswith(".txt")]

# ✅ Tool 2: Read a specific file
def read_file(filename):
    path = os.path.join(source_dir, filename)
    if not os.path.isfile(path):
        return f"⚠️ File not found: {filename}"
    with open(path, "r") as f:
        return f.read()

# ✅ Tool 3: Search for keyword in file names
def search_file_names(keyword, case_sensitive=False):
    matches = []
    for f in file_list:
        name = os.path.basename(f)
        haystack = name if case_sensitive else name.lower()
        needle = keyword if case_sensitive else keyword.lower()
        if needle in haystack:
            matches.append(name)
    return matches

print("🗂️ All .txt files:", list_files())
print("📖 Read sample file:", read_file(list_files()[0]))
print("🔍 Search 'agent':", search_file_names("agent"))


🗂️ All .txt files: ['004_AGENT_Tools.txt', '001_PArse_the Response.txt', '003_gent Feedback and Memory.txt', '000_Prompting for Agents -GAIL.txt', '002_Execute_the_Action.txt']
📖 Read sample file: 

Describing Tools to the Agent

When developing an agentic AI system, one of the most critical aspects is ensuring that the agent understands the tools it has access to. In our previous tutorial, we explored how an AI agent interacts with an environment. Now, we extend that discussion to focus on tool definition, particularly the importance of naming, parameters, and structured metadata.

Example: Automating Documentation for Python Code
Imagine we are building an AI agent that scans through all Python files in a src/ directory and automatically generates corresponding documentation files in a docs/ directory. This agent will need to:

List Python files in the src/ directory.
Read the content of each Python file.
Write documentation files in the docs/ directory.
Since file operations are str

## ✅ Tool 1: list_files

Now that we’ve built and tested the Python tools, we’ll define **tool schemas in JSON** so that an LLM can understand what each tool does and how to call it.

These schemas follow the [OpenAI tool calling spec](https://platform.openai.com/docs/guides/function-calling), where each tool is described using:

* `name`: the function name (matches the Python function name)
* `description`: what the tool does, in natural language
* `parameters`: an object with:

  * `type`: always `"object"`
  * `properties`: dictionary of input fields with their types and descriptions
  * `required`: list of required input fields


In [5]:
{
  "name": "list_files",
  "description": "Returns a list of all .txt files in the source directory.",
  "parameters": {
    "type": "object",
    "properties": {},
    "required": []
  }
}


{'name': 'list_files',
 'description': 'Returns a list of all .txt files in the source directory.',
 'parameters': {'type': 'object', 'properties': {}, 'required': []}}

## ✅ Tool 2: read_file




In [6]:
{
  "name": "read_file",
  "description": "Reads the contents of a specified file in the source directory.",
  "parameters": {
    "type": "object",
    "properties": {
      "filename": {
        "type": "string",
        "description": "The name of the file to read (including .txt)."
      }
    },
    "required": ["filename"]
  }
}


{'name': 'read_file',
 'description': 'Reads the contents of a specified file in the source directory.',
 'parameters': {'type': 'object',
  'properties': {'filename': {'type': 'string',
    'description': 'The name of the file to read (including .txt).'}},
  'required': ['filename']}}

## ✅ Tool 3: search_file_names

In [9]:
{
  "name": "search_file_names",
  "description": "Searches for files whose names include the given keyword.",
  "parameters": {
    "type": "object",
    "properties": {
      "keyword": {
        "type": "string",
        "description": "The keyword to search for in the file names."
      },
      "case_sensitive": {
        "type": "boolean",
        "description": "Whether the search should be case sensitive.",
        "default": False
      }
    },
    "required": ["keyword"]
  }
}


{'name': 'search_file_names',
 'description': 'Searches for files whose names include the given keyword.',
 'parameters': {'type': 'object',
  'properties': {'keyword': {'type': 'string',
    'description': 'The keyword to search for in the file names.'},
   'case_sensitive': {'type': 'boolean',
    'description': 'Whether the search should be case sensitive.',
    'default': False}},
  'required': ['keyword']}}

Defining tools as individual named variables and then adding them to the `tools` list is a **very good idea** for modularity, readability, and debugging. It allows you to:

### ✅ Benefits of Modular Tool Definitions

1. **Keep tools isolated**: You can work on or test one tool at a time.
2. **Reuse or reorganize**: Easily reuse tools across agents or notebooks.
3. **Improved debugging**: You can `print()` or inspect a single tool JSON by name.
4. **Simplified diffs**: If you're using version control like Git, it's easier to see what's changed per tool.
5. **Selective loading**: You can conditionally include tools in different toolchains.

---

### ✅ Refactored Example

Here’s how you can rewrite your tool definitions using named variables:

This pattern also scales beautifully as you add more tools. You could even store each in a separate `.py` file or JSON if needed.



In [6]:
# 🔧 Define tools individually

list_files_tool = {
    "name": "list_files",
    "description": "Lists all files in the source directory.",
    "parameters": {
        "type": "object",
        "properties": {},
        "required": []
    }
}

read_file_tool = {
    "name": "read_file",
    "description": "Reads the content of a specified file.",
    "parameters": {
        "type": "object",
        "properties": {
            "filename": {
                "type": "string",
                "description": "The name of the file to read."
            }
        },
        "required": ["filename"]
    }
}

search_file_names_tool = {
    "name": "search_file_names",
    "description": "Searches for files whose names include the given keyword.",
    "parameters": {
        "type": "object",
        "properties": {
            "keyword": {
                "type": "string",
                "description": "The keyword to search for in the file names."
            },
            "case_sensitive": {
                "type": "boolean",
                "description": "Whether the search should be case sensitive.",
                "default": False
            }
        },
        "required": ["keyword"]
    }
}

# 🧰 Combine into master tools list
tools = [list_files_tool, read_file_tool, search_file_names_tool]

Now that you’ve:

✅ Built your **Python tools** (functions that actually do the work)
✅ Defined your **JSON tool schemas** (what the LLM sees and uses)
✅ Created a modular **`tools` list** for orchestration

---

### 🔜 Next Step: Connect the Tools to the Agent

We now need to wire everything together so the LLM can:

1. **See the available tools** via `tools=...` when making the API call
2. **Decide** which tool to use and provide the correct inputs (`tool_calls`)
3. **Trigger** the correct Python function to actually do the work
4. **Return** the result back to the LLM and continue the conversation if needed

---

### ✅ Here’s the Plan

We’ll now do the following:

#### 1. Create a `tool_router` function

This maps each tool name (like `"list_files"`) to its Python function.

#### 2. Write a simple LLM call that uses tools

We’ll pass the `tools` list into the `client.chat.completions.create(...)` call.

#### 3. Parse and execute tool calls

We’ll check for any `tool_calls` in the LLM response and execute the appropriate Python function using the router.






## ✅ 1. tool_router: Maps tool name → Python function
This function dispatches the correct tool using the name the LLM returns:

In [7]:
def tool_router(tool_name, args):
    if tool_name == "list_files":
        return list_files()
    elif tool_name == "read_file":
        return read_file(args["filename"])
    elif tool_name == "search_file_names":
        return search_file_names(
            keyword=args["keyword"],
            case_sensitive=args.get("case_sensitive", False)
        )
    else:
        return f"❌ Unknown tool: {tool_name}"

## ✅ 2. generate_agent_response:
Sends a chat completion with tool use enabled

In [8]:
def generate_agent_response(user_input):
    messages = [
        {"role": "system", "content": "You are an assistant that helps with managing files. Use tools when needed."},
        {"role": "user", "content": user_input}
    ]

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        tools=tools,  # ← enables tool usage
        tool_choice="auto",  # ← LLM can choose any tool
    )

    return response.choices[0]


## ✅ 3. handle_tool_call:
Executes tool call and returns result

In [9]:
def handle_tool_call(choice):
    tool_calls = choice.message.tool_calls

    if not tool_calls:
        print("🤖 LLM Response:\n")
        print(choice.message.content)
        return

    for call in tool_calls:
        tool_name = call.function.name
        args = json.loads(call.function.arguments)
        print(f"\n🛠️ Executing tool: {tool_name} with args: {args}")
        result = tool_router(tool_name, args)
        print(f"✅ Tool Result:\n{result}")


## 🧪 4. Test It!

In [10]:
choice = generate_agent_response("List the files that include 'memory'")
handle_tool_call(choice)


BadRequestError: Error code: 400 - {'error': {'message': "Missing required parameter: 'tools[0].type'.", 'type': 'invalid_request_error', 'param': 'tools[0].type', 'code': 'missing_required_parameter'}}

## Type Required



The `"type": "function"` field is required because **OpenAI's function calling API expects it** as part of its tool definition format.

### 🔍 Why it matters:

When you're passing tools (functions) to the model, OpenAI's API needs to clearly distinguish:

* 🔧 *This is a callable function* (vs. other types of tools or structured data).
* 🤖 *This tool can be selected and executed by the model.*

So in the tool schema:

```json
{
  "type": "function",  ← ✅ tells the model this is a function
  "name": "read_file",
  ...
}
```

Whenever you build tools for an agent, **you must structure them to match the expected format of the model you're using.** Here's what that means in practice:

---

### ✅ For OpenAI models like `gpt-4-0613`, `gpt-3.5-turbo-0613`, etc.

These older models expect tools in this format:

```python
{
  "name": "tool_name",
  "description": "What this tool does",
  "parameters": {
    "type": "object",
    "properties": { ... },
    "required": [ ... ]
  }
}
```

---

### ✅ For **latest models** like `gpt-4-1106-preview`, `gpt-4o`, `gpt-3.5-turbo-1106`, etc.

These require the **new function-style schema** with `type: "function"`:

```python
{
  "type": "function",
  "function": {
    "name": "tool_name",
    "description": "What this tool does",
    "parameters": {
      "type": "object",
      "properties": { ... },
      "required": [ ... ]
    }
  }
}
```

---

### 🔁 Why the difference?

Because OpenAI is evolving the tool-calling API to make things more modular and future-proof. Tools are no longer assumed to be just "functions" — future versions might include other tool types like external APIs, files, or agents.

---

### 🛠 Tip for working with tools

When building a toolchain:

1. **Write your Python tool first and test it.**
2. **Define the tool's schema** in the format matching the model you're using.
3. **Pass the schema to the `tools` parameter** when calling the API.
4. ✅ Then let the model decide when and how to use it (unless you're forcing `tool_choice`).




### 🔁 OpenAI APIs Can and Do Change

OpenAI periodically updates:

* **Model interfaces** (e.g. newer `gpt-4o` models requiring different tool formats)
* **Tool schema expectations**
* **Response formatting and behavior**

If your code is tightly coupled to one version of the API and schema, **it can break when those assumptions change.**

---

### ✅ Solutions to Prevent Breakage

1. **📦 Docker or Virtual Environments**
   Use Docker or `venv` to freeze your runtime environment (Python version, library versions like `openai`, etc.).
   This helps you reproduce behavior consistently even years later.

2. **📌 Pin API Versions and Model Names**
   Use a specific model like `"gpt-4-1106-preview"` instead of just `"gpt-4"`.
   If you use `"gpt-4"` or `"gpt-3.5-turbo"` without a version suffix, you’re opting into **auto-updates**, which may break your tool formatting.

3. **🧪 Write Version-Aware Tool Builders**
   Create helper functions that generate the correct schema format depending on the model you're using:

   ```python
   def make_tool_schema(name, description, parameters, model_version="gpt-4o"):
       if model_version in ["gpt-4o", "gpt-4-1106-preview"]:
           return {
               "type": "function",
               "function": {
                   "name": name,
                   "description": description,
                   "parameters": parameters
               }
           }
       else:
           return {
               "name": name,
               "description": description,
               "parameters": parameters
           }
   ```

4. **🔒 Version Control Your Tool Schemas**
   Keep your tool definitions in versioned files (e.g. `tools_v1.json`, `tools_v2.json`) so you can track what changed and why.

---

### 🚀 Bonus: Use Docker for Maximum Stability

A Docker setup ensures:

* Python and library versions don’t drift
* External dependencies (like API clients) don’t suddenly introduce breaking changes
* Deployment is reproducible across machines or cloud services




In [12]:
# 🔧 Define tools individually
list_files_tool = {
    "type": "function",  # ✅ Required by OpenAI function calling
    "name": "list_files",
    "description": "Lists all files in the source directory.",
    "parameters": {
        "type": "object",
        "properties": {},
        "required": []
    }
}

read_file_tool = {
    "type": "function",  # ✅ Required by OpenAI function calling
    "name": "read_file",
    "description": "Reads the content of a specified file.",
    "parameters": {
        "type": "object",
        "properties": {
            "filename": {
                "type": "string",
                "description": "The name of the file to read."
            }
        },
        "required": ["filename"]
    }
}

search_file_names_tool = {
    "type": "function",  # ✅ Required by OpenAI function calling
    "name": "search_file_names",
    "description": "Searches for files whose names include the given keyword.",
    "parameters": {
        "type": "object",
        "properties": {
            "keyword": {
                "type": "string",
                "description": "The keyword to search for in the file names."
            },
            "case_sensitive": {
                "type": "boolean",
                "description": "Whether the search should be case sensitive.",
                "default": False
            }
        },
        "required": ["keyword"]
    }
}

# router
def tool_router(tool_name, args):
    if tool_name == "list_files":
        return list_files()
    elif tool_name == "read_file":
        return read_file(args["filename"])
    elif tool_name == "search_file_names":
        return search_file_names(
            keyword=args["keyword"],
            case_sensitive=args.get("case_sensitive", False)
        )
    else:
        return f"❌ Unknown tool: {tool_name}"

# response
def generate_agent_response(user_input):
    messages = [
        {"role": "system", "content": "You are an assistant that helps with managing files. Use tools when needed."},
        {"role": "user", "content": user_input}
    ]

    response = client.chat.completions.create(
        model="gpt-3.5-turbo",
        messages=messages,
        tools=tools,
        tool_choice="auto",
    )

    choice = response.choices[0].message

    if choice.tool_calls:
        # The LLM decided to use a tool
        tool_call = choice.tool_calls[0]
        return {"type": "tool", "tool_call": tool_call}
    else:
        # Regular assistant reply
        return {"type": "text", "content": choice.content}

def handle_tool_call(choice):
    if choice["type"] == "tool":
        tool_name = choice["tool_call"].function.name
        args = json.loads(choice["tool_call"].function.arguments)
        result = tool_router(tool_name, args)
        print(f"🛠️ Used Tool: {tool_name}")
        print(f"📤 Args: {args}")
        print(f"📥 Result:\n{result}")
    elif choice["type"] == "text":
        print(f"💬 Assistant Response:\n{choice['content']}")
    else:
        print("❓ Unrecognized response type.")

# 🧰 Combine into master tools list
# ❌ Original (Old Format – Causes API Error)
# tools = [list_files_tool, read_file_tool, search_file_names_tool]
'''Why it fails:
The new OpenAI API expects a tool definition wrapped in a dictionary with
a "type": "function" and a "function" key. Without this wrapper, the API throws:

"Missing required parameter: 'tools[0].function'"'''

# ✅ Updated (Correct Format – Required by OpenAI's tools schema)
# 🧰 Combine into master tools list using the required OpenAI tools format
tools = [
    {
        "type": "function",               # ✅ Declares this is a function-style tool
        "function": list_files_tool       # ✅ Embeds the tool definition
    },
    {
        "type": "function",
        "function": read_file_tool
    },
    {
        "type": "function",
        "function": search_file_names_tool
    }
]

user_input = "List all the files that contain the word 'memory'"
choice = generate_agent_response(user_input)
handle_tool_call(choice)


🛠️ Used Tool: search_file_names
📤 Args: {'keyword': 'memory'}
📥 Result:
['003_gent Feedback and Memory.txt']



## 🧠 Key Concepts and Learnings

### 1. **Separation of Concerns**

* You build the **tool logic in Python** — these are the actual implementations (`list_files()`, `read_file(filename)`, etc.).
* You then define the **tool schema in JSON format**, which describes **what the tool does** and **what inputs it expects**, not how it works.

  * This schema is what the LLM sees and uses to decide whether or not to invoke the tool.

---

### 2. **Tool Schema Structure (OpenAI Format)**

Each tool **must be wrapped like this** in the `tools` list:

```python
tools = [
    {
        "type": "function",        # ✅ Declares this is a function-style tool
        "function": tool_schema    # ✅ Embeds the JSON schema defined above
    },
    ...
]
```

Each schema (`tool_schema`) itself must include:

* `"name"`: Tool name (unique string).
* `"description"`: What the tool does, in natural language.
* `"parameters"`: A JSON Schema definition:

  * `"type"`: Must be `"object"`.
  * `"properties"`: Dictionary of expected inputs.
  * `"required"`: List of required argument keys.

---

### 3. **Tool Router**

You need a `tool_router(tool_name, args)` function that **maps the LLM’s tool selection back to your actual Python code**.

This lets you separate:

* LLM *decision-making* (it picks the tool),
* From Python *execution logic* (you do the work).

---

### 4. **Function Calling Flow**

Here’s the full flow when an agent receives a user request:

1. **User provides input**
2. LLM decides: “Do I need a tool?”
3. If yes, it chooses a tool from the `tools` list and passes in arguments.
4. You parse the `tool_call` from the LLM’s response.
5. You route that to your Python function via `tool_router()`.
6. You return and print the result.

---

### 5. **Why This Is Powerful**

* Tools **extend** what the LLM can do — from text generation to *real-world actions*.
* You can teach the LLM to **reason about tasks**, then **hand off specific subtasks to tools**.
* This is the core of **agentic workflows**.

---

### 6. **Pitfalls to Avoid**

* If you forget `type: "function"` or don't wrap tools properly, you'll get 400 errors.
* Inputs to tools must exactly match their `parameters` schema.
* `args.get()` is safer than `args["key"]` to handle optional arguments.
* The LLM doesn’t know the internal implementation of your tool — just the interface.

---

## ✅ Summary: Your Agent Toolkit

* 🔧 Define Python tool functions (business logic).
* 📜 Describe tools using OpenAI’s JSON Schema.
* 🧠 Use `generate_agent_response()` to let the LLM decide what to do.
* 🔁 Route tool requests with `tool_router()` and print results.



