<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/025_Execute_Action.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
%pip install -qU dotenv openai

[?25l   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/765.0 kB[0m [31m?[0m eta [36m-:--:--[0m[2K   [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m757.8/765.0 kB[0m [31m24.4 MB/s[0m eta [36m0:00:01[0m[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m765.0/765.0 kB[0m [31m14.4 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
from openai import OpenAI
from dotenv import load_dotenv
import os
import json
import re

load_dotenv("/content/API_KEYS.env")
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

source_dir = "/content/docs_folder"

# Make sure the directory exists
if not os.path.exists(source_dir):
    raise FileNotFoundError(f"📁 Directory not found: {source_dir}")

# List and build full file paths
file_list = [
    os.path.join(source_dir, f)
    for f in os.listdir(source_dir)
    if os.path.isfile(os.path.join(source_dir, f))
]

# Display the found files
print("📂 Files found:")
for file in file_list:
    print("  -", file)

📂 Files found:
  - /content/docs_folder/001_PArse_the Response.txt
  - /content/docs_folder/006_Agent Loop with Function Calling.txt
  - /content/docs_folder/000_Prompting for Agents -GAIL.txt
  - /content/docs_folder/002_Execute_the_Action.txt
  - /content/docs_folder/005_Using Function Calling Capabilities with LLMs.txt
  - /content/docs_folder/003_gent Feedback and Memory.txt
  - /content/docs_folder/004_AGENT_Tools.txt


In [4]:
# ✅ Step 1: Define the Available Tools
def list_files():
    return [os.path.basename(f) for f in file_list]

def read_file(file_name):
    try:
        path = os.path.join(source_dir, file_name)
        with open(path, "r", encoding="utf-8") as f:
            return f.read()[:500]  # Just return preview
    except:
        return "File not found."

# ✅ Step 2: LLM Prompt and Call
def build_execution_prompt(user_question):
    tool_list = (
        "- list_files: takes no arguments\n"
        "- read_file: takes {\"file_name\": str}\n"
    )

    system_prompt = (
        "You are an agent that responds with a tool invocation.\n"
        "You must reply with a JSON block inside a markdown ```action block, like this:\n\n"
        "```action\n"
        "{\n"
        "  \"tool_name\": \"list_files\",\n"
        "  \"args\": {}\n"
        "}\n"
        "```\n\n"
        f"Available tools:\n{tool_list}"
    )

    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_question}
    ]

def generate_response(messages):
    response = client.chat.completions.create(
        model="gpt-3.5-turbo",  # ✅ Lightweight
        messages=messages,
        max_tokens=500
    )
    return response.choices[0].message.content

#✅ Step 3: Parsing and Execution
def extract_markdown_block(text: str, tag="action") -> str:
    pattern = rf"```{tag}\s*(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    if match:
        return match.group(1).strip()
    raise ValueError(f"No `{tag}` block found.")

def parse_action_block(llm_response):
    try:
        json_block = extract_markdown_block(llm_response, tag="action")
        return json.loads(json_block)
    except Exception as e:
        return {"tool_name": "error", "args": {"message": str(e)}}

This block is at the **heart of agent execution**.
Let’s walk through it line by line and break down exactly what's happening:

---

## 🔧 Purpose of the Function

```python
def execute_action(action):
```

This function receives a structured dictionary from the LLM like:

```json
{
  "tool_name": "read_file",
  "args": { "file_name": "lecture_01.txt" }
}
```

And decides **which tool to run**, based on the `tool_name`.

---

## 🔁 Control Flow – What It Checks

### ✅ 1. Handle `"list_files"`

```python
if action["tool_name"] == "list_files":
    return {"result": list_files()}
```

If the tool name is `list_files`, it calls a helper function (likely defined elsewhere) to get a list of filenames, and wraps the result in a dictionary.

---

### ✅ 2. Handle `"read_file"`

```python
elif action["tool_name"] == "read_file":
    return {"result": read_file(action["args"]["file_name"])}
```

If the LLM requested a file to be read, it looks up the filename from the `"args"` field, calls `read_file(...)`, and returns the result.

So if the action was:

```json
{
  "tool_name": "read_file",
  "args": { "file_name": "lecture_02.txt" }
}
```

It runs: `read_file("lecture_02.txt")`

---

### ⚠️ 3. Handle `"error"` (from earlier parsing)

```python
elif action["tool_name"] == "error":
    return {"error": action["args"]["message"]}
```

Sometimes the LLM returns an `"error"` tool name (for example, if parsing failed or it gave invalid output).
This branch simply returns the message in the `"args"`.

Example:

```json
{
  "tool_name": "error",
  "args": { "message": "Invalid JSON format." }
}
```

---

### ❌ 4. Catch-all for unknown tools

```python
else:
    return {"error": f"Unknown tool: {action['tool_name']}"}
```

This is the safety net — if the `tool_name` isn’t one of the known tools (`list_files`, `read_file`, or `error`), this will catch it and return an error message.

---

## 🧠 In Summary

This function is like the **tool dispatcher** in your agent. It:

* Receives a clean parsed action
* Checks the `tool_name`
* Calls the correct function
* Wraps the result in a standardized format

---

## ✅ Example Input and Output

### Input:

```python
{
  "tool_name": "read_file",
  "args": { "file_name": "GAIL_intro.txt" }
}
```

### Output:

```python
{
  "result": "This file introduces the GAIL framework for agent design..."
}
```




In [10]:
# actual working code
def execute_action(action):
    if action["tool_name"] == "list_files":
        return {"result": list_files()}
    elif action["tool_name"] == "read_file":
        return {"result": read_file(action["args"]["file_name"])}
    elif action["tool_name"] == "error":
        return {"error": action["args"]["message"]}
    else:
        return {"error": f"Unknown tool: {action['tool_name']}"}

# use print statements for illustration of actions
def execute_action(action):
    tool = action.get("tool_name")
    args = action.get("args", {})

    print(f"\n🔧 Executing Tool: {tool}")
    print(f"🧾 With Args: {args}")

    if tool == "list_files":
        result = list_files()
        print("📄 Result:", result)
        return {"result": result}

    elif tool == "read_file":
        file_name = args.get("file_name", "")
        result = read_file(file_name)
        print("📄 Result:", result[:300] + "..." if len(result) > 300 else result)  # limit preview
        return {"result": result}

    elif tool == "error":
        error_msg = args.get("message", "Unknown error")
        print("❌ Error:", error_msg)
        return {"error": error_msg}

    else:
        unknown_tool = f"Unknown tool: {tool}"
        print("❌ Error:", unknown_tool)
        return {"error": unknown_tool}


In [11]:
import textwrap

# ✅ Step 4: Run It!
user_input = "Can you show me what files are in the folder?"

# 🧠 LLM decides what to do
messages = build_execution_prompt(user_input)
llm_raw = generate_response(messages)

print("\n🤖 Raw LLM Output:\n")
print(textwrap.fill(llm_raw, width=80))  # Wrap to 80 characters

# 🛠️ Parse + execute
action = parse_action_block(llm_raw)
print("\n✅ Parsed Action:\n", action)

# 🔧 Simulate tool execution
result = execute_action(action)

# 📦 Print result (wrap if it's long text)
print("\n📦 Execution Result:\n")
if isinstance(result.get("result"), str):
    print(textwrap.fill(result["result"], width=80))
else:
    print(result)



🤖 Raw LLM Output:

```action {   "tool_name": "list_files",   "args": {} } ```

✅ Parsed Action:
 {'tool_name': 'list_files', 'args': {}}

🔧 Executing Tool: list_files
🧾 With Args: {}
📄 Result: ['001_PArse_the Response.txt', '006_Agent Loop with Function Calling.txt', '000_Prompting for Agents -GAIL.txt', '002_Execute_the_Action.txt', '005_Using Function Calling Capabilities with LLMs.txt', '003_gent Feedback and Memory.txt', '004_AGENT_Tools.txt']

📦 Execution Result:

{'result': ['001_PArse_the Response.txt', '006_Agent Loop with Function Calling.txt', '000_Prompting for Agents -GAIL.txt', '002_Execute_the_Action.txt', '005_Using Function Calling Capabilities with LLMs.txt', '003_gent Feedback and Memory.txt', '004_AGENT_Tools.txt']}




## 🧩 What You're Seeing:

### ✅ **LLM + Traditional Code = Hybrid Intelligence**

You're combining:

---

### 🧠 1. **LLM as the "Planner"**

The language model:

* Interprets natural language
* Chooses what tool to run (`"tool_name"`)
* Decides what data to pass (`"args"`)
* Outputs a structured **action plan** like:

```json
{
  "tool_name": "read_file",
  "args": { "file_name": "lecture_02.txt" }
}
```

It acts like an **orchestrator or high-level brain**.

---

### 🧰 2. **Python as the "Executor"**

Your code:

* Uses `if`/`elif` to check the tool name
* Calls a specific function like `read_file(...)` or `list_files()`
* Prints or returns the result

This is the **hardcoded logic**, where Python is dependable, fast, and cheap.

---

## 🤖 This is Exactly How Real Agents Work

This design is what underlies tools like:

* LangChain
* OpenAI Function Calling
* Agent frameworks like CrewAI, AutoGen, MetaGPT, etc.

They all follow this same core pattern:

> 🧠 LLM decides **what to do**
> 🛠️ Code decides **how to do it**

---

## ✅ Why This Works So Well

| LLM Strengths                  | Python Strengths                |
| ------------------------------ | ------------------------------- |
| Natural language understanding | Structured, deterministic logic |
| Reasoning and planning         | Efficient execution             |
| Flexible and adaptive          | Safe, fast, and debuggable      |

You're building a **division of labor** where each side does what it's best at.



## 💰 Agents Can Get Expensive — Fast

Because agents:

* **Think in multiple steps**
* Often **call the LLM repeatedly**
* Pass along **growing memory or context**
* May use **external tools** (APIs, DBs, scraping, function calling)

A simple-sounding task like “summarize this report and suggest actions” could become:

1. One call to plan steps
2. Another to read the file
3. Another to summarize
4. Another to reflect or explain
5. Possibly another to take action...

Multiply that across users, tasks, and iterations... and costs spiral.

---

## 🔍 That’s Why Smart Agent Design Matters

| ☁️ Naive Agent           | 🧠 Smart Agent                |
| ------------------------ | ----------------------------- |
| Calls LLM for everything | Uses LLM only for *decisions* |
| Stores full chat history | Trims or summarizes memory    |
| Reads full files         | Uses RAG / chunking           |
| No token tracking        | Monitors token usage per step |
| Runs constantly          | Triggers only when needed     |

You want the LLM to **guide** the flow, not be the engine behind every tiny operation.

---

## 💼 Business Impact

| Situation              | Risk                               |
| ---------------------- | ---------------------------------- |
| Low-margin apps        | LLM costs can **erase profit**     |
| High user volume       | LLM usage scales with each session |
| No caching / planning  | Users repeat expensive requests    |
| Over-engineered agents | "Cool" ≠ sustainable               |

Unless you’re charging per action (e.g., usage-based pricing), you need to keep tight control of:

* Token costs
* Action frequency
* Session length
* Retrieval and caching strategies

---

## ✅ Rule of Thumb

> **Use the LLM only when human-like reasoning is actually needed.**
> Let regular code do everything else.




## ✅ Example 2: Let the LLM Choose a Math Operation

We’ll create an agent that:

* Accepts natural language (e.g. *"What's 7 times 9?"*)
* The LLM decides what to do: `"add"`, `"subtract"`, `"multiply"`, `"divide"`
* Python executes that operation

In [12]:
import json
import re

# ✅ Step 1: Define Your Available Tools
def add(a, b):
    return a + b

def subtract(a, b):
    return a - b

def multiply(a, b):
    return a * b

def divide(a, b):
    return a / b if b != 0 else "⚠️ Cannot divide by zero"

# ✅ Step 2: Write the Prompt Template
def build_math_prompt(user_input):
    system_prompt = (
        "You are a math agent. You always respond with a tool call in JSON format "
        "inside a ```action markdown block, like this:\n\n"
        "```action\n"
        "{\n"
        "  \"tool_name\": \"multiply\",\n"
        "  \"args\": { \"a\": 7, \"b\": 9 }\n"
        "}\n"
        "```\n\n"
        "Available tools:\n"
        "- add: takes {\"a\": number, \"b\": number}\n"
        "- subtract: takes {\"a\": number, \"b\": number}\n"
        "- multiply: takes {\"a\": number, \"b\": number}\n"
        "- divide: takes {\"a\": number, \"b\": number}"
    )

    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_input}
    ]

# ✅ Step 3: Parse the LLM Response
def extract_action_block(text):
    pattern = r"```action\s*(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    if not match:
        raise ValueError("No action block found.")
    return json.loads(match.group(1).strip())

# ✅ Step 4: Execute the Chosen Action
def execute_math_action(action):
    tool = action["tool_name"]
    args = action["args"]

    if tool == "add":
        return add(args["a"], args["b"])
    elif tool == "subtract":
        return subtract(args["a"], args["b"])
    elif tool == "multiply":
        return multiply(args["a"], args["b"])
    elif tool == "divide":
        return divide(args["a"], args["b"])
    else:
        return f"❌ Unknown tool: {tool}"

# ✅ Step 5: Put It All Together
# User input
question = "What's 7 times 9?"

# Call LLM
messages = build_math_prompt(question)
llm_output = generate_response(messages)
print("\n🤖 LLM Output:\n", llm_output)

# Parse and execute
action = extract_action_block(llm_output)
print("\n✅ Parsed Action:\n", action)

result = execute_math_action(action)
print("\n🧮 Final Result:\n", result)



🤖 LLM Output:
 ```action
{
  "tool_name": "multiply",
  "args": { "a": 7, "b": 9 }
}
```

✅ Parsed Action:
 {'tool_name': 'multiply', 'args': {'a': 7, 'b': 9}}

🧮 Final Result:
 63


In [13]:
# ✅ Step 1: Define the To-Do List Tools
todo_list = []

def add_task(task):
    todo_list.append(task)
    return f"✅ Added: {task}"

def remove_task(task):
    if task in todo_list:
        todo_list.remove(task)
        return f"❌ Removed: {task}"
    else:
        return f"⚠️ Task not found: {task}"

def list_tasks():
    if not todo_list:
        return "📝 Your to-do list is empty."
    return "📝 To-Do List:\n" + "\n".join(f"- {t}" for t in todo_list)

# ✅ Step 2: LLM Prompt Template
def build_todo_prompt(user_input):
    system_prompt = (
        "You are a to-do list assistant. You always respond with an action in this format:\n\n"
        "```action\n"
        "{\n"
        "  \"tool_name\": \"add_task\",\n"
        "  \"args\": { \"task\": \"Buy groceries\" }\n"
        "}\n"
        "```\n\n"
        "Available tools:\n"
        "- add_task: takes {\"task\": str}\n"
        "- remove_task: takes {\"task\": str}\n"
        "- list_tasks: takes no arguments"
    )

    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_input}
    ]

# ✅ Step 3: Markdown Parsing
def extract_action_block(text):
    pattern = r"```action\s*(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    if not match:
        raise ValueError("No action block found.")
    return json.loads(match.group(1).strip())

# ✅ Step 4: Action Executor
def execute_todo_action(action):
    tool = action["tool_name"]
    args = action.get("args", {})

    if tool == "add_task":
        return add_task(args["task"])
    elif tool == "remove_task":
        return remove_task(args["task"])
    elif tool == "list_tasks":
        return list_tasks()
    else:
        return f"❌ Unknown tool: {tool}"

# ✅ Step 5: Try It Out!
# User command
command = "Add 'Write agent notebook' and 'Take a break' to my list."

# Ask the LLM what to do
messages = build_todo_prompt(command)
llm_output = generate_response(messages)

print("\n🤖 LLM Output:\n")
print(textwrap.fill(llm_output, width=80))

# Parse and execute
action = extract_action_block(llm_output)
print("\n✅ Parsed Action:\n", action)

result = execute_todo_action(action)
print("\n📦 Execution Result:\n", result)



🤖 LLM Output:

```action {   "tool_name": "add_task",   "args": { "task": "Write agent
notebook" } } ```  ```action {   "tool_name": "add_task",   "args": { "task":
"Take a break" } } ```

✅ Parsed Action:
 {'tool_name': 'add_task', 'args': {'task': 'Write agent notebook'}}

📦 Execution Result:
 ✅ Added: Write agent notebook


In [15]:
todo_list

['Write agent notebook']


## ✅ What Happened

Your prompt was:

> "Add 'Write agent notebook' and 'Take a break' to my list."

The LLM responded with:

````markdown
```action
{ "tool_name": "add_task", "args": { "task": "Write agent notebook" } }
````

```action
{ "tool_name": "add_task", "args": { "task": "Take a break" } }
```

So it did generate **two separate actions**, but...

---

## ❌ Why Only One Was Executed?

Your current code uses this:

```python
action = extract_action_block(llm_output)
result = execute_todo_action(action)
```

This means you're:

* ✅ Extracting only the **first** `action` block
* ✅ Executing only **one** action

---

## ✅ How to Fix It (Multi-Action Support)

Let’s improve `extract_action_block` to grab **all** markdown blocks:

````python
def extract_all_action_blocks(text):
    pattern = r"```action\s*(.*?)```"
    matches = re.findall(pattern, text, re.DOTALL)
    if not matches:
        raise ValueError("No action blocks found.")
    return [json.loads(block.strip()) for block in matches]
````

Then update your runner:

```python
actions = extract_all_action_blocks(llm_output)

for action in actions:
    print("\n✅ Parsed Action:\n", action)
    result = execute_todo_action(action)
    print("\n📦 Execution Result:\n", result)
```




In [16]:
def extract_all_action_blocks(text):
    pattern = r"```action\s*(.*?)```"
    matches = re.findall(pattern, text, re.DOTALL)
    if not matches:
        raise ValueError("No action blocks found.")
    return [json.loads(block.strip()) for block in matches]

actions = extract_all_action_blocks(llm_output)

for action in actions:
    print("\n✅ Parsed Action:\n", action)
    result = execute_todo_action(action)
    print("\n📦 Execution Result:\n", result)

# User command
command = "Add 'Write agent notebook' and 'Take a break' to my list."

# Ask the LLM what to do
messages = build_todo_prompt(command)
llm_output = generate_response(messages)

print("\n🤖 LLM Output:\n")
print(textwrap.fill(llm_output, width=80))

# Parse and execute
action = extract_action_block(llm_output)
print("\n✅ Parsed Action:\n", action)

result = execute_todo_action(action)
print("\n📦 Execution Result:\n", result)


✅ Parsed Action:
 {'tool_name': 'add_task', 'args': {'task': 'Write agent notebook'}}

📦 Execution Result:
 ✅ Added: Write agent notebook

✅ Parsed Action:
 {'tool_name': 'add_task', 'args': {'task': 'Take a break'}}

📦 Execution Result:
 ✅ Added: Take a break

🤖 LLM Output:

```action {   "tool_name": "add_task",   "args": { "task": "Write agent
notebook" } } ```  ```action {   "tool_name": "add_task",   "args": { "task":
"Take a break" } } ```

✅ Parsed Action:
 {'tool_name': 'add_task', 'args': {'task': 'Write agent notebook'}}

📦 Execution Result:
 ✅ Added: Write agent notebook


In [17]:
todo_list

['Write agent notebook',
 'Write agent notebook',
 'Take a break',
 'Write agent notebook']

## 🔒 Final Example: Password Strength Checker Agent

In [18]:
# ✅ Step 1: Tool Implementation
def check_password_strength(password):
    import re
    score = 0
    if len(password) >= 8:
        score += 1
    if re.search(r"[A-Z]", password):
        score += 1
    if re.search(r"[a-z]", password):
        score += 1
    if re.search(r"[0-9]", password):
        score += 1
    if re.search(r"[^A-Za-z0-9]", password):
        score += 1

    if score >= 4:
        return "✅ Strong password"
    elif score == 3:
        return "⚠️ Moderate password"
    else:
        return "❌ Weak password"

# ✅ Step 2: Prompt Builder
def build_password_prompt(user_input):
    system_prompt = (
        "You are a password assistant agent. Always respond with a tool invocation "
        "in a markdown ```action block like this:\n\n"
        "```action\n"
        "{\n"
        "  \"tool_name\": \"check_password_strength\",\n"
        "  \"args\": { \"password\": \"MyCat123!\" }\n"
        "}\n"
        "```\n\n"
        "Available tools:\n"
        "- check_password_strength: takes {\"password\": str}"
    )

    return [
        {"role": "system", "content": system_prompt},
        {"role": "user", "content": user_input}
    ]

# ✅ Step 3: Parser
def extract_action_block(text):
    pattern = r"```action\s*(.*?)```"
    match = re.search(pattern, text, re.DOTALL)
    if not match:
        raise ValueError("No action block found.")
    return json.loads(match.group(1).strip())

# ✅ Step 4: Execute the Action
def execute_password_action(action):
    if action["tool_name"] == "check_password_strength":
        return check_password_strength(action["args"]["password"])
    else:
        return f"❌ Unknown tool: {action['tool_name']}"

# ✅ Step 5: Run It!
user_input = "Check if this password is strong: banana"

messages = build_password_prompt(user_input)
llm_output = generate_response(messages)

print("\n🤖 LLM Output:\n", textwrap.fill(llm_output, width=80))

action = extract_action_block(llm_output)
print("\n✅ Parsed Action:\n", action)

result = execute_password_action(action)
print("\n🔐 Password Strength:\n", result)


🤖 LLM Output:
 ```action {   "tool_name": "check_password_strength",   "args": { "password":
"banana" } } ```

✅ Parsed Action:
 {'tool_name': 'check_password_strength', 'args': {'password': 'banana'}}

🔐 Password Strength:
 ❌ Weak password
