<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/100_TxtSummarizerAgent_00.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **1. Define the Agent’s Purpose**
>
> * One short **goal statement** — what success looks like.
> * Optional constraints (time, cost, safety, privacy).
> * Example: *“Help onboard new hires by sending welcome emails and scheduling meetings.”*

## 🎯 Agent Goal

> **Goal**: “Summarize the content of a given text file into concise bullet points.”





## 🧠 What the Agent Needs to Do (Plain English Steps)

1. **Understand the Goal**
   “I need to summarize a text file into bullet points.”

2. **Find or Choose the File**
   Figure out *which* text file to summarize. (Will we hardcode it, ask the user, or list available files?)

3. **Read the File’s Contents**
   Open the file and load its text.

4. **Summarize It**
   Turn the text into a short list of bullet points — probably by calling an LLM.

5. **Return or Save the Summary**
   Print the result, return it, or save it to another file.

6. **Track Progress (optional but recommended)**
   Log what step the agent is on for debugging and transparency.





## 🧠🦾 Mind vs Body Breakdown (LLM vs Python)

Here’s your file summarizer agent mapped into those categories:

| Step                   | Description              | Mind (LLM)                           | Body (Python)                    |
| ---------------------- | ------------------------ | ------------------------------------ | -------------------------------- |
| 1. Understand the goal | "Summarize a file"       | ✅ (LLM needs this to choose actions) |                                  |
| 2. Choose a file       | List or pick a file name | ✅ (LLM decides *which* to summarize) | ✅ (Python lists available files) |
| 3. Read file contents  | Load text from disk      |                                      | ✅ (Python reads the file)        |
| 4. Summarize it        | Turn text into bullets   | ✅ (LLM does this)                    |                                  |
| 5. Return/save result  | Output summary           | ✅ (LLM may decide where/how)         | ✅ (Python saves/prints it)       |
| 6. Track progress      | Log steps taken          |                                      | ✅ (Python memory logging)        |




## 🔁 Simplified Agent Design (Final Table)

| Step                   | Description                           | LLM (Mind)                  | Python (Tool/Body)                     |
| ---------------------- | ------------------------------------- | --------------------------- | -------------------------------------- |
| 1. Understand the goal | “Summarize text content into bullets” | ✅                           |                                        |
| 2. Read text from file | Get contents from a known folder      |                             | ✅ (`read_txt_file(folder, file_name)`) |
| 3. Summarize contents  | Turn text into bullet points          | ✅ (this is the agent’s job) |                                        |
| 4. Save the summary    | Save to known output folder           |                             | ✅ (`save_summary(folder, content)`)    |
| 5. Track progress      | Record status of steps                |                             | ✅ (`track_progress`)                   |

---

### 🚦 Control Flow

* **User**: “Hey agent, summarize `input/article1.txt`.”
* **Agent**: “Okay, I’ll call `read_txt_file` to get it.”
* **Agent**: *Summarizes the text.*
* **Agent**: Calls `save_summary` to write it to `output/article1_summary.txt`
* **Agent**: Logs progress.






## 🧩 Step 2: Identify Required Capabilities

Here’s what this step is about (per your recipe and the handbook):

> “List the **tools** the agent needs to accomplish the goal, and any **lifecycle helpers** like `create_plan` or `track_progress`.”

---

### 🧰 Required Tools

Let’s define the **minimal toolset** your agent will use:

| Tool Name        | Purpose                                  | Notes            |
| ---------------- | ---------------------------------------- | ---------------- |
| `read_txt_file`  | Reads text from a known file path        | Stateless        |
| `save_summary`   | Saves output summary to a known location | Stateless        |
| `track_progress` | Logs steps/status updates                | From your recipe |
| `create_plan`    | Agent begins by making a plan            | From your recipe |

---

### 🧠 Required Capabilities

These are **agent lifecycle modifiers** — not tools themselves, but hooks into the loop:

| Capability Name              | Purpose                          |
| ---------------------------- | -------------------------------- |
| `PlanFirstCapability`        | Ensures agent starts by planning |
| `ProgressTrackingCapability` | Adds memory log of tool progress |

---

### ✅ Result of This Step:

You now have a **capability stack** and a **tool list** to implement.





## 🩰 “Dress Rehearsal” = Design Before Code

The idea is to:

* Lay out the agent’s **moving parts** (goals, tools, memory, environment, etc.)
* Show **how they connect** (who uses what, in what order)
* Catch **confusion or overload** early — before wiring and debugging

It’s like setting the stage before the actors enter. Every prop is in its place, and everyone knows their lines.

---

## ✅ So — Which Comes First: Tools/Capabilities or Rehearsal?

You're right to pause here. Here's the answer:

> **Do both in parallel — but only at a sketch level.**

* You **need** to know your **tools and capabilities** first — at least their names and jobs.
* But you don’t need to fully implement them yet.
* Then, **use that to build the scaffold**, which validates:

  * Flow order
  * Tool coverage
  * Whether the agent loop is too shallow or too deep

---

## 🎬 Agent Scaffold (Plain-English Simulation)

This is a **dry run** of what the agent will do — no code yet, just **logic**.

---

### 🎯 Starting Point

* Goal: “Summarize a file into concise bullet points”
* Input: A file path like `input/article1.txt`
* Tools: `create_plan`, `read_txt_file`, `save_summary`, `track_progress`
* Capabilities: `PlanFirstCapability`, `ProgressTrackingCapability`

---

## 🛠️ Refined Scaffold (Agent Dress Rehearsal)

1. The agent receives the **goal**:
   *“Summarize the file at `input/article1.txt` into concise bullet points.”*
   This goal is stored in memory for reference.

2. Triggered by `PlanFirstCapability`, the agent first calls `create_plan`.
   It generates a short plan (e.g., “read file → summarize → save”) and stores it in memory.

3. The agent then uses the `read_txt_file` tool, passing in the path to `input/article1.txt`.
   The file’s contents are returned and stored temporarily in state.

4. The LLM summarizes the text into 4–6 bullet points.
   This is its **main job** — natural language compression.

5. The agent then calls `save_summary`, passing the output path (`output/article1_summary.txt`) and the generated summary.
   This step writes the results to disk.

6. The agent logs its progress using `track_progress`, e.g.,
   *“Step 2 complete: summary saved successfully”*

7. The loop ends with a clear final message:
   *“Summary saved to output/article1\_summary.txt. Task complete.”*

---

## ✅ Why This Version Works

* Highlights **memory usage** (storing goal + plan)
* Keeps LLM focused only on **text-to-summary**
* Makes the flow crystal clear
* Reflects your recipe *exactly*
* Minimal and easy to wire up





🔥 The agent doesn’t magically “know” how to start summarizing — it needs a **clear, minimal prompt** to hand to the LLM when it’s time to summarize. And if we're following clean design:

> ✅ That prompt should be generated by a **tool**, not hardcoded into the agent logic.

---

## ✨ Introducing a New Tool: `generate_summary_prompt`

| Tool Name                 | Purpose                                                               |
| ------------------------- | --------------------------------------------------------------------- |
| `generate_summary_prompt` | Creates a clean, focused prompt for the LLM to summarize a given text |

---

## 🔁 Updated Flow (With Prompt Tool)

Let’s insert it into the scaffold:

1. Agent receives the goal: “Summarize input/article1.txt”
2. Calls `create_plan` → Plan: `read → generate prompt → summarize → save → log`
3. Calls `read_txt_file` to get raw text
4. Calls `generate_summary_prompt` to create the prompt from the text
5. LLM uses that prompt to generate bullet-point summary
6. Calls `save_summary`
7. Calls `track_progress`
8. Returns final message

---

### 🔧 Why Split the Prompt Tool?

* ✅ Keeps summarization **flexible** (can tune prompt later)
* ✅ Makes LLM logic **transparent**
* ✅ Easier to test the prompt generation logic
* ✅ Reusable for other agents (e.g., summarizing PDFs, chat transcripts)







This bit of code from the **Agent Builder Handbook** is quietly doing something powerful:

### **Code Pattern: GAME Scaffolding**

```python
class AgentBlueprint:
    def __init__(self, goals, instructions, actions, memory, environment):
        self.goals = goals
        self.instructions = instructions
        self.actions = actions  # abstract definitions
        self.memory = memory
        self.environment = environment  # actual implementations

# Example GAME setup
goals = ["Summarize Python files in the repo"]
instructions = ["Be concise, skip docstrings, focus on function definitions"]
actions = ["list_files", "read_file", "write_summary"]
memory = "sliding_window(5)"
environment = "local_filesystem"

agent = AgentBlueprint(goals, instructions, actions, memory, environment)
```

---

### 🔍 What That GAME Snippet Teaches

It shows that:

* Goals are specific
* Instructions are decoupled
* Actions (tools) are modular and abstract
* Memory and environment are swappable

Your insight to add a **`generate_summary_prompt`** tool aligns *exactly* with this pattern.

---

## 🧠 What the Handbook *Implies* (and your teacher reinforced)

> ❝ Make your tools small, testable, and **LLM-friendly**. Each one should do **one thing** and do it well. ❞

In this mindset:

* `generate_summary_prompt` is a **“prep” tool** → helps the LLM think clearly.
* It becomes a **reusable mental utility** — for summarizing emails, transcripts, reports, etc.
* The LLM doesn’t need to *remember* how to construct the prompt — the environment just hands it the right one.

---

## ✅ Summary

You're:

* Following **the GAME structure** ✔️
* Reducing LLM workload ✔️
* Encouraging reuse + separation of concerns ✔️
* Building a system that will be easier to simulate, swap, test ✔️

So yes — let's lock it in:

> Add `generate_summary_prompt` to the tool list
> Insert it into the scaffold right before the LLM summary generation





# 🧰 Tools (final list)

* `create_plan` — make a tiny plan for the run
* `read_txt_file(folder, file_name)` — load raw text from a known folder
* `generate_summary_prompt(text)` — craft a minimal, reusable LLM prompt for summarizing
* `save_summary(output_folder, file_name, content)` — write the summary to a known folder
* `track_progress(step, status, note?)` — log progress/status

# 🧩 Capability stack

* `PlanFirstCapability` → ensures we call `create_plan` first
* `ProgressTrackingCapability` → captures progress entries

# 🎭 Dress rehearsal (scaffold)

1. Agent receives goal: “Summarize `input/article1.txt` into concise bullet points.” (store goal)
2. `create_plan` → plan like: *read → generate\_prompt → summarize → save → log* (store plan)
3. `read_txt_file(input, "article1.txt")` → get raw text (store in state)
4. `generate_summary_prompt(text)` → produce clean summarization prompt (store prompt)
5. LLM → generate 4–6 bullet points using that prompt (core cognition)
6. `save_summary(output, "article1_summary.txt", summary)` → persist result
7. `track_progress(step=final, status="done", note="summary saved")`
8. Return: “Summary saved to `output/article1_summary.txt`.”






## 🔧 What “Dependencies” Means

Each tool can’t exist in a vacuum — it needs access to *something* from the agent’s context (files, memory, folders, clocks, etc.).
The trick is to **list those explicitly** so tools stay stateless and testable.

---

## 🧰 Tool Dependency Table

| Tool                                              | Purpose                     | Needs From `ActionContext`                                  | Notes                                      |
| ------------------------------------------------- | --------------------------- | ----------------------------------------------------------- | ------------------------------------------ |
| `create_plan`                                     | Make a step-by-step plan    | **Goal** (from memory)                                      | Simple: just reads the goal + instructions |
| `read_txt_file(folder, file_name)`                | Load raw text               | **Folder path**, **file name**                              | Folder should come from config, not LLM    |
| `generate_summary_prompt(text)`                   | Create summarization prompt | **Text content** (already in memory/state)                  | No external deps — pure transformer        |
| `save_summary(output_folder, file_name, content)` | Save the summary            | **Output folder**, **summary text**                         | Output folder fixed/configured             |
| `track_progress(step, status, note?)`             | Log what happened           | **Memory** (to append logs), **clock** (optional timestamp) | Helpful for debugging/testing              |

---

## ✅ Key Design Notes

* **Folder paths**: never chosen by the LLM — injected once into `ActionContext`.
* **Memory**: shared so both logs + results can be tracked.
* **Clock**: optional dep for `track_progress` (great for replay/debug).
* **LLM**: only gets the **prompt + text**; everything else stays in Python.






## 🧭 The Core Design Principle

You nailed it:

> 🧱 “We want tools to be **reusable** — so inputs like folders or file paths shouldn’t be hardcoded.”

Instead, we should:

* Provide fixed values like `folder paths` via **dependency injection**
* Pass dynamic values like `file_name` via **arguments from the LLM**

This keeps the **LLM focused on decisions**, and the **environment focused on execution**.

---

## ✅ Recommendation: Inject Folder Paths via `ActionContext.config`

### How it works:

* Store known folders in `ActionContext.config`:

  ```python
  ActionContext(config={
      "input_folder": "input/",
      "output_folder": "output/"
  })
  ```

* In your tool:

  ```python
  def read_txt_file(ctx, file_name, _input_folder):
      full_path = os.path.join(_input_folder, file_name)
      ...
  ```

* `_input_folder` is injected automatically from:

  ```python
  deps={"input_folder": "input/"}  # from ActionContext.deps
  ```

OR — if you prefer — you can pull it from `ctx.config["input_folder"]` inside the tool.

Either way:

* **`file_name`** = LLM chooses it
* **`folder path`** = injected at runtime

---

### 🧪 Example Call

Agent wants to read a file:

```json
{
  "tool": "read_txt_file",
  "arguments": {
    "file_name": "article1.txt"
  }
}
```

Then:

* `read_txt_file(ctx, file_name, _input_folder)`
* Tool constructs: `input/article1.txt` internally

---

## 🔁 TL;DR — Final Rule

* ✅ **LLM provides**: file names, summary content, etc.
* ✅ **You provide**: config (folder paths), injected via `ActionContext`
* 🔁 Makes tools **testable**, **swappable**, and **clean**






## 🪛 Step 4: Define Tool Interfaces

> "Write the signature and schema of each tool — so the agent knows what to call, and Python knows what to inject."

We'll do **two things per tool**:

1. Write the **Python signature** (with `ctx` + injected deps)
2. Write the **LLM-facing schema** (JSON-style: name + args)

---

### 🧰 Tool 1: `create_plan`

* **Signature**:

  ```python
  def create_plan(ctx):
      ...
  ```

* **Schema**:

  ```json
  {
    "name": "create_plan",
    "description": "Create a short plan for completing the goal.",
    "parameters": {}
  }
  ```

---

### 🧰 Tool 2: `read_txt_file`

* **Signature**:

  ```python
  def read_txt_file(ctx, file_name, _input_folder):
      ...
  ```

* **Schema**:

  ```json
  {
    "name": "read_txt_file",
    "description": "Read a text file from the input folder.",
    "parameters": {
      "file_name": {
        "type": "string",
        "description": "The name of the file to read (e.g., article1.txt)"
      }
    }
  }
  ```

---

### 🧰 Tool 3: `generate_summary_prompt`

* **Signature**:

  ```python
  def generate_summary_prompt(ctx, text):
      ...
  ```

* **Schema**:

  ```json
  {
    "name": "generate_summary_prompt",
    "description": "Create a prompt for summarizing the given text.",
    "parameters": {
      "text": {
        "type": "string",
        "description": "The raw text to summarize"
      }
    }
  }
  ```

---

### 🧰 Tool 4: `save_summary`

* **Signature**:

  ```python
  def save_summary(ctx, file_name, content, _output_folder):
      ...
  ```

* **Schema**:

  ```json
  {
    "name": "save_summary",
    "description": "Save the summary to the output folder.",
    "parameters": {
      "file_name": {
        "type": "string",
        "description": "The output file name (e.g., article1_summary.txt)"
      },
      "content": {
        "type": "string",
        "description": "The text content to save"
      }
    }
  }
  ```

---

### 🧰 Tool 5: `track_progress`

* **Signature**:

  ```python
  def track_progress(ctx, step, status, note=None, _clock=None):
      ...
  ```

* **Schema**:

  ```json
  {
    "name": "track_progress",
    "description": "Log agent progress at each step.",
    "parameters": {
      "step": { "type": "string", "description": "Step name" },
      "status": { "type": "string", "description": "Status or result" },
      "note": { "type": "string", "description": "Optional extra info", "optional": true }
    }
  }
  ```

---

## 🧩 Recap: What We Just Did

You now have:

* **Python signatures** — with `ctx` and injected deps
* **LLM schemas** — names + parameter types

You’re ready to:

> **Step 5: Implement the tools**
> But slowly — one at a time, and testable.




## ❓1. Why do we break it into “signature” and “schema”?

### ✍️ **Signature**

* The **Python function signature** (e.g., `def read_txt_file(ctx, file_name, _input_folder)`) defines how the tool is written and how it runs **in code**.
* This is what the **Agent runtime** uses when executing the tool.

### 🧠 **Schema**

* The **Schema** (like your JSON block) is how the **LLM understands the tool**.
* It’s what gets registered with the LLM as part of the agent's `actions`/tools.

### 🧩 Think of it like this:

| Aspect    | Used By | Purpose                                                          |
| --------- | ------- | ---------------------------------------------------------------- |
| Signature | Python  | How the tool runs in real life                                   |
| Schema    | LLM     | How the agent knows what it can call, and what arguments to pass |

So we need both:

* One for the **runtime**
* One for the **language model interface**

---

## ❓2. What is `ctx` in `def create_plan(ctx)`?

> `ctx` = short for **ActionContext**

It’s the context object passed into every tool. It gives the tool access to:

* Memory
* Config
* Dependencies
* Filesystem
* Clock
* Scratchpad state
* Logs
* Anything shared across tools

It’s like the **backstage pass** for tools.

You never want your tools to go off grabbing global variables or hardcoded things — so you hand them everything they need via `ctx` (or via explicitly injected parameters like `_input_folder`).

---

## ❓3. Are those schemas based on OpenAI’s API tool format?

Yes — exactly.

You're looking at a schema that is:

* ✅ 100% compatible with OpenAI’s `functions` / `tool_choice` format
* ✅ Exactly what’s used in the **agent recipe**
* ✅ What the `LLMFunctionCaller` in the recipe expects

This structure:

```json
{
  "name": "generate_summary_prompt",
  "description": "...",
  "parameters": {
    ...
  }
}
```

…is standard. And yes, you can test with OpenAI or use mock LLMs during local dev.

---

## ✅ TL;DR Summary

| Concept       | Meaning                                                                           |
| ------------- | --------------------------------------------------------------------------------- |
| `signature`   | Python definition for how the tool is called at runtime                           |
| `schema`      | JSON structure the LLM uses to decide how to call tools                           |
| `ctx`         | The injected ActionContext — gives tools access to shared config, memory, and env |
| Schema format | Yes, it's OpenAI-compatible and exactly what the agent recipe expects             |





> 🧳 **ActionContext = the agent’s backpack.**

Let’s break it down in detail to reinforce:

---

## 🎒 What is `ActionContext`?

It’s a single object that:

* Holds everything the agent (and its tools) might need during execution.
* Gets passed into **every tool** call.
* Is **injected fresh** for each agent run — so agents don’t share backpacks.

---

## 🧩 What goes inside the backpack?

| Key       | What it holds                                         | Who sets it              |
| --------- | ----------------------------------------------------- | ------------------------ |
| `config`  | Static config (like folder paths, model IDs)          | You (the agent designer) |
| `memory`  | Persistent state — plan, goal, logs, last tool output | Agent runtime            |
| `deps`    | Injected dependencies (e.g., `input_folder`, `clock`) | You                      |
| `llm`     | Reference to an LLM tool (if one is used)             | You                      |
| `scratch` | Temporary runtime data                                | Agent + tools            |
| `clock`   | Optional timestamp system                             | You                      |
| `logger`  | Logging hook                                          | Agent system or you      |

---

## 🔁 Is ActionContext shared?

* **The `ActionContext` class is reusable** — same code for all agents.
* But each **instance is unique per agent run.**
* So:

  * ✅ **Agent A and Agent B have their own contexts**
  * ✅ Even multiple *runs* of the same agent get their own context

---

## 🧠 Why This Is Smart

It gives you:

* ✅ **Isolation**: agents don’t interfere with each other
* ✅ **Modularity**: tools stay pure, take only what they need
* ✅ **Debuggability**: you can snapshot the whole run from the context

---

## 🧪 Want to See One?

Here’s what a sample `ActionContext` might look like when building your summarizer:

```python
ctx = ActionContext(
    config={
        "input_folder": "input/",
        "output_folder": "output/"
    },
    memory=ScratchMemory(),
    deps={
        "clock": Clock.now,
        "input_folder": "input/",
        "output_folder": "output/"
    },
    llm=openai_chat_model,
)
```

Then you just pass `ctx` to the agent and it flows into every tool.


