<a href="https://colab.research.google.com/github/micah-shull/AI_Agents/blob/main/107_Meeting_Notes_Agent.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## GOAL

In [35]:
goal = """
Extract action items from a meeting-notes .txt file —
capturing action, owner, and deadline
(verbatim phrase + ISO date when inferable) —
then save the results to JSON and CSV in the output folder.
""".strip()

## IMPORTS


In [36]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ SETUP: Notebook Environment (Optional)                                       ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
# For Colab or Notebook use only. Use pip install if dependencies are not pre-installed.
!pip install -q openai python-dotenv


# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ IMPORTS                                                                      ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
# Standard Library
import os
import re
import time
import inspect
import textwrap
from dataclasses import dataclass
from typing import Callable, Optional, Any, Dict
# External Libraries
from dotenv import load_dotenv
from openai import OpenAI
import builtins  # Ensure builtins.open is available (not overridden)
import json
from datetime import datetime, timedelta
import csv


# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ OPENAI CLIENT SETUP                                                          ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
# Load secrets from .env — avoid hardcoding API keys!
load_dotenv("/content/API_KEYS.env")
api_key = os.getenv("OPENAI_API_KEY")

if not api_key:
    raise RuntimeError("OPENAI_API_KEY not found. Please check your .env file.")

# Initialize OpenAI client
client = OpenAI(api_key=api_key)

## MEETING NOTES

In [37]:
notes = {
    "meeting_notes_1.txt": """
        Marketing Sync — August 21, 2025

        Attendees: Sarah L., Tom H., Raj P.

        Topics Covered:
        - Campaign performance from Q3
        - Influencer pipeline updates
        - Email open rates dropped 8%

        Action Items:
        - Sarah to follow up with Design on banner asset refresh. (Due: Aug 25)
        - Tom will run A/B test on email subject lines. (No due date mentioned)
        - Raj to source 3 new influencers by next Friday.

        Other Notes:
        - Consider exploring TikTok ad spend next month.
 """,
    "meeting_notes_2.txt": """
        Engineering Standup - 08/21

        In attendance: Jesse, Lin, Martin, Ahmed

        Progress:
        - Jesse completed the logging refactor.
        - Lin is 70% through the auth system rewrite.

        Action Items:
        - Martin to finalize database migration checklist (by Thursday)
        - Ahmed needs to resolve flaky test on signup flow (ASAP)

        Blocked:
        - Auth system cannot be merged until migration is complete.
""",
    "meeting_notes_3.txt": """
        Client Kickoff: Acme Corp - August 20th

        Team: Chloe (PM), Luis (Eng), Danielle (Design)

        Summary:
        - Acme is aiming for a soft launch in late September.
        - They expect weekly updates and a shared task board.

        Action Items:
        - Chloe to set up shared Trello board by tomorrow EOD.
        - Luis will prepare system architecture diagram for Monday's call.
        - Danielle to send first design draft by next Wednesday.

        Open Questions:
        - Will Acme provide final copy or do we need to draft placeholders?
""",
}

input_folder = "/content/files"
os.makedirs(input_folder, exist_ok=True)

for name, content in notes.items():
    with open(os.path.join(input_folder, name), "w") as f:
        f.write(textwrap.dedent(content).strip() + "\n")


print("Files saved!")


Files saved!


## STANDARD RESULT ENVELOPE

In [38]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ STANDARD RESULT ENVELOPE                                                     ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
# These functions define a common "contract" between tools and agent logic.

def ok(**data):
    """
    Return a successful result in standardized format.
    This helps agents reason about output consistently.
    """
    return {"ok": True, **data}

def err(msg, hint=None, retryable=False, **extra):
    """
    Return a failure result with optional hints and retryable flag.
    Ensures consistent structure for error handling and debugging.
    """
    out = {"ok": False, "error": msg, "retryable": retryable}
    if hint:
        out["hint"] = hint
    if extra:
        out.update(extra)
    return out


# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ FILESYSTEM ADAPTER (for underscore-DI: _fs)                                  ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

class RealFS:
    """
    A pluggable filesystem adapter.
    Enables tools to work with files via dependency injection (_fs),
    allowing flexibility for testing or alternate storage layers.
    """
    path = os.path               # Standard path operations (e.g., join, exists)
    makedirs = staticmethod(os.makedirs)  # Create directories
    open = staticmethod(builtins.open)    # File open (read/write)


## MEMORY & ACTION CONTEXT

In [39]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ MEMORY & ACTION CONTEXT                                                      ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

class ScratchMemory:
    """
    Minimal in-memory key/value store for agent state.
    Enables steps to share information and persist outputs.
    """
    def __init__(self):
        self.store = {}

    def get(self, key, default=None):
        return self.store.get(key, default)

    def set(self, key, value):
        self.store[key] = value

# ── Valid progress states for tracking tool execution ──────────────────────────
VALID_STATUSES = {"started", "completed", "error"}


class ActionContext:
    """
    Agent context object — shared across all tools and steps.

    Attributes:
    - memory:   scratchpad state (shared step-to-step)
    - llm:      LLM wrapper for completions
    - config:   runtime config like folder paths or model names
    - deps:     injectable dependencies (_fs, _clock, etc.)

    Also handles centralized progress logging and status checks.
    """
    def __init__(self, memory, llm, config=None, deps=None):
        self.memory = memory
        self.llm = llm
        self.config = config or {}
        self.deps = deps or {}

    # ── Progress tracking: step lifecycle states ───────────────────────────────
    def track_progress(self, step, status, note=""):
        if status not in VALID_STATUSES:
            raise ValueError(f"Invalid status '{status}'. Use {VALID_STATUSES}.")
        log = self.memory.get("progress_log", [])
        log.append({
            "step": step,
            "status": status,
            "note": note,
            "time": time.strftime("%Y-%m-%d %H:%M:%S"),
        })
        self.memory.set("progress_log", log)

    def print_progress(self):
        log = self.memory.get("progress_log", [])
        print("\n📊 Progress Log:")
        for e in log:
            t = f" ({e.get('time')})" if e.get("time") else ""
            note = f" — {e['note']}" if e['note'] else ""
            print(f"- [{e['status']}] {e['step']}{t}{note}")

    def last_completed_step(self):
        log = self.memory.get("progress_log", [])
        for e in reversed(log):
            if e.get("status") == "completed":
                return e.get("step")
        return None

    def first_error(self):
        log = self.memory.get("progress_log", [])
        for e in log:
            if e.get("status") == "error":
                return e
        return None

## LLM WRAPPER

In [40]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ LLM WRAPPER                                                                 ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

class OpenAILLM:
    """
    Wrapper for OpenAI chat models.
    Provides a simplified `.complete(prompt)` interface for agent tools.
    """
    def __init__(self, client, model="gpt-4o-mini", temperature=0.2):
        self.client = client
        self.model = model
        self.temperature = temperature

    def complete(self, prompt: str, **kwargs) -> str:
        """
        Send a simple prompt to the model and return the reply text.
        Optional override: temperature, etc.
        """
        temp = kwargs.get("temperature", self.temperature)
        try:
            response = self.client.chat.completions.create(
                model=self.model,
                messages=[{"role": "user", "content": prompt}],
                temperature=temp,
            )
            return response.choices[0].message.content
        except Exception as e:
            raise RuntimeError(f"LLM call failed: {type(e).__name__}: {e}")


## TOOLS

In [51]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ TOOL: create_plan                                                           ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
def create_plan(ctx):
    """
    Converts a high-level goal into a step-by-step plan using the LLM.
    Saves the plan in ctx.memory['plan'] for future steps.
    """
    goal = ctx.memory.get("goal")
    if not goal:
        return err("No goal provided (memory key 'goal' missing).",
                   hint="Set ctx.memory['goal'] before calling create_plan")

#     prompt = f"""You are an expert task planner.

# Given the goal below, break it into a short numbered list of clear, concrete steps.

# Goal: {goal}

# Respond ONLY with a numbered list. One step per line. No extra explanation."""

    prompt = f"""
    You are a senior step-planning expert.

    Task: Turn the GOAL into a concise, ordered plan that is tool-agnostic.

    Principles:
    - One concrete action per step, lead with an imperative verb.
    - Order by dependency; gather missing inputs early.
    - After transformations, include a brief verification step.
    - End with a handoff/deliverable step when applicable.

    Constraints:
    - 5–9 steps total.
    - No implementation details (no code, APIs, models, regex, etc.).
    - Be specific about outcomes (what will exist after each step).

    Output:
    - NUMBERED LIST ONLY, one step per line.

    GOAL:
    \"\"\"{goal}\"\"\"
    """.strip()


    raw = ctx.llm.complete(prompt).strip()

    # Prefer numbered format (e.g. 1. ..., 2) ...)
    numbered = re.findall(r'^\s*(?:\d+[\).\s-]+)\s*(.+)$', raw, flags=re.M)

    if numbered:
        steps = numbered
    else:
        # Fallback: bullets (-, *, •)
        bullets = re.findall(r'^\s*[-*•]\s+(.+)$', raw, flags=re.M)
        steps = bullets if bullets else [ln.strip() for ln in raw.splitlines() if ln.strip()]

    # Normalize and deduplicate steps
    clean_steps = []
    seen = set()
    for step in steps:
        norm = re.sub(r'\s+', ' ', step).strip(' .')
        if norm and norm.lower() not in seen:
            seen.add(norm.lower())
            clean_steps.append(norm)

    if not clean_steps:
        return err("Planner returned no usable steps.",
                   hint="Try refining the goal or loosening parsing rules")

    ctx.memory.set("plan", clean_steps)
    return ok(message="Plan created from goal.", steps=clean_steps)


def build_pipeline_from_goal(ctx):
    """
    Build a safe, deterministic pipeline for the Meeting Note Action Extractor
    based on the current goal + memory (file_name).
    Stores a list of (tool_name, kwargs) tuples in ctx.memory['planned_steps'].
    """
    goal = ctx.memory.get("goal")
    if not goal:
        return err("No goal found in memory.", hint="Set ctx.memory['goal'] first.")

    file_name = ctx.memory.get("file_name") or "meeting_notes_1.txt"
    steps = [
        ("create_plan", {}),
        ("read_txt_file", {"file_name": file_name}),
        ("parse_meeting_date", {}),            # ← add this
        ("build_extraction_prompt", {}),
        ("extract_actions", {}),
        ("save_actions_json", {}),
        ("save_actions_csv", {}),
    ]
    ctx.memory.set("planned_steps", steps)
    return ok(message="Pipeline built from goal.", steps=[{"tool": n, "args": a} for n, a in steps])

def list_txt_files(ctx):
    folder = ctx.config.get("input_folder")
    if not folder or not os.path.isdir(folder):
        return err("Invalid input_folder.")
    files = sorted([f for f in os.listdir(folder) if f.endswith(".txt")])
    ctx.memory.set("available_txt_files", files)
    return ok(message=f"{len(files)} .txt file(s) found.", files=files)

def read_txt_file(ctx, file_name):
    """
    Reads a .txt file from the configured input folder.
    Stores raw text and filename in memory.
    """
    base = os.path.abspath(ctx.config.get("input_folder", ""))
    path = os.path.abspath(os.path.join(base, file_name))

    if not base or not path.startswith(base + os.sep):
        return err("Path traversal blocked.", retryable=False)

    if not os.path.exists(path):
        return err(f"File not found: {path}",
                   hint="Call list_txt_files to see available files",
                   retryable=True)

    with open(path, "r", encoding="utf-8") as f:
        text = f.read()

    ctx.memory.set("file_name", file_name)
    ctx.memory.set("raw_text", text)
    return ok(message="File read successfully.", length=len(text))

# ============================
# MEETING NOTE ACTION EXTRACTOR TOOLS
# ============================

# --- Helper: (lightweight) deadline normalizer for common phrases ---
WEEKDAYS = ["monday","tuesday","wednesday","thursday","friday","saturday","sunday"]

def _next_weekday(from_dt: datetime, target_name: str) -> datetime:
    target = WEEKDAYS.index(target_name.lower())
    delta = (target - from_dt.weekday() + 7) % 7
    delta = 7 if delta == 0 else delta
    return from_dt + timedelta(days=delta)

def normalize_deadline_text(deadline_text: str, now: datetime | None = None) -> dict:
    """
    Try to convert common natural language into an ISO8601 date.
    Returns {"text": original, "iso": optional ISO date string}
    """
    if not deadline_text:
        return {"text": "", "iso": None}
    txt = deadline_text.strip()
    now = now or datetime.now()

    # Absolute-like patterns: Aug 25, 2025 ; 08/21 ; August 20th
    try:
        for fmt in ["%b %d, %Y", "%b %d", "%m/%d/%Y", "%m/%d"]:
            try:
                dt = datetime.strptime(txt.replace("th","").replace("st","").replace("nd","").replace("rd",""), fmt)
                # If year missing, assume this year (or next if already passed)
                if "%Y" not in fmt:
                    dt = dt.replace(year=now.year)
                    if dt.date() < now.date():
                        dt = dt.replace(year=now.year + 1)
                return {"text": txt, "iso": dt.date().isoformat()}
            except ValueError:
                pass
    except Exception:
        pass

    # Relative: tomorrow, today, EOD, ASAP, next Friday, by Thursday, Monday
    low = txt.lower()

    if "asap" in low:
        return {"text": txt, "iso": None}
    if "tomorrow" in low:
        dt = now + timedelta(days=1)
        return {"text": txt, "iso": dt.date().isoformat()}
    if "today" in low:
        return {"text": txt, "iso": now.date().isoformat()}

    # "next <weekday>"
    m = re.search(r"next\s+(monday|tuesday|wednesday|thursday|friday|saturday|sunday)", low)
    if m:
        dt = _next_weekday(now, m.group(1))
        return {"text": txt, "iso": dt.date().isoformat()}

    # "by <weekday>" or just "<weekday>"
    m = re.search(r"(?:by\s+)?(monday|tuesday|wednesday|thursday|friday|saturday|sunday)\b", low)
    if m:
        dt = _next_weekday(now - timedelta(days=1), m.group(1))  # handle "by Thursday" as the nearest upcoming
        return {"text": txt, "iso": dt.date().isoformat()}

    # EOD hints: we leave date empty unless paired with a day word; keep raw text
    return {"text": txt, "iso": None}

def parse_meeting_date(ctx):
    """
    Tries to detect a meeting date in ctx.memory['raw_text'] and stores
    ctx.memory['reference_date'] = datetime for downstream use.
    Supported formats: 'August 21, 2025', 'August 20th', '08/21', '08/21/2025'
    """
    text = ctx.memory.get("raw_text", "") or ""
    if not text:
        return err("No raw_text found.", hint="Run read_txt_file first.")

    # 1) Month-name formats
    m = re.search(r'\b(January|February|March|April|May|June|July|August|September|October|November|December)\s+(\d{1,2})(?:st|nd|rd|th)?(?:,\s*(\d{4}))?\b', text, re.I)
    if m:
        month, day, year = m.group(1), int(m.group(2)), m.group(3)
        y = int(year) if year else datetime.now().year
        try:
            dt = datetime.strptime(f"{month} {day} {y}", "%B %d %Y")
            ctx.memory.set("reference_date", dt)
            return ok(message=f"Reference date set to {dt.date().isoformat()}")
        except ValueError:
            pass

    # 2) Numeric formats like 08/21 or 08/21/2025
    m = re.search(r'\b(\d{1,2})/(\d{1,2})(?:/(\d{2,4}))?\b', text)
    if m:
        mm, dd, yy = int(m.group(1)), int(m.group(2)), m.group(3)
        y = int(yy) if yy else datetime.now().year
        if y < 100:  # 2-digit year
            y += 2000
        try:
            dt = datetime(year=y, month=mm, day=dd)
            ctx.memory.set("reference_date", dt)
            return ok(message=f"Reference date set to {dt.date().isoformat()}")
        except ValueError:
            pass

    # Fallback to "today"
    dt = datetime.now()
    ctx.memory.set("reference_date", dt)
    return ok(message=f"No explicit meeting date found. Using today: {dt.date().isoformat()}")


def build_extraction_prompt(ctx):
    """
    Create a strict JSON-only prompt to extract action items, owners, and deadlines.
    Prefers a meeting-based reference date (ctx.memory['reference_date']) for resolving
    relative deadlines; falls back to RUN_NOW (if defined) or datetime.now().
    Saves 'extraction_prompt' in memory.
    """
    note = ctx.memory.get("raw_text")
    if not note:
        return err("No raw_text in memory.", hint="Run read_txt_file first.")

    # Choose the reference date for resolving relative phrases like "next Friday"
    ref_dt = ctx.memory.get("reference_date")  # set by parse_meeting_date
    try:
        fixed_now = RUN_NOW  # optional: define RUN_NOW in Setup for reproducible runs
    except NameError:
        fixed_now = None
    base_dt = ref_dt or fixed_now or datetime.now()
    today_str = base_dt.date().isoformat()

    prompt = f"""
You are an information extraction system. From the meeting note below, extract ACTION ITEMS.

Return STRICT JSON ONLY with this shape:
{{
  "actions": [
    {{
      "action": "do something",
      "owner": "Full Name or short name",
      "deadline_text": "verbatim deadline phrase if any, else empty",
      "deadline_iso": "YYYY-MM-DD if you can infer a concrete date, else null"
    }}
  ]
}}

Rules:
- Only include genuine action items (tasks someone must do).
- Prefer a single, primary owner. If none is obvious, set owner to "".
- Use concise action phrasing (start with a verb).
- Copy the deadline phrase exactly into deadline_text when present.
- If the phrase implies a clear date (e.g., "by Thursday", "next Friday", "Aug 25"), attempt a concrete YYYY-MM-DD based on TODAY.
- If not concrete (e.g., "ASAP", "EOD", "soon"), set deadline_iso to null.
- Output JSON ONLY. No commentary.

TODAY is {today_str}.

MEETING NOTE:
\"\"\"{note}\"\"\"
"""
    ctx.memory.set("extraction_prompt", prompt)
    return ok(message="Extraction prompt created.")

def extract_actions(ctx):
    """
    Call the LLM with the extraction prompt and post-process:
    - Parse JSON
    - Normalize deadline_iso using the meeting reference date if available
    Saves 'actions' list in ctx.memory.
    """
    prompt = ctx.memory.get("extraction_prompt")
    if not prompt:
        return err("No extraction_prompt in memory.", hint="Run build_extraction_prompt first.")

    raw = ctx.llm.complete(prompt, temperature=0.0).strip().strip("```").strip()
    if raw.lower().startswith("json"):
        raw = raw[4:].lstrip()

    try:
        data = json.loads(raw)
        actions = data.get("actions", [])

        # Choose the date to resolve relative deadlines
        ref = ctx.memory.get("reference_date")
        try:
            base_dt = ref or RUN_NOW  # RUN_NOW optional (if you defined it)
        except NameError:
            base_dt = ref or datetime.now()

        # Normalize missing ISO deadlines from deadline_text
        for a in actions:
            dtxt = (a.get("deadline_text") or "").strip()
            diso = a.get("deadline_iso")
            if (not diso) and dtxt:
                norm = normalize_deadline_text(dtxt, now=base_dt)
                a["deadline_iso"] = norm["iso"]

        ctx.memory.set("actions", actions)
        return ok(message=f"Extracted {len(actions)} action items.", count=len(actions))

    except json.JSONDecodeError as e:
        return err("Model did not return valid JSON.", hint=str(e), retryable=True)


def save_actions_json(ctx, out_name=None, _fs=os):
    """
    Save extracted actions to JSON in output_folder.
    """
    actions = ctx.memory.get("actions")
    if actions is None:
        return err("No actions in memory.", hint="Run extract_actions first.")

    out_dir = ctx.config.get("output_folder")
    if not out_dir:
        return err("No output_folder in config.", hint="Set ctx.config['output_folder']")

    _fs.makedirs(out_dir, exist_ok=True)
    src = ctx.memory.get("file_name", "actions")
    root, _ = os.path.splitext(os.path.basename(src))
    base = out_name or f"{root}_actions.json"
    path = _fs.path.join(out_dir, base)

    with _fs.open(path, "w", encoding="utf-8") as f:
        json.dump({"actions": actions}, f, ensure_ascii=False, indent=2)

    ctx.memory.set("actions_json_path", path)
    return ok(message="Actions JSON saved.", path=path)


def save_actions_csv(ctx, out_name=None, _fs=os):
    """
    Save extracted actions to CSV in output_folder.
    Columns: action, owner, deadline_text, deadline_iso
    """
    actions = ctx.memory.get("actions")
    if actions is None:
        return err("No actions in memory.", hint="Run extract_actions first.")

    out_dir = ctx.config.get("output_folder")
    if not out_dir:
        return err("No output_folder in config.", hint="Set ctx.config['output_folder']")

    _fs.makedirs(out_dir, exist_ok=True)
    src = ctx.memory.get("file_name", "actions")
    root, _ = os.path.splitext(os.path.basename(src))
    base = out_name or f"{root}_actions.csv"
    path = _fs.path.join(out_dir, base)

    with _fs.open(path, "w", newline="", encoding="utf-8") as f:
        writer = csv.DictWriter(f, fieldnames=["action", "owner", "deadline_text", "deadline_iso"])
        writer.writeheader()
        for a in actions:
            writer.writerow({
                "action": a.get("action",""),
                "owner": a.get("owner",""),
                "deadline_text": a.get("deadline_text",""),
                "deadline_iso": a.get("deadline_iso") or ""
            })

    ctx.memory.set("actions_csv_path", path)
    return ok(message="Actions CSV saved.", path=path)


## TOOL REGISTRY

In [42]:
@dataclass
class ToolDef:
    name: str                  # Unique name for calling the tool
    func: Callable             # Actual function to execute
    description: str = ""      # Optional human-readable help
    schema: dict | None = None # Optional JSON Schema for validation
    returns: dict | None = None# Optional return schema for metadata

class ToolRegistry:
    def __init__(self):
        self._tools = {}

    def register(self, tool: ToolDef):
        self._tools[tool.name] = tool

    def get(self, name: str) -> ToolDef:
        if name not in self._tools:
            raise KeyError(f"Unknown tool: {name}")
        return self._tools[name]

    def list(self):
        return list(self._tools.keys())

# Ensure registry exists:
try:
    registry
except NameError:
    registry = ToolRegistry()

# 1) Planner (unchanged from your original style)
registry.register(ToolDef(
    "create_plan",
    create_plan,
    "Create a plan from goal",
    schema={ "type": "object", "properties": {}, "required": [] },
    returns={
        "type": "object",
        "properties": {
            "message": { "type": "string" },
            "steps":   { "type": "array", "items": { "type": "string" } }
        },
        "required": ["message", "steps"]
    }
))

# 2) Build Pipeline from Goal
registry.register(ToolDef(
    "build_pipeline_from_goal", build_pipeline_from_goal,
    "Deterministically compose the steps for the Meeting Note Action Extractor.",
    schema={ "type": "object", "properties": {}, "required": [] },
    returns={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
            "steps":   {"type": "array", "items": {"type": "object"}}
        },
        "required": ["message","steps"]
    }
))

# 3) Read .txt file
registry.register(ToolDef(
    "read_txt_file", read_txt_file, "Read a .txt file from input_folder",
    schema={
        "type": "object",
        "properties": {"file_name": {"type": "string"}},
        "required": ["file_name"]
    },
    returns={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
            "length":  {"type": "integer"}
        },
        "required": ["message"]
    }
))

# 4) Build extraction prompt
registry.register(ToolDef(
    "build_extraction_prompt", build_extraction_prompt,
    "Create JSON-only extraction prompt for actions/owners/deadlines.",
    schema={ "type": "object", "properties": {}, "required": [] },
    returns={
        "type": "object",
        "properties": { "message": { "type": "string" } },
        "required": ["message"]
    }
))

# 5) Extract actions
registry.register(ToolDef(
    "extract_actions", extract_actions,
    "Run LLM to extract structured action items into ctx.memory['actions'].",
    schema={ "type": "object", "properties": {}, "required": [] },
    returns={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
            "count":   {"type": "integer"}
            # Note: actions list is stored in memory; not returned here
        },
        "required": ["message"]
    }
))

# 6) Save JSON
registry.register(ToolDef(
    "save_actions_json", save_actions_json,
    "Save extracted actions to JSON in output_folder.",
    schema={
        "type": "object",
        "properties": { "out_name": { "type": "string" } },  # optional
        "required": []
    },
    returns={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
            "path":    {"type": "string"}
        },
        "required": ["message", "path"]
    }
))

# 7) Save CSV
registry.register(ToolDef(
    "save_actions_csv", save_actions_csv,
    "Save extracted actions to CSV in output_folder.",
    schema={
        "type": "object",
        "properties": { "out_name": { "type": "string" } },  # optional
        "required": []
    },
    returns={
        "type": "object",
        "properties": {
            "message": {"type": "string"},
            "path":    {"type": "string"}
        },
        "required": ["message", "path"]
    }
))

registry.register(ToolDef(
    "parse_meeting_date", parse_meeting_date,
    "Detect meeting date in note and store it as reference_date.",
    schema={"type": "object", "properties": {}, "required": []},
    returns={"type": "object", "properties": {"message": {"type":"string"}}, "required": ["message"]}
))

registry.register(ToolDef(
    "list_txt_files", list_txt_files, "List .txt files in input_folder",
    schema={"type":"object","properties":{},"required":[]},
    returns={"type":"object","properties":{"message":{"type":"string"}, "files":{"type":"array"}}, "required":["message","files"]}
))

## ENVIRONMENT

In [43]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ ENVIRONMENT — Validation, DI, Execution (Generic Scaffold)                   ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

# --- Minimal JSON-schema-ish validator for tool kwargs -------------------------
_JSON_TYPES = {
    "string": str,
    "integer": int,
    "number": (int, float),
    "boolean": bool,
    # extend as needed: "array": list, "object": dict ...
}

def validate_args(schema: Optional[Dict[str, Any]], kwargs: Dict[str, Any]) -> Optional[str]:
    """Return None if valid, else an error message."""
    if not schema:
        return None
    # required keys
    missing = [k for k in schema.get("required", []) if k not in kwargs]
    if missing:
        return f"Missing required: {missing}"
    # type checks
    props = schema.get("properties") or {}
    for key, spec in props.items():
        if key in kwargs and "type" in spec:
            py_t = _JSON_TYPES.get(spec["type"])
            if py_t and not isinstance(kwargs[key], py_t):
                return f"Bad type for '{key}': expected {spec['type']}"
    return None


class Environment:
    """
    Runs tools by name with:
      - input validation (schema)
      - dependency injection (ctx + underscore deps, e.g., _fs -> ctx.deps['fs'])
      - centralized progress logging via ctx.track_progress(...)
      - standardized results (ensures {'ok': True/False, ...})
    """
    def __init__(self, ctx, registry):
        self.ctx = ctx
        self.registry = registry

    def run(self, tool_name: str, **kwargs) -> Dict[str, Any]:
        # Lookup
        tool = self.registry.get(tool_name)
        fn = tool.func
        sig = inspect.signature(fn)

        # 1) Validate input BEFORE execution
        v_err = validate_args(tool.schema, kwargs)
        if v_err:
            self._log(tool.name, "error", v_err)
            return {"ok": False, "error": v_err, "retryable": True}

        # 2) Build call args with auto-DI
        call = {}
        for pname, param in sig.parameters.items():
            if pname == "ctx":
                call[pname] = self.ctx
            elif pname.startswith("_"):  # underscore dep → ctx.deps['name']
                dep_name = pname[1:]
                if dep_name not in self.ctx.deps:
                    msg = f"Missing dep '{dep_name}' for tool '{tool_name}'"
                    self._log(tool.name, "error", msg)
                    return {"ok": False, "error": msg}
                call[pname] = self.ctx.deps[dep_name]
            else:
                if pname in kwargs:
                    call[pname] = kwargs[pname]
                elif param.default is not inspect._empty:
                    # optional arg with default → let function use its default
                    pass
                else:
                    msg = f"Missing required arg '{pname}' for tool '{tool_name}'"
                    self._log(tool.name, "error", msg)
                    return {"ok": False, "error": msg, "retryable": True}

        # 3) Execute with logging + exception capture
        self._log(tool.name, "started", note=str(kwargs)[:180])
        try:
            result = fn(**call)
        except Exception as e:
            msg = f"{type(e).__name__}: {e}"
            self._log(tool.name, "error", msg)
            return {"ok": False, "error": msg}

        # 4) Normalize result shape + final log
        if isinstance(result, dict):
            if result.get("ok") is False:
                # tool already returned an error envelope
                self._log(tool.name, "error", note=str(result.get("error", ""))[:180])
                return result
            if "ok" not in result and "error" in result:
                # back-compat for dicts that signal error without ok flag
                self._log(tool.name, "error", note=str(result["error"])[:180])
                return {"ok": False, **result}
            # success path
            out = result if "ok" in result else {"ok": True, **result}
            self._log(tool.name, "completed", note=str(out.get("message", ""))[:180])
            return out

        # Non-dict success: wrap it
        self._log(tool.name, "completed")
        return {"ok": True, "result": result}

    # --- helper: centralized progress logging -----------------------------------
    def _log(self, step: str, status: str, note: str = "") -> None:
        # If ctx has track_progress, use it; otherwise no-op
        logger = getattr(self.ctx, "track_progress", None)
        if callable(logger):
            logger(step, status, note)


## SCRIPTED AGENT

In [44]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ SCRIPTED AGENT — Fixed Pipeline Runner (Generic Scaffold)                    ║
# ╚══════════════════════════════════════════════════════════════════════════════╝
from typing import Iterable, Tuple, Dict, Any, Optional

class ScriptedAgent:
    """
    Executes a predetermined sequence of (tool_name, kwargs) steps
    using the provided Environment.
    """
    def __init__(self, env, steps: Iterable[Tuple[str, Dict[str, Any]]]):
        self.env = env
        self.steps = list(steps)

    def run(
        self,
        max_calls: Optional[int] = None,
        stop_on_error: bool = True,
    ) -> Dict[str, Any]:
        calls = 0
        for name, kwargs in self.steps:
            if max_calls is not None and calls >= max_calls:
                return {"final": f"stopped: max_calls={max_calls}"}

            result = self.env.run(name, **(kwargs or {}))
            calls += 1

            # Optional: attach last_result to context for inspection
            if hasattr(self.env, "ctx"):
                self.env.ctx.memory.set("last_result", result)

            if stop_on_error and isinstance(result, dict) and result.get("ok") is False:
                out = {"final": f"stopped at {name}: {result.get('error', 'unknown error')}"}
                if "hint" in result:  # surface recovery tips
                    out["hint"] = result["hint"]
                return out

        return {"final": "done"}


## SETUP & CONFIG

In [45]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ SETUP & CONFIG                                                               ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

# Setup
from datetime import datetime
RUN_NOW = datetime.now()

# Memory: simple scratchpad shared across steps
memory = ScratchMemory()
memory.set("goal", goal)  # goal defined earlier as your multi-line string

# Runtime configuration knobs
config = {
    "input_folder": "/content/files",     # where input .txt files live
    "output_folder": "/content/output",   # where outputs are written
    "model": "gpt-4o-mini",               # explicit is clearer than relying on defaults
    "temperature": 0.2,
}

# LLM wrapper: single source of truth for model + defaults
llm = OpenAILLM(
    client=client,
    model=config["model"],
    temperature=config["temperature"],
)

# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ CONTEXT & ENVIRONMENT                                                        ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

# Ensure tool registry exists (in case this cell runs before registration cell)
try:
    registry
except NameError:
    registry = ToolRegistry()

# Create context with DI bag pre-populated (fs adapter, clock if you want later)
ctx = ActionContext(
    memory=memory,
    llm=llm,
    config=config,
    deps={"fs": RealFS},  # add more deps later if needed
)

# Guardrails: ensure folders exist
os.makedirs(ctx.config["input_folder"], exist_ok=True)
os.makedirs(ctx.config["output_folder"], exist_ok=True)
ctx.track_progress("setup", "completed", "goal + config injected")

# Build runtime (validation + underscore-DI + centralized logging)
env = Environment(ctx, registry)


## REPORTING

In [46]:
# ╔══════════════════════════════════════════════════════════════════════════════╗
# ║ REPORTING / PRETTY PRINT SNAPSHOT                                            ║
# ╚══════════════════════════════════════════════════════════════════════════════╝

def pretty_report(ctx):
    """Prints a human-readable snapshot of memory + progress log for the extractor."""
    print("\n" + "="*80)
    print("📦 Meeting Note Action Extractor — Report")

    # Goal
    print("\n🎯 Goal:")
    print(textwrap.fill(ctx.memory.get("goal","<no goal set>"), width=80))

    # Plan
    plan = ctx.memory.get("plan") or []
    if plan:
        print("\n🗺️  Plan:")
        for step in plan:
            print(" -", step)

    # Source file + preview
    fname = ctx.memory.get("file_name") or "<unknown file>"
    raw_text = ctx.memory.get("raw_text") or ""
    print(f"\n📄 Source file: {fname}")
    if raw_text:
        print("\n📝 Note Preview:")
        print(textwrap.fill(raw_text[:600], width=80, subsequent_indent="  "))
        if len(raw_text) > 600:
            print("... (truncated)")

    # Prompt preview (extraction prompt)
    prompt = ctx.memory.get("extraction_prompt") or ""
    if prompt:
        print("\n🧾 Extraction Prompt (first 600 chars):")
        print(textwrap.fill(prompt[:600], width=80, subsequent_indent="  "))
        if len(prompt) > 600:
            print("... (truncated)")

    # Actions
    actions = ctx.memory.get("actions") or []
    print(f"\n✅ Extracted actions: {len(actions)}")
    for a in actions[:10]:  # show up to 10
        print(" -", {k: a.get(k) for k in ("action","owner","deadline_text","deadline_iso")})
    if len(actions) > 10:
        print(f" ... and {len(actions)-10} more")

    # Saved artifacts
    if ctx.memory.get("actions_json_path"):
        print("\n💾 JSON:", ctx.memory.get("actions_json_path"))
    if ctx.memory.get("actions_csv_path"):
        print("💾  CSV:", ctx.memory.get("actions_csv_path"))

    # Progress log
    ctx.print_progress()


## RUN PIPELINE

In [47]:
file_name = "meeting_notes_1.txt"
ctx.memory.set("file_name", file_name)  # optional convenience for builders

# Build steps from the goal (deterministic builder recommended)
res = env.run("build_pipeline_from_goal")
if not res.get("ok"):
    raise RuntimeError(f"Failed to build pipeline: {res.get('error')}")

steps = ctx.memory.get("planned_steps")  # list of (tool_name, kwargs)

# (optional) show the goal for clarity
def show_goal(ctx):
    print("\n🎯 GOAL\n" + "-"*60)
    print(ctx.memory.get("goal", "<no goal set>"))

show_goal(ctx)

# Run the agent
agent = ScriptedAgent(env, steps)
final = agent.run(max_calls=10)
print("Pipeline final:", final.get("final"))

# (optional) if you added the updated pretty_report
pretty_report(ctx)



🎯 GOAL
------------------------------------------------------------
Extract action items from a meeting-notes .txt file —
capturing action, owner, and deadline
(verbatim phrase + ISO date when inferable) —
then save the results to JSON and CSV in the output folder.
Pipeline final: done

📦 Meeting Note Action Extractor — Report

🎯 Goal:
Extract action items from a meeting-notes .txt file — capturing action, owner,
and deadline (verbatim phrase + ISO date when inferable) — then save the results
to JSON and CSV in the output folder.

🗺️  Plan:
 - Open the meeting-notes .txt file for reading
 - Read the content of the .txt file into a variable
 - Identify and extract action items from the text using regex or keyword search
 - For each action item, capture the verbatim phrase, owner, and deadline
 - Format the extracted data into a structured format (e.g., list of dictionaries)
 - Convert the structured data to JSON format
 - Save the JSON data to a file in the output folder
 - Convert the

## Updated Promtp Testing

**Neither “short” nor “long” prompt wins by default—“clear + relevant + enforceable” wins.** For step-planning, a **compact prompt with explicit rules and an output schema** usually beats a super-short prompt, and it often beats an overly verbose one too.

Here’s how to think about it:

### Quick trade-offs

* **Short & sweet**

  * ✅ Flexible, faster, fewer tokens
  * ❌ More variance; format drift; misses verification/handoff steps
* **Verbose with many constraints**

  * ✅ More consistent; better format compliance
  * ❌ Can dilute focus if bloated; brittle if rules are redundant or irrelevant

### What tends to produce the best results

Use a **Goldilocks prompt**: tight, structured, and only the constraints that actually matter. Aim for **150–350 tokens**, with:

1. **Role** (1 line)
2. **Task** (1–2 lines)
3. **Principles** (3–5 bullets max; only what changes quality)
4. **Hard constraints** (2–4 non-negotiables)
5. **Output format** (strict, minimal, parsable)
6. **Goal** (the variable content)

### A compact, strict planner prompt (drop-in)

This is shorter than the rigorous one I gave before, but still strict and generic:

```python
prompt = f"""
You are a senior step-planning expert.

Task: Turn the GOAL into a concise, ordered plan that is tool-agnostic.

Principles:
- One concrete action per step, lead with an imperative verb.
- Order by dependency; gather missing inputs early.
- After transformations, include a brief verification step.
- End with a handoff/deliverable step when applicable.

Constraints:
- 5–9 steps total.
- No implementation details (no code, APIs, models, regex, etc.).
- Be specific about outcomes (what will exist after each step).

Output:
- NUMBERED LIST ONLY, one step per line.

GOAL:
\"\"\"{goal}\"\"\"
""".strip()
```

### When to prefer each style

* **Exploration/prototyping**: shorter prompt (+ slightly higher temperature) to see diverse plans.
* **Reliability/production**: compact-but-strict prompt (above) + low temperature for reproducibility.

### Make it measurable (tiny A/B harness)

If you want to prove it in your notebook, run each prompt 10 times and score compliance (number of steps in range, numbered format, presence of verification/handoff). Keep temp at 0.2 for both to compare fairly.

---

**Bottom line:** Go with the **compact, structured prompt**. It’s general, reusable across agents, and delivers more consistent step plans without the bloat.


In [52]:
file_name = "meeting_notes_1.txt"
ctx.memory.set("file_name", file_name)  # optional convenience for builders

# Build steps from the goal (deterministic builder recommended)
res = env.run("build_pipeline_from_goal")
if not res.get("ok"):
    raise RuntimeError(f"Failed to build pipeline: {res.get('error')}")

steps = ctx.memory.get("planned_steps")  # list of (tool_name, kwargs)

# (optional) show the goal for clarity
def show_goal(ctx):
    print("\n🎯 GOAL\n" + "-"*60)
    print(ctx.memory.get("goal", "<no goal set>"))

show_goal(ctx)

# Run the agent
agent = ScriptedAgent(env, steps)
final = agent.run(max_calls=10)
print("Pipeline final:", final.get("final"))

# (optional) if you added the updated pretty_report
pretty_report(ctx)


🎯 GOAL
------------------------------------------------------------
Extract action items from a meeting-notes .txt file —
capturing action, owner, and deadline
(verbatim phrase + ISO date when inferable) —
then save the results to JSON and CSV in the output folder.
Pipeline final: done

📦 Meeting Note Action Extractor — Report

🎯 Goal:
Extract action items from a meeting-notes .txt file — capturing action, owner,
and deadline (verbatim phrase + ISO date when inferable) — then save the results
to JSON and CSV in the output folder.

🗺️  Plan:
 - Open the meeting-notes .txt file
 - Read the contents of the file
 - Identify and extract action items from the text
 - For each action item, determine the owner and deadline if available
 - Format the extracted data into a structured format (action, owner, deadline)
 - Convert the structured data into JSON format
 - Save the JSON data to a file in the output folder
 - Convert the structured data into CSV format
 - Save the CSV data to a file in