# Script 03 — Building a Conversational Research Agent

This script combines every primitive from Scripts 01 and 02 into a
single, interactive research agent with personality and intelligence.

The agent:
- **Asks clarifying questions** before diving into research
- **Decides** what the user wants on every turn (DECIDE)
- **Plans** its research approach (PLAN)
- **Searches** the web and **fetches** content (SEARCH, FETCH)
- **Synthesizes** findings with its own perspective (THOUGHT)
- **Remembers** everything across turns (MEMORY)
- **Chats** interactively with the user (CHAT)

Prerequisites:
  - Groq API key and Brave Search API key in `.env`
  - `pip install thoughtflow`

In [1]:
from _setup import load_env, print_heading, print_separator
import os

env = load_env()

from thoughtflow import (
    LLM, MEMORY, THOUGHT, DECIDE, PLAN,
    SEARCH, CHAT,
)

llm = LLM("groq:llama-3.3-70b-versatile", key=os.environ.get("GROQ_API_KEY", ""))
brave_key = os.environ.get("BRAVE_API_KEY", "")

---
## Part 1 — Agent Architecture

The agent operates in phases, controlled by intent classification:

| Intent | What happens |
|--------|-------------|
| `new_topic` | Agent asks clarifying questions to understand the request |
| `clarify` | Agent synthesizes the user's answers and refines its understanding |
| `research` | Agent plans searches, fetches results, synthesizes findings |
| `deeper` | Agent digs into a sub-topic from previous results |
| `done` | Agent wraps up with a closing summary |

In [2]:
print_heading("1.1  System prompt — the agent's persona")

SYSTEM_PROMPT = """You are Ariel, a sharp and curious research analyst.

Your personality:
- You are direct and opinionated — you share your own synthesis, not just a dump of search results.
- You ask probing questions before researching. You want to understand WHAT the person actually needs, not just the surface request.
- When you find conflicting information, you surface the conflict honestly rather than glossing over it.
- You are concise but thorough. You respect people's time.
- You have a slight dry wit, but you never sacrifice clarity for humor.

Your workflow:
- When someone brings a new topic, ask 2-3 sharp clarifying questions FIRST.
- Only after you understand their angle do you dive into research.
- When presenting findings, lead with your own take, then support it with evidence.
- Always tell the user when you are about to search, so they know what is happening.

Rules:
- Never fabricate sources. If you searched and found nothing useful, say so.
- If you are unsure, say so and explain why.
- Keep responses focused. No walls of text."""

print("System prompt loaded ({} chars)".format(len(SYSTEM_PROMPT)))


══════════════════════════════════════════════════════════════════════
  1.1  System prompt — the agent's persona
══════════════════════════════════════════════════════════════════════

System prompt loaded (1043 chars)


---
## Part 2 — Building the Agent's Components

Each component is a reusable primitive that the agent function
orchestrates at runtime.

In [3]:
print_heading("2.1  Intent classifier (DECIDE)")

# This runs on every user turn to figure out what they want.
intent_classifier = DECIDE(
    name="intent",
    llm=llm,
    choices={
        "new_topic": "User is introducing a new research topic or question",
        "clarify": "User is answering a clarifying question or providing more context",
        "research": "User wants me to go ahead and research now",
        "deeper": "User wants to dig deeper into a sub-topic from previous results",
        "done": "User wants to wrap up or says goodbye",
    },
    prompt=(
        "Based on the conversation so far and the user's latest message, "
        "determine their intent.\n\n"
        "Conversation context:\n{conversation_context}\n\n"
        "Latest user message: {last_user_msg}"
    ),
    system_prompt=SYSTEM_PROMPT,
    max_retries=3,
)

print("Intent classifier ready:", intent_classifier)


══════════════════════════════════════════════════════════════════════
  2.1  Intent classifier (DECIDE)
══════════════════════════════════════════════════════════════════════

Intent classifier ready: Decide: intent (choices: new_topic, clarify, research, deeper, done)


In [4]:
print_heading("2.2  Discovery thought (asks clarifying questions)")

discovery_thought = THOUGHT(
    name="discovery",
    llm=llm,
    prompt=(
        "The user wants to research: {last_user_msg}\n\n"
        "What they have told you so far:\n{conversation_context}\n\n"
        "Ask 2-3 sharp, specific clarifying questions that will help you "
        "understand exactly what angle of this topic they care about. "
        "Be conversational and direct — you are Ariel."
    ),
    system_prompt=SYSTEM_PROMPT,
)

print("Discovery thought ready:", discovery_thought)


══════════════════════════════════════════════════════════════════════
  2.2  Discovery thought (asks clarifying questions)
══════════════════════════════════════════════════════════════════════

Discovery thought ready: Thought: discovery


In [5]:
print_heading("2.3  Research planner (PLAN)")

research_planner = PLAN(
    name="research_plan",
    llm=llm,
    actions={
        "search": {
            "description": "Search the web for information on a specific query",
            "params": {"query": "str"},
        },
        "analyze": {
            "description": "Analyze and extract key insights from gathered information",
            "params": {"focus": "str"},
        },
        "synthesize": {
            "description": "Create a clear synthesis of all findings",
            "params": {"format": "str?"},
        },
    },
    prompt=(
        "The user wants to know about: {research_topic}\n"
        "Specific angle: {research_angle}\n\n"
        "Create a focused research plan.  Use 2-3 search queries that "
        "cover different angles of the topic.  Then analyze and synthesize."
    ),
    system_prompt=SYSTEM_PROMPT,
    max_steps=5,
    max_parallel=2,
)

print("Research planner ready:", research_planner)


══════════════════════════════════════════════════════════════════════
  2.3  Research planner (PLAN)
══════════════════════════════════════════════════════════════════════

Research planner ready: Plan: research_plan (3 actions available)


In [6]:
print_heading("2.4  Synthesis thought (final answer)")

synthesis_thought = THOUGHT(
    name="synthesis",
    llm=llm,
    prompt=(
        "You are presenting your research findings to the user.\n\n"
        "Topic: {research_topic}\n"
        "Angle: {research_angle}\n\n"
        "Search results:\n{search_findings}\n\n"
        "Conversation so far:\n{conversation_context}\n\n"
        "Present your findings in a clear, opinionated way. "
        "Lead with your own synthesis, then support with evidence. "
        "If you found conflicting information, surface it. "
        "Keep it focused and actionable."
    ),
    system_prompt=SYSTEM_PROMPT,
)

print("Synthesis thought ready:", synthesis_thought)


══════════════════════════════════════════════════════════════════════
  2.4  Synthesis thought (final answer)
══════════════════════════════════════════════════════════════════════

Synthesis thought ready: Thought: synthesis


In [7]:
print_heading("2.5  Deeper-dive thought")

deeper_thought = THOUGHT(
    name="deeper_dive",
    llm=llm,
    prompt=(
        "The user wants to go deeper on a sub-topic.\n\n"
        "Their request: {last_user_msg}\n"
        "Previous research context:\n{search_findings}\n\n"
        "Identify the specific sub-topic they want to explore, "
        "then provide a more detailed analysis.  Use what you already "
        "know from previous research, and note what additional searching "
        "might be needed."
    ),
    system_prompt=SYSTEM_PROMPT,
)

print("Deeper-dive thought ready:", deeper_thought)


══════════════════════════════════════════════════════════════════════
  2.5  Deeper-dive thought
══════════════════════════════════════════════════════════════════════

Deeper-dive thought ready: Thought: deeper_dive


In [8]:
print_heading("2.6  Wrap-up thought")

wrapup_thought = THOUGHT(
    name="wrapup",
    llm=llm,
    prompt=(
        "The user is wrapping up the conversation.\n\n"
        "Conversation:\n{conversation_context}\n\n"
        "Give a brief, warm closing.  If you researched something, "
        "offer a 1-2 sentence takeaway.  Keep it short."
    ),
    system_prompt=SYSTEM_PROMPT,
)

print("Wrap-up thought ready:", wrapup_thought)


══════════════════════════════════════════════════════════════════════
  2.6  Wrap-up thought
══════════════════════════════════════════════════════════════════════

Wrap-up thought ready: Thought: wrapup


---
## Part 3 — The Agent Function

The agent function follows the ThoughtFlow contract: it takes a MEMORY
and returns a MEMORY.  Internally it classifies intent, then dispatches
to the appropriate phase.

In [9]:
print_heading("3.1  Defining the agent function")


def research_agent(memory):
    """
    A conversational research agent that classifies user intent on each
    turn and dispatches to discovery, research, deep-dive, or wrap-up.

    Follows the ThoughtFlow contract: memory in, memory out.

    The agent tracks its phase and accumulated findings in memory
    variables so it can maintain context across turns.
    """
    # Build a conversation summary for the LLM to reference.
    conversation_context = memory.render(
        format="conversation",
        max_total_length=3000,
    )
    memory.set_var("conversation_context", conversation_context)

    # --- Classify intent ---
    memory = intent_classifier(memory)
    intent = memory.get_var("intent_result")

    # --- Dispatch based on intent ---

    if intent == "new_topic":
        # Store the topic for later use.
        memory.set_var("research_topic", memory.last_user_msg(content_only=True))
        memory.set_var("research_phase", "discovery")
        # Ask clarifying questions.
        memory = discovery_thought(memory)
        agent_response = memory.get_var("discovery_result")

    elif intent == "clarify":
        # The user answered our questions.  Refine our understanding.
        memory.set_var("research_angle", memory.last_user_msg(content_only=True))
        memory.set_var("research_phase", "ready_to_research")
        # Acknowledge and tell them we are about to search.
        agent_response = (
            "Got it — that gives me a clear direction.  "
            "Let me search for some current information on this.  One moment..."
        )

    elif intent == "research":
        # Execute the research pipeline.
        agent_response = _execute_research(memory)

    elif intent == "deeper":
        # Dig deeper on a sub-topic.
        memory = deeper_thought(memory)
        agent_response = memory.get_var("deeper_dive_result")

    elif intent == "done":
        memory = wrapup_thought(memory)
        agent_response = memory.get_var("wrapup_result")

    else:
        agent_response = "I am not sure what you mean. Could you rephrase?"

    # Store the agent's response as an assistant message.
    memory.add_msg("assistant", agent_response, channel="cli")

    return memory


def _execute_research(memory):
    """
    Run the full research pipeline: plan → search → synthesize.

    Returns the synthesized response string.
    """
    # Ensure we have a topic and angle.
    topic = memory.get_var("research_topic") or memory.last_user_msg(content_only=True)
    angle = memory.get_var("research_angle") or ""
    memory.set_var("research_topic", topic)
    memory.set_var("research_angle", angle)
    memory.set_var("research_phase", "researching")

    # Step 1: Generate a research plan.
    memory = research_planner(memory)
    plan = memory.get_var("research_plan_result") or []

    # Step 2: Execute search queries from the plan.
    all_snippets = []
    search_queries = []

    for step in plan:
        for task in step:
            if task.get("action") == "search":
                query = task.get("params", {}).get("query", topic)
                search_queries.append(query)

    # Run up to 3 searches.
    for query in search_queries[:3]:
        searcher = SEARCH(
            name="live_search",
            provider="brave",
            query=query,
            api_key=brave_key,
            max_results=4,
        )
        memory = searcher(memory)
        results = memory.get_var("live_search_results") or {}
        for r in results.get("results", []):
            snippet = "- [{}] {}".format(r.get("title", ""), r.get("snippet", ""))
            all_snippets.append(snippet)

    # Store combined findings.
    findings_text = "\n".join(all_snippets) if all_snippets else "(No results found.)"
    memory.set_var("search_findings", findings_text)

    # Step 3: Synthesize.
    memory = synthesis_thought(memory)
    response = memory.get_var("synthesis_result")

    memory.set_var("research_phase", "presented")

    return response


print("research_agent() defined — ready to use.")


══════════════════════════════════════════════════════════════════════
  3.1  Defining the agent function
══════════════════════════════════════════════════════════════════════

research_agent() defined — ready to use.


---
## Part 4 — Running the Agent with CHAT

The `CHAT` class wraps any ThoughtFlow-contract function and provides
an interactive input/output loop.

- `CHAT.run()` — blocking interactive loop (type `q` to exit)
- `CHAT.turn(text)` — single turn, ideal for cell-by-cell use

In [10]:
print_heading("4.1  Programmatic turns with CHAT.turn()")

# CHAT.turn() lets you drive the conversation programmatically.
# This is perfect for notebooks where you want to see each step.

chat = CHAT(
    research_agent,
    greeting="Hi, I'm Ariel — your research analyst.  What would you like to explore?",
    channel="cli",
    user_label="You",
    agent_label="Ariel",
)

# Display the greeting.
print("Ariel:", chat.greeting)
print()


══════════════════════════════════════════════════════════════════════
  4.1  Programmatic turns with CHAT.turn()
══════════════════════════════════════════════════════════════════════

Ariel: Hi, I'm Ariel — your research analyst.  What would you like to explore?



In [11]:
# Turn 1: Introduce a topic.
print_separator()
print("You: I want to understand how companies are using LLMs in production right now.")
response = chat.turn("I want to understand how companies are using LLMs in production right now.")
print()
print("Ariel:", response)

──────────────────────────────────────────────────────────────────────
You: I want to understand how companies are using LLMs in production right now.

Ariel: So you want to know how companies are using Large Language Models (LLMs) in production. That's a pretty broad topic. To give you a more focused answer, can you tell me: 

1. Are you interested in a specific industry, like healthcare or finance, or do you want a general overview across multiple sectors?
2. What aspect of LLM usage are you most curious about - is it the types of tasks they're being used for, the benefits they're bringing, or perhaps the challenges companies are facing in implementation?
3. Are you looking for examples of well-known companies, or are you also interested in learning about smaller organizations or startups that might be using LLMs in innovative ways?


In [12]:
# Turn 2: Answer clarifying questions.
print_separator()
user_msg = (
    "I'm mostly interested in the infrastructure and cost side — "
    "how are companies managing latency, reliability, and spending at scale?"
)
print("You:", user_msg)
response = chat.turn(user_msg)
print()
print("Ariel:", response)

──────────────────────────────────────────────────────────────────────
You: I'm mostly interested in the infrastructure and cost side — how are companies managing latency, reliability, and spending at scale?

Ariel: research


In [13]:
# Turn 3: Ask the agent to go research.
print_separator()
print("You: Yes, go ahead and research that.")
response = chat.turn("Yes, go ahead and research that.")
print()
print("Ariel:", response)

──────────────────────────────────────────────────────────────────────
You: Yes, go ahead and research that.

Ariel: Based on my research, it's clear that managing latency, reliability, and spending at scale is a significant challenge for companies using Large Language Models (LLMs) in production. My synthesis is that companies must carefully weigh the benefits of using LLMs against the significant infrastructure and cost challenges. 

To mitigate these expenses, companies are exploring techniques like caching, batching, and dynamic scaling to cut inference costs. For instance, using spot instances for non-urgent tasks and reserved instances for predictable workloads can help minimize costs. Additionally, optimizing memory usage, applying lower precision calculations, and configuring GPU features for asynchronous execution can also enhance performance.

The estimated annual costs for a production-grade LLM deployment can range from $800,000 to $1,200,000, primarily due to human infrast

In [14]:
# Turn 4: Ask to go deeper.
print_separator()
print("You: Tell me more about the cost optimization strategies specifically.")
response = chat.turn("Tell me more about the cost optimization strategies specifically.")
print()
print("Ariel:", response)

──────────────────────────────────────────────────────────────────────
You: Tell me more about the cost optimization strategies specifically.

Ariel: The user wants to explore the sub-topic of cost optimization strategies for Large Language Models (LLMs) in production. Based on my previous research, I've identified some key strategies that companies are using to optimize costs, including:

1. **Caching, batching, and dynamic scaling**: These techniques can help reduce inference costs by minimizing the number of requests made to the LLM.
2. **Using spot instances for non-urgent tasks and reserved instances for predictable workloads**: This approach can help companies save on infrastructure costs by utilizing cheaper instance types for non-critical tasks.
3. **Storing data and models in the same region**: This can help minimize transfer fees and reduce overall costs.
4. **Auto-scaling with serverless endpoints**: This approach can help match demand spikes and reduce costs by only provisi

In [15]:
# Turn 5: Wrap up.
print_separator()
print("You: Great, that's very helpful. Thanks!")
response = chat.turn("Great, that's very helpful. Thanks!")
print()
print("Ariel:", response)

──────────────────────────────────────────────────────────────────────
You: Great, that's very helpful. Thanks!

Ariel: It was a pleasure helping you understand how companies are using LLMs in production. To recap, companies are employing various strategies to optimize infrastructure costs and reduce latency, with estimated annual costs for production-grade LLM deployments ranging from $800,000 to $1,200,000.


---
## Part 5 — Inspecting Memory After the Conversation

In [16]:
print_heading("5.1  Conversation history")

print("Turns recorded:", len(chat.history))
for i, (user_text, agent_text) in enumerate(chat.history):
    print("\n--- Turn {} ---".format(i + 1))
    print("  You  :", user_text[:80], "..." if len(user_text) > 80 else "")
    print("  Ariel:", (agent_text or "")[:80], "..." if len(agent_text or "") > 80 else "")


══════════════════════════════════════════════════════════════════════
  5.1  Conversation history
══════════════════════════════════════════════════════════════════════

Turns recorded: 5

--- Turn 1 ---
  You  : I want to understand how companies are using LLMs in production right now. 
  Ariel: So you want to know how companies are using Large Language Models (LLMs) in prod ...

--- Turn 2 ---
  You  : I'm mostly interested in the infrastructure and cost side — how are companies ma ...
  Ariel: research 

--- Turn 3 ---
  You  : Yes, go ahead and research that. 
  Ariel: Based on my research, it's clear that managing latency, reliability, and spendin ...

--- Turn 4 ---
  You  : Tell me more about the cost optimization strategies specifically. 
  Ariel: The user wants to explore the sub-topic of cost optimization strategies for Larg ...

--- Turn 5 ---
  You  : Great, that's very helpful. Thanks! 
  Ariel: It was a pleasure helping you understand how companies are using LLMs in pro

In [17]:
print_heading("5.2  Memory variables")

# The agent stored various state variables during the conversation.
interesting_vars = [
    "research_topic", "research_angle", "research_phase",
    "intent_result", "research_plan_result",
]

for var in interesting_vars:
    value = chat.memory.get_var(var)
    if value is not None:
        preview = str(value)[:100]
        if len(str(value)) > 100:
            preview += "..."
        print("  {} = {}".format(var, preview))


══════════════════════════════════════════════════════════════════════
  5.2  Memory variables
══════════════════════════════════════════════════════════════════════

  research_topic = I want to understand how companies are using LLMs in production right now.
  research_angle = 
  research_phase = presented
  intent_result = done
  research_plan_result = [[{'action': 'search', 'params': {'query': 'LLM production infrastructure costs'}, 'reason': 'Gather...


In [18]:
print_heading("5.3  All events in memory")

print("Total events:", len(chat.memory.events))
print("Messages    :", len(chat.memory.idx_msgs))
print("Variables   :", len(chat.memory.idx_vars))
print("Logs        :", len(chat.memory.idx_logs))
print("Reflections :", len(chat.memory.idx_refs))


══════════════════════════════════════════════════════════════════════
  5.3  All events in memory
══════════════════════════════════════════════════════════════════════

Total events: 99
Messages    : 22
Variables   : 35
Logs        : 30
Reflections : 12


---
## Part 6 — Interactive Mode (optional)

Uncomment the cell below to run a fully interactive session.
Type your messages, and the agent will respond.
Type `q` or `quit` to exit.

In [19]:
# # Uncomment to run interactively:
# interactive_chat = CHAT(
#     research_agent,
#     greeting="Hi, I'm Ariel — your research analyst.  What would you like to explore?",
#     channel="cli",
#     user_label="You",
#     agent_label="Ariel",
# )
# interactive_chat.run()

---
## Recap

In this script we built a complete agent from ThoughtFlow primitives:

| Component | Primitive | Role |
|-----------|-----------|------|
| Intent classifier | DECIDE | Routes each turn to the right phase |
| Discovery | THOUGHT | Asks clarifying questions |
| Research planner | PLAN | Creates a multi-step search strategy |
| Web search | SEARCH | Fetches real results from Brave |
| Synthesis | THOUGHT | Produces an opinionated summary |
| Deep dive | THOUGHT | Explores sub-topics in detail |
| Wrap-up | THOUGHT | Closes the conversation gracefully |
| Interactive loop | CHAT | Handles turn-taking and history |

Next: **Script 04** — Deploy this agent to the cloud with ThoughtBase.

In [20]:
# [END!]