# Comprehensive Educational Research Notebook

This notebook demonstrates a complete multi-agent research system that combines:

- **Full Research Workflow** (from `5_full_agent.ipynb`): Complete end-to-end research process
- **MCP Integration** (from `3_research_agent_mcp.ipynb`): Model Context Protocol for tool access
- **Test-Synchronized Examples** (from `0_consolidated_research_agent.ipynb`): Deterministic demonstrations

## Learning Objectives

By the end of this notebook, you will understand:

1. **System Architecture**: How multi-agent research systems are structured
2. **MCP Integration**: How to use Model Context Protocol for tool access
3. **LLM Impact**: How different prompts and LLM settings affect research quality
4. **Workflow Orchestration**: How LangGraph coordinates complex research workflows
5. **Test-Driven Development**: How to build reliable, testable AI systems

## Prerequisites

- Basic understanding of Python and async programming
- Familiarity with Jupyter notebooks
- Understanding of LLMs and their capabilities
- Basic knowledge of agent-based systems

## Notebook Structure

This notebook is organized into distinct sections, each building upon the previous:

1. **Bootstrap & Setup**: Environment configuration and initialization
2. **Core Components**: Understanding the building blocks
3. **MCP Integration**: Tool access and async operations
4. **Research Workflow**: Complete end-to-end process
5. **LLM Impact Analysis**: Understanding prompt and model effects
6. **Test Synchronization**: Ensuring reliability and reproducibility


In [109]:
# Use the centralized notebook bootstrap helper to ensure src is on sys.path
# and to run project bootstrap. This keeps notebooks DRY.
#
# In some environments the `notebooks` directory is not an importable package
# (for example, when running the notebook from the repo root without an __init__.py).
# To avoid `ModuleNotFoundError: No module named 'notebooks'`, we load the
# helper by file path using importlib so the notebook works regardless.
import importlib.util
from pathlib import Path
def _locate_nb_bootstrap():
    candidates = []
    try:
        candidates.append(Path(__file__).parent)
    except NameError:
        pass
    candidates.append(Path.cwd())
    for c in candidates:
        p = c / 'nb_bootstrap.py'
        if p.exists():
            return p
    for parent in [Path.cwd()] + list(Path.cwd().parents):
        p = parent / 'notebooks' / 'nb_bootstrap.py'
        if p.exists():
            return p
    raise ImportError('Could not locate nb_bootstrap.py relative to notebook or cwd')
helper_path = _locate_nb_bootstrap()
spec = importlib.util.spec_from_file_location('nb_bootstrap', str(helper_path))
if spec is None or spec.loader is None:
    raise ImportError(f'Failed to load nb_bootstrap from {helper_path!s}: spec or loader is None')
nb_bootstrap = importlib.util.module_from_spec(spec)
spec.loader.exec_module(nb_bootstrap)
ensure_src_and_bootstrap = nb_bootstrap.ensure_src_and_bootstrap

# Returns (settings, console, logger)
settings, console, logger = ensure_src_and_bootstrap()


In [110]:
# Try both loggers: std logging with RichHandler and Loguru with Rich sink
import logging
from rich.logging import RichHandler
from research_agent_framework.config import get_logger, get_console
from loguru import logger as loguru_logger

console = get_console()

# Std logging with RichHandler
std_logger = logging.getLogger("rich_std_demo")
std_logger.setLevel("INFO")
if not any(isinstance(h, RichHandler) for h in std_logger.handlers):
    std_logger.addHandler(RichHandler(console=console, rich_tracebacks=True))
std_logger.info("[std] This is an info message with RichHandler.")
std_logger.warning("[std] This is a warning message with RichHandler.")

# Loguru with Rich sink (configured by bootstrap)
loguru_logger.info("[loguru] This is an info message with Rich color.")
loguru_logger.warning("[loguru] This is a warning message with Rich color.")


## Section 1: Bootstrap & Setup

The first step in any robust AI system is proper initialization and configuration. This section demonstrates:

- **Environment Setup**: Loading configuration and setting up logging
- **Path Management**: Ensuring proper imports and module discovery
- **Bootstrap Process**: Initializing the research framework
- **Console Configuration**: Setting up rich output for educational purposes

### Why Bootstrap Matters

Bootstrap ensures that:
1. Environment variables are loaded correctly
2. Logging is configured for debugging and monitoring
3. Console output is formatted for readability
4. All dependencies are properly initialized
5. Error handling is set up with rich tracebacks


## TODO Anchors and Test Cross‑links

- [ ] Section 2: Core Components — see `tests/test_research_agent.py`
- [ ] Section 3: MCP Integration — see `tests/test_renderer_rich.py`
- [ ] Section 4: Research Workflow — see `tests/test_end_to_end_flow.py`
- [ ] Section 5: LLM Impact Analysis — see `tests/test_llm_mock.py`
- [ ] Section 6: Test Synchronization — see `tests/test_renderer.py`
- [ ] Search Adapters (SerpAPI/Tavily) — see `tests/test_serpapi_adapter.py`, `tests/test_tavily_adapter.py`, `tests/test_serpapi_and_tavily_adapters.py`, `tests/test_adapters*.py`
- [ ] Supervisor Policy Demo — see `tests/test_supervisor_policy.py`, `tests/test_supervisor_policy_deterministic.py`
- [ ] Bootstrap & Config Walkthrough — see `tests/test_bootstrap.py`, `tests/test_bootstrap_wiring.py`, `tests/test_config.py`


In [111]:
# Use centralized helper to bootstrap the project and obtain common handles
# Load helper by file path to avoid depending on 'notebooks' being an importable package
import importlib.util
from pathlib import Path
def _locate_nb_bootstrap():
    candidates = []
    try:
        candidates.append(Path(__file__).parent)
    except NameError:
        pass
    candidates.append(Path.cwd())
    for c in candidates:
        p = c / 'nb_bootstrap.py'
        if p.exists():
            return p
    for parent in [Path.cwd()] + list(Path.cwd().parents):
        p = parent / 'notebooks' / 'nb_bootstrap.py'
        if p.exists():
            return p
    raise ImportError('Could not locate nb_bootstrap.py relative to notebook or cwd')
helper_path = _locate_nb_bootstrap()
spec = importlib.util.spec_from_file_location('nb_bootstrap', str(helper_path))
if spec is None or spec.loader is None:
    raise ImportError(f'Failed to load nb_bootstrap from {helper_path!s}: spec or loader is None')
nb_bootstrap = importlib.util.module_from_spec(spec)
spec.loader.exec_module(nb_bootstrap)
ensure_src_and_bootstrap = nb_bootstrap.ensure_src_and_bootstrap

settings, console, logger = ensure_src_and_bootstrap()

def nb_console():
    """Return the project's shared `Console` instance via `get_console()`."""
    try:
        return settings.console
    except Exception:
        from rich.console import Console
        return Console()


### Bootstrap Process Demonstration

Now let's run the bootstrap process and see what it initializes. This demonstrates how a production AI system should start up.


In [112]:
# Bootstrap the research framework
# This initializes logging, console, and environment configuration
from research_agent_framework.bootstrap import bootstrap
from research_agent_framework.config import get_settings, get_console, get_logger
from rich.panel import Panel
from rich.table import Table

# Run bootstrap to configure the environment
console.print("🔄 Running bootstrap process...")
bootstrap()

# Get the configured settings, console, and logger
settings = get_settings()
console = settings.console # get_console()
logger = settings.logger # get_logger()

# Display bootstrap information
console.print(Panel(
    "[bold green]✅ Bootstrap Complete[/bold green]\n\n"
    "The research framework has been initialized with:\n"
    "• Environment variables loaded\n"
    "• Logging configured (console sink)\n"
    "• Console formatting enabled\n"
    "• Error handling set up",
    title="Bootstrap Status",
    expand=False
))

# Show configuration details
config_table = Table(title="Framework Configuration", show_header=True, header_style="bold magenta")
config_table.add_column("Component", style="cyan", width=20)
config_table.add_column("Status", style="green", width=15)
config_table.add_column("Details", style="white", width=40)

config_table.add_row("Environment", "✅ Loaded", "Variables from .env file (if present)")
config_table.add_row("Logging", "✅ Configured", "Loguru wired to Rich Console")
config_table.add_row("Console", "✅ Ready", "Rich formatting enabled")
config_table.add_row("Error Handling", "✅ Active", "Rich tracebacks installed")

console.print(config_table)


In [113]:
# Logging: after bootstrap
logger.info("Bootstrap complete. Environment, console, and logging configured.")


### Configuration impact on behavior and logging

This section explains how changing settings (env vars) impacts runtime:

- `model_name` and `model_temperature` influence prompt behavior and deterministic outputs.
- `LOGGING__LEVEL` and `LOGGING__FMT` change the logging verbosity and format; the `Settings.logger` property reflects changes when `get_settings(force_reload=True)` is used.
- `enable_tracing` toggles optional tracing hooks (visualizations guarded by env).

Below is a safe example demonstrating how to reload settings at runtime and observe logger level changes without restarting the notebook.

Note: This example mutates process environment variables temporarily and reloads `Settings` with `force_reload=True` to illustrate effects in a deterministic demo.


In [114]:
# Demonstration: change logging level via env and reload settings
import os
from research_agent_framework.config import get_settings, get_logger

# Show current logging level
settings = get_settings()
print("Before reload: logging.level=", settings.logging.level)

# Temporarily set environment to DEBUG and reload
os.environ["LOGGING__LEVEL"] = "DEBUG"
settings = get_settings(force_reload=True)
print("After reload: logging.level=", settings.logging.level)

# Acquire logger and show that level reflects setting
logger = get_logger()
logger.info("This is an info message (should always show at INFO/DEBUG)")
logger.debug("This is a debug message (visible only when level=DEBUG)")

# Clean up: restore env and reload to original for deterministic notebook runs
os.environ.pop("LOGGING__LEVEL", None)
settings = get_settings(force_reload=True)
print("Restored: logging.level=", settings.logging.level)


Before reload: logging.level= INFO
After reload: logging.level= DEBUG


2025-09-13T11:50:18.048206-0700 INFO This is an info message (should always show at INFO/DEBUG)
2025-09-13T11:50:18.050303-0700 DEBUG This is a debug message (visible only when level=DEBUG)


Restored: logging.level= INFO


In [115]:
# Logging: config reload demo
logger.info(f"Reloaded settings: logging.level={settings.logging.level}")


2025-09-13T11:50:18.061883-0700 INFO Reloaded settings: logging.level=INFO


## Architecture & Technologies (Brief Overview)

- **Settings & Bootstrap**: `Settings` (Pydantic) loads env; `bootstrap()` enables rich tracebacks and wires Loguru → Rich `Console`.
- **Logging**: `LoggingConfig` fields (`level`, `fmt`, `backend`) drive a lazy `logger` property; helpers delegate to the same instances.
- **Agents & Models**: Agents coordinate research steps; Pydantic models (`SerpResult`, `Scope`, etc.) provide typed state.
- **Adapters**: Search adapters (SerpAPI/Tavily) expose deterministic stubs with optional live paths.
- **Prompts/Renderer**: Jinja templates rendered with rich-markdown output for clarity.
- **Tests as Specs**: Notebook sections mirror `tests/` behaviors for deterministic, reproducible demos.


## Architecture Diagram (Components)

```mermaid
flowchart LR
  subgraph Agents
    A[Research Agent]
    S[Scoping Agent]
    SP[Supervisor]
  end

  subgraph Framework
    CFG[Settings]
    LCFG[LoggingConfig]
    CON[Console]
    LGR[Logger]
    PR[Prompt Renderer]
    LLM[LLM Client]
  end

  subgraph Adapters
    SERP[SerpAPI Adapter]
    TAV[Tavily Adapter]
  end

  A --> PR
  A --> LLM
  A --> SERP
  A --> TAV
  S --> PR
  SP --> A
  SP --> S

  CFG --> CON
  CFG --> LCFG
  LCFG --> LGR
  LGR --> CON
```

metadata
language
source
import sys
import os
from pathlib import Path

# Locate repository root by searching upwards for either 'src/research_agent_framework' or 'research_agent_framework' folder
repo_cwd = Path.cwd().resolve()
found_root = None
for candidate in [repo_cwd] + list(repo_cwd.parents):
    if (candidate / 'src' / 'research_agent_framework').exists() or (candidate / 'research_agent_framework').exists():
        found_root = candidate.resolve()
        break

# If we found a root, prefer 'src' if present, otherwise use the root itself
if found_root is not None:
    src_candidate = (found_root / 'src') if (found_root / 'src' / 'research_agent_framework').exists() else found_root
    src_path = str(src_candidate)
    if src_path not in sys.path:
        sys.path.insert(0, src_path)
        # Also set PYTHONPATH so spawned kernels may pick it up in some environments
        os.environ['PYTHONPATH'] = os.environ.get('PYTHONPATH', '') or src_path

# Now import project bootstrap and helpers safely
from research_agent_framework.bootstrap import bootstrap
from research_agent_framework.config import get_settings, get_console, get_logger

# Initialize environment, console, and logging (idempotent)
bootstrap()


## Data Model Diagram (Key models & relationships)

```mermaid
classDiagram
    direction LR
    class SerpResult {
        string id
        string title
        string snippet
        string url
        list citations
        +from_raw(dict) SerpResult
    }

    class Scope {
        string id
        string question
        list constraints
        list clarifications
    }

    class ResearchTask {
        string id
        Scope scope
        list steps
        status
    }

    class EvalResult {
        string task_id
        bool success
        float score
        string feedback
        dict details
    }

    class AgentContext {
        settings
        console
        logger
        llm_client
        search_adapter
    }

    SerpResult --|> ResearchTask : evidence
    Scope "1" o-- "0..*" ResearchTask : generates
    ResearchTask "1" o-- "0..*" EvalResult : evaluated_by
    AgentContext "1" -- "*" ResearchTask : used_by
```


### Why this setup (results of the design)

- **Env overrides work**: `LOGGING__LEVEL`, `LOGGING__FMT`, `LOGGING__BACKEND` populate real fields; the `logger` property reflects them at access time.
- **Single ownership via properties**: `settings.console` and `settings.logger` are the shared instances used everywhere.
- **Helpers remain simple**: `get_console()` / `get_logger()` just delegate to those shared instances when you prefer function calls.
- **Robust bootstrap**: `bootstrap()` installs rich tracebacks, ensures a Console, and wires Loguru → Console idempotently.
- **Notebook consistency**: Cells use property access (e.g., `settings.console`) for clarity; helpers are equivalent if preferred.


## Env vars: required and optional (4.1)

Key environment variables used by the framework (recommended defaults shown):

| Variable | Required? | Default | Description |
|---|---:|---|---|
| `SERPAPI_API_KEY` | Optional | — | API key for SerpAPI; if missing, notebook defaults to `Mock` adapter |
| `TAVILY_API_KEY` | Optional | — | API key for Tavily adapter; if missing, notebook defaults to `Mock` adapter |
| `LOGGING__LEVEL` | Optional | `INFO` | Logging verbosity |
| `LOGGING__FMT` | Optional | project default | Logging format string |
| `MODEL_NAME` | Optional | `mock-model` | LLM model to use when not mocking |
| `MODEL_TEMPERATURE` | Optional | `0.0` | Controls LLM sampling |
| `ENABLE_TRACING` | Optional | `False` | Enable tracing hooks |

This section lists recommended env vars, safe defaults, and guidance on toggling live providers vs mocks.

In [116]:
# Safe demo: display resolved settings and show defaults
from research_agent_framework.config import get_settings

s = get_settings(force_reload=True)
print('MODEL_NAME =', s.model_name)
print('MODEL_TEMPERATURE =', s.model_temperature)
print('LOGGING__LEVEL =', s.logging.level)
print('ENABLE_TRACING =', s.enable_tracing)

# Display whether external adapter keys are present
import os
print('SERPAPI_API_KEY present:', bool(os.environ.get('SERPAPI_API_KEY')))
print('TAVILY_API_KEY present:', bool(os.environ.get('TAVILY_API_KEY')))


MODEL_NAME = mock-model
MODEL_TEMPERATURE = 0.0
LOGGING__LEVEL = INFO
ENABLE_TRACING = False
SERPAPI_API_KEY present: False
TAVILY_API_KEY present: True


### Safe defaults and fallback behavior (4.2)

This cell demonstrates how the notebook and framework behave when external API keys are not provided. By default the notebook uses deterministic `Mock` adapters and `MockLLM` to keep examples reproducible and low-cost.

Key points:

- If `SERPAPI_API_KEY` or `TAVILY_API_KEY` are missing, the framework falls back to the `Mock` search adapter.
- If `MODEL_NAME` is set to a real provider name, `llm_factory` will create a live client — otherwise the `MockLLM` is used.
- Use `get_settings(force_reload=True)` after mutating `os.environ` to observe changes at runtime in an idempotent manner.

The next code cell runs a safe demo that temporarily unsets adapter-related env vars, reloads settings, and prints which adapters/LLM the framework would use.

### Switchboard helper: centralize mock/live toggles (4.3)

This small utility centralizes environment-driven toggles used by the notebook and examples. Use the helper to make notebook cells short and declarative — change the environment in one place and the helper will consistently report whether the framework will use mocks or live providers.

The code cell below demonstrates toggling `FORCE_USE_MOCK` and observing the resolved behavior for search adapters and the LLM.

In [117]:
# Switchboard demo cell: toggle FORCE_USE_MOCK and show effective choices
import os
from research_agent_framework.helpers.switchboard import use_mock_search, use_mock_llm
from research_agent_framework.config import get_console, get_settings

console = get_console()

# Baseline
s = get_settings(force_reload=True)
console.print(f"Baseline: use_mock_search={use_mock_search(s)}, use_mock_llm={use_mock_llm(s)}")

# Force use of mocks
os.environ['FORCE_USE_MOCK'] = '1'
s_forced = get_settings(force_reload=True)
console.print(f"After FORCE_USE_MOCK=1: use_mock_search={use_mock_search(s_forced)}, use_mock_llm={use_mock_llm(s_forced)}")

# Clean up
os.environ.pop('FORCE_USE_MOCK', None)
s_restored = get_settings(force_reload=True)
console.print(f"After restore: use_mock_search={use_mock_search(s_restored)}, use_mock_llm={use_mock_llm(s_restored)}")


### Central switchboard (6.0) - single place to toggle mocks vs live providers

This small, editable cell is the recommended place to toggle the environment for the entire notebook when you want to run the examples against live providers. By default the notebook is mock-first (safe and deterministic).

Change `FORCE_USE_MOCK` below or set provider-specific keys (`SERPAPI_API_KEY`, `TAVILY_API_KEY`, `MODEL_NAME`) to run with live services. Use `get_settings(force_reload=True)` after editing to apply changes in subsequent cells.

In [118]:
# Central switchboard: edit this cell to toggle mock vs live globally for the notebook
# Options:
#  - Set FORCE_USE_MOCK=1 to force all mocks
#  - Unset FORCE_USE_MOCK and set provider keys (SERPAPI_API_KEY/TAVILY_API_KEY/MODEL_NAME) to use live providers
import os
# Example: force mocks for all demo cells (safe default)
os.environ['FORCE_USE_MOCK'] = os.environ.get('FORCE_USE_MOCK', '1')
# Example: to use live providers, uncomment and set real keys here (DO NOT commit secrets)
# os.environ.pop('FORCE_USE_MOCK', None)
# os.environ['SERPAPI_API_KEY'] = 'sk-...your-key...'
# os.environ['TAVILY_API_KEY'] = 'tk-...your-key...'
# os.environ['MODEL_NAME'] = 'openai-gpt-4'

# Apply settings reload so later cells observe the new environment
from research_agent_framework.config import get_settings
s = get_settings(force_reload=True)
print('Central switchboard applied. Current: FORCE_USE_MOCK=', os.environ.get('FORCE_USE_MOCK'), 'MODEL_NAME=', s.model_name)


Central switchboard applied. Current: FORCE_USE_MOCK= 1 MODEL_NAME= mock-model


## 7.0 User Clarification and Scoping Demo

This section demonstrates how the research agent iteratively refines the user's request using deterministic (mock) responses. The workflow ensures that the agent only proceeds to research after sufficient clarification, mirroring the logic in `research_agent_scope.py` and the corresponding tests.

- **7.2 Iterative refinement:** The agent asks clarifying questions until enough information is provided.
- **7.3 Scope state capture:** The agent validates and stores the scope state for downstream research.

Cells below show the clarification loop and state validation, using mock adapters for reproducibility.

In [119]:
# 7.2 Iterative refinement using deterministic responses (with max turns)
from deep_research_from_scratch.research_agent_scope import scope_research, AgentInputState
from deep_research_from_scratch.state_scope import AgentState
from langchain_core.messages import HumanMessage, AnyMessage
from typing import cast

# Simulate a user request that needs clarification
user_messages = [HumanMessage(content="Find the best coffee shops in SF.")]
input_state = AgentInputState(messages=cast(list[AnyMessage], user_messages))

max_turns = 3
turn = 0
while turn < max_turns:
    result = scope_research.invoke(input_state)
    clarify_msg = result["messages"][-1].content
    print(f"Turn {turn+1} - Agent: {clarify_msg}")
    # Simulate user providing more detail after each clarification
    if "clarify" in clarify_msg.lower() or "more detail" in clarify_msg.lower():
        if turn == 0:
            user_messages.append(HumanMessage(content="I want places open now and not paid."))
        elif turn == 1:
            user_messages.append(HumanMessage(content="No cover charge, open now, highest ratings in SOMA."))
        input_state = AgentInputState(messages=cast(list[AnyMessage], user_messages))
        turn += 1
    else:
        print("Agent is ready to start research or has sufficient info.")
        break
else:
    print(f"Max turns ({max_turns}) reached. Please provide more specific details to proceed.")
    print("Conversation so far:")
    for msg in user_messages:
        print("User:", msg.content)
    print("Last agent message:", clarify_msg)


Turn 1 - Agent: Thank you for your request. You are looking for the best coffee shops in San Francisco. I have sufficient information to proceed and will now begin researching the top coffee shops in SF for you.
Agent is ready to start research or has sufficient info.


In [120]:
# Logging: scoping clarification loop
logger.info("Starting scoping clarification loop for user request.")


2025-09-13T11:50:21.295960-0700 INFO Starting scoping clarification loop for user request.


**Explanation:**

This cell demonstrates how the agent uses deterministic logic to clarify ambiguous user requests. The agent asks for more details if needed, and only proceeds when the scope is sufficiently defined. This mirrors the logic in `research_agent_scope.py` and is validated by tests in `test_research_agent.py`.

### 7.3 Capture scope state object and validation

This cell demonstrates how the agent captures the scope state object after clarification and validates its structure, ensuring downstream research steps are well-defined and reproducible.

In [121]:
# 7.3 Capture scope state object and validate
from deep_research_from_scratch.research_agent_scope import scope_research, AgentInputState
from deep_research_from_scratch.state_scope import AgentState
from langchain_core.messages import HumanMessage, AnyMessage
from assertpy import assert_that
from typing import cast
from research_agent_framework.config import get_console
from rich.markdown import Markdown

console = get_console()

# Simulate clarified user request
user_messages = [
    HumanMessage(content="Find the best coffee shops in SF."),
    HumanMessage(content="I want places open now and not paid."),
    HumanMessage(content="No cover charge, open now, highest ratings in SOMA.")
]
input_state = AgentInputState(messages=cast(list[AnyMessage], user_messages))
result = scope_research.invoke(input_state)

# Validate scope state object
assert_that(result).contains_key("research_brief")
assert_that(result["research_brief"]).is_not_empty()
console.print(Markdown(f"**Research brief:**\n\n{result['research_brief']}"))


In [122]:
# Logging: scope state validation
logger.info("Validating scope state object after clarification.")


2025-09-13T11:50:23.946856-0700 INFO Validating scope state object after clarification.


## 9.1 Supervisor Policy Demo: Deterministic Multi-Agent Coordination

This section demonstrates how the supervisor agent orchestrates multiple research agents using a deterministic policy. The workflow ensures reproducibility and mirrors the logic in `multi_agent_supervisor.py` and the corresponding tests.

- **Supervisor Policy:** Coordinates agent actions, assigns tasks, and validates outcomes.
- **Deterministic Steps:** Uses mock adapters and fixed agent responses for educational clarity.

Cells below show the supervisor's decision-making process and agent coordination, with output rendered using Rich's Markdown for clarity.

In [123]:
# 9.1 Supervisor Policy Demo: Deterministic Multi-Agent Coordination
from deep_research_from_scratch.multi_agent_supervisor import Supervisor, AgentTask, DeterministicPolicy
from deep_research_from_scratch.state_scope import AgentState
from research_agent_framework.config import get_console
from rich.markdown import Markdown

console = get_console()

# Define mock agent tasks
agent_tasks = [
    AgentTask(agent_id="A1", description="Find top-rated coffee shops in SF."),
    AgentTask(agent_id="A2", description="Check which shops are open now."),
    AgentTask(agent_id="A3", description="Filter for no cover charge in SOMA.")
]

# Use deterministic supervisor policy for reproducibility
policy = DeterministicPolicy()
supervisor = Supervisor(policy=policy)

# Run coordination workflow
results = supervisor.coordinate(agent_tasks)

# Validate and display results
for result in results:
    assert hasattr(result, "agent_id")
    assert hasattr(result, "outcome")
    console.print(Markdown(f"**Agent {result.agent_id} outcome:**\n\n{result.outcome}"))


In [124]:
# Logging: supervisor policy demo
logger.info("Supervisor policy demo: coordinating agent tasks.")


2025-09-13T11:50:23.970972-0700 INFO Supervisor policy demo: coordinating agent tasks.


## 9.2 Logging Agent Messages and State Transitions

This section demonstrates how the supervisor and agents log their messages and state transitions during coordination. Logging is essential for debugging, monitoring, and educational clarity.

- **Structured Logging:** Each agent logs its actions and state changes.
- **State Transition Tracking:** Supervisor logs before and after coordination steps.

Cells below show how logging is integrated into the multi-agent workflow, with output rendered using Rich for clarity.

In [125]:
# 9.2 Logging Agent Messages and State Transitions (with logger injection demo)
from deep_research_from_scratch.multi_agent_supervisor import Supervisor, AgentTask, DeterministicPolicy, LoggingSupervisor, LoggingAgentTask
from research_agent_framework.config import get_console, get_logger
from rich.markdown import Markdown

console = get_console()
logger = get_logger()

# Inject logger into agent tasks and supervisor for reproducible logging
agent_tasks = [
    LoggingAgentTask(agent_id="A1", description="Find top-rated coffee shops in SF.", logger=logger),
    LoggingAgentTask(agent_id="A2", description="Check which shops are open now.", logger=logger),
    LoggingAgentTask(agent_id="A3", description="Filter for no cover charge in SOMA.", logger=logger)
]
supervisor = LoggingSupervisor(logger=logger)
results = supervisor.coordinate(agent_tasks)

for result in results:
    console.print(Markdown(f"**Agent {result.agent_id} outcome:**\n\n{result.outcome}"))


2025-09-13T11:50:23.987732-0700 INFO Supervisor: Starting coordination
2025-09-13T11:50:23.987732-0700 INFO Supervisor: Starting coordination
2025-09-13T11:50:23.987732-0700 INFO Supervisor: Starting coordination


2025-09-13T11:50:23.990843-0700 INFO Agent A1 starting: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.990843-0700 INFO Agent A1 starting: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.990843-0700 INFO Agent A1 starting: Find top-rated coffee shops in SF.


2025-09-13T11:50:23.993686-0700 INFO Agent A1 finished: Completed: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.993686-0700 INFO Agent A1 finished: Completed: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.993686-0700 INFO Agent A1 finished: Completed: Find top-rated coffee shops in SF.


2025-09-13T11:50:23.996503-0700 INFO Supervisor: Agent A1 outcome: Completed: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.996503-0700 INFO Supervisor: Agent A1 outcome: Completed: Find top-rated coffee shops in SF.
2025-09-13T11:50:23.996503-0700 INFO Supervisor: Agent A1 outcome: Completed: Find top-rated coffee shops in SF.


2025-09-13T11:50:23.998796-0700 INFO Agent A2 starting: Check which shops are open now.
2025-09-13T11:50:23.998796-0700 INFO Agent A2 starting: Check which shops are open now.
2025-09-13T11:50:23.998796-0700 INFO Agent A2 starting: Check which shops are open now.


2025-09-13T11:50:24.001496-0700 INFO Agent A2 finished: Completed: Check which shops are open now.
2025-09-13T11:50:24.001496-0700 INFO Agent A2 finished: Completed: Check which shops are open now.
2025-09-13T11:50:24.001496-0700 INFO Agent A2 finished: Completed: Check which shops are open now.


2025-09-13T11:50:24.004004-0700 INFO Supervisor: Agent A2 outcome: Completed: Check which shops are open now.
2025-09-13T11:50:24.004004-0700 INFO Supervisor: Agent A2 outcome: Completed: Check which shops are open now.
2025-09-13T11:50:24.004004-0700 INFO Supervisor: Agent A2 outcome: Completed: Check which shops are open now.


2025-09-13T11:50:24.006341-0700 INFO Agent A3 starting: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.006341-0700 INFO Agent A3 starting: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.006341-0700 INFO Agent A3 starting: Filter for no cover charge in SOMA.


2025-09-13T11:50:24.008526-0700 INFO Agent A3 finished: Completed: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.008526-0700 INFO Agent A3 finished: Completed: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.008526-0700 INFO Agent A3 finished: Completed: Filter for no cover charge in SOMA.


2025-09-13T11:50:24.011139-0700 INFO Supervisor: Agent A3 outcome: Completed: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.011139-0700 INFO Supervisor: Agent A3 outcome: Completed: Filter for no cover charge in SOMA.
2025-09-13T11:50:24.011139-0700 INFO Supervisor: Agent A3 outcome: Completed: Filter for no cover charge in SOMA.


2025-09-13T11:50:24.014003-0700 INFO Supervisor: Coordination complete
2025-09-13T11:50:24.014003-0700 INFO Supervisor: Coordination complete
2025-09-13T11:50:24.014003-0700 INFO Supervisor: Coordination complete


In [126]:
# Logging: agent state transitions
logger.info("Logging agent messages and state transitions during coordination.")


2025-09-13T11:50:24.035003-0700 INFO Logging agent messages and state transitions during coordination.
2025-09-13T11:50:24.035003-0700 INFO Logging agent messages and state transitions during coordination.
2025-09-13T11:50:24.035003-0700 INFO Logging agent messages and state transitions during coordination.


## 10.0 Research Compression and Synthesis

This section demonstrates how to compress and synthesize research findings using deterministic logic (MockLLM) and structured outputs. The workflow mirrors the agent's compression node and aligns with tests in `test_supervisor_policy_deterministic.py` and `test_end_to_end_flow.py`.

- **10.1 Compression strategy:** Uses a Jinja2 template and MockLLM to compress findings.
- **10.2 Synthesis:** Produces structured objects for downstream reporting.
- **10.3 Validation:** Validates outputs and displays with rich output.

Cells below show the compression logic, synthesis, and output validation.

In [127]:
# 10.1 Compression strategy with MockLLM
from research_agent_framework.llm.client import MockLLM, LLMConfig
from research_agent_framework.prompts import renderer
from research_agent_framework.config import get_console
from rich.markdown import Markdown

console = get_console()
mock_config = LLMConfig(api_key="test", model="mock-model")
mock_llm = MockLLM(mock_config)

# Example research brief and notes
research_brief = 'Find the best coffee shops in SF with no cover charge, open now, highest ratings in SOMA.'
notes = [
    'Blue Bottle Coffee: 4.7 stars, open now, free WiFi, no cover charge.',
    'Sightglass Coffee: 4.6 stars, open now, SOMA location, no cover charge.',
    'Verve Coffee: 4.5 stars, open now, SOMA, no cover charge.'
]

# Render compression prompt using Jinja2 template
from datetime import date
compression_prompt = renderer.render_template('compress_research.j2', {
    'date': date.today().isoformat(),
    'research_brief': research_brief,
    'notes': notes
})

# Use MockLLM to compress findings
import asyncio
summary = asyncio.run(mock_llm.generate(compression_prompt))
console.print(Markdown(f'**Compressed Research Summary:**\n\n{summary}'))


In [128]:
# Logging: compression and synthesis
logger.info("Compressing and synthesizing research findings.")


2025-09-13T11:50:24.065403-0700 INFO Compressing and synthesizing research findings.
2025-09-13T11:50:24.065403-0700 INFO Compressing and synthesizing research findings.
2025-09-13T11:50:24.065403-0700 INFO Compressing and synthesizing research findings.


**Explanation:**

This cell demonstrates how the agent compresses research findings using a Jinja2 template and MockLLM for deterministic output. The summary is rendered with rich Markdown for clarity.

In [129]:
# 10.2 Synthesize findings into structured objects
from research_agent_framework.models import EvalResult
from assertpy import assert_that

# Simulate synthesis: wrap summary into EvalResult
eval_result = EvalResult(
    task_id='synth-001',
    success=True,
    score=0.95,
    feedback=str(summary),
    details={'notes': '\n'.join(notes)}
 )
# Validate structured output
assert_that(eval_result).is_instance_of(EvalResult)
assert_that(eval_result.feedback).contains('mock response for:')
console.print(Markdown(f'**EvalResult:**\n\n{eval_result}'))


**Explanation:**

This cell shows how compressed findings are synthesized into a structured EvalResult object, validated for type and content, and displayed with rich output.

## 11.0 Final Report Generation

This section demonstrates how to assemble the final research report, render it with rich output, and provide a downloadable artifact. The workflow uses deterministic logic and aligns with renderer and output tests.

- **11.1 Assemble report sections and metadata**
- **11.2 Verify formatting against renderer tests**
- **11.3 Provide downloadable artifact (markdown export)

Cells below show the report assembly, rendering, and export logic.

In [130]:
# 11.1 Assemble report sections and metadata
from research_agent_framework.prompts import renderer
from research_agent_framework.llm.client import MockLLM, LLMConfig
from research_agent_framework.config import get_console
from rich.markdown import Markdown
from datetime import date

console = get_console()
mock_config = LLMConfig(api_key="test", model="mock-model")
mock_llm = MockLLM(mock_config)

# Example: assemble report sections
report_metadata = {
    'date': date.today().isoformat(),
    'research_brief': 'Find the best coffee shops in SF with no cover charge, open now, highest ratings in SOMA.',
    'findings': [
        'Blue Bottle Coffee: 4.7 stars, open now, free WiFi, no cover charge.',
        'Sightglass Coffee: 4.6 stars, open now, SOMA location, no cover charge.',
        'Verve Coffee: 4.5 stars, open now, SOMA, no cover charge.'
    ],
    'sources': []
}

# Render final report using Jinja2 template
final_report_prompt = renderer.render_template('final_report_generation_prompt.j2', report_metadata)

# DEBUG: show the rendered prompt so we can verify it contains the report content and not an agent prompt
print('--- Rendered final_report_prompt start ---')
print(final_report_prompt)
print('--- Rendered final_report_prompt end ---')

# Sanitize prompt: remove any angle-bracket instruction tokens that may have leaked in
import re
sanitized_prompt = re.sub(r'<[^>]+>', '', final_report_prompt)
# Collapse repeated blank lines for cleanliness
sanitized_prompt = re.sub(r'\n{3,}', '\n\n', sanitized_prompt).strip()

# By default, use the rendered template as the final report to ensure no agent instructions
# If you want to exercise the LLM, set `use_llm = True` (keeps deterministic default for demos/tests)
use_llm = False
import asyncio
if use_llm:
    final_report = asyncio.run(mock_llm.generate(sanitized_prompt))
else:
    final_report = sanitized_prompt

console.print(Markdown(f'**Final Research Report:**\n\n{final_report}'))


--- Rendered final_report_prompt start ---


# Final Research Report

## Research Brief
Find the best coffee shops in SF with no cover charge, open now, highest ratings in SOMA.

## Findings

- Blue Bottle Coffee: 4.7 stars, open now, free WiFi, no cover charge.

- Sightglass Coffee: 4.6 stars, open now, SOMA location, no cover charge.

- Verve Coffee: 4.5 stars, open now, SOMA, no cover charge.


---
Today's date is 2025-09-13.



--- Rendered final_report_prompt end ---


In [131]:
# 11.2 Verify formatting against renderer tests
from assertpy import assert_that

# Validate that the final report contains expected structure
assert_that(final_report).contains('# Final Research Report')
assert_that(final_report).contains('Findings')
console.print(Markdown(f'**Report formatting validated against renderer tests.**'))


In [132]:
# 11.3 Provide downloadable artifact (markdown export)
import os
from pathlib import Path

# Save the final report as a markdown file in the same folder as this notebook.
# Prefer the helper_path.parent (set earlier when loading nb_bootstrap), otherwise fall back to cwd.
try:
    notebook_dir = Path(helper_path).parent if 'helper_path' in globals() else None
except Exception:
    notebook_dir = None
if notebook_dir is None or not notebook_dir.exists():
    notebook_dir = Path.cwd()
output_path = notebook_dir / 'final_research_report.md'
with open(output_path, 'w', encoding='utf-8') as f:
    f.write(final_report)

console.print(f'[bold green]✅ Final report saved as {output_path!s}[/bold green]')

# Test: ensure the saved file contains the actual report, not the prompt
with open(output_path, 'r', encoding='utf-8') as f:
    saved_content = f.read()
assert saved_content.startswith('# Final Research Report'), "Report does not start with expected header!"
assert 'Findings' in saved_content, "Report does not contain a findings section!"


## 12.0 Structured Logging and LangSmith Tracing

This section demonstrates how to enable LangSmith tracing and visualize traces for the research workflow. Tracing is optional and controlled by the `ENABLE_TRACING` environment variable or settings. When enabled, traces are sent to LangSmith for inspection and debugging. This is useful for understanding agent reasoning, tool calls, and workflow execution.

- **12.1 Structured logs** are already integrated throughout the code and notebook (see previous sections).
- **12.2 LangSmith tracing hooks** are enabled below if tracing is turned on.
- **12.3 Minimal trace visualization**: If tracing is enabled, a link to the LangSmith UI is displayed for the current run. Otherwise, a message explains how to enable tracing.

**Note:** You must have a valid LangSmith API key and project configured in your environment for traces to be sent. See [LangSmith documentation](https://docs.langchain.com/docs/langsmith) for setup instructions.

In [133]:
# 12.2 Enable LangSmith tracing if configured
import os
from research_agent_framework.config import get_settings, get_logger, get_console

settings = get_settings()
logger = get_logger()
console = get_console()

# Check if tracing is enabled (via env or settings)
enable_tracing = bool(os.environ.get("ENABLE_TRACING", str(settings.enable_tracing)).lower() in ("1", "true", "yes"))

if enable_tracing:
    try:
        from langsmith import traceable, LangSmithTracer
        # Optionally set up LangSmithTracer with env/config
        tracer = LangSmithTracer()
        console.print("[bold green]✅ LangSmith tracing is ENABLED. Traces will be sent to your LangSmith project.[/bold green]")
        logger.info("LangSmith tracing enabled.")
    except ImportError:
        console.print("[bold yellow]⚠️ LangSmith tracing requested but 'langsmith' package is not installed. Please install it to enable tracing.[/bold yellow]")
        logger.warning("LangSmith tracing requested but package not installed.")
else:
    console.print("[bold cyan]ℹ️ LangSmith tracing is DISABLED. Set ENABLE_TRACING=1 and configure your LangSmith API key to enable.[/bold cyan]")
    logger.info("LangSmith tracing is disabled.")


2025-09-13T11:50:24.166444-0700 INFO LangSmith tracing is disabled.
2025-09-13T11:50:24.166444-0700 INFO LangSmith tracing is disabled.
2025-09-13T11:50:24.166444-0700 INFO LangSmith tracing is disabled.


In [134]:
# 12.3 Minimal trace visualization or LangSmith UI link
if enable_tracing:
    # If LangSmith tracing is enabled, show a link to the LangSmith project UI
    langsmith_project = os.environ.get("LANGCHAIN_PROJECT", "default")
    langsmith_url = f"https://smith.langchain.com/o/projects/{langsmith_project}"
    console.print(f"[bold blue]🔗 View your traces in LangSmith: [link={langsmith_url}]{langsmith_url}[/link][/bold blue]")
else:
    console.print("[bold cyan]ℹ️ Tracing is off. To view traces, set ENABLE_TRACING=1 and rerun this section.[/bold cyan]")


# 13.0 Comparing Prompts and LLM Settings Side-by-Side

This section demonstrates how prompt wording and LLM settings (such as temperature and max_tokens) affect outputs. We use deterministic mock LLMs for educational clarity, and log all steps generously.

## 13.1 Varying Prompts: Deterministic Output Comparison

We compare how different prompt phrasings affect the output of a mock LLM. All outputs and steps are logged for clarity.

In [135]:
# Compare several prompt phrasings using the mock LLM and log outputs
from research_agent_framework.llm.client import LLMConfig
from research_agent_framework.llm.compare import compare_prompts
from research_agent_framework.config import get_logger, get_console
import asyncio

logger = get_logger()
console = get_console()

prompts = [
    "Summarize the best coffee shops in SF.",
    "List top-rated coffee shops in San Francisco.",
    "What are the highest rated places for coffee in SF?",
    "Find coffee shops open now in SOMA, SF.",
]
config = LLMConfig(api_key="test", model="mock-model")

results = asyncio.run(compare_prompts(prompts, config))

for prompt, output in results.items():
    logger.info(f"Prompt: {prompt}\nOutput: {output}")
    console.print(f"[bold]Prompt:[/bold] {prompt}\n[green]Output:[/green] {output}\n")


2025-09-13T11:50:24.189747-0700 INFO Comparing prompt: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.189747-0700 INFO Comparing prompt: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.189747-0700 INFO Comparing prompt: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.192763-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.192763-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.192763-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.195442-0700 INFO Comparing prompt: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.195442-0700 INFO Comparing prompt: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.195442-0700 INFO Comparing prompt: List top-rated coffee shops in San Francisco.


2025-09-13T11:50:24.198380-0700 INFO Output: mock response for: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.198380-0700 INFO Output: mock response for: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.198380-0700 INFO Output: mock response for: List top-rated coffee shops in San Francisco.


2025-09-13T11:50:24.201454-0700 INFO Comparing prompt: What are the highest rated places for coffee in SF?
2025-09-13T11:50:24.201454-0700 INFO Comparing prompt: What are the highest rated places for coffee in SF?
2025-09-13T11:50:24.201454-0700 INFO Comparing prompt: What are the highest rated places for coffee in SF?


2025-09-13T11:50:24.204336-0700 INFO Output: mock response for: What are the highest rated places for coffee in SF?
2025-09-13T11:50:24.204336-0700 INFO Output: mock response for: What are the highest rated places for coffee in SF?
2025-09-13T11:50:24.204336-0700 INFO Output: mock response for: What are the highest rated places for coffee in SF?


2025-09-13T11:50:24.206962-0700 INFO Comparing prompt: Find coffee shops open now in SOMA, SF.
2025-09-13T11:50:24.206962-0700 INFO Comparing prompt: Find coffee shops open now in SOMA, SF.
2025-09-13T11:50:24.206962-0700 INFO Comparing prompt: Find coffee shops open now in SOMA, SF.


2025-09-13T11:50:24.209710-0700 INFO Output: mock response for: Find coffee shops open now in SOMA, SF.
2025-09-13T11:50:24.209710-0700 INFO Output: mock response for: Find coffee shops open now in SOMA, SF.
2025-09-13T11:50:24.209710-0700 INFO Output: mock response for: Find coffee shops open now in SOMA, SF.


2025-09-13T11:50:24.212823-0700 INFO Prompt: Summarize the best coffee shops in SF.
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.212823-0700 INFO Prompt: Summarize the best coffee shops in SF.
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.212823-0700 INFO Prompt: Summarize the best coffee shops in SF.
Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.217032-0700 INFO Prompt: List top-rated coffee shops in San Francisco.
Output: mock response for: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.217032-0700 INFO Prompt: List top-rated coffee shops in San Francisco.
Output: mock response for: List top-rated coffee shops in San Francisco.
2025-09-13T11:50:24.217032-0700 INFO Prompt: List top-rated coffee shops in San Francisco.
Output: mock response for: List top-rated coffee shops in San Francisco.


2025-09-13T11:50:24.221362-0700 INFO Prompt: What are the highest rated places for coffee in SF?
Output: mock response for: What are the highest rated places for coffee in SF?
2025-09-13T11:50:24.221362-0700 INFO Prompt: What are the highest rated places for coffee in SF?
Output: mock response for: What are the highest rated places for coffee in SF?


2025-09-13T11:50:24.225551-0700 INFO Prompt: Find coffee shops open now in SOMA, SF.
Output: mock response for: Find coffee shops open now in SOMA, SF.
2025-09-13T11:50:24.225551-0700 INFO Prompt: Find coffee shops open now in SOMA, SF.
Output: mock response for: Find coffee shops open now in SOMA, SF.


## 13.2 Demonstrating Temperature and Max Tokens Effects

We show how changing the temperature and max_tokens settings affects the output, using the mock LLM for deterministic demonstration. All results are logged and displayed.

In [136]:
# Compare settings: temperature and max_tokens
from research_agent_framework.llm.compare import compare_settings

prompt = "Summarize the best coffee shops in SF."
configs = [
    LLMConfig(api_key="test", model="mock-model", temperature=0.0, max_tokens=32),
    LLMConfig(api_key="test", model="mock-model", temperature=1.0, max_tokens=128),
    LLMConfig(api_key="test", model="mock-model", temperature=2.0, max_tokens=256),
]

results = asyncio.run(compare_settings(prompt, configs))

for label, output in results.items():
    logger.info(f"Settings: {label}\nOutput: {output}")
    console.print(f"[bold]Settings:[/bold] {label}\n[green]Output:[/green] {output}\n")


2025-09-13T11:50:24.247396-0700 INFO Comparing settings: model=mock-model, temp=0.0, max_tokens=32
2025-09-13T11:50:24.247396-0700 INFO Comparing settings: model=mock-model, temp=0.0, max_tokens=32
2025-09-13T11:50:24.247396-0700 INFO Comparing settings: model=mock-model, temp=0.0, max_tokens=32


2025-09-13T11:50:24.250397-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.250397-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.250397-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.253638-0700 INFO Comparing settings: model=mock-model, temp=1.0, max_tokens=128
2025-09-13T11:50:24.253638-0700 INFO Comparing settings: model=mock-model, temp=1.0, max_tokens=128
2025-09-13T11:50:24.253638-0700 INFO Comparing settings: model=mock-model, temp=1.0, max_tokens=128


2025-09-13T11:50:24.256473-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.256473-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.256473-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.259173-0700 INFO Comparing settings: model=mock-model, temp=2.0, max_tokens=256
2025-09-13T11:50:24.259173-0700 INFO Comparing settings: model=mock-model, temp=2.0, max_tokens=256
2025-09-13T11:50:24.259173-0700 INFO Comparing settings: model=mock-model, temp=2.0, max_tokens=256


2025-09-13T11:50:24.261919-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.261919-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.261919-0700 INFO Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.264965-0700 INFO Settings: model=mock-model, temp=0.0, max_tokens=32
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.264965-0700 INFO Settings: model=mock-model, temp=0.0, max_tokens=32
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.264965-0700 INFO Settings: model=mock-model, temp=0.0, max_tokens=32
Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.269273-0700 INFO Settings: model=mock-model, temp=1.0, max_tokens=128
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.269273-0700 INFO Settings: model=mock-model, temp=1.0, max_tokens=128
Output: mock response for: Summarize the best coffee shops in SF.


2025-09-13T11:50:24.273834-0700 INFO Settings: model=mock-model, temp=2.0, max_tokens=256
Output: mock response for: Summarize the best coffee shops in SF.
2025-09-13T11:50:24.273834-0700 INFO Settings: model=mock-model, temp=2.0, max_tokens=256
Output: mock response for: Summarize the best coffee shops in SF.


## 13.3 Best Practices and Trade-Offs

- **Prompt clarity matters:** Small changes in phrasing can affect LLM output quality and relevance.
- **Temperature:** Lower values make output more deterministic; higher values increase creativity and variability.
- **Max tokens:** Controls output length; set appropriately for your use case.
- **Log everything:** Use structured logging to track prompt, settings, and outputs for reproducibility and debugging.
- **Use mocks for education/testing:** Deterministic mock LLMs help you understand effects without provider variability.

**Try varying prompts and settings above to see effects in real time!**

# 14.0 MCP Server Initialization and Tool Discovery

This section demonstrates initializing a minimal in-process MCP server, registering tools, listing and describing them, and explaining async usage patterns. All steps are logged for educational clarity.

## 14.1 Start a Simple In-Process MCP Stub

We initialize the MCPStub, which acts as a minimal async message bus for tool calls.

In [137]:
# Initialize MCPStub and ToolRegistry with logging
from research_agent_framework.mcp.stub import MCPStub
from research_agent_framework.mcp.tools import ToolRegistry
from research_agent_framework.config import get_logger, get_console

logger = get_logger()
console = get_console()

mcp = MCPStub()
tool_registry = ToolRegistry(logger)
console.print("[bold green]✅ MCPStub initialized. ToolRegistry ready.[/bold green]")
logger.info("MCPStub and ToolRegistry initialized.")


2025-09-13T11:50:24.312022-0700 INFO MCPStub and ToolRegistry initialized.
2025-09-13T11:50:24.312022-0700 INFO MCPStub and ToolRegistry initialized.
2025-09-13T11:50:24.312022-0700 INFO MCPStub and ToolRegistry initialized.


## 14.2 Register, Discover, and List Tools

We register example tools, list them, and display their descriptions. All actions are logged.

In [138]:
# Register example tools and list them

def square(x):
    """Returns x squared."""
    return x * x

def plus_ten(y):
    """Returns y plus 10."""
    return y + 10

tool_registry.register("square", square)
tool_registry.register("plus_ten", plus_ten)

console.print("[bold blue]Registered tools:[/bold blue]", list(tool_registry.list_tools().keys()))
logger.info(f"Registered tools: {list(tool_registry.list_tools().keys())}")

desc = tool_registry.describe_tools()
for name, doc in desc.items():
    console.print(f"[bold]{name}:[/bold] {doc}")
    logger.info(f"Tool {name}: {doc}")


2025-09-13T11:50:24.339683-0700 INFO Registering tool: square
2025-09-13T11:50:24.339683-0700 INFO Registering tool: square
2025-09-13T11:50:24.339683-0700 INFO Registering tool: square


2025-09-13T11:50:24.345444-0700 INFO Registering tool: plus_ten
2025-09-13T11:50:24.345444-0700 INFO Registering tool: plus_ten
2025-09-13T11:50:24.345444-0700 INFO Registering tool: plus_ten


2025-09-13T11:50:24.350182-0700 INFO Listing 2 registered tools.
2025-09-13T11:50:24.350182-0700 INFO Listing 2 registered tools.
2025-09-13T11:50:24.350182-0700 INFO Listing 2 registered tools.


2025-09-13T11:50:24.357182-0700 INFO Listing 2 registered tools.
2025-09-13T11:50:24.357182-0700 INFO Listing 2 registered tools.
2025-09-13T11:50:24.357182-0700 INFO Listing 2 registered tools.


2025-09-13T11:50:24.361318-0700 INFO Registered tools: ['square', 'plus_ten']
2025-09-13T11:50:24.361318-0700 INFO Registered tools: ['square', 'plus_ten']


2025-09-13T11:50:24.366060-0700 INFO Tool: square - Returns x squared.
2025-09-13T11:50:24.366060-0700 INFO Tool: square - Returns x squared.


2025-09-13T11:50:24.369435-0700 INFO Tool: plus_ten - Returns y plus 10.
2025-09-13T11:50:24.369435-0700 INFO Tool: plus_ten - Returns y plus 10.


2025-09-13T11:50:24.375357-0700 INFO Tool square: Returns x squared.
2025-09-13T11:50:24.375357-0700 INFO Tool square: Returns x squared.


2025-09-13T11:50:24.381095-0700 INFO Tool plus_ten: Returns y plus 10.
2025-09-13T11:50:24.381095-0700 INFO Tool plus_ten: Returns y plus 10.
2025-09-13T11:50:24.381095-0700 INFO Tool plus_ten: Returns y plus 10.


## 14.3 Async Usage Patterns Explained

The MCPStub and ToolRegistry support async tool calls and concurrent handling. This enables scalable, non-blocking workflows in agent systems.

In [139]:
# Example: Async tool call via MCPStub
import asyncio

async def handle_square(message):
    result = tool_registry.list_tools()["square"](message)
    console.print(f"[bold green]Async square result:[/bold green] {result}")
    logger.info(f"Async square result: {result}")

mcp.register_handler("square", handle_square)

async def demo_async_call():
    await mcp.publish("square", 7)

asyncio.run(demo_async_call())


2025-09-13T11:50:24.410353-0700 INFO Listing 2 registered tools.
2025-09-13T11:50:24.410353-0700 INFO Listing 2 registered tools.


2025-09-13T11:50:24.417333-0700 INFO Async square result: 49
2025-09-13T11:50:24.417333-0700 INFO Async square result: 49
2025-09-13T11:50:24.417333-0700 INFO Async square result: 49


## Summary: MCP Server and Tool Discovery

- MCPStub provides a minimal async message bus for agent workflows.
- ToolRegistry enables tool registration, listing, and description with logging.
- Async usage patterns allow scalable, non-blocking tool calls.
- All steps are logged for reproducibility and debugging.

**Try registering your own tools and publishing async messages to see the effects!**

## 15.0 MCP Filesystem Operations and Error Handling


This section demonstrates how to use the MCPFileTool to read local documentation files, handle errors safely, and use mock mode for deterministic educational runs. All actions are logged at appropriate levels for clarity and reproducibility.


### Learning Objectives

- Read local files using the MCPFileTool

- Handle file-not-found and other errors gracefully

- Use mock mode for deterministic outputs in tests and demos

- Observe structured logging at each step

In [140]:
# 15.1 Read local docs via MCPFileTool
from research_agent_framework.mcp.file_tools import MCPFileTool
from research_agent_framework.config import get_logger, get_console

logger = get_logger()
console = get_console()

# Use mock_mode=False for real file access, True for deterministic demo
def read_doc_demo(path, mock_mode=False):
    file_tool = MCPFileTool(logger, mock_mode=mock_mode)
    content = file_tool.read_file(path)
    if content is not None:
        console.print(f"[bold green]File content:[/bold green] {content[:100]}{'...' if len(content) > 100 else ''}")
        logger.info(f"Read file: {path} (length={len(content)})")
    else:
        console.print(f"[bold red]File not found or error:[/bold red] {path}")
        logger.error(f"Failed to read file: {path}")
    return content

# Example: Try reading a real file (will fail if not present)
read_doc_demo("../docs/notes.md", mock_mode=False)

# Example: Deterministic mock mode
read_doc_demo("../docs/notes.md", mock_mode=True)


2025-09-13T11:50:24.436215-0700 INFO MCPFileTool.read_file called with path: ../docs/notes.md
2025-09-13T11:50:24.436215-0700 INFO MCPFileTool.read_file called with path: ../docs/notes.md
2025-09-13T11:50:24.436215-0700 INFO MCPFileTool.read_file called with path: ../docs/notes.md


2025-09-13T11:50:24.440615-0700 INFO Successfully read file: ../docs/notes.md
2025-09-13T11:50:24.440615-0700 INFO Successfully read file: ../docs/notes.md
2025-09-13T11:50:24.440615-0700 INFO Successfully read file: ../docs/notes.md


2025-09-13T11:50:24.450001-0700 INFO Read file: ../docs/notes.md (length=483)
2025-09-13T11:50:24.450001-0700 INFO Read file: ../docs/notes.md (length=483)


2025-09-13T11:50:24.453000-0700 INFO MCPFileTool.read_file called with path: ../docs/notes.md
2025-09-13T11:50:24.453000-0700 INFO MCPFileTool.read_file called with path: ../docs/notes.md




2025-09-13T11:50:24.461018-0700 INFO Read file: ../docs/notes.md (length=29)
2025-09-13T11:50:24.461018-0700 INFO Read file: ../docs/notes.md (length=29)


'[MOCK CONTENT] File: notes.md'

### 15.2 Error Handling and Safe Patterns

The MCPFileTool logs errors at the appropriate level and returns None for missing or unreadable files. Always check for None before using file content. Mock mode ensures deterministic outputs for tests and demos.

In [141]:
# 15.3 Mock fallback for deterministic runs
# This cell demonstrates that mock_mode returns predictable content for any file path
mock_content = read_doc_demo("/any/path/to/file.md", mock_mode=True)
console.print(f"[bold yellow]Mock file content:[/bold yellow] {mock_content}")
logger.info(f"Mock file content: {mock_content}")


2025-09-13T11:50:24.476675-0700 INFO MCPFileTool.read_file called with path: /any/path/to/file.md
2025-09-13T11:50:24.476675-0700 INFO MCPFileTool.read_file called with path: /any/path/to/file.md
2025-09-13T11:50:24.476675-0700 INFO MCPFileTool.read_file called with path: /any/path/to/file.md




2025-09-13T11:50:24.485602-0700 INFO Read file: /any/path/to/file.md (length=28)
2025-09-13T11:50:24.485602-0700 INFO Read file: /any/path/to/file.md (length=28)
2025-09-13T11:50:24.485602-0700 INFO Read file: /any/path/to/file.md (length=28)


2025-09-13T11:50:24.491436-0700 INFO Mock file content: [MOCK CONTENT] File: file.md
2025-09-13T11:50:24.491436-0700 INFO Mock file content: [MOCK CONTENT] File: file.md
2025-09-13T11:50:24.491436-0700 INFO Mock file content: [MOCK CONTENT] File: file.md


In [142]:
# 15.4 Stylized Logging: Color, Bold, Italic
from rich.console import Console
from rich.text import Text

console = get_console()

# Example: Stylized log message using rich
console.print(Text("Stylized log: [INFO] Operation succeeded", style="bold green"))
console.print(Text("Stylized log: [WARNING] Check your input", style="italic yellow"))
console.print(Text("Stylized log: [ERROR] Operation failed", style="bold red"))

# You can also use rich's markup directly
console.print("[bold green]Stylized log: [INFO] Operation succeeded[/bold green]")
console.print("[italic yellow]Stylized log: [WARNING] Check your input[/italic yellow]")
console.print("[bold red]Stylized log: [ERROR] Operation failed[/bold red]")

# For structured logs, you can combine loguru/standard logging with rich output
logger.info("[bold green]Structured log: Operation succeeded[/bold green]")
logger.warning("[italic yellow]Structured log: Check your input[/italic yellow]")
logger.error("[bold red]Structured log: Operation failed[/bold red]")


2025-09-13T11:50:24.523182-0700 INFO [bold green]Structured log: Operation succeeded[/bold green]
2025-09-13T11:50:24.523182-0700 INFO [bold green]Structured log: Operation succeeded[/bold green]




2025-09-13T11:50:24.529605-0700 ERROR [bold red]Structured log: Operation failed[/bold red]
2025-09-13T11:50:24.529605-0700 ERROR [bold red]Structured log: Operation failed[/bold red]
2025-09-13T11:50:24.529605-0700 ERROR [bold red]Structured log: Operation failed[/bold red]


### Logging Troubleshooting & Stylization Tips

If log messages are not stylized (color, bold, italic), check the following:

- **Rich Console**: Use `console.print()` with rich markup or `Text` objects for stylized output in notebook cells.
- **Loguru/StdLogger**: By default, loguru and std logging output to stderr and do not support rich formatting. For stylized logs, print to the console in addition to logging.
- **Custom Sink**: To route loguru logs to rich, add a custom sink using `logger.add(lambda msg: console.print(msg, style="bold green"))`.
- **Example**: See the next cell for a custom loguru-to-rich demonstration.

In [143]:
# 15.5 Custom Loguru Sink for Rich Stylized Logging
from loguru import logger as loguru_logger
from rich.text import Text

# Remove previous handlers to avoid duplicate logs
loguru_logger.remove()

# Add a custom sink that prints loguru messages to rich console with style
loguru_logger.add(lambda msg: console.print(Text(msg, style="bold magenta")), level="INFO")

loguru_logger.info("This is a stylized loguru INFO message (bold magenta)")
loguru_logger.warning("This is a stylized loguru WARNING message (bold magenta)")
loguru_logger.error("This is a stylized loguru ERROR message (bold magenta)")


In [144]:
# 15.6 Rich Traceback and Console Setup (Magic for Stylized Output)
from rich import traceback
from rich.console import Console

# Install rich traceback for stylized error output in notebook
traceback.install()

# Ensure a fresh rich console instance for stylized printing
console = Console()
console.print("[bold green]Rich console and traceback are now active! Stylized output will appear below.[/bold green]")
