# Agent Testing Notebook

This notebook tests individual modules from agent.py. 

**Workflow**: 
1. Modify agent.py functions
2. Run the reload cell below
3. Run individual test cells to validate changes

## Setup Configuration

In [144]:
# Setup environment and configuration
import os
from dotenv import load_dotenv
import dspy
import logging
import importlib
import agent
from agent import *

load_dotenv(override=True)

# Reload agent.py after making changes
importlib.reload(agent)
print("Agent module reloaded successfully!")

# Disable verbose DSPy and LiteLLM outputs
dspy.settings.configure(show_guidelines=False, show_messages=False, show_cot=False)
logging.getLogger("LiteLLM").setLevel(logging.WARNING)

# Print configuration
print(f"SMALL_MODEL: {SMALL_MODEL}")
print(f"BIG_MODEL: {BIG_MODEL}")
print(f"TEMPERATURE: {TEMPERATURE}")
print(f"MAX_TOKENS: {MAX_TOKENS}")

# Configure DSPy default
default_lm = dspy.LM(
    model=SMALL_MODEL,
    api_key=OPENROUTER_API_KEY,
    api_base=OPENROUTER_BASE_URL,
    temperature=TEMPERATURE,
    max_tokens=MAX_TOKENS
)
dspy.configure(lm=default_lm)
lead_agent = AsyncLeadAgent()

print("\nConfiguration complete!")

Agent module reloaded successfully!
SMALL_MODEL: openrouter/google/gemini-2.5-flash-lite-preview-06-17
BIG_MODEL: openrouter/openai/gpt-4.1-mini
TEMPERATURE: 1.0
MAX_TOKENS: 4000

Configuration complete!


## Test Individual Async Tools

In [136]:
# Test web_search
result = await web_search("DSPy framework", count=2)
print("=== Web Search Test ===")
print(result)

=== Web Search Test ===
Search results for 'DSPy framework':\n\n1. GitHub - stanfordnlp/dspy: DSPy: The framework for programming—not prompting—language models\n   <strong>DSPy</strong>: <strong>The</strong> <strong>framework</strong> for programming—not prompting—language models - stanfordnlp/<strong>dspy</strong>\n   https://github.com/stanfordnlp/dspy\n\n2. DSPy\n   <strong>DSPy</strong> is a declarative <strong>framework</strong> for building modular AI software.\n   https://dspy.ai/


In [137]:
# Test wikipedia_search
result = wikipedia_search("Python programming", sentences=3)
print("=== Wikipedia Search Test ===")
print(result)

=== Wikipedia Search Test ===
Wikipedia – Python (programming language)\n\nPython is a high-level, general-purpose programming language. Its design philosophy emphasizes code readability with the use of significant indentation.
Python is dynamically type-checked and garbage-collected.


In [138]:
# Test async_batch_call
calls = [
    {"tool_name": "web_search", "args": {"query": "Lamine Yamal", "count": 2}},
    {"tool_name": "wikipedia_search", "args": {"query": "Desire Doue", "sentences": 2}}
]

results = await async_batch_call(calls)
print("=== Async Batch Call Test ===")
for i, result in enumerate(results):
    print(f"\nResult {i+1}:")
    print(result[:200] + "..." if len(result) > 200 else result)

=== Async Batch Call Test ===

Result 1:
web_search: Search results for 'Lamine Yamal':\n\n1. Lamine Yamal - Wikipedia\n   <strong>Lamine</strong> <strong>Yamal</strong> Nasraoui Ebana (born 13 July 2007) is a Spanish professional footballer...

Result 2:
wikipedia_search: Wikipedia – Désiré Doué\n\nDésiré Nonka-Maho Doué (French pronunciation: [deziʁe dwe]; born 3 June 2005) is a French professional footballer who plays as an attacking midfielder or w...


# Test full AsyncLeadAgent workflow with decomposition

In [139]:
# Test AsyncLeadAgent - Query Analysis
test_query = "Desire Doue vs Lamine Yamal, who is better?"
analysis_result = await lead_agent.query_analyzer.acall(query=test_query)

print("=== AsyncLeadAgent Query Analysis Test ===")
print(f"Query: {test_query}")
print(f"\nAnalysis:")
print(f"Type: {analysis_result.analysis.query_type}")
print(f"Complexity: {analysis_result.analysis.complexity}")
print(f"Main Concepts: {analysis_result.analysis.main_concepts}")
print(f"Key Entities: {analysis_result.analysis.key_entities}")
print(f"Answer Format: {analysis_result.analysis.answer_format}")

=== AsyncLeadAgent Query Analysis Test ===
Query: Desire Doue vs Lamine Yamal, who is better?

Analysis:
Type: depth_first
Complexity: complex
Main Concepts: ['Player comparison', 'Football skills', 'Performance metrics', 'Potential']
Key Entities: ['Desire Doue', 'Lamine Yamal']
Answer Format: Comparative analysis highlighting strengths, weaknesses, and overall impact of each player.


In [140]:
# Test AsyncLeadAgent - Research Planning
# Use the analysis from the previous cell
plan_result = await lead_agent.planner.acall(
    query=test_query,
    analysis=analysis_result.analysis
)

print("=== AsyncLeadAgent Research Planning Test ===")
print(f"Reasoning: {plan_result.reasoning}")
for plan in plan_result.plans:
    print(f"Plan: {plan}")


    

=== AsyncLeadAgent Research Planning Test ===
Reasoning: The user wants to know who is better between Desire Doue and Lamine Yamal. The analysis indicates a need for a comparative analysis based on various factors like current performance, potential, playing style, statistics, strengths, and weaknesses.

I have already gathered initial statistics and playing style information in previous steps. Now, I need to extract specific performance metrics (goals, assists, key passes, etc.), detailed playing styles, identified strengths and weaknesses, and market values for both players for the 2023-2024 season from the most reliable sources (WhoScored, FBref, Fotmob).

The plan is to create a detailed comparison by executing the following steps:
1.  For Desire Doue:
    *   Extract his 2023-2024 season stats, playing style, strengths, and weaknesses from WhoScored.com.
    *   Extract his 2023-2024 season stats, playing style, strengths, and weaknesses from FBref.com.
    *   Extract his 2023-20

In [141]:
decompose_result = await lead_agent.decomposer.acall(
    query=test_query,
    completed_results=[],
    plans=plan_result.plans,
    current_step=plan_result.plans[0]
)

print("=== DecomposeToTasks Test ===")
print(f"Strategy: {decompose_result.allocation.execution_strategy}")
print(f"Max concurrent: {decompose_result.allocation.max_concurrent}")

for i, task in enumerate(decompose_result.allocation.tasks):
    print(f"\nTask {i+1} (Task {task.id}): {task.description}")
    print(f"  Tools: {task.tools_to_use}")
    print(f"  Budget: {task.tool_budget} | Complexity: {task.complexity}")

=== DecomposeToTasks Test ===
Strategy: Execute the following tasks to gather information about Desire Doue's performance.
Max concurrent: 4

Task 1 (Task 0): Gather Desire Doue's 2023-2024 season stats, playing style, strengths, and weaknesses from WhoScored.com.
  Tools: ['web_search']
  Budget: 8 | Complexity: medium


In [159]:
analysis, research_plan, allocation, subagent_results = await lead_agent.aforward(query=test_query)

In [164]:
print("=== AsyncLeadAgent Test ===")
print(f"Query: {test_query}")

print(f"Analysis: {analysis.analysis}")
print(f"Research Plan: ")
for plan in research_plan.plans:
    print(f"Plan: {plan}")
print(f"Allocation: ")
for task in allocation.allocation.tasks:
    print(f"Task: {task.description}")
    print(f"Tools: {task.tools_to_use}")
    print(f"Budget: {task.tool_budget}")
    print(f"Complexity: {task.complexity}")
    print(f"Depends On: {task.depends_on}")
print(f"Subagent Results: ")
for result in subagent_results:
    print(f"Task ID: {result.final_result.task_id}")
    print(f"Summary: {result.final_result.summary}")
    print(f"Finding: {result.final_result.finding}")
    print(f"Debug Info: {result.final_result.debug_info}")


=== AsyncLeadAgent Test ===
Query: Desire Doue vs Lamine Yamal, who is better?
Analysis: query_type='depth_first' complexity='complex' main_concepts=['Player comparison', 'Football skills', 'Performance metrics', 'Potential'] key_entities=['Desire Doue', 'Lamine Yamal'] relationships=['Desire Doue vs Lamine Yamal'] notes="The term 'better' is subjective and can be interpreted based on various factors like current performance, potential, playing style, statistics, etc. The response should consider multiple facets of comparison." answer_format='Comparative analysis highlighting strengths, weaknesses, and overall impact of each player.'
Research Plan: 
Plan: id=1 description="Extract Desire Doue's 2023-2024 season stats, playing style, strengths, and weaknesses from FBref.com." depends_on=[] complexity_hint='medium'
Plan: id=2 description="Extract Desire Doue's 2023-2024 season stats, playing style, strengths, and weaknesses from Fotmob." depends_on=[] complexity_hint='medium'
Plan: id=3 