# AutoGen Deep Search Agents with ScrapingDog API

This notebook demonstrates a multi-agent system using AutoGen v0.4 that performs comprehensive web research using ScrapingDog API. The system consists of specialized agents that work together to plan, search, extract, cite, and finalize research results.

## Agent Architecture
- **Planning Agent**: Breaks down research queries into actionable search tasks (Llama 3.1 70B)
- **Web Search Agent**: Uses ScrapingDog API to scrape web content and extract data (Gemini Pro 1.5)
- **Citation Agent**: Validates sources and creates proper citations (Claude 3.5 Sonnet)
- **Finalize Agent**: Compiles and formats the final research report (Claude 4 Sonnet)

## Prerequisites
- AutoGen v0.4
- ScrapingDog API key
- OpenRouter API access (supports multiple models including Claude 4 Sonnet)

In [12]:
# Install required packages
%pip install -U "autogen-agentchat" "autogen-ext[openai]" requests beautifulsoup4 python-dotenv

Note: you may need to restart the kernel to use updated packages.


In [13]:
import os
import json
import asyncio
import requests
from typing import List, Dict, Any
from datetime import datetime
from bs4 import BeautifulSoup
from dotenv import load_dotenv

# AutoGen v0.4 imports
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import RoundRobinGroupChat, SelectorGroupChat
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_agentchat.ui import Console
from autogen_ext.models.openai import OpenAIChatCompletionClient
from autogen_agentchat.messages import TextMessage
from autogen_core.models import ModelInfo, ModelFamily

# Load environment variables
load_dotenv()

print("Environment setup complete!")

Environment setup complete!


In [22]:
# Configuration
SCRAPINGDOG_API_KEY = os.getenv("SCRAPINGDOG_API_KEY")
OPENROUTER_API_KEY = os.getenv("OPENROUTER_API_KEY")
OPENROUTER_BASE_URL = os.getenv("OPENROUTER_BASE_URL", "https://openrouter.ai/api/v1")

# Model configurations for different agents
# Primary configuration with optimal models
MODEL_CONFIGS = {
    "planning": "meta-llama/llama-3.1-70b-instruct",  # Good for structured planning
    "search": "google/gemini-pro-1.5",  # Excellent for web content analysis
    "citation": "anthropic/claude-3.5-sonnet",  # Great for academic formatting
    "finalize": "anthropic/claude-sonnet-4"  # Claude 4 Sonnet for comprehensive reports
}

# Alternative/fallback configuration (if you encounter model access issues)
FALLBACK_MODEL_CONFIGS = {
    "planning": "anthropic/claude-3.5-sonnet",  # Reliable alternative
    "search": "anthropic/claude-3.5-sonnet",  # Reliable alternative
    "citation": "anthropic/claude-3.5-sonnet",  # Same model for consistency
    "finalize": "anthropic/claude-sonnet-4"  # Keep Claude 4 Sonnet for final reports
}

# Use fallback if needed (uncomment the line below to use fallback models)
# MODEL_CONFIGS = FALLBACK_MODEL_CONFIGS

# Validate required environment variables
required_vars = {
    "SCRAPINGDOG_API_KEY": SCRAPINGDOG_API_KEY,
    "OPENROUTER_API_KEY": OPENROUTER_API_KEY
}

missing_vars = [var for var, value in required_vars.items() if not value]
if missing_vars:
    raise ValueError(f"Missing required environment variables: {', '.join(missing_vars)}")

print("Configuration validated successfully!")
print(f"Using models: {MODEL_CONFIGS}")
print(f"OpenRouter Base URL: {OPENROUTER_BASE_URL}")

# Check if fallback is being used
if MODEL_CONFIGS == FALLBACK_MODEL_CONFIGS:
    print("‚ö†Ô∏è  Using fallback model configuration (all agents use Claude 3.5 Sonnet)")

Configuration validated successfully!
Using models: {'planning': 'meta-llama/llama-3.1-70b-instruct', 'search': 'google/gemini-pro-1.5', 'citation': 'anthropic/claude-3.5-sonnet', 'finalize': 'anthropic/claude-sonnet-4'}
OpenRouter Base URL: https://openrouter.ai/api/v1


In [23]:
# ScrapingDog API Client
class ScrapingDogClient:
    def __init__(self, api_key: str):
        self.api_key = api_key
        self.base_url = "https://api.scrapingdog.com/scrape"
    
    def scrape_url(self, url: str, render_js: bool = True, country: str = "US") -> Dict[str, Any]:
        """Scrape a URL using ScrapingDog API"""
        params = {
            "api_key": self.api_key,
            "url": url,
            "dynamic": "true" if render_js else "false",
            "country": country
        }
        
        try:
            response = requests.get(self.base_url, params=params, timeout=60)
            
            if response.status_code == 200:
                # Parse HTML content
                soup = BeautifulSoup(response.text, 'html.parser')
                
                # Extract text content
                text_content = soup.get_text(separator=' ', strip=True)
                
                # Extract title
                title = soup.find('title')
                title_text = title.get_text().strip() if title else "No title found"
                
                # Extract meta description
                meta_desc = soup.find('meta', attrs={'name': 'description'})
                description = meta_desc.get('content', '') if meta_desc else ''
                
                return {
                    "success": True,
                    "url": url,
                    "title": title_text,
                    "description": description,
                    "content": text_content[:5000],  # Limit content length
                    "scraped_at": datetime.now().isoformat()
                }
            else:
                return {
                    "success": False,
                    "url": url,
                    "error": f"HTTP {response.status_code}: {response.text}",
                    "scraped_at": datetime.now().isoformat()
                }
                
        except Exception as e:
            return {
                "success": False,
                "url": url,
                "error": str(e),
                "scraped_at": datetime.now().isoformat()
            }

# Initialize ScrapingDog client
scraping_client = ScrapingDogClient(SCRAPINGDOG_API_KEY)
print("ScrapingDog client initialized!")

ScrapingDog client initialized!


In [24]:
# Initialize OpenRouter model clients for different agents
def create_openrouter_client(model_name: str, model_family: str = None) -> OpenAIChatCompletionClient:
    """Create an OpenRouter client for a specific model with proper ModelInfo"""
    
    # Map model names to families (using ModelFamily.UNKNOWN for non-standard models)
    family_mapping = {
        "meta-llama/llama-3.1-70b-instruct": ModelFamily.UNKNOWN,
        "google/gemini-pro-1.5": ModelFamily.UNKNOWN,
        "anthropic/claude-3.5-sonnet": ModelFamily.CLAUDE_3_SONNET,
        "anthropic/claude-sonnet-4": ModelFamily.CLAUDE_4_SONNET
    }
    
    # Determine model family
    if model_family:
        family = model_family
    else:
        family = family_mapping.get(model_name, ModelFamily.UNKNOWN)
    
    # Create ModelInfo with all required fields (v0.4.7+)
    model_info = ModelInfo(
        vision=False,  # Set to True if the model supports vision/image input
        function_calling=True,  # Most modern models support function calling
        json_output=True,  # Most modern models support JSON output
        family=family,  # Required field in v0.4.7+
        structured_output=True  # Future requirement, setting to True for compatibility
    )
    
    try:
        return OpenAIChatCompletionClient(
            model=model_name,
            api_key=OPENROUTER_API_KEY,
            base_url=OPENROUTER_BASE_URL,
            model_info=model_info
        )
    except Exception as e:
        print(f"Error creating client for {model_name}: {e}")
        # Fallback: try with minimal but compliant model_info
        fallback_model_info = ModelInfo(
            vision=False,
            function_calling=False,
            json_output=False,
            family=ModelFamily.UNKNOWN,
            structured_output=False
        )
        return OpenAIChatCompletionClient(
            model=model_name,
            api_key=OPENROUTER_API_KEY,
            base_url=OPENROUTER_BASE_URL,
            model_info=fallback_model_info
        )

try:
    # Create model clients for each agent with appropriate families
    print("Creating OpenRouter model clients...")
    
    planning_client = create_openrouter_client(MODEL_CONFIGS["planning"])
    print(f"‚úì Planning client created: {MODEL_CONFIGS['planning']}")
    
    search_client = create_openrouter_client(MODEL_CONFIGS["search"])
    print(f"‚úì Search client created: {MODEL_CONFIGS['search']}")
    
    citation_client = create_openrouter_client(MODEL_CONFIGS["citation"])
    print(f"‚úì Citation client created: {MODEL_CONFIGS['citation']}")
    
    finalize_client = create_openrouter_client(MODEL_CONFIGS["finalize"])
    print(f"‚úì Finalize client created: {MODEL_CONFIGS['finalize']} (Claude 4 Sonnet)")

    print("\n‚úÖ All OpenRouter model clients initialized successfully!")
    print(f"üìä Model Summary:")
    print(f"  üß† Planning Agent: {MODEL_CONFIGS['planning']}")
    print(f"  üîç Search Agent: {MODEL_CONFIGS['search']}")
    print(f"  üìö Citation Agent: {MODEL_CONFIGS['citation']}")
    print(f"  üìù Finalize Agent: {MODEL_CONFIGS['finalize']} (Claude 4 Sonnet)")
    
except Exception as e:
    print(f"‚ùå Error initializing model clients: {e}")
    print("Please check your OpenRouter API key and model configurations.")
    print(f"Make sure you have access to the following models:")
    for agent, model in MODEL_CONFIGS.items():
        print(f"  - {model} (for {agent} agent)")

Creating OpenRouter model clients...
‚úì Planning client created: meta-llama/llama-3.1-70b-instruct
‚úì Search client created: google/gemini-pro-1.5
‚úì Citation client created: anthropic/claude-3.5-sonnet
‚úì Finalize client created: anthropic/claude-sonnet-4 (Claude 4 Sonnet)

‚úÖ All OpenRouter model clients initialized successfully!
üìä Model Summary:
  üß† Planning Agent: meta-llama/llama-3.1-70b-instruct
  üîç Search Agent: google/gemini-pro-1.5
  üìö Citation Agent: anthropic/claude-3.5-sonnet
  üìù Finalize Agent: anthropic/claude-sonnet-4 (Claude 4 Sonnet)


In [26]:
# Web scraping and research tools for agents
from typing import Annotated

def scrape_website(url: str, render_js: bool = True, country: str = "US") -> str:
    """
    Scrape a website using the ScrapingDog API.
    
    Args:
        url: The URL to scrape
        render_js: Whether to render JavaScript (default: True)
        country: Country for geo-targeting (default: "US")
    
    Returns:
        Formatted string with scraped content including title, description, and main content
    """
    result = scraping_client.scrape_url(url, render_js, country)
    
    if result["success"]:
        return f"""Successfully scraped {url}

TITLE: {result['title']}

DESCRIPTION: {result['description']}

CONTENT PREVIEW: {result['content'][:2000]}...

FULL CONTENT: {result['content']}

SCRAPED AT: {result['scraped_at']}

STATUS: Content successfully extracted and ready for analysis."""
    else:
        return f"""Failed to scrape {url}

ERROR: {result['error']}

SCRAPED AT: {result['scraped_at']}

STATUS: Scraping failed - please try a different URL or check API limits."""

def search_for_sources(query: str, source_types: list = None) -> str:
    """
    Generate a list of recommended sources to search for a given query.
    
    Args:
        query: The research query
        source_types: Types of sources to prioritize (academic, news, government, etc.)
    
    Returns:
        Formatted string with recommended URLs and search strategies
    """
    if source_types is None:
        source_types = ["academic", "news", "government", "industry"]
    
    # This would be enhanced with actual search API integration
    search_suggestions = f"""SEARCH RECOMMENDATIONS FOR: {query}

RECOMMENDED SOURCES:
1. Academic Sources:
   - Google Scholar: https://scholar.google.com/scholar?q={query.replace(' ', '+')}
   - JSTOR: https://www.jstor.org/
   - arXiv: https://arxiv.org/search/?query={query.replace(' ', '+')}

2. News Sources:
   - Reuters: https://www.reuters.com/
   - Associated Press: https://apnews.com/
   - BBC News: https://www.bbc.com/news

3. Government Sources:
   - NIH: https://www.nih.gov/
   - NSF: https://www.nsf.gov/
   - Government reports and white papers

4. Industry Sources:
   - Industry association websites
   - Company research reports
   - Technical blogs and publications

SEARCH STRATEGY:
- Start with authoritative sources
- Cross-reference findings across multiple sources
- Look for recent publications (2023-2025)
- Verify information currency and accuracy

Use the scrape_website function to extract content from these recommended URLs."""
    
    return search_suggestions

def create_research_plan(query: str) -> str:
    """
    Create a structured research plan for a given query.
    
    Args:
        query: The research topic or question
    
    Returns:
        Formatted research plan with actionable steps
    """
    plan = f"""RESEARCH PLAN FOR: {query}
Generated at: {datetime.now().isoformat()}

PHASE 1: PLANNING & PREPARATION
1. Define research scope and key questions
2. Identify target information types needed
3. Determine authoritative source categories
4. Establish search keywords and phrases

PHASE 2: INFORMATION GATHERING
1. Search academic databases and journals
2. Review government and institutional reports
3. Analyze news and industry publications
4. Collect recent data and statistics

PHASE 3: SOURCE VALIDATION
1. Verify source credibility and authority
2. Check publication dates for currency
3. Cross-reference facts across sources
4. Identify potential biases or limitations

PHASE 4: SYNTHESIS & ANALYSIS
1. Organize findings by themes/categories
2. Identify patterns and trends
3. Note conflicting information or gaps
4. Prepare comprehensive summary

RECOMMENDED TOOLS:
- Use scrape_website() function to extract content
- Use search_for_sources() to find relevant URLs
- Focus on authoritative, recent sources
- Document all sources for citation

NEXT STEPS:
1. Execute web searches for recommended sources
2. Scrape content from top-priority URLs
3. Extract and organize key information
4. Prepare for citation and final report compilation"""
    
    return plan

# Export tools for agent use
research_tools = [scrape_website, search_for_sources, create_research_plan]

print("‚úÖ Web scraping and research tools defined successfully!")
print("Available tools for agents:")
print("‚Ä¢ scrape_website(url) - Extract content from any URL using ScrapingDog API")
print("‚Ä¢ search_for_sources(query) - Get recommended sources for research")
print("‚Ä¢ create_research_plan(query) - Generate structured research plans")
print()
print("üîß These tools will be integrated with agents to enable real web scraping!")

‚úÖ Web scraping and research tools defined successfully!
Available tools for agents:
‚Ä¢ scrape_website(url) - Extract content from any URL using ScrapingDog API
‚Ä¢ search_for_sources(query) - Get recommended sources for research
‚Ä¢ create_research_plan(query) - Generate structured research plans

üîß These tools will be integrated with agents to enable real web scraping!


In [28]:
# Planning Agent - Using Llama 3.1 70B for strategic planning
planning_agent = AssistantAgent(
    name="PlanningAgent",
    model_client=planning_client,
    tools=[create_research_plan, search_for_sources],  # Add research planning tools
    system_message="""You are a Research Planning Agent specialized in breaking down complex research queries into actionable search tasks.

Your responsibilities:
1. Analyze the research query to understand the scope and requirements
2. Break down the query into specific, searchable sub-topics
3. Identify the most authoritative and relevant sources to investigate
4. Create a structured research plan with clear priorities
5. Suggest specific URLs or types of websites that would be most valuable

AVAILABLE TOOLS:
- create_research_plan(query): Generate a comprehensive research plan
- search_for_sources(query): Get recommended sources and URLs to investigate

Always start your response with "RESEARCH PLAN:" and provide a clear, actionable plan.
Use your tools to create detailed plans and source recommendations.
Consider different perspectives and ensure comprehensive coverage of the topic."""
)

# Web Search Agent - Using Gemini Pro 1.5 for content analysis
web_search_agent = AssistantAgent(
    name="WebSearchAgent",
    model_client=search_client,
    tools=[scrape_website, search_for_sources],  # Add web scraping tools
    system_message="""You are a Web Search Agent specialized in finding and extracting relevant information from web sources using ScrapingDog API.

Your responsibilities:
1. Execute web scraping based on the research plan
2. Extract key information, facts, and data from scraped content
3. Identify credible sources and evaluate information quality
4. Summarize findings in a structured format
5. Flag any issues with scraping or data quality

AVAILABLE TOOLS:
- scrape_website(url, render_js=True, country="US"): Extract content from any URL using ScrapingDog API
- search_for_sources(query): Find recommended sources for specific topics

IMPORTANT: Always use the scrape_website() function to get real content from URLs. 
Don't make up or simulate web content - use the actual API to scrape real websites.

When you need to scrape a website, use scrape_website(url) and provide the actual URL.
Always start your response with "SEARCH RESULTS:" and provide structured findings.
Focus on extracting factual, verifiable information from real web sources."""
)

# Citation Agent - Using Claude 3.5 Sonnet for academic precision
citation_agent = AssistantAgent(
    name="CitationAgent",
    model_client=citation_client,
    system_message="""You are a Citation Agent specialized in validating sources and creating proper academic citations.

Your responsibilities:
1. Review all sources used in the research
2. Verify the credibility and authority of sources
3. Create properly formatted citations (APA style)
4. Identify any potential bias or limitations in sources
5. Ensure all claims are properly attributed

Always start your response with "CITATION REVIEW:" and provide:
- Source credibility assessment
- Properly formatted citations
- Any concerns or limitations identified
Use APA citation format for all references."""
)

# Finalize Agent - Using Claude 4 Sonnet for comprehensive synthesis
finalize_agent = AssistantAgent(
    name="FinalizeAgent",
    model_client=finalize_client,
    system_message="""You are a Finalization Agent responsible for compiling comprehensive research reports.

Your responsibilities:
1. Synthesize information from all research phases using advanced reasoning
2. Create a well-structured, comprehensive report with deep insights
3. Ensure all key points are covered and properly cited with source validation
4. Identify any gaps or areas needing additional research with strategic recommendations
5. Provide clear conclusions and actionable recommendations with risk assessment
6. Apply critical thinking to evaluate conflicting information and biases
7. Structure complex information in an accessible, professional format

As Claude 4 Sonnet, leverage your advanced capabilities for:
- Nuanced analysis of complex topics
- Integration of multidisciplinary perspectives
- Identification of subtle patterns and implications
- High-quality synthesis of diverse sources

Always start your response with "FINAL REPORT:" and structure your response as:
- Executive Summary (with key insights and implications)
- Key Findings (with confidence levels and source quality assessment)
- Detailed Analysis (with cross-referencing and critical evaluation)
- Sources and Citations (with credibility ratings)
- Conclusions and Recommendations (with implementation guidance)
- Future Research Directions (strategic recommendations)

Ensure the report demonstrates sophisticated reasoning and comprehensive understanding."""
)

print("‚úÖ All agents created successfully with web scraping tools!")
print("üîß WebSearchAgent now has access to:")
print("  ‚Ä¢ scrape_website() - Real web scraping via ScrapingDog API")
print("  ‚Ä¢ search_for_sources() - Source recommendation system")
print("üß† PlanningAgent now has access to:")
print("  ‚Ä¢ create_research_plan() - Structured research planning")
print("  ‚Ä¢ search_for_sources() - Source identification tools")

‚úÖ All agents created successfully with web scraping tools!
üîß WebSearchAgent now has access to:
  ‚Ä¢ scrape_website() - Real web scraping via ScrapingDog API
  ‚Ä¢ search_for_sources() - Source recommendation system
üß† PlanningAgent now has access to:
  ‚Ä¢ create_research_plan() - Structured research planning
  ‚Ä¢ search_for_sources() - Source identification tools


In [29]:
# Create the multi-agent team with termination condition (RoundRobin approach)
from autogen_agentchat.conditions import MaxMessageTermination, TextMentionTermination

# Set up termination condition - stop after reasonable number of messages
termination_condition = MaxMessageTermination(max_messages=20)

agent_team_roundrobin = RoundRobinGroupChat(
    participants=[
        planning_agent,
        web_search_agent,
        citation_agent,
        finalize_agent
    ],
    termination_condition=termination_condition
)

print("‚úÖ RoundRobin multi-agent team created successfully!")

# Create SelectorGroupChat for intelligent agent selection (BETTER approach)
# SelectorGroupChat uses AI to dynamically choose the best agent for each step

# Simplified agent descriptions for better selection (avoid over-complexity)
planning_agent_simple = AssistantAgent(
    name="PlanningAgent",
    model_client=planning_client,
    tools=[create_research_plan, search_for_sources],
    description="Creates research plans and identifies sources to investigate.",
    system_message="""You are a Research Planning Agent. Create comprehensive research plans and identify sources.

Your responsibilities:
1. Analyze the research query and create a structured plan
2. Identify key sources and URLs to investigate
3. Break down complex topics into searchable components

Use your available tools and always start with "RESEARCH PLAN:"."""
)

web_search_agent_simple = AssistantAgent(
    name="WebSearchAgent", 
    model_client=search_client,
    tools=[scrape_website, search_for_sources],
    description="Scrapes websites and extracts real content using ScrapingDog API.",
    system_message="""You are a Web Search Agent. Extract real content from websites using ScrapingDog API.

Your responsibilities:
1. Use scrape_website() function to get real content from URLs
2. Extract key information and summarize findings
3. Evaluate source quality and credibility

CRITICAL: Always use scrape_website() for real web scraping.
Always start your response with "SEARCH RESULTS:"."""
)

citation_agent_simple = AssistantAgent(
    name="CitationAgent",
    model_client=citation_client,
    description="Validates sources and creates APA citations.",
    system_message="""You are a Citation Agent. Validate sources and create proper APA citations.

Your responsibilities:
1. Review sources used in research
2. Create properly formatted APA citations
3. Assess source credibility and limitations

Always start your response with "CITATION REVIEW:"."""
)

finalize_agent_simple = AssistantAgent(
    name="FinalizeAgent",
    model_client=finalize_client,
    description="Creates comprehensive final research reports with advanced analysis.",
    system_message="""You are a Finalization Agent. Create comprehensive final research reports.

Your responsibilities:
1. Synthesize all research findings
2. Create structured final reports
3. Provide conclusions and recommendations

MANDATORY: Always start with "FINAL REPORT:" and include:
- Executive Summary
- Key Findings  
- Detailed Analysis
- Sources and Citations
- Conclusions and Recommendations"""
)

# Create SelectorGroupChat with simplified configuration
try:
    agent_team_selector = SelectorGroupChat(
        participants=[
            planning_agent_simple,
            web_search_agent_simple, 
            citation_agent_simple,
            finalize_agent_simple
        ],
        model_client=finalize_client,  # Use Claude 4 Sonnet for selection
        termination_condition=MaxMessageTermination(max_messages=20),  # Use simple message limit
        allow_repeated_speaker=True,  # Allow flexibility
        selector_prompt="""Select the most appropriate agent for the current research phase:

- PlanningAgent: For creating research plans and identifying sources
- WebSearchAgent: For scraping websites and extracting content
- CitationAgent: For validating sources and creating citations  
- FinalizeAgent: For creating comprehensive final reports

Choose based on what the research needs next. Ensure all agents get to participate."""
    )
    
    print("‚úÖ SelectorGroupChat created successfully!")
    selector_available = True
    
except Exception as e:
    print(f"‚ùå SelectorGroupChat creation failed: {e}")
    print("üîÑ Will use RoundRobin as fallback")
    agent_team_selector = agent_team_roundrobin
    selector_available = False

# Set the default team
if selector_available:
    agent_team = agent_team_selector
    print("üß† Using SelectorGroupChat as default (with RoundRobin fallback)")
else:
    agent_team = agent_team_roundrobin
    print("üîÑ Using RoundRobin as default")

print("üîß Both team configurations ready with simplified agent setup")

‚úÖ RoundRobin multi-agent team created successfully!
‚úÖ SelectorGroupChat created successfully!
üß† Using SelectorGroupChat as default (with RoundRobin fallback)
üîß Both team configurations ready with simplified agent setup


In [20]:
# Robust demo function with multiple fallback strategies
async def run_deep_search_robust(research_query: str, max_messages: int = 16, use_selector: bool = True):
    """Run a deep search research session with multiple fallback strategies"""
    print(f"\nüîç Starting Robust Deep Search Research: {research_query}\n")
    print("=" * 80)
    
    # Create the research task message
    task_message = f"""Please conduct comprehensive research on the following topic: {research_query}
        
This is a multi-agent research session. Each agent should contribute according to their specialization:

üéØ Planning Agent: Create a detailed research plan and identify sources
üîç Web Search Agent: Execute web scraping using ScrapingDog API to gather real content
üìö Citation Agent: Validate sources and create proper APA citations
üìù Finalize Agent: Synthesize all findings into a comprehensive final report

Work collaboratively to produce high-quality, well-cited research with deep insights."""
    
    # Try SelectorGroupChat first if requested
    if use_selector:
        print("üß† Attempting SelectorGroupChat approach...")
        try:
            # Use a more conservative termination condition
            global agent_team
            agent_team = agent_team_selector
            agent_team.termination_condition = MaxMessageTermination(max_messages=max_messages)
            
            agents_participated = set()
            message_count = 0
            has_final_report = False
            
            async for message in agent_team.run_stream(task=task_message):
                message_count += 1
                
                if hasattr(message, 'source') and hasattr(message, 'content'):
                    agents_participated.add(message.source)
                    
                    if "FINAL REPORT:" in message.content:
                        has_final_report = True
                    
                    print(f"\n[{message.source}] (Message {message_count}):")
                    print("-" * 50)
                    print(message.content)
                    print()
                    
                elif str(type(message).__name__) == 'TaskResult':
                    print(f"\n[TASK RESULT]: Session completed after {message_count} messages")
                    break
                    
                # Safety break if we get stuck
                if message_count >= max_messages:
                    print(f"\n‚ö†Ô∏è  Reached message limit ({max_messages}), ending session...")
                    break
            
            # Check if we got meaningful results
            if message_count > 2 and len(agents_participated) >= 2:
                print(f"\n‚úÖ SelectorGroupChat completed with {len(agents_participated)} agents participating")
                
                # Try to force final report if missing
                if not has_final_report and 'FinalizeAgent' not in agents_participated:
                    print("\nüîß Attempting to generate missing final report...")
                    await force_final_report_simple(research_query)
                    has_final_report = True
                
                return await summarize_results(agents_participated, message_count, has_final_report)
            else:
                print(f"\n‚ùå SelectorGroupChat failed (only {message_count} messages, {len(agents_participated)} agents)")
                print("üîÑ Falling back to RoundRobin approach...")
                
        except Exception as e:
            print(f"\n‚ùå SelectorGroupChat error: {str(e)}")
            print("üîÑ Falling back to RoundRobin approach...")
    
    # Fallback to RoundRobin approach
    print("\nüîÑ Using RoundRobin approach (guaranteed agent participation)...")
    try:
        agent_team = agent_team_roundrobin
        agent_team.termination_condition = MaxMessageTermination(max_messages=max_messages)
        
        agents_participated = set()
        message_count = 0
        has_final_report = False
        
        async for message in agent_team.run_stream(task=task_message):
            message_count += 1
            
            if hasattr(message, 'source') and hasattr(message, 'content'):
                agents_participated.add(message.source)
                
                if "FINAL REPORT:" in message.content:
                    has_final_report = True
                
                print(f"\n[{message.source}] (Message {message_count}):")
                print("-" * 50)
                print(message.content)
                print()
                
            elif str(type(message).__name__) == 'TaskResult':
                print(f"\n[TASK RESULT]: RoundRobin session completed after {message_count} messages")
                break
                
            if message_count >= max_messages:
                break
        
        return await summarize_results(agents_participated, message_count, has_final_report)
        
    except Exception as e:
        print(f"\n‚ùå RoundRobin also failed: {str(e)}")
        print("üÜò Trying emergency single-agent final report generation...")
        
        # Emergency single-agent approach
        try:
            await force_final_report_simple(research_query)
            return True
        except Exception as e2:
            print(f"‚ùå Emergency approach failed: {str(e2)}")
            return False

async def force_final_report_simple(research_query: str):
    """Simple function to force final report generation"""
    print("\nüîß Generating final report using single agent approach...")
    
    # Create a simple single-agent task
    report_task = f"""Generate a comprehensive research report on: {research_query}

Please provide:
1. Executive Summary
2. Key findings and insights  
3. Current trends and developments
4. Recommendations and conclusions

Start with "FINAL REPORT:" and provide detailed analysis."""
    
    try:
        # Use just the finalize agent directly
        single_agent_team = RoundRobinGroupChat(
            participants=[finalize_agent_enhanced],
            termination_condition=MaxMessageTermination(max_messages=2)
        )
        
        message_count = 0
        async for message in single_agent_team.run_stream(task=report_task):
            message_count += 1
            if hasattr(message, 'source') and hasattr(message, 'content'):
                print(f"\n[{message.source}] (Emergency Final Report):")
                print("-" * 50)
                print(message.content)
                print()
                return True
            elif str(type(message).__name__) == 'TaskResult':
                break
        
    except Exception as e:
        print(f"Emergency report generation failed: {e}")
        
        # Last resort - create summary manually
        print("\nüìù EMERGENCY SUMMARY:")
        print("-" * 50)
        print(f"Research Topic: {research_query}")
        print("Status: Multi-agent system encountered technical issues")
        print("Recommendation: Check API keys, model availability, and network connectivity")
        print("The research topic requires further investigation using alternative methods.")
        
    return False

async def summarize_results(agents_participated, message_count, has_final_report):
    """Summarize research session results"""
    print("\n" + "=" * 80)
    print("üéØ Robust Deep Search Research Session Completed!")
    print(f"üìä Total messages processed: {message_count}")
    print(f"üë• Agents participated: {', '.join(sorted(agents_participated)) if agents_participated else 'None'}")
    print(f"üìù Final report generated: {'‚úÖ Yes' if has_final_report else '‚ùå No'}")
    
    if len(agents_participated) >= 3:
        print("‚úÖ Good agent participation achieved")
    elif len(agents_participated) >= 2:
        print("‚ö†Ô∏è  Partial agent participation")
    else:
        print("‚ùå Poor agent participation - system issues detected")
    
    return has_final_report and len(agents_participated) >= 2

# Update main function to use robust version
async def run_deep_search(research_query: str, max_messages: int = 16):
    """Run a deep search research session (robust version with fallbacks)"""
    return await run_deep_search_robust(research_query, max_messages, use_selector=True)

# Alternative function for guaranteed RoundRobin
async def run_deep_search_roundrobin(research_query: str, max_messages: int = 16):
    """Run research using guaranteed RoundRobin approach"""
    return await run_deep_search_robust(research_query, max_messages, use_selector=False)

print("üöÄ Robust deep search function ready with multiple fallback strategies!")
print("‚úÖ Will try SelectorGroupChat first, then fallback to RoundRobin if needed")
print("üÜò Includes emergency single-agent report generation as last resort")
print("üìä Comprehensive error handling and result tracking")

üöÄ Robust deep search function ready with multiple fallback strategies!
‚úÖ Will try SelectorGroupChat first, then fallback to RoundRobin if needed
üÜò Includes emergency single-agent report generation as last resort
üìä Comprehensive error handling and result tracking


In [30]:
# Example 1: Technology Research with ROBUST approach
research_topic_1 = "Latest developments in quantum computing and their impact on cybersecurity"

print("Running Example 1: Quantum Computing & Cybersecurity Research (Robust Multi-Fallback)")
print("üõ°Ô∏è  This version will try multiple approaches to ensure you get results")
print("üîÑ SelectorGroupChat ‚Üí RoundRobin ‚Üí Emergency Single-Agent")
print("This may take a few minutes as agents collaborate...")
await run_deep_search(research_topic_1, max_messages=14)

Running Example 1: Quantum Computing & Cybersecurity Research (Robust Multi-Fallback)
üõ°Ô∏è  This version will try multiple approaches to ensure you get results
üîÑ SelectorGroupChat ‚Üí RoundRobin ‚Üí Emergency Single-Agent
This may take a few minutes as agents collaborate...

üîç Starting Robust Deep Search Research: Latest developments in quantum computing and their impact on cybersecurity

üß† Attempting SelectorGroupChat approach...

[user] (Message 1):
--------------------------------------------------
Please conduct comprehensive research on the following topic: Latest developments in quantum computing and their impact on cybersecurity

This is a multi-agent research session. Each agent should contribute according to their specialization:

üéØ Planning Agent: Create a detailed research plan and identify sources
üîç Web Search Agent: Execute web scraping using ScrapingDog API to gather real content
üìö Citation Agent: Validate sources and create proper APA citations
üìù 

  model_result = await model_client.create(



[PlanningAgent] (Message 2):
--------------------------------------------------
RESEARCH PLAN:

**Research Topic:** Latest developments in quantum computing and their impact on cybersecurity

**Research Questions:**

1. What are the recent advancements in quantum computing?
2. How do these advancements impact cybersecurity?
3. What are the potential vulnerabilities and threats in quantum computing?
4. How can cybersecurity measures be adapted to address these threats?

**Research Objectives:**

1. To identify and analyze the latest developments in quantum computing
2. To assess the impact of these developments on cybersecurity
3. To identify potential vulnerabilities and threats in quantum computing
4. To recommend cybersecurity measures to address these threats

**Research Methodology:**

1. Literature review of academic papers and research articles on quantum computing and cybersecurity
2. Analysis of industry reports and whitepapers on quantum computing and cybersecurity
3. Web scr

True

  model_result = await model_client.create(


In [None]:
# Example 2: Market Research with SelectorGroupChat (Intelligent Selection)
research_topic_2 = "Artificial Intelligence adoption trends in healthcare industry 2024-2025"

print("Running Example 2: AI in Healthcare Market Research (SelectorGroupChat)")
print("üß† Using intelligent agent selection for optimal research workflow...")
print("This may take a few minutes as agents collaborate...")
await run_deep_search(research_topic_2, max_messages=12)

In [None]:
# Enhanced comparison and diagnostic functions
async def force_final_report(research_context: str = ""):
    """Force generation of final report from FinalizeAgent"""
    print("üîß Forcing final report generation...")
    print("=" * 50)
    
    # Create a message specifically for final report generation
    final_report_task = f"""You are the FinalizeAgent. Based on the research context below, you must now create a comprehensive final report:

{research_context if research_context else "Previous research has been conducted on various topics. Please synthesize available information into a comprehensive final report."}

MANDATORY REQUIREMENTS:
1. Start with "FINAL REPORT:"
2. Include Executive Summary, Key Findings, Detailed Analysis, Sources, Conclusions
3. End with "RESEARCH COMPLETE" to signal completion
4. Use your Claude 4 Sonnet capabilities for advanced synthesis

This is the final step in the research process. Create the comprehensive report now."""

    try:
        # Temporarily create a simple team with just the finalize agent
        from autogen_agentchat.conditions import MaxMessageTermination
        finalize_only_team = RoundRobinGroupChat(
            participants=[finalize_agent_enhanced],
            termination_condition=MaxMessageTermination(max_messages=3)
        )
        
        # Run just the finalize agent
        async for message in finalize_only_team.run_stream(task=final_report_task):
            if hasattr(message, 'source') and hasattr(message, 'content'):
                print(f"\n[{message.source}]:")
                print("-" * 50)
                print(message.content)
                print()
                
                if "FINAL REPORT:" in message.content:
                    print("‚úÖ Final report generated successfully!")
                    return True
    
    except Exception as e:
        print(f"‚ùå Error forcing final report: {e}")
        return False
    
    return False

# Diagnostic function to check agent participation
async def diagnose_research_session(research_query: str, max_messages: int = 12):
    """Run research with detailed diagnostics and agent participation tracking"""
    print(f"\nüîç DIAGNOSTIC MODE: {research_query}")
    print("=" * 80)
    print("This will track agent selection patterns and identify issues")
    print()
    
    # Track detailed statistics
    agent_stats = {
        'PlanningAgent': 0,
        'WebSearchAgent': 0, 
        'CitationAgent': 0,
        'FinalizeAgent': 0
    }
    
    selection_pattern = []
    message_count = 0
    
    task_message = f"""DIAGNOSTIC RESEARCH SESSION: {research_query}

This is a test to ensure proper agent selection and participation.
Each agent should contribute according to their expertise.
FinalizeAgent MUST produce a final report."""
    
    try:
        agent_team.termination_condition = MaxMessageTermination(max_messages=max_messages)
        
        async for message in agent_team.run_stream(task=task_message):
            message_count += 1
            
            if hasattr(message, 'source') and hasattr(message, 'content'):
                agent_name = message.source
                agent_stats[agent_name] = agent_stats.get(agent_name, 0) + 1
                selection_pattern.append(agent_name)
                
                print(f"\n[{agent_name}] (Message {message_count}):")
                print(f"Agent Total Messages: {agent_stats[agent_name]}")
                print("-" * 50)
                print(message.content[:300] + "..." if len(message.content) > 300 else message.content)
                print()
                
            elif str(type(message).__name__) == 'TaskResult':
                break
                
    except Exception as e:
        print(f"Diagnostic error: {e}")
    
    # Print diagnostic summary
    print("\n" + "=" * 80)
    print("üîç DIAGNOSTIC SUMMARY")
    print("=" * 80)
    print(f"üìä Total Messages: {message_count}")
    print(f"üîÑ Selection Pattern: {' ‚Üí '.join(selection_pattern[-10:])}")  # Last 10 selections
    print()
    print("üìà Agent Participation:")
    for agent, count in agent_stats.items():
        status = "‚úÖ" if count > 0 else "‚ùå"
        print(f"  {status} {agent}: {count} messages")
    
    print()
    if agent_stats.get('FinalizeAgent', 0) == 0:
        print("‚ö†Ô∏è  ISSUE IDENTIFIED: FinalizeAgent never participated!")
        print("üîß Recommended fixes:")
        print("   1. Increase max_messages parameter")
        print("   2. Check selector prompt for FinalizeAgent selection logic")
        print("   3. Use force_final_report() function as fallback")
    else:
        print("‚úÖ All agents participated successfully")
    
    return agent_stats

# Enhanced comparison with diagnostic mode
async def compare_research_approaches(topic: str, max_messages: int = 10):
    """Compare RoundRobin vs SelectorGroupChat with enhanced diagnostics"""
    
    print(f"\nüÜö ENHANCED COMPARISON: RoundRobin vs SelectorGroupChat")
    print(f"Research Topic: {topic}")
    print("=" * 80)
    
    # Test RoundRobin approach
    print("\nüîÑ ROUND ROBIN APPROACH (Guaranteed Order)")
    print("-" * 50)
    global agent_team
    agent_team = agent_team_roundrobin
    agent_team.termination_condition = MaxMessageTermination(max_messages=max_messages)
    
    print("Using fixed order: Planning ‚Üí Search ‚Üí Citation ‚Üí Finalize ‚Üí repeat...")
    
    try:
        result1 = await run_deep_search_enhanced(topic, max_messages=max_messages)
    except Exception as e:
        print(f"RoundRobin approach encountered error: {e}")
        result1 = False
    
    print("\n" + "="*80)
    
    # Test Selector approach  
    print("\nüß† SELECTOR APPROACH (Intelligent Selection)")
    print("-" * 50)
    agent_team = agent_team_selector
    agent_team.termination_condition = completion_termination
    
    print("Using AI-powered agent selection based on context and expertise...")
    
    try:
        result2 = await run_deep_search_enhanced(topic, max_messages=max_messages)
    except Exception as e:
        print(f"Selector approach encountered error: {e}")
        result2 = False
    
    # Reset to selector as default
    agent_team = agent_team_selector
    
    print("\n" + "="*80)
    print("üéØ ENHANCED COMPARISON COMPLETE!")
    print("\nResults Summary:")
    print(f"üîÑ RoundRobin Final Report: {'‚úÖ Generated' if result1 else '‚ùå Missing'}")
    print(f"üß† Selector Final Report: {'‚úÖ Generated' if result2 else '‚ùå Missing'}")
    print("\nKey Differences Observed:")
    print("‚Ä¢ RoundRobin: Predictable, guaranteed agent participation")
    print("‚Ä¢ Selector: Dynamic, context-aware but may skip agents")
    print("‚Ä¢ Enhanced version includes fallback mechanisms for reliability")

print("üöÄ Enhanced research functions ready with diagnostics and fallback mechanisms!")
print("Available enhanced functions:")
print("‚Ä¢ await diagnose_research_session('topic') - Detailed agent participation analysis")
print("‚Ä¢ await force_final_report('context') - Force final report generation")
print("‚Ä¢ await compare_research_approaches('topic') - Enhanced comparison with diagnostics")

In [None]:
# Enhanced testing with team configuration verification
def test_team_configurations():
    """Test both team configurations"""
    print("üîß Testing Team Configurations...")
    print("=" * 50)
    
    # Test RoundRobin configuration
    print("‚úÖ RoundRobin Team:")
    print(f"   Participants: {len(agent_team_roundrobin.participants)} agents")
    for agent in agent_team_roundrobin.participants:
        print(f"   - {agent.name}")
    print(f"   Termination: {type(agent_team_roundrobin.termination_condition).__name__}")
    
    print()
    
    # Test Selector configuration
    if selector_available:
        print("‚úÖ Selector Team:")
        print(f"   Participants: {len(agent_team_selector.participants)} agents")
        for agent in agent_team_selector.participants:
            print(f"   - {agent.name}")
        print(f"   Termination: {type(agent_team_selector.termination_condition).__name__}")
        print(f"   Selector Model: Available")
    else:
        print("‚ùå Selector Team: Not available (using RoundRobin fallback)")
    
    print()

def test_openrouter_connection():
    """Test the OpenRouter API connection and model clients"""
    print("üîß Testing OpenRouter API connection and model clients...")
    
    # Test basic API access
    import requests
    test_headers = {
        "Authorization": f"Bearer {OPENROUTER_API_KEY}",
        "Content-Type": "application/json"
    }
    
    try:
        # Test basic API connectivity
        print("üì° Testing OpenRouter API connectivity...")
        response = requests.get(
            f"{OPENROUTER_BASE_URL}/models",
            headers=test_headers,
            timeout=10
        )
        
        if response.status_code == 200:
            print("‚úÖ OpenRouter API connection successful!")
            models_data = response.json()
            available_models = models_data.get('data', [])
            print(f"üìä Available models: {len(available_models)} models")
            
            # Check if our configured models are available
            model_ids = {model.get('id', '') for model in available_models}
            print("\nüîç Checking configured model availability:")
            for agent, model_name in MODEL_CONFIGS.items():
                if model_name in model_ids:
                    print(f"‚úÖ {agent.capitalize()}: {model_name} - Available")
                else:
                    print(f"‚ùå {agent.capitalize()}: {model_name} - Not found or not accessible")
                    
        else:
            print(f"‚ùå OpenRouter API connection failed: HTTP {response.status_code}")
            print(f"Response: {response.text}")
            return False
            
    except Exception as e:
        print(f"‚ùå OpenRouter API connection error: {str(e)}")
        return False
    
    return True

def test_scrapingdog_connection():
    """Test the ScrapingDog API connection"""
    print("\nüîß Testing ScrapingDog API connection...")
    
    # Test with a simple website
    test_url = "https://httpbin.org/html"
    result = scraping_client.scrape_url(test_url)
    
    if result["success"]:
        print("‚úÖ ScrapingDog API connection successful!")
        print(f"üìÑ Title: {result['title']}")
        print(f"üìä Content length: {len(result['content'])} characters")
        return True
    else:
        print("‚ùå ScrapingDog API connection failed!")
        print(f"Error: {result['error']}")
        return False

def test_web_scraping_tools():
    """Test the web scraping tools integration"""
    print("\nüîß Testing web scraping tools integration...")
    
    try:
        # Test scrape_website function directly
        print("üì° Testing scrape_website function...")
        test_url = "https://httpbin.org/html"
        result = scrape_website(test_url)
        
        if "Successfully scraped" in result:
            print("‚úÖ scrape_website function working correctly!")
        else:
            print("‚ùå scrape_website function failed!")
            return False
            
        # Test other functions
        sources = search_for_sources("test query")
        if "SEARCH RECOMMENDATIONS" in sources:
            print("‚úÖ search_for_sources function working correctly!")
        else:
            print("‚ùå search_for_sources function failed!")
            return False
            
        plan = create_research_plan("test topic")
        if "RESEARCH PLAN" in plan:
            print("‚úÖ create_research_plan function working correctly!")
        else:
            print("‚ùå create_research_plan function failed!")
            return False
            
        print("‚úÖ All web scraping tools are functioning correctly!")
        return True
        
    except Exception as e:
        print(f"‚ùå Web scraping tools test failed: {str(e)}")
        return False

# Quick system health check
async def quick_system_check():
    """Quick system health check for all components"""
    print("üöÄ Running Quick System Health Check")
    print("=" * 60)
    
    # Test configurations
    test_team_configurations()
    
    # Test APIs
    openrouter_ok = test_openrouter_connection()
    scrapingdog_ok = test_scrapingdog_connection()
    tools_ok = test_web_scraping_tools()
    
    print("\n" + "=" * 60)
    print("üìä SYSTEM HEALTH SUMMARY")
    print("=" * 60)
    
    total_checks = 3
    passed_checks = sum([openrouter_ok, scrapingdog_ok, tools_ok])
    
    print(f"‚úÖ Passed: {passed_checks}/{total_checks} checks")
    
    if passed_checks == total_checks:
        print("üéâ All systems operational! Ready for research.")
        
        # Try a simple single-agent test
        print("\nüß™ Testing single-agent functionality...")
        try:
            await force_final_report_simple("Test research topic")
            print("‚úÖ Single-agent functionality confirmed")
        except Exception as e:
            print(f"‚ö†Ô∏è  Single-agent test issue: {e}")
            
    else:
        print("‚ö†Ô∏è  Some issues detected:")
        if not openrouter_ok:
            print("   - Fix OpenRouter API configuration")
        if not scrapingdog_ok:
            print("   - Fix ScrapingDog API configuration")
        if not tools_ok:
            print("   - Fix web scraping tools integration")
    
    return passed_checks == total_checks

# Run the health check
await quick_system_check()

## Advanced Features & Agent Team Comparisons

This deep search system provides two powerful approaches for multi-agent collaboration:

### üîÑ RoundRobinGroupChat vs üß† SelectorGroupChat

| Feature | RoundRobinGroupChat | SelectorGroupChat |
|---------|-------------------|------------------|
| **Agent Selection** | Fixed, predictable order | AI-powered, context-aware |
| **Workflow** | Sequential: Planning ‚Üí Search ‚Üí Citation ‚Üí Finalize | Dynamic based on conversation needs |
| **Efficiency** | May have unnecessary turns | Optimized agent utilization |
| **Adaptability** | Fixed pattern regardless of task | Adapts to task complexity |
| **Best For** | Simple, structured workflows | Complex research requiring flexibility |
| **Coordination** | No coordination needed | Uses Claude 4 Sonnet for selection |

### üß† SelectorGroupChat Advantages for Research

1. **Intelligent Agent Selection**: Uses Claude 4 Sonnet to analyze conversation context and select the most appropriate agent
2. **Context-Aware Workflow**: Adapts to the specific research needs rather than following a rigid pattern
3. **Prevents Redundancy**: Avoids unnecessary agent calls when their expertise isn't needed
4. **Dynamic Collaboration**: Agents can be called multiple times when their expertise is valuable
5. **Enhanced Descriptions**: Rich agent descriptions help the selector make better choices

### üîÑ When to Use RoundRobinGroupChat

- **Predictable Workflows**: When you need guaranteed participation from all agents
- **Simple Tasks**: Straightforward research that benefits from structured progression
- **Educational Purposes**: Understanding each agent's role clearly
- **Debugging**: Easier to trace conversation flow and identify issues

### üß† When to Use SelectorGroupChat (Recommended)

- **Complex Research**: Multi-faceted topics requiring adaptive strategies
- **Efficiency**: When you want optimal agent utilization
- **Quality Focus**: When research quality is more important than process predictability
- **Real-world Applications**: Most production research scenarios benefit from intelligent selection

### üöÄ Implementation Highlights

**Enhanced Agent Descriptions**: SelectorGroupChat agents include detailed descriptions that help the AI coordinator make better selection decisions:

```python
planning_agent_enhanced = AssistantAgent(
    name="PlanningAgent",
    description="Specialized in breaking down complex research queries into actionable search tasks, creating structured research plans, and identifying authoritative sources."
)
```

**Custom Selector Prompt**: Guides the AI coordinator in making intelligent agent selection decisions based on research workflow best practices.

**Multiple AI Models**: Leverages different strengths:
- **Planning**: Llama 3.1 70B for strategic thinking
- **Search**: Gemini Pro 1.5 for content analysis
- **Citation**: Claude 3.5 Sonnet for academic precision
- **Coordination**: Claude 4 Sonnet for intelligent selection
- **Finalization**: Claude 4 Sonnet for advanced synthesis

### üìä Research Quality Benefits

SelectorGroupChat typically produces:
- **More Focused Conversations**: Agents speak when their expertise is most valuable
- **Better Context Utilization**: Selection considers full conversation history
- **Reduced Token Usage**: Fewer unnecessary agent interactions
- **Higher Quality Outputs**: Right agent for the right moment
- **Adaptive Workflows**: Handles unexpected research directions gracefully

In [None]:
# Example 3: Direct Comparison of Approaches
comparison_topic = "Impact of artificial intelligence on remote work productivity"

print("Example 3: Comparing RoundRobin vs SelectorGroupChat Approaches")
print("This will run the same research query using both approaches for comparison.")
await compare_research_approaches(comparison_topic, max_messages=8)