# AutoGen Group Chat with LlamaIndex Agents

This notebook demonstrates how to integrate AutoGen's group chat functionality with LlamaIndex agents to create a powerful multi-agent system.

## Overview

This example shows:
- Setting up AutoGen agents
- Integrating LlamaIndex for document retrieval
- Creating a group chat with multiple specialized agents
- Coordinating agents to solve complex tasks

## Prerequisites

Install required packages:
```bash
pip install pyautogen llama-index openai
```

In [None]:
# Import required libraries
import os
from typing import Dict, List, Optional

import autogen
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
from llama_index.llms.openai import OpenAI as LlamaOpenAI
from llama_index.embeddings.openai import OpenAIEmbedding

## Configuration

Set up your OpenAI API key and configure the LLM settings.

In [None]:
# Set up OpenAI API key
os.environ["OPENAI_API_KEY"] = "your-api-key-here"

# Configure AutoGen
config_list = [
    {
        "model": "gpt-4",
        "api_key": os.environ.get("OPENAI_API_KEY"),
    }
]

# LLM configuration
llm_config = {
    "timeout": 600,
    "cache_seed": 42,
    "config_list": config_list,
    "temperature": 0,
}

## Set Up LlamaIndex

Create a LlamaIndex instance for document retrieval and question answering.

In [None]:
# Configure LlamaIndex settings
Settings.llm = LlamaOpenAI(model="gpt-4", temperature=0)
Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")

# Load documents (replace with your document directory)
# documents = SimpleDirectoryReader("./data").load_data()
# index = VectorStoreIndex.from_documents(documents)

# For demo purposes, create a simple index
sample_documents = [
    "SurfSense is an AI research assistant that integrates with your personal knowledge base.",
    "It supports multiple file formats including PDF, DOCX, images, and videos.",
    "SurfSense uses advanced RAG techniques with hybrid search and rerankers.",
    "You can generate podcasts from your chat conversations or documents.",
]

# Create query engine
# query_engine = index.as_query_engine()

## Define Agents

Create specialized agents for different tasks.

In [None]:
# 1. User Proxy Agent - represents the human user
user_proxy = autogen.UserProxyAgent(
    name="User_Proxy",
    system_message="A human user who provides tasks and feedback.",
    code_execution_config={"last_n_messages": 2, "work_dir": "groupchat"},
    human_input_mode="NEVER",  # Change to "ALWAYS" for interactive mode
)

# 2. Document Researcher - uses LlamaIndex for document retrieval
document_researcher = autogen.AssistantAgent(
    name="Document_Researcher",
    llm_config=llm_config,
    system_message="""You are a document researcher. 
    Your role is to search through documents and provide relevant information.
    Use the available tools to query the document index and provide accurate, cited answers.
    Always cite your sources when providing information from documents.""",
)

# 3. Data Analyst - analyzes information and provides insights
data_analyst = autogen.AssistantAgent(
    name="Data_Analyst",
    llm_config=llm_config,
    system_message="""You are a data analyst.
    Your role is to analyze information, identify patterns, and provide insights.
    You work with data provided by other agents and synthesize meaningful conclusions.""",
)

# 4. Report Writer - creates comprehensive reports
report_writer = autogen.AssistantAgent(
    name="Report_Writer",
    llm_config=llm_config,
    system_message="""You are a professional report writer.
    Your role is to compile information from other agents into clear, well-structured reports.
    Create comprehensive summaries with proper formatting and citations.""",
)

# 5. Code Executor - runs code and validates results
code_executor = autogen.AssistantAgent(
    name="Code_Executor",
    llm_config=llm_config,
    system_message="""You are a code execution specialist.
    Your role is to write and execute Python code to process data, create visualizations, or perform calculations.
    Always provide clear explanations of your code and results.""",
)

## Create LlamaIndex Query Function

Define a function that allows agents to query the LlamaIndex.

In [None]:
def query_documents(query: str) -> str:
    """
    Query the document index using LlamaIndex.
    
    Args:
        query: The search query
        
    Returns:
        The response from the query engine
    """
    try:
        # In a real implementation, you would use:
        # response = query_engine.query(query)
        # return str(response)
        
        # Demo response
        return f"""Based on the documents, here's what I found about '{query}':
        SurfSense is a comprehensive AI research assistant that integrates with your 
        personal knowledge base. It supports 50+ file formats and uses advanced RAG 
        techniques including hybrid search, vector embeddings, and rerankers for 
        optimal information retrieval."""
    except Exception as e:
        return f"Error querying documents: {str(e)}"

# Register the function with the document researcher agent
autogen.register_function(
    query_documents,
    caller=document_researcher,
    executor=user_proxy,
    name="query_documents",
    description="Query the document index to retrieve relevant information. Use this when you need to find specific information from documents.",
)

## Set Up Group Chat

Create a group chat with all agents and define the conversation flow.

In [None]:
# Create group chat
groupchat = autogen.GroupChat(
    agents=[user_proxy, document_researcher, data_analyst, report_writer, code_executor],
    messages=[],
    max_round=12,
    speaker_selection_method="round_robin",  # or "auto" for automatic selection
)

# Create manager
manager = autogen.GroupChatManager(
    groupchat=groupchat,
    llm_config=llm_config,
)

## Example 1: Simple Document Query Task

Ask the agents to research and summarize information about SurfSense.

In [None]:
# Start the conversation
task_1 = """
Research SurfSense and create a brief summary report that includes:
1. What is SurfSense?
2. What are its key features?
3. What technologies does it use?

Document Researcher: Query the documents for information.
Data Analyst: Analyze the information.
Report Writer: Create a final summary report.
"""

user_proxy.initiate_chat(
    manager,
    message=task_1,
)

## Example 2: Complex Multi-Step Task

A more complex task involving research, data analysis, and code execution.

In [None]:
task_2 = """
Perform a comprehensive analysis of AI agent frameworks:

1. Document Researcher: Find information about CrewAI, AutoGen, and LangGraph
2. Data Analyst: Compare their features and use cases
3. Code Executor: Create a simple visualization comparing the frameworks
4. Report Writer: Compile everything into a final report with recommendations

Please work together to complete this task.
"""

# Reset the group chat for a fresh conversation
groupchat.messages = []

user_proxy.initiate_chat(
    manager,
    message=task_2,
)

## Advanced: Custom Speaker Selection

Implement a custom speaker selection method for more control over the conversation flow.

In [None]:
def custom_speaker_selection_func(last_speaker, groupchat):
    """
    Custom function to determine the next speaker based on the conversation context.
    
    Args:
        last_speaker: The agent who spoke last
        groupchat: The group chat object
        
    Returns:
        The next agent to speak
    """
    messages = groupchat.messages
    
    # If no messages yet, start with user proxy
    if len(messages) == 0:
        return user_proxy
    
    # Get the last message
    last_message = messages[-1]["content"].lower()
    
    # Decision logic based on keywords in the last message
    if "query" in last_message or "search" in last_message or "document" in last_message:
        return document_researcher
    elif "analyze" in last_message or "compare" in last_message:
        return data_analyst
    elif "code" in last_message or "visualize" in last_message or "plot" in last_message:
        return code_executor
    elif "report" in last_message or "summary" in last_message or "conclude" in last_message:
        return report_writer
    else:
        # Default to round-robin if no keywords match
        agents = groupchat.agents
        last_idx = agents.index(last_speaker)
        return agents[(last_idx + 1) % len(agents)]

# Create group chat with custom speaker selection
groupchat_custom = autogen.GroupChat(
    agents=[user_proxy, document_researcher, data_analyst, report_writer, code_executor],
    messages=[],
    max_round=12,
    speaker_selection_method=custom_speaker_selection_func,
)

manager_custom = autogen.GroupChatManager(
    groupchat=groupchat_custom,
    llm_config=llm_config,
)

## Integration with SurfSense

Example of how this could integrate with SurfSense's backend.

In [None]:
# Pseudo-code for SurfSense integration

class SurfSenseAgentChat:
    """
    Integration wrapper for using AutoGen + LlamaIndex in SurfSense.
    """
    
    def __init__(self, search_space_id: int, user_id: int):
        self.search_space_id = search_space_id
        self.user_id = user_id
        # Initialize agents and index
        self.setup_agents()
        
    def setup_agents(self):
        """Set up agents with SurfSense-specific configurations."""
        # Load user's documents from SurfSense database
        # Create LlamaIndex from user's search space
        # Initialize AutoGen agents
        pass
    
    def query_with_agents(self, user_query: str) -> dict:
        """
        Process a user query using the agent group chat.
        
        Args:
            user_query: The user's question or task
            
        Returns:
            Dictionary containing the conversation history and final result
        """
        # Initiate group chat
        # Collect results
        # Return structured response
        pass

# Example usage:
# agent_chat = SurfSenseAgentChat(search_space_id=1, user_id=123)
# result = agent_chat.query_with_agents("Analyze my recent research papers")

## Best Practices

1. **Agent Specialization**: Create agents with clear, specific roles
2. **Task Decomposition**: Break complex tasks into smaller subtasks for each agent
3. **Error Handling**: Implement robust error handling in custom functions
4. **Token Management**: Monitor token usage to stay within API limits
5. **Conversation History**: Keep track of conversation context for better results
6. **Testing**: Test your agent system with various scenarios before production use

## Next Steps

- Experiment with different agent configurations
- Add more specialized agents (e.g., fact checker, quality assurance)
- Integrate with real document collections
- Implement advanced speaker selection logic
- Add monitoring and logging for production use

## Resources

- [AutoGen Documentation](https://microsoft.github.io/autogen/stable/)
- [LlamaIndex Documentation](https://docs.llamaindex.ai/)
- [SurfSense Documentation](https://www.surfsense.net/docs/)