# Using Roundtable MCP Server with LlamaIndex

Integrate unified AI assistant access into your LlamaIndex workflows using the [Roundtable MCP Server](https://github.com/punkpeye/roundtable).

## Overview

**Roundtable** is a unified AI assistant access system that allows you to interact with multiple AI providers (OpenAI, Anthropic, Google, etc.) through a single, consistent interface. The Roundtable MCP Server brings this powerful capability directly into your LlamaIndex applications.

### Key Benefits for LlamaIndex Developers:

- **Unified AI Access**: Switch between AI providers without changing your LlamaIndex code
- **Enhanced RAG Workflows**: Combine multiple AI assistants for complex reasoning tasks
- **Multi-Model Strategies**: Use different models for different parts of your pipeline
- **Cost Optimization**: Route queries to the most cost-effective model for each task
- **Fallback Support**: Automatic failover between AI providers for reliability

This integration is particularly powerful for LlamaIndex applications that need:
- Complex multi-step reasoning with different AI capabilities
- Robust production systems with multiple AI provider fallbacks
- Cost-optimized AI usage across different model types
- Flexible AI provider switching without code changes

## Installation

First, install the required packages:

In [None]:
# Install LlamaIndex MCP tools
!pip install llama-index-tools-mcp

# Install Roundtable MCP Server
!npm install -g @punkpeye/roundtable

## Setup Roundtable MCP Server

Configure your AI provider credentials and start the Roundtable MCP Server:

In [None]:
# Create Roundtable configuration
import json
import os

# Configure your AI providers
roundtable_config = {
    "providers": {
        "openai": {
            "apiKey": os.getenv("OPENAI_API_KEY"),
            "models": ["gpt-4", "gpt-3.5-turbo"]
        },
        "anthropic": {
            "apiKey": os.getenv("ANTHROPIC_API_KEY"),
            "models": ["claude-3-sonnet", "claude-3-haiku"]
        },
        "google": {
            "apiKey": os.getenv("GOOGLE_API_KEY"),
            "models": ["gemini-pro", "gemini-pro-vision"]
        }
    },
    "defaultProvider": "openai",
    "fallbackProviders": ["anthropic", "google"]
}

# Save configuration
with open("roundtable-config.json", "w") as f:
    json.dump(roundtable_config, f, indent=2)

print("Roundtable configuration created!")

Start the Roundtable MCP Server (run this in a separate terminal):

```bash
roundtable mcp --config roundtable-config.json --port 8080
```

## Basic Usage: Connect to Roundtable MCP Server

Load tools from the Roundtable MCP Server:

In [None]:
from llama_index.tools.mcp import aget_tools_from_mcp_url
import asyncio

# Connect to Roundtable MCP Server
roundtable_tools = await aget_tools_from_mcp_url("http://127.0.0.1:8080/mcp")

print(f"Loaded {len(roundtable_tools)} tools from Roundtable MCP Server:")
for tool in roundtable_tools:
    print(f"- {tool.metadata.name}: {tool.metadata.description}")

## Example 1: Enhanced RAG with Multi-Model Strategy

Use different AI models for different aspects of your RAG pipeline:

In [None]:
from llama_index.core import SimpleDirectoryReader, VectorStoreIndex
from llama_index.core.agent.workflow import FunctionAgent
from llama_index.llms.openai import OpenAI

# Load your documents
documents = SimpleDirectoryReader("your_documents").load_data()
index = VectorStoreIndex.from_documents(documents)

# Create an agent with Roundtable tools
agent = FunctionAgent(
    tools=roundtable_tools,
    llm=OpenAI(model="gpt-4"),
    system_prompt="""
    You are an advanced RAG assistant with access to multiple AI providers through Roundtable.
    
    For complex reasoning tasks, use Claude models.
    For quick factual queries, use GPT-3.5-turbo.
    For creative tasks, use GPT-4.
    For visual analysis, use Gemini Pro Vision.
    
    Always choose the most appropriate model for each subtask.
    """
)

# Enhanced query with multi-model reasoning
response = await agent.run("""
Analyze the uploaded documents and:
1. Use Claude for deep reasoning about the main themes
2. Use GPT-3.5 for quick fact extraction
3. Use GPT-4 for creative synthesis
4. Provide a comprehensive analysis combining all perspectives
""")

print(response)

## Example 2: Cost-Optimized Query Routing

Automatically route queries to the most cost-effective model:

In [None]:
# Create a cost-optimized agent
cost_optimized_agent = FunctionAgent(
    tools=roundtable_tools,
    llm=OpenAI(model="gpt-3.5-turbo"),
    system_prompt="""
    You are a cost-optimization specialist. For each query:
    
    - Simple factual questions → Use GPT-3.5-turbo (lowest cost)
    - Complex reasoning → Use Claude Haiku (balanced cost/performance)
    - Creative tasks → Use GPT-4 only when necessary
    - Code generation → Use Claude Sonnet
    
    Always justify your model choice based on cost-effectiveness.
    """
)

# Example queries with automatic cost optimization
queries = [
    "What is the capital of France?",  # Simple → GPT-3.5
    "Analyze the philosophical implications of AI consciousness",  # Complex → Claude
    "Write a creative story about time travel",  # Creative → GPT-4
    "Generate Python code for data analysis"  # Code → Claude Sonnet
]

for query in queries:
    print(f"\nQuery: {query}")
    response = await cost_optimized_agent.run(query)
    print(f"Response: {response}")
    print("-" * 50)

## Example 3: Robust Production System with Fallbacks

Build resilient systems with automatic provider fallbacks:

In [None]:
# Create a production-ready agent with fallback support
production_agent = FunctionAgent(
    tools=roundtable_tools,
    llm=OpenAI(model="gpt-4"),
    system_prompt="""
    You are a production AI assistant with robust fallback capabilities.
    
    Primary strategy: Use OpenAI GPT-4
    Fallback 1: If OpenAI fails, use Anthropic Claude
    Fallback 2: If Anthropic fails, use Google Gemini
    
    Always ensure responses are delivered even if the primary provider fails.
    Log which provider was used for monitoring purposes.
    """
)

# Simulate production queries with error handling
async def robust_query(query: str):
    try:
        response = await production_agent.run(f"""
        Process this query with fallback support: {query}
        
        If the primary provider fails, automatically try fallback providers.
        Include in your response which AI provider successfully handled the query.
        """)
        return response
    except Exception as e:
        print(f"Error with all providers: {e}")
        return "All AI providers are currently unavailable. Please try again later."

# Test production resilience
production_query = "Analyze market trends for Q4 2024 and provide strategic recommendations"
result = await robust_query(production_query)
print(f"Production Result: {result}")

## Example 4: Advanced Multi-Modal Workflow

Combine text, image, and code analysis using different specialized models:

In [None]:
# Multi-modal analysis agent
multimodal_agent = FunctionAgent(
    tools=roundtable_tools,
    llm=OpenAI(model="gpt-4"),
    system_prompt="""
    You are a multi-modal AI assistant specialist. Route tasks to the best model:
    
    - Image analysis → Gemini Pro Vision
    - Code review → Claude Sonnet
    - Document analysis → GPT-4
    - Creative writing → GPT-4
    - Technical reasoning → Claude Sonnet
    - Quick facts → GPT-3.5-turbo
    
    Coordinate between models to provide comprehensive analysis.
    """
)

# Complex multi-modal workflow
workflow_query = """
I have a project with:
1. A screenshot of a UI mockup (image)
2. Python backend code
3. A requirements document (text)

Please:
1. Analyze the UI mockup using vision capabilities
2. Review the Python code for best practices
3. Check if the code meets the requirements
4. Provide a comprehensive project assessment
"""

comprehensive_analysis = await multimodal_agent.run(workflow_query)
print(f"Multi-modal Analysis: {comprehensive_analysis}")

## Integration with LlamaIndex Query Engines

Enhance your existing LlamaIndex query engines with Roundtable's multi-provider capabilities:

In [None]:
from llama_index.core.query_engine import SubQuestionQueryEngine
from llama_index.core.tools import QueryEngineTool

# Create enhanced query engines with Roundtable
base_query_engine = index.as_query_engine()

# Wrap with Roundtable-enhanced capabilities
enhanced_query_engine = SubQuestionQueryEngine.from_defaults(
    query_engine_tools=[
        QueryEngineTool.from_defaults(
            query_engine=base_query_engine,
            description="Base document search and analysis"
        )
    ] + roundtable_tools,  # Add Roundtable tools
    llm=OpenAI(model="gpt-4")
)

# Enhanced querying with multiple AI providers
enhanced_response = enhanced_query_engine.query("""
Analyze the documents using multiple AI perspectives:
1. Use Claude for philosophical analysis
2. Use GPT-4 for technical insights
3. Use Gemini for alternative viewpoints
4. Synthesize all perspectives into a unified answer
""")

print(f"Enhanced Response: {enhanced_response}")

## Monitoring and Analytics

Track usage patterns and optimize your AI provider selection:

In [None]:
# Analytics and monitoring agent
analytics_agent = FunctionAgent(
    tools=roundtable_tools,
    llm=OpenAI(model="gpt-3.5-turbo"),
    system_prompt="""
    You are an AI usage analytics specialist. Track:
    
    - Which providers are used most frequently
    - Response time patterns
    - Cost optimization opportunities
    - Error rates and fallback usage
    
    Provide actionable insights for optimization.
    """
)

# Get usage analytics
analytics_query = """
Analyze our AI provider usage and provide:
1. Most cost-effective provider combinations
2. Recommendations for query routing optimization
3. Suggestions for reducing overall AI costs
4. Performance optimization recommendations
"""

analytics_result = await analytics_agent.run(analytics_query)
print(f"Usage Analytics: {analytics_result}")

## Best Practices

### 1. Provider Selection Strategy
- **GPT-3.5-turbo**: Quick factual queries, simple tasks
- **GPT-4**: Complex reasoning, creative tasks
- **Claude Sonnet**: Code analysis, technical documentation
- **Claude Haiku**: Fast reasoning, cost-effective analysis
- **Gemini Pro**: Alternative perspectives, multimodal tasks

### 2. Cost Optimization
- Route simple queries to cheaper models
- Use expensive models only for complex tasks
- Implement intelligent caching for repeated queries
- Monitor usage patterns and adjust routing

### 3. Error Handling
- Always configure fallback providers
- Implement retry logic with exponential backoff
- Log provider failures for monitoring
- Test failover scenarios regularly

### 4. Performance Monitoring
- Track response times by provider
- Monitor error rates and success rates
- Analyze cost per query by provider
- Set up alerts for service degradation

## Configuration Options

Advanced Roundtable configuration for production use:

In [None]:
# Advanced production configuration
production_config = {
    "providers": {
        "openai": {
            "apiKey": os.getenv("OPENAI_API_KEY"),
            "models": ["gpt-4", "gpt-3.5-turbo"],
            "rateLimits": {
                "requestsPerMinute": 60,
                "tokensPerMinute": 90000
            },
            "timeout": 30000,
            "retryAttempts": 3
        },
        "anthropic": {
            "apiKey": os.getenv("ANTHROPIC_API_KEY"),
            "models": ["claude-3-sonnet", "claude-3-haiku"],
            "rateLimits": {
                "requestsPerMinute": 50,
                "tokensPerMinute": 100000
            },
            "timeout": 45000,
            "retryAttempts": 2
        }
    },
    "routing": {
        "strategy": "cost-optimized",
        "fallbackEnabled": True,
        "loadBalancing": True
    },
    "monitoring": {
        "enabled": True,
        "logLevel": "info",
        "metricsEndpoint": "/metrics"
    }
}

# Save production configuration
with open("roundtable-production.json", "w") as f:
    json.dump(production_config, f, indent=2)

print("Production configuration created!")

## Conclusion

The Roundtable MCP Server integration with LlamaIndex provides:

1. **Unified AI Access**: Single interface to multiple AI providers
2. **Enhanced Reliability**: Automatic failover and error handling
3. **Cost Optimization**: Intelligent routing to the most cost-effective models
4. **Improved Performance**: Task-specific model selection for optimal results
5. **Production Ready**: Robust configuration and monitoring capabilities

This integration is ideal for LlamaIndex applications that require:
- High availability and reliability
- Cost-effective AI usage
- Complex multi-model workflows
- Production-grade AI systems

### Next Steps

1. **Explore the [Roundtable documentation](https://github.com/punkpeye/roundtable)** for advanced features
2. **Configure your production setup** with appropriate provider credentials
3. **Implement monitoring and analytics** to optimize your AI usage
4. **Experiment with different routing strategies** for your specific use case

### Resources

- [Roundtable GitHub Repository](https://github.com/punkpeye/roundtable)
- [LlamaIndex MCP Documentation](/python/framework/module_guides/mcp/llamaindex_mcp)
- [LlamaIndex Tools Documentation](/python/framework/module_guides/deploying/agents/tools)
- [MCP Protocol Specification](https://spec.modelcontextprotocol.io/)