Skip to content

mkgrei/context-loading-test

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Multi-Domain Agent with Pydantic AI

A production-ready, modular agent system that intelligently routes queries to specialized domain experts. Built with pydantic-ai, supporting multiple LLM providers (OpenAI, Gemini, Claude), with comprehensive testing and evaluation frameworks.

✨ Features

  • 🎯 Intelligent Domain Classification: Automatically identifies domain from queries with confidence scoring
  • πŸ“š Dynamic Context Loading: Loads domain-specific contexts, tools, and knowledge
  • πŸ€– Multi-Model Support: Works with OpenAI GPT, Google Gemini, and Anthropic Claude
  • πŸ› οΈ Domain-Specific Tools: Each domain has specialized tools via pydantic-ai
  • πŸ“Š Evaluation Framework: Comprehensive benchmarking and model comparison
  • βœ… Full Test Coverage: Unit and integration tests with pytest
  • ⚑ Async First: Built with asyncio for performance
  • πŸ”’ Type Safe: Pydantic models throughout

πŸš€ Quick Start

Installation

# Clone repository
git clone <repository-url>
cd context-loading-test

# Install dependencies
pip install -r requirements.txt

# Setup environment
cp .env.example .env
# Edit .env and add your API keys

Basic Usage

from src.models.factory import ModelFactory
from src.agents.orchestrator import MultiDomainOrchestrator

# Create model provider (OpenAI, Gemini, or Claude)
provider = ModelFactory.create_openai("gpt-4o-mini")

# Create orchestrator
orchestrator = MultiDomainOrchestrator(provider)

# Process queries
result = orchestrator.process_sync("Write a Python function to reverse a string")

print(f"Domain: {result.domain}")
print(f"Answer: {result.answer}")

πŸ“ Project Structure

context-loading-test/
β”œβ”€β”€ src/                          # Core source code
β”‚   β”œβ”€β”€ agents/                   # Agent implementations
β”‚   β”‚   β”œβ”€β”€ classifier.py         # Domain classification
β”‚   β”‚   └── orchestrator.py       # Main orchestration
β”‚   β”œβ”€β”€ domains/                  # Domain-specific contexts
β”‚   β”‚   β”œβ”€β”€ base.py               # Base domain class
β”‚   β”‚   β”œβ”€β”€ programming.py        # Programming domain
β”‚   β”‚   β”œβ”€β”€ mathematics.py        # Mathematics domain
β”‚   β”‚   β”œβ”€β”€ data_analysis.py      # Data analysis domain
β”‚   β”‚   β”œβ”€β”€ research.py           # Research domain
β”‚   β”‚   └── registry.py           # Domain registry
β”‚   β”œβ”€β”€ models/                   # Model providers
β”‚   β”‚   β”œβ”€β”€ base.py               # Base provider class
β”‚   β”‚   β”œβ”€β”€ openai_provider.py    # OpenAI implementation
β”‚   β”‚   β”œβ”€β”€ gemini_provider.py    # Gemini implementation
β”‚   β”‚   β”œβ”€β”€ claude_provider.py    # Claude implementation
β”‚   β”‚   └── factory.py            # Model factory
β”‚   └── utils/                    # Utilities
β”‚
β”œβ”€β”€ tests/                        # Test suite
β”‚   β”œβ”€β”€ unit/                     # Unit tests
β”‚   β”‚   β”œβ”€β”€ test_models.py
β”‚   β”‚   └── test_domains.py
β”‚   β”œβ”€β”€ integration/              # Integration tests
β”‚   β”‚   └── test_orchestrator.py
β”‚   └── conftest.py               # Pytest configuration
β”‚
β”œβ”€β”€ evaluation/                   # Model evaluation
β”‚   β”œβ”€β”€ benchmarks.py             # Test cases & suites
β”‚   β”œβ”€β”€ evaluator.py              # Evaluation runner
β”‚   └── metrics.py                # Metrics calculation
β”‚
β”œβ”€β”€ examples/                     # Example scripts
β”‚   β”œβ”€β”€ basic_usage.py            # Basic usage example
β”‚   β”œβ”€β”€ multi_model.py            # Multi-model comparison
β”‚   └── run_evaluation.py         # Run evaluations
β”‚
β”œβ”€β”€ config/                       # Configuration files
β”‚   └── models.yaml               # Model configurations
β”‚
β”œβ”€β”€ docs/                         # Documentation
β”‚   β”œβ”€β”€ ARCHITECTURE.md           # Architecture guide
β”‚   └── API_REFERENCE.md          # API documentation
β”‚
β”œβ”€β”€ requirements.txt              # Dependencies
β”œβ”€β”€ pytest.ini                    # Pytest configuration
β”œβ”€β”€ .env.example                  # Environment template
└── README.md                     # This file

🎯 Supported Domains

1. Programming

# Example: Code implementation, debugging, algorithms
query = "How do I implement quicksort in Python?"
# Returns: Code with explanation, complexity analysis

Tools: get_language_libraries, suggest_language, explain_complexity

2. Mathematics

# Example: Equations, calculations, proofs
query = "Solve x^2 + 5x + 6 = 0"
# Returns: Step-by-step solution with explanation

Tools: calculate_basic, solve_quadratic, get_formula

3. Data Analysis

# Example: Statistics, visualization, insights
query = "How do I detect outliers in my dataset?"
# Returns: Statistical methods, code examples

Tools: recommend_viz, suggest_statistical_test, explain_metric

4. Research

# Example: Information gathering, citations
query = "What are the effects of climate change?"
# Returns: Comprehensive answer with sources

Tools: format_citation, suggest_search_terms, evaluate_source

πŸ”§ Model Providers

OpenAI

from src.models.factory import ModelFactory

provider = ModelFactory.create_openai(
    model_name="gpt-4o-mini",
    temperature=0.7
)

Available Models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo

Google Gemini

provider = ModelFactory.create_gemini(
    model_name="gemini-1.5-flash",
    temperature=0.7
)

Available Models: gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b

Anthropic Claude

provider = ModelFactory.create_claude(
    model_name="claude-3-5-sonnet-20241022",
    temperature=0.7
)

Available Models: claude-3-5-sonnet, claude-3-5-haiku, claude-3-opus

Note: Yes, you can use Claude SDK through pydantic-ai's Anthropic integration!

πŸ“Š Evaluation & Benchmarking

Run Evaluations

python examples/run_evaluation.py

Programmatic Usage

from src.models.factory import ModelFactory
from evaluation.evaluator import ModelEvaluator

# Create evaluator
provider = ModelFactory.create_openai("gpt-4o-mini")
evaluator = ModelEvaluator(provider)

# Evaluate all domains
results = await evaluator.evaluate_all_domains()

# Print results
for domain, domain_results in results.items():
    evaluator.print_results(domain_results)

Compare Models

# Compare different models
primary = ModelFactory.create_openai("gpt-4o-mini")
evaluator = ModelEvaluator(primary)

other_models = [
    ModelFactory.create_gemini("gemini-1.5-flash"),
    ModelFactory.create_claude("claude-3-5-sonnet-20241022")
]

comparison = await evaluator.compare_models(other_models)
evaluator.print_comparison(comparison)

Evaluation Metrics

  • Classification Accuracy: Correctness of domain identification
  • Confidence Scores: Model confidence in classifications
  • Execution Time: Response time per query
  • Keyword Match Rate: Presence of expected keywords
  • Confusion Matrix: Classification error patterns

πŸ§ͺ Testing

Run All Tests

pytest

Run Specific Test Types

# Unit tests only
pytest tests/unit/

# Integration tests (requires API keys)
pytest tests/integration/

# With coverage
pytest --cov=src --cov-report=html

Test Structure

  • Unit Tests: Test individual components in isolation
  • Integration Tests: Test component interactions
  • Evaluation Tests: Benchmark performance

πŸ—οΈ Architecture

Query β†’ Orchestrator β†’ Classifier β†’ Domain Context β†’ Specialized Agent β†’ Result
                          ↓
                    Model Provider
                   (OpenAI/Gemini/Claude)

See docs/ARCHITECTURE.md for detailed architecture documentation.

πŸ“š API Reference

See docs/API_REFERENCE.md for complete API documentation.

πŸ”¨ Development

Adding a New Domain

  1. Create domain class in src/domains/:
from src.domains.base import DomainContext
from pydantic_ai import Agent, RunContext

@dataclass
class MyDomain(DomainContext):
    @property
    def name(self) -> str:
        return "My Domain"

    def create_agent(self, model_string: str) -> Agent:
        agent = Agent(model_string, system_prompt=self.system_prompt)

        @agent.tool
        def my_tool(ctx: RunContext[MyDomain], param: str) -> str:
            return "result"

        return agent
  1. Register in src/domains/registry.py:
from .my_domain import MyDomain

class DomainRegistry:
    _domains = {
        ...
        "my_domain": MyDomain,
    }

Adding a New Model Provider

  1. Create provider class in src/models/:
from src.models.base import ModelProvider

class MyProvider(ModelProvider):
    def get_model_string(self) -> str:
        return f"my_provider:{self.config.model_name}"

    def validate_config(self) -> bool:
        # Validation logic
        return True
  1. Register in src/models/factory.py:
class ModelFactory:
    _providers = {
        ...
        "my_provider": MyProvider,
    }

🀝 Contributing

Contributions welcome! Please:

  1. Fork the repository
  2. Create a feature branch
  3. Add tests for new features
  4. Ensure tests pass: pytest
  5. Submit a pull request

πŸ“ License

MIT License

πŸ™ Acknowledgments

πŸ“– Resources

πŸ› Troubleshooting

API Key Issues

# Set environment variable
export OPENAI_API_KEY='your-key-here'

# Or use .env file
echo "OPENAI_API_KEY=your-key-here" > .env

Module Not Found

pip install -r requirements.txt

Tests Failing

# Make sure API keys are set for integration tests
export OPENAI_API_KEY='your-key'

# Run without integration tests
pytest tests/unit/

πŸ“Š Example Output

πŸ” Analyzing query: 'Write a Python function to reverse a string'

πŸ“‹ Domain identified: programming
   Confidence: 95.0%
   Reasoning: Query asks for Python code implementation

πŸ“š Loading programming context...
   Context loaded: Programming

πŸ€– Creating specialized Programming agent...
   Executing query with OpenAI - gpt-4o-mini...

βœ… Query processed successfully

πŸ“ ANSWER:
Here's a Python function to reverse a string:

def reverse_string(s: str) -> str:
    """Reverse a string using slicing."""
    return s[::-1]

# Example usage:
print(reverse_string("hello"))  # Output: "olleh"

This uses Python's slicing feature with a step of -1 to reverse the string...

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages