A production-ready, modular agent system that intelligently routes queries to specialized domain experts. Built with pydantic-ai, supporting multiple LLM providers (OpenAI, Gemini, Claude), with comprehensive testing and evaluation frameworks.
- π― Intelligent Domain Classification: Automatically identifies domain from queries with confidence scoring
- π Dynamic Context Loading: Loads domain-specific contexts, tools, and knowledge
- π€ Multi-Model Support: Works with OpenAI GPT, Google Gemini, and Anthropic Claude
- π οΈ Domain-Specific Tools: Each domain has specialized tools via pydantic-ai
- π Evaluation Framework: Comprehensive benchmarking and model comparison
- β Full Test Coverage: Unit and integration tests with pytest
- β‘ Async First: Built with asyncio for performance
- π Type Safe: Pydantic models throughout
# Clone repository
git clone <repository-url>
cd context-loading-test
# Install dependencies
pip install -r requirements.txt
# Setup environment
cp .env.example .env
# Edit .env and add your API keysfrom src.models.factory import ModelFactory
from src.agents.orchestrator import MultiDomainOrchestrator
# Create model provider (OpenAI, Gemini, or Claude)
provider = ModelFactory.create_openai("gpt-4o-mini")
# Create orchestrator
orchestrator = MultiDomainOrchestrator(provider)
# Process queries
result = orchestrator.process_sync("Write a Python function to reverse a string")
print(f"Domain: {result.domain}")
print(f"Answer: {result.answer}")context-loading-test/
βββ src/ # Core source code
β βββ agents/ # Agent implementations
β β βββ classifier.py # Domain classification
β β βββ orchestrator.py # Main orchestration
β βββ domains/ # Domain-specific contexts
β β βββ base.py # Base domain class
β β βββ programming.py # Programming domain
β β βββ mathematics.py # Mathematics domain
β β βββ data_analysis.py # Data analysis domain
β β βββ research.py # Research domain
β β βββ registry.py # Domain registry
β βββ models/ # Model providers
β β βββ base.py # Base provider class
β β βββ openai_provider.py # OpenAI implementation
β β βββ gemini_provider.py # Gemini implementation
β β βββ claude_provider.py # Claude implementation
β β βββ factory.py # Model factory
β βββ utils/ # Utilities
β
βββ tests/ # Test suite
β βββ unit/ # Unit tests
β β βββ test_models.py
β β βββ test_domains.py
β βββ integration/ # Integration tests
β β βββ test_orchestrator.py
β βββ conftest.py # Pytest configuration
β
βββ evaluation/ # Model evaluation
β βββ benchmarks.py # Test cases & suites
β βββ evaluator.py # Evaluation runner
β βββ metrics.py # Metrics calculation
β
βββ examples/ # Example scripts
β βββ basic_usage.py # Basic usage example
β βββ multi_model.py # Multi-model comparison
β βββ run_evaluation.py # Run evaluations
β
βββ config/ # Configuration files
β βββ models.yaml # Model configurations
β
βββ docs/ # Documentation
β βββ ARCHITECTURE.md # Architecture guide
β βββ API_REFERENCE.md # API documentation
β
βββ requirements.txt # Dependencies
βββ pytest.ini # Pytest configuration
βββ .env.example # Environment template
βββ README.md # This file
# Example: Code implementation, debugging, algorithms
query = "How do I implement quicksort in Python?"
# Returns: Code with explanation, complexity analysisTools: get_language_libraries, suggest_language, explain_complexity
# Example: Equations, calculations, proofs
query = "Solve x^2 + 5x + 6 = 0"
# Returns: Step-by-step solution with explanationTools: calculate_basic, solve_quadratic, get_formula
# Example: Statistics, visualization, insights
query = "How do I detect outliers in my dataset?"
# Returns: Statistical methods, code examplesTools: recommend_viz, suggest_statistical_test, explain_metric
# Example: Information gathering, citations
query = "What are the effects of climate change?"
# Returns: Comprehensive answer with sourcesTools: format_citation, suggest_search_terms, evaluate_source
from src.models.factory import ModelFactory
provider = ModelFactory.create_openai(
model_name="gpt-4o-mini",
temperature=0.7
)Available Models: gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-4, gpt-3.5-turbo
provider = ModelFactory.create_gemini(
model_name="gemini-1.5-flash",
temperature=0.7
)Available Models: gemini-1.5-pro, gemini-1.5-flash, gemini-1.5-flash-8b
provider = ModelFactory.create_claude(
model_name="claude-3-5-sonnet-20241022",
temperature=0.7
)Available Models: claude-3-5-sonnet, claude-3-5-haiku, claude-3-opus
Note: Yes, you can use Claude SDK through pydantic-ai's Anthropic integration!
python examples/run_evaluation.pyfrom src.models.factory import ModelFactory
from evaluation.evaluator import ModelEvaluator
# Create evaluator
provider = ModelFactory.create_openai("gpt-4o-mini")
evaluator = ModelEvaluator(provider)
# Evaluate all domains
results = await evaluator.evaluate_all_domains()
# Print results
for domain, domain_results in results.items():
evaluator.print_results(domain_results)# Compare different models
primary = ModelFactory.create_openai("gpt-4o-mini")
evaluator = ModelEvaluator(primary)
other_models = [
ModelFactory.create_gemini("gemini-1.5-flash"),
ModelFactory.create_claude("claude-3-5-sonnet-20241022")
]
comparison = await evaluator.compare_models(other_models)
evaluator.print_comparison(comparison)- Classification Accuracy: Correctness of domain identification
- Confidence Scores: Model confidence in classifications
- Execution Time: Response time per query
- Keyword Match Rate: Presence of expected keywords
- Confusion Matrix: Classification error patterns
pytest# Unit tests only
pytest tests/unit/
# Integration tests (requires API keys)
pytest tests/integration/
# With coverage
pytest --cov=src --cov-report=html- Unit Tests: Test individual components in isolation
- Integration Tests: Test component interactions
- Evaluation Tests: Benchmark performance
Query β Orchestrator β Classifier β Domain Context β Specialized Agent β Result
β
Model Provider
(OpenAI/Gemini/Claude)
See docs/ARCHITECTURE.md for detailed architecture documentation.
See docs/API_REFERENCE.md for complete API documentation.
- Create domain class in
src/domains/:
from src.domains.base import DomainContext
from pydantic_ai import Agent, RunContext
@dataclass
class MyDomain(DomainContext):
@property
def name(self) -> str:
return "My Domain"
def create_agent(self, model_string: str) -> Agent:
agent = Agent(model_string, system_prompt=self.system_prompt)
@agent.tool
def my_tool(ctx: RunContext[MyDomain], param: str) -> str:
return "result"
return agent- Register in
src/domains/registry.py:
from .my_domain import MyDomain
class DomainRegistry:
_domains = {
...
"my_domain": MyDomain,
}- Create provider class in
src/models/:
from src.models.base import ModelProvider
class MyProvider(ModelProvider):
def get_model_string(self) -> str:
return f"my_provider:{self.config.model_name}"
def validate_config(self) -> bool:
# Validation logic
return True- Register in
src/models/factory.py:
class ModelFactory:
_providers = {
...
"my_provider": MyProvider,
}Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new features
- Ensure tests pass:
pytest - Submit a pull request
MIT License
- Built with pydantic-ai
- Supports OpenAI, Google Gemini, and Anthropic Claude
# Set environment variable
export OPENAI_API_KEY='your-key-here'
# Or use .env file
echo "OPENAI_API_KEY=your-key-here" > .envpip install -r requirements.txt# Make sure API keys are set for integration tests
export OPENAI_API_KEY='your-key'
# Run without integration tests
pytest tests/unit/π Analyzing query: 'Write a Python function to reverse a string'
π Domain identified: programming
Confidence: 95.0%
Reasoning: Query asks for Python code implementation
π Loading programming context...
Context loaded: Programming
π€ Creating specialized Programming agent...
Executing query with OpenAI - gpt-4o-mini...
β
Query processed successfully
π ANSWER:
Here's a Python function to reverse a string:
def reverse_string(s: str) -> str:
"""Reverse a string using slicing."""
return s[::-1]
# Example usage:
print(reverse_string("hello")) # Output: "olleh"
This uses Python's slicing feature with a step of -1 to reverse the string...