-
Notifications
You must be signed in to change notification settings - Fork 296
Description
📋 Overview
This issue tracks the implementation of missing E2E test cases to achieve comprehensive coverage of semantic-router's core functionality. The current E2E test suite covers basic functionality, PII detection, jailbreak detection, domain classification, and semantic cache, but lacks coverage for several critical routing strategies and filters.
🎯 Scope
1️⃣ Routing Strategies (High Priority ⭐⭐⭐⭐⭐)
1.1 Keyword Routing
File: src/semantic-router/pkg/classification/keyword_classifier.go
Test Coverage Needed:
- ✅ OR operator - any keyword matches
- ✅ AND operator - all keywords must match
- ✅ NOR operator - no keywords match
- ✅ Case-sensitive vs case-insensitive matching
- ✅ Regex pattern matching
- ✅ Word boundary detection
- ✅ Priority over embedding and intent-based routing
Example Test Data:
{
"test_cases": [
{
"description": "OR operator - urgent request",
"query": "I need urgent help with my account",
"expected_category": "urgent_request",
"expected_confidence": 1.0,
"matched_keywords": ["urgent"]
},
{
"description": "AND operator - sensitive data",
"query": "My SSN and credit card were stolen",
"expected_category": "sensitive_data",
"expected_confidence": 1.0,
"matched_keywords": ["SSN", "credit card"]
}
]
}Reference Config: config/intelligent-routing/in-tree/keyword.yaml
1.2 Embedding Routing
File: src/semantic-router/pkg/classification/embedding_classifier.go
Test Coverage Needed:
- ✅ Semantic similarity matching with embeddings
- ✅ Mean vs Max aggregation methods
- ✅ Similarity threshold validation
- ✅ Embedding model selection (auto, qwen3, gemma, bert)
- ✅ Matryoshka dimensions (768, 512, 256, 128)
- ✅ Quality vs Latency priority in auto mode
Example Test Data:
{
"test_cases": [
{
"description": "Mean aggregation - technical query",
"query": "How to implement async/await in Python?",
"keywords": ["programming", "coding", "software"],
"aggregation_method": "mean",
"threshold": 0.7,
"expected_category": "technical",
"expected_similarity": 0.85
},
{
"description": "Max aggregation - high similarity",
"query": "What is machine learning?",
"keywords": ["AI", "ML", "neural networks"],
"aggregation_method": "max",
"threshold": 0.8,
"expected_category": "ai"
}
]
}Reference Config: config/intelligent-routing/in-tree/embedding.yaml
1.3 MCP Routing (Model Context Protocol)
File: src/semantic-router/pkg/classification/mcp_classifier.go
Test Coverage Needed:
- ✅ MCP Stdio transport (process communication)
- ✅ MCP HTTP transport (API calls)
- ✅ Custom classification logic via external MCP servers
- ✅ Model and reasoning decision from MCP response
- ✅ Fallback to in-tree classifier on MCP failure
- ✅ Probability distribution with
with_probabilitiesparameter
Example Test Data:
{
"test_cases": [
{
"description": "MCP stdio - regex classifier",
"mcp_server": "server_keyword.py",
"transport": "stdio",
"query": "urgent: fix production bug",
"expected_category": "urgent",
"expected_model": "gpt-oss",
"expected_use_reasoning": true
},
{
"description": "MCP HTTP - embedding classifier",
"mcp_server": "http://localhost:8080",
"transport": "http",
"query": "Explain quantum computing",
"expected_category": "science"
}
]
}Reference Servers: examples/mcp-classifier-server/
1.4 Hybrid Routing
Test Coverage Needed:
- ✅ Priority order: Keyword → Embedding → Intent-based → MCP
- ✅ Fallback chain when high-priority methods fail
- ✅ Combined strategy with multiple routing methods enabled
- ✅ Confidence fusion from multiple classifiers
Example Test Data:
{
"test_cases": [
{
"description": "Keyword takes priority over embedding",
"query": "urgent: machine learning question",
"keyword_match": "urgent_request",
"embedding_match": "ai",
"expected_category": "urgent_request",
"expected_method": "keyword"
},
{
"description": "Fallback to embedding when keyword fails",
"query": "What is deep learning?",
"keyword_match": null,
"embedding_match": "ai",
"expected_category": "ai",
"expected_method": "embedding"
}
]
}1.5 Entropy-Based Routing
File: src/semantic-router/pkg/utils/entropy/entropy.go
Test Coverage Needed:
- ✅ Shannon entropy and normalized entropy calculation
- ✅ Uncertainty levels: very_high, high, medium, low, very_low
- ✅ Reasoning decision based on entropy
- ✅ Weighted decision for high uncertainty (top-2 categories)
- ✅ Confidence adjustment based on uncertainty
Example Test Data:
{
"test_cases": [
{
"description": "Very high entropy - enable reasoning",
"probabilities": [0.25, 0.25, 0.25, 0.25],
"expected_uncertainty": "very_high",
"expected_use_reasoning": true,
"expected_confidence": 0.3
},
{
"description": "Very low entropy - trust classification",
"probabilities": [0.95, 0.02, 0.02, 0.01],
"expected_uncertainty": "very_low",
"expected_use_reasoning": false,
"expected_confidence": 0.90
}
]
}2️⃣ Filter Tests (High Priority ⭐⭐⭐⭐)
2.1 ReasoningControl Filter
File: src/semantic-router/pkg/extproc/req_filter_reason.go
Test Coverage Needed:
- ✅ Enable/disable reasoning with
enableReasoning - ✅ Reasoning effort levels: low, medium, high
- ✅ Reasoning families: gpt-oss, deepseek, qwen3, claude
- ✅
chat_template_kwargsfor different model families - ✅
reasoning_effortparameter (OpenAI-style) - ✅
maxReasoningStepslimit
Example Config:
filters:
- type: ReasoningControl
enabled: true
config:
reasonFamily: "gpt-oss"
enableReasoning: true
reasoningEffort: "high"
maxReasoningSteps: 152.2 ToolSelection Filter
Test Coverage Needed:
- ✅ Top-K tool selection
- ✅ Similarity threshold filtering
- ✅ Tools database loading from
toolsDBPath - ✅ Fallback strategy with
fallbackToEmpty - ✅ Category/tag-based tool filtering
Example Config:
filters:
- type: ToolSelection
enabled: true
config:
toolsDBPath: "tools.json"
topK: 3
similarityThreshold: 0.7
fallbackToEmpty: falseReference: examples/semanticroute/tool-selection-example.yaml
2.3 Filter Chain Combination
Test Coverage Needed:
- ✅ Multiple filter execution order
- ✅ Filter short-circuit (e.g., PIIDetection blocks subsequent filters)
- ✅ Filter independence (configs don't interfere)
- ✅ Performance impact of multiple filters
Example Chain:
filters:
- type: PIIDetection
- type: PromptGuard
- type: SemanticCache
- type: ReasoningControl
- type: ToolSelection3️⃣ Cache Tests (Medium Priority ⭐⭐⭐)
3.1 Different Cache Backends
File: src/semantic-router/pkg/cache/
Test Coverage Needed:
- ✅ InMemory cache performance
- ✅ Milvus cache with vector database
- ✅ Hybrid cache (HNSW + Milvus)
- ✅ TTL expiration mechanism
- ✅ Eviction strategy when
maxEntriesreached
3.2 Different Embedding Models for Cache
Test Coverage Needed:
- ✅ BERT (fast, 384-dim)
- ✅ Qwen3 (high quality, 1024-dim, 32K context)
- ✅ Gemma (balanced, 768-dim, 8K context)
- ✅ Matryoshka dimensions impact on cache hit rate
4️⃣ Performance & Concurrency Tests (Medium Priority ⭐⭐⭐)
4.1 Concurrent Requests
Test Coverage Needed:
- ✅ 100 concurrent classification requests
- ✅ Thread safety of classifiers
- ✅ Resource contention (cache, model loading)
- ✅ QPS (queries per second) benchmarking
4.2 Long Text Handling
Test Coverage Needed:
- ✅ 32K context with Qwen3
- ✅ 8K context with Gemma
- ✅ Token limit handling and truncation
5️⃣ Edge Cases & Error Handling (Low Priority ⭐⭐)
5.1 Configuration Errors
Test Coverage Needed:
- ✅ Invalid category mapping
- ✅ Invalid threshold values
- ✅ Missing defaultModel
- ✅ Invalid filter configurations
5.2 Network Errors
Test Coverage Needed:
- ✅ Model service unavailable
- ✅ MCP server timeout
- ✅ Milvus connection failure
- ✅ Network timeout handling
📁 Implementation Structure
Suggested file organization:
e2e/testcases/
├── keyword_routing.go
├── embedding_routing.go
├── mcp_routing.go
├── hybrid_routing.go
├── entropy_routing.go
├── reasoning_control.go
├── tool_selection.go
├── filter_chain.go
├── cache_backends.go
├── concurrent_requests.go
└── testdata/
├── keyword_routing_cases.json
├── embedding_routing_cases.json
├── mcp_routing_cases.json
├── hybrid_routing_cases.json
├── entropy_routing_cases.json
├── reasoning_control_cases.json
├── tool_selection_cases.json
└── ...
🎯 Acceptance Criteria
- All test cases pass consistently
- Test data is stored in JSON files for maintainability
- Tests follow existing E2E framework patterns
- Documentation is updated with new test coverage
- CI/CD pipeline includes new tests
📚 References
- Existing E2E tests:
e2e/testcases/ - Keyword routing docs:
website/docs/tutorials/intelligent-route/keyword-routing.md - MCP classification docs:
website/docs/tutorials/mcp-classification/overview.md - Example configs:
examples/semanticroute/ - In-tree configs:
config/intelligent-routing/in-tree/
🤝 Contributing
This is a great opportunity for new contributors! Each test case can be implemented independently. Feel free to:
- Pick any test case from the list above
- Comment on this issue to claim it
- Submit a PR with your implementation
For questions or guidance, please comment on this issue or join our community discussions.