Skip to content

Test Coverage Audit

arminrad edited this page Mar 16, 2026 · 2 revisions

Test Coverage Audit

Part of: Testing Guide | See also: CM Unit Test Coverage Report (conceptual model coverage), Testing Plan (manual test cases)


TL;DR — 28 of 29 feature areas have automated test coverage (only Guardrails has none — it's not implemented). 332 test files, 1,500+ test functions. 3 areas have partial coverage. This maps test files to feature areas — use it to find "does feature X have tests?"


Systematic audit of automated test coverage vs. the Conceptual Model and Testing Plan.

Last updated: 2026-03-06 | Test files: 332 | Test directories: 20+


Summary

Metric Value
Feature areas defined in Conceptual Model 29
Feature areas with automated test coverage 28
Feature areas with NO coverage 1 (Guardrails)
Feature areas with partial coverage 3
Total test files 332
Total test functions (estimated) 1,500+

Coverage by Feature Area

Legend

Symbol Meaning
Excellent coverage — all key behaviors tested
⚠️ Partial coverage — core paths tested, gaps in edge cases
No coverage

1. System Utilities & Ping ✅

Testing Plan Case Automated Test File Status
1.1 Root endpoint tests/routes/test_root.py
1.2 Ping with uptime tests/routes/test_ping.py
1.3 Ping stats tests/routes/test_ping.py
1.4 Velocity mode status tests/routes/test_ping.py

2. Authentication ✅

Testing Plan Case Automated Test File Status
2.1 Login with Privy token tests/routes/test_auth_v2.py
2.2 New user creation tests/routes/test_auth_v2.py
2.3 Login rate limit tests/services/test_auth_rate_limiting.py
2.4 Register rate limit tests/services/test_auth_rate_limiting.py
2.5 Auth health check tests/routes/test_auth_v2.py
2.6 Invalid token rejected tests/routes/test_auth_v2.py

Additional coverage: Temp email detection (test_auth_temp_email_bot_status.py), Google/GitHub OAuth (test_auth_v2.py), auth caching (test_auth_timeout_fixes.py).


3. Chat & Inference ✅

3.1 OpenAI-Compatible

Testing Plan Case Automated Test File Status
3.1.1 Non-streaming tests/routes/test_chat.py, test_chat_comprehensive.py
3.1.2 Streaming (SSE) tests/routes/test_chat_completions_comprehensive.py
3.1.3 JSON mode tests/routes/test_responses.py
3.1.4 Tool/function calling tests/routes/test_chat_function_calling.py
3.1.5 Logprobs tests/routes/test_chat_completions_comprehensive.py
3.1.6 Credits deducted tests/routes/test_chat.py
3.1.7 Insufficient credits (402) tests/routes/test_chat.py
3.1.8 Expired trial (402) tests/routes/test_chat.py
3.1.9 Invalid model tests/routes/test_chat_comprehensive.py
3.1.10 Anonymous + whitelisted model tests/routes/test_chat.py
3.1.11 Anonymous + non-whitelisted tests/routes/test_chat.py
3.1.12 Rate limit enforced tests/routes/test_chat_comprehensive.py

3.2 Anthropic-Compatible

Testing Plan Case Automated Test File Status
3.2.1 Non-streaming messages tests/routes/test_messages.py
3.2.2 Streaming messages tests/routes/test_messages.py

3.3 Model Aliasing & Routing

Testing Plan Case Automated Test File Status
3.3.1 Short alias resolves tests/unit/test_model_transformations.py
3.3.2 Canonical ID works tests/unit/test_model_transformations.py
3.3.3 Provider failover tests/services/test_provider_failover.py

3.4 Intelligent Routing

Testing Plan Case Automated Test File Status
3.4.1-3.4.4 General router modes tests/services/test_general_router.py
3.4.5-3.4.8 Code router modes tests/services/test_code_router.py

3.5–3.8 Chat History, Feedback, Sharing, Metrics

All covered across tests/db/test_chat_history.py, tests/db/test_feedback.py, tests/routes/test_share.py, tests/routes/test_chat_metrics.py, and related files.


4. Models & Catalog ✅

Testing Plan Case Automated Test File Status
4.1.1 List all models tests/routes/test_catalog_endpoints.py
4.1.2 Filter by provider tests/routes/test_catalog_endpoints.py
4.1.3 Unique models tests/routes/test_catalog_endpoints.py
4.1.4 Search models tests/routes/test_catalog_endpoints.py
4.1.5 Trending models tests/routes/test_catalog_endpoints.py
4.1.7 Model detail tests/routes/test_api_models.py
4.1.11 HuggingFace enrichment tests/test_huggingface_hub_integration.py
4.1.12 Pricing completeness tests/services/test_pricing_lookup.py
4.2.1-4.2.3 Modelz registry tests/services/test_models.py, tests/db/test_models_catalog_db.py
4.3.1-4.3.4 Providers tests/services/test_providers.py, tests/routes/test_catalog_endpoints.py
4.4.1-4.4.4 Gateways tests/routes/test_catalog_endpoints.py
4.5.1-4.5.6 HuggingFace tests/test_huggingface_hub_integration.py
4.6.1-4.6.6 Model Health tests/db/test_model_health.py, tests/routes/test_health.py
4.7.1 Rankings tests/routes/test_ranking.py
4.8.1-4.8.7 Availability tests/routes/test_availability.py, tests/services/test_model_availability.py

5. Circuit Breakers ✅

Testing Plan Case Automated Test File Status
5.1 List states tests/routes/test_monitoring.py
5.2 Specific provider tests/services/test_redis_metrics.py
5.3 Reset specific tests/routes/test_monitoring.py
5.4 Reset all tests/routes/test_monitoring.py

6. Code Router ✅

All 5 test cases covered in tests/services/test_code_router.py.


7. General Router ✅

All 5 test cases covered in tests/services/test_general_router.py.


8. Users ✅

All subsections (Profile, Activity, API Keys, Rate Limits, Plans, Referrals) covered across:

  • tests/routes/test_users.py
  • tests/routes/test_api_keys.py
  • tests/routes/test_activity.py
  • tests/routes/test_rate_limits.py
  • tests/routes/test_plans.py
  • tests/routes/test_referral.py

9. Credits & Billing ✅

Testing Plan Case Automated Test File Status
9.1.1-9.1.7 Credit operations tests/routes/test_credits.py
9.2.1 Pre-flight credit check tests/services/test_credit_precheck.py
9.2.2 Idempotent deduction tests/db/test_credit_transactions.py
9.2.3 Subscription allowance first tests/routes/test_chat.py
9.2.4 Provider error auto-refund tests/integration/test_chat_errors.py

10. Coupons ✅

All 6 test cases covered in tests/routes/test_coupons.py, tests/db/test_coupons.py.


11. Payments (Stripe) ✅

All 8 test cases covered in tests/routes/test_payments.py, tests/integration/test_stripe_webhook_metadata.py.


12. Health & Monitoring ✅

All 16 test cases covered across tests/routes/test_health.py (19 test classes), tests/health/test_gateway_health.py.


13. Metrics & Observability ✅

All 12 test cases covered across tests/routes/test_monitoring.py, tests/test_prometheus_metrics.py, tests/routes/test_instrumentation.py.


14. Diagnostics ✅

Both test cases covered in tests/routes/test_monitoring.py.


15. Status ✅

All 9 test cases covered in tests/test_status_page.py and related route tests.


16. Analytics ✅

All 5 test cases covered in tests/routes/test_analytics.py.


17. Error Monitoring ✅

All 7 test cases covered in tests/routes/test_error_monitor.py.


18. Image Generation ✅

All 3 test cases covered in tests/routes/test_images.py, tests/e2e/test_images_e2e.py.


19. Audio Transcription ✅

Both test cases covered in tests/routes/test_audio.py.


20. Server-Side Tools ✅

All 6 test cases covered in tests/services/test_tools.py, tests/services/test_web_search_tool.py.


21. IP Allowlist ✅

All 5 test cases covered in tests/security/test_security.py.


22. Partner Trials ✅

All 5 test cases covered in tests/routes/test_partner_trials.py.


23. Notifications ⚠️

Testing Plan Case Automated Test File Status
23.1 Preferences tests/routes/test_notifications.py
23.2 Usage report tests/routes/test_notifications.py
23.3 Test notification tests/routes/test_notifications.py

Gaps: Email delivery verification, notification content validation, multi-channel support.


24. Admin ✅

All critical admin test cases covered across tests/routes/test_admin.py, tests/routes/test_roles.py, tests/routes/test_audit.py.


25. Conceptual Model Conformance ✅

# Test Automated Coverage Status
25.1 OpenAI SDK drop-in tests/e2e/test_chat_completions_e2e.py
25.2 Anthropic SDK drop-in tests/e2e/test_messages_e2e.py
25.3 Failover on failure tests/services/test_provider_failover.py
25.4 400 no failover tests/services/test_provider_failover.py
25.5 Circuit breaker opens tests/services/test_redis_metrics.py
25.6 Breaker recovery tests/services/test_redis_metrics.py
25.7 IP-level rate limits tests/middleware/test_security_middleware.py
25.8 API key rate limits tests/services/test_rate_limiting.py
25.9 Anonymous stricter tests/services/test_anonymous_rate_limiter.py
25.10 Redis down fallback tests/services/test_redis_integration_basic.py
25.11 Catalog cache fast tests/services/test_unified_catalog_cache.py
25.12 Auth cache reduces latency tests/services/test_redis_integration_basic.py
25.13 High-value model pricing tests/services/test_pricing_lookup.py
25.14 Subscription allowance first tests/routes/test_chat.py
25.15 New user $5 + 3-day trial tests/routes/test_auth_v2.py
25.16 Expired trial + :free tests/services/test_trial_service.py
25.17 Keys encrypted at rest tests/security/test_security.py
25.18 SSE format correct tests/routes/test_stream_generator.py
25.19 Aliases resolve tests/unit/test_model_transformations.py
25.20 Health always 200 tests/routes/test_health.py

Cross-Cutting Coverage

Security ✅

Area Test Files
Encryption (Fernet) tests/security/test_security.py, tests/utils/test_crypto.py
HMAC-SHA256 tests/security/test_security.py
RBAC tests/services/test_roles.py, tests/routes/test_roles.py
SQL Injection tests/security/test_injection.py
XSS Prevention tests/security/test_injection.py
Command Injection tests/security/test_injection.py
Path Traversal tests/security/test_injection.py
LDAP/Header/JSON Injection tests/security/test_injection.py

Rate Limiting (3 layers) ✅

Layer Test Files
Layer 1: IP/Security Middleware tests/middleware/test_security_middleware.py (12 test classes)
Layer 2: API Key (Redis) tests/services/test_rate_limiting.py, tests/services/test_auth_rate_limiting.py
Layer 3: Anonymous tests/services/test_anonymous_rate_limiter.py
Fallback (Redis down) tests/services/test_redis_integration_basic.py

Caching ✅

Cache Layer Test Files
Response cache tests/services/test_response_cache.py
Auth cache tests/services/test_redis_integration_basic.py
Catalog cache (L1/L2) tests/services/test_unified_catalog_cache.py, tests/test_cache.py
DB query cache tests/services/test_redis_integration_basic.py
User lookup cache tests/services/test_user_lookup_cache.py
In-memory fallback tests/services/test_redis_integration_basic.py

Provider Failover ✅

Behavior Test Files
Failover chain building tests/services/test_provider_failover.py
Error → failover mapping tests/services/test_provider_failover.py
Zero-model fallback tests/services/test_zero_model_fallback.py
Model-aware rules tests/services/test_provider_failover.py

Provider Clients ✅

30+ provider client test files in tests/services/: test_openrouter_client.py, test_groq_client.py, test_cerebras_client.py, test_together_client.py, test_featherless_client.py, test_xai_client.py, test_simplismart_client.py, test_sybil_client.py, test_near_client.py, test_nosana_client.py, test_cloudflare_workers_ai_client.py, test_aihubmix_client.py, test_canopywave_client.py, test_morpheus_client.py, test_butter_client.py, test_helicone_client.py, and more.


The Gap: Guardrails ❌

The Conceptual Model (section 2.2) describes guardrails that have no automated test coverage:

Guardrail Description Test Coverage
PII Detection Scan prompts for phone numbers, SSNs, emails, credit cards ❌ None
Prompt Injection Defense Detect/block injection patterns overriding system prompts ❌ None
Topic Restrictions Per-API-key domain restrictions ❌ None
Content Moderation Block harmful/policy-violating inputs ❌ None
Output Content Filtering Scan responses for policy violations ❌ None
Structured Output Validation Validate JSON schema conformance of responses ❌ None
Hallucination Flags Surface provider-side safety metadata ❌ None

Note: These appear in the Conceptual Model as the target architecture. The guardrails may not yet be implemented — the gap is in both implementation and testing.


Partial Coverage Areas ⚠️

Notifications

  • Basic route tests exist (tests/routes/test_notifications.py)
  • Missing: Email delivery verification, notification content validation, multi-channel (Slack/Discord), scheduling/queues

Diagnostics

  • Concurrency and provider timing are tested
  • Missing: Resource utilization diagnostics, detailed performance diagnostics

Admin Bulk Operations

  • Core admin paths well-covered
  • Missing: Comprehensive model sync testing, pricing scheduler edge cases, bulk operations stress testing

Schemas

  • No dedicated schema validation test files (Pydantic schemas in src/schemas/ have 0 test files)
  • Schemas are implicitly tested through route tests

E2E Test Coverage

E2E Test File What it validates
test_chat_completions_e2e.py Full OpenAI-compatible chat flow
test_messages_e2e.py Full Anthropic-compatible messages flow
test_responses_e2e.py Responses API flow
test_images_e2e.py Image generation flow
test_streaming_providers_e2e.py Multi-provider streaming
test_allenai_models_e2e.py AllenAI provider integration
test_simplismart_models_e2e.py Simplismart provider integration
test_sybil_models_e2e.py Sybil provider integration

Recommendations

  1. Guardrails: When guardrails are implemented, create tests/services/test_guardrails.py and tests/routes/test_guardrails.py covering all 7 guardrail types
  2. Schema Tests: Consider adding tests/schemas/ to validate Pydantic model serialization/deserialization edge cases
  3. Notification Delivery: Add integration tests for email delivery via Resend
  4. Conformance Test Suite: Consider creating a dedicated tests/conformance/ directory that maps 1:1 to the Testing Plan sections, making it easy to track which plan cases have passing automated tests

Clone this wiki locally