-
Notifications
You must be signed in to change notification settings - Fork 1
Testing Guide
Your starting point for everything testing-related. This page explains what testing docs exist, how they connect, and how to actually run tests.
TL;DR — We have 1,500+ automated tests across 332 test files. Run them with
pytest. Coverage is strong (28 of 29 feature areas) but 66 conceptual model claims have zero automated verification. The biggest gaps are in auto-refunds, high-value model protection, token estimation, and API compatibility features. Below is a map of all testing docs and when to use each one.
# All tests
pytest
# With coverage
pytest --cov=src
# Specific test type
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/
# Conceptual Model tests (186 tests verifying code matches the spec)
pytest tests/conceptual_model/ -v
# Only CM tests that should pass (code matches spec)
pytest tests/conceptual_model/ -m cm_verified -v
# Only CM tests documenting gaps (code doesn't match spec yet)
pytest tests/conceptual_model/ -m cm_gap -v
# Specific feature area
pytest tests/routes/test_chat.py
pytest tests/services/test_circuit_breaker.pyPrerequisites: Python 3.10+, Redis running (for integration tests), Supabase running (for DB tests). See Troubleshooting if things break.
We have 5 testing documents. Here's how they connect:
Conceptual Model (the spec)
│
├──► CM Unit Testing Plan ──► CM Unit Test Coverage Report
│ "What SHOULD be tested" "What IS vs ISN'T tested"
│
└──► Testing Plan ──────────► Test Coverage Audit
"Manual API test cases" "Automated coverage by feature"
1. Testing Plan — Manual API Test Cases
What: 250+ manual test cases organized by feature. Each is a curl/Postman command with expected result.
When to use: When you want to manually verify a feature works end-to-end. Great for QA sessions, pre-release verification, or when you're debugging something and want to isolate it.
Format: Tables with test #, method, endpoint, auth requirement, and expected response.
2. Conceptual-Model-Unit-Testing-Plan — What Should Be Tested
What: 186 theoretical unit tests derived from every testable claim in the Conceptual Model. Each has a test ID (CM-{area}.{number}), the unit under test, assertions, and mocks needed.
When to use: When you want to write a new test and need to know what the Conceptual Model demands be verified. Or when reviewing if a feature's tests are complete.
Format: Tables with test ID, test name, unit under test, assertions, mocks.
Where's the code? The actual test implementations live in
tests/conceptual_model/(18 files, 186 tests). Seetests/conceptual_model/README.mdfor the full guide — file layout, markers, fixtures, and how to add new tests. Theconceptual-model-tests.ymlGitHub Actions workflow runs them on every PR and posts a per-section pass/fail summary as a sticky comment.
3. CM Unit Test Coverage Report — What IS vs ISN'T Tested
What: Compares the 186 theoretical tests from the CM Unit Testing Plan against the 5,491 actual tests. Shows which conceptual model claims are covered, partially covered, or completely missing.
When to use: When you want to find the gaps — which promises does the spec make that have zero automated verification? This is your priority list for writing new tests.
Key numbers:
| Status | Count | % |
|---|---|---|
| Covered | 89 | 47.8% |
| Partial | 31 | 16.7% |
| Missing | 66 | 35.5% |
4. Test Coverage Audit — Automated Coverage by Feature
What: Feature-by-feature audit of automated test coverage. Maps the Testing Plan test cases to actual test files.
When to use: When you want to know "does feature X have tests?" Shows test files, test functions, and coverage status per feature area.
Key number: 28 of 29 feature areas have automated coverage (only Guardrails has none — because it's not implemented).
5. Test Mapping — Raw Test-to-Feature Mapping
What: The raw data — 5,491 tests mapped to features. This is a large reference file (7,978 lines), not meant for reading start to finish.
When to use: When you need to find a specific test for a specific feature, or when you're generating coverage reports.
| I want to... | Go to... |
|---|---|
| Run the tests right now |
pytest (commands above) |
| Manually verify a feature | Testing Plan |
| Write a new test | Conceptual-Model-Unit-Testing-Plan for what to test, CM Unit Test Coverage Report for what's missing |
| Check if a feature has tests | Test Coverage Audit |
| Find the test file for feature X | Test Mapping |
- Features Acceptance Criteria — What each feature must do (criteria the tests should verify)
- Delta Report — P0/P1/P2 issues that need test coverage
-
Testing-Workflows-Locally — Running GitHub Actions CI locally with
act
Reading Path (start here, in order)
- Conceptual Model
- Stability Definition
- Conceptual Model Features
- Features
- Delta Report
- Features-Acceptance-Criteria
Testing
Security & Access
Billing
Monitoring
Features
Providers
Operations
Data References