Testing Guide

Your starting point for everything testing-related. This page explains what testing docs exist, how they connect, and how to actually run tests.

TL;DR — We have 1,500+ automated tests across 332 test files. Run them with pytest. Coverage is strong (28 of 29 feature areas) but 66 conceptual model claims have zero automated verification. The biggest gaps are in auto-refunds, high-value model protection, token estimation, and API compatibility features. Below is a map of all testing docs and when to use each one.

How to Run Tests

# All tests
pytest

# With coverage
pytest --cov=src

# Specific test type
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/

# Specific feature area
pytest tests/routes/test_chat.py
pytest tests/services/test_circuit_breaker.py

Prerequisites: Python 3.10+, Redis running (for integration tests), Supabase running (for DB tests). See Troubleshooting if things break.

The Testing Docs — What They Are and When to Use Them

We have 5 testing documents. Here's how they connect:

Conceptual Model (the spec)
    │
    ├──► CM Unit Testing Plan ──► CM Unit Test Coverage Report
    │    "What SHOULD be tested"     "What IS vs ISN'T tested"
    │
    └──► Testing Plan ──────────► Test Coverage Audit
         "Manual API test cases"     "Automated coverage by feature"

1. Testing Plan — Manual API Test Cases

What: 250+ manual test cases organized by feature. Each is a curl/Postman command with expected result.

When to use: When you want to manually verify a feature works end-to-end. Great for QA sessions, pre-release verification, or when you're debugging something and want to isolate it.

Format: Tables with test #, method, endpoint, auth requirement, and expected response.

2. Conceptual-Model-Unit-Testing-Plan — What Should Be Tested

What: 186 theoretical unit tests derived from every testable claim in the Conceptual Model. Each has a test ID (CM-{area}.{number}), the unit under test, assertions, and mocks needed.

When to use: When you want to write a new test and need to know what the Conceptual Model demands be verified. Or when reviewing if a feature's tests are complete.

Format: Tables with test ID, test name, unit under test, assertions, mocks.

3. CM Unit Test Coverage Report — What IS vs ISN'T Tested

What: Compares the 186 theoretical tests from the CM Unit Testing Plan against the 5,491 actual tests. Shows which conceptual model claims are covered, partially covered, or completely missing.

When to use: When you want to find the gaps — which promises does the spec make that have zero automated verification? This is your priority list for writing new tests.

Key numbers:

Status	Count	%
Covered	89	47.8%
Partial	31	16.7%
Missing	66	35.5%

4. Test Coverage Audit — Automated Coverage by Feature

What: Feature-by-feature audit of automated test coverage. Maps the Testing Plan test cases to actual test files.

When to use: When you want to know "does feature X have tests?" Shows test files, test functions, and coverage status per feature area.

Key number: 28 of 29 feature areas have automated coverage (only Guardrails has none — because it's not implemented).

5. Test Mapping — Raw Test-to-Feature Mapping

What: The raw data — 5,491 tests mapped to features. This is a large reference file (7,978 lines), not meant for reading start to finish.

When to use: When you need to find a specific test for a specific feature, or when you're generating coverage reports.

Where to Start

I want to...	Go to...
Run the tests right now	`pytest` (commands above)
Manually verify a feature	Testing Plan
Write a new test	Conceptual-Model-Unit-Testing-Plan for what to test, CM Unit Test Coverage Report for what's missing
Check if a feature has tests	Test Coverage Audit
Find the test file for feature X	Test Mapping

Testing Guide

Testing Guide

How to Run Tests

The Testing Docs — What They Are and When to Use Them

1. Testing Plan — Manual API Test Cases

2. Conceptual-Model-Unit-Testing-Plan — What Should Be Tested

3. CM Unit Test Coverage Report — What IS vs ISN'T Tested

4. Test Coverage Audit — Automated Coverage by Feature

5. Test Mapping — Raw Test-to-Feature Mapping

Where to Start

Related

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally