-
-
Couldn't load subscription status.
- Fork 106
Description
Context
Analysis of 55 bug fixes over the past 4 months reveals recurring patterns that could be caught by better unit tests. Many bugs were only discovered in integration tests or production.
Key Bug Patterns Identified
1. Isolated Node Issues (3+ bugs)
- fix: Check local storage directly in request_get() before network operations #1806: GET on isolated nodes
- Fix: WebSocket SubscribeResponse not being delivered to clients #1844: Subscribe on isolated nodes
- UPDATE operations timeout on isolated nodes without returning UpdateResponse #1884: UPDATE on isolated nodes (new)
Gap: Operations tested in multi-node setups break on isolated nodes
2. Self-Routing Prevention (4+ bugs)
- fix: Check local storage directly in request_get() before network operations #1806: GET routing to self
- fix: Prevent nodes from forwarding GET requests to themselves #1786: Forwarding to self
- fix: allow nodes to consider themselves as caching peers #1781: Self-caching consideration
- Fix self connection attempts #1827: Self-connection attempts
Gap: No unit tests validating "never route to self" invariant
3. Actor/Router Delivery (3+ bugs)
- Fix: WebSocket SubscribeResponse not being delivered to clients #1844: SubscribeResponse not delivered
- fix: Critical subscription routing fixes (Phases 1-3) #1854: Subscription routing fixes (3 phases)
- Revert Fix WebSocket UPDATE race condition #1736: Critical WebSocket subscription bug
Gap: ResultRouter → SessionActor flow has minimal unit tests (only creation test!)
4. Race Conditions (4+ flaky tests)
- Fix race condition in state store cache updates #1753: State store cache race
- Fix flaky test_multiple_clients_subscription in CI #1754: Multiple client subscription race
- Fix test_put_contract flakiness by adding worker_threads specification #1764: test_put_contract flaky
- Various flaky message handling tests
Gap: No systematic race condition testing framework
5. Contract Initialization (#1842)
Gap: No tests for operations during initialization
6. Edge Cases
- fix: send UpdateResponse to clients when update results in no state change #1734: UpdateResponse when no state change
- Oversized message handling
Gap: No systematic edge case testing
Proposed Test Improvements
Priority 1: Isolated Node Test Suite
// Generic test pattern for ANY operation on isolated node
#[tokio::test]
async fn test_operation_on_isolated_node<Op>() {
// Verify: no self-routing, no network timeout, uses local cache
}- Expand isolated_node_regression.rs to cover UPDATE
- Create reusable isolated node test framework
Priority 2: Self-Routing Prevention
Create routing_invariants.rs:
#[test]
fn test_never_routes_to_self() {
let node = create_node();
let target = same_location_as_node();
let selected_peer = select_peer_for_target(target);
assert_ne!(selected_peer, node.peer_id());
}
#[test]
fn test_connection_rejects_self() {
// Should reject connection attempts to own peer ID
}Priority 3: Actor/Router Delivery Path Tests
Expand result_router.rs tests (currently only 2 trivial tests!):
#[tokio::test]
async fn test_result_router_delivers_to_session_actor() {
// Send result through router, verify session actor receives it
}
#[tokio::test]
async fn test_instant_completion_race() {
// Operation completes before client 2 registers
// Verify both clients still get results
}
#[tokio::test]
async fn test_multiple_clients_receive_response() {
// Verify message delivery to multiple WebSocket clients
}Priority 4: Race Condition Testing Framework
Add Loom dependency for deterministic race detection:
#[test]
fn test_state_cache_concurrent_updates() {
loom::model(|| {
// Model concurrent cache updates deterministically
});
}Priority 5: Contract Initialization Edge Cases
Create initialization_edge_cases.rs:
#[tokio::test]
async fn test_operations_queued_during_init() {
let contract = start_initializing();
send_operation(Op1);
send_operation(Op2);
complete_init();
// Verify ops executed in order with correct params
}Priority 6: Edge Case Response Tests
#[test]
fn test_update_no_state_change_returns_response() {
let result = update(same_state);
assert!(matches!(result, UpdateResponse { .. }));
}Concrete Action Plan
- Add comprehensive tests to
result_router.rs(currently minimal coverage) - Create
routing_invariants.rswith self-routing prevention tests - Expand
isolated_node_regression.rsto cover UPDATE operation - Add Loom dependency for deterministic race condition testing
- Create
initialization_edge_cases.rsfor contract init scenarios - Add edge case matrix tests for all operations (empty, max size, over-limit, no-change)
Expected Impact
These test improvements would have caught most of the 55 bugs before they reached integration testing or production, significantly improving development velocity and code quality.
Current Test Coverage
- 52 files with unit tests in src/
- 3 integration test files in tests/
- Many critical paths (result router, request router, self-routing prevention) have minimal unit test coverage
[AI-assisted debugging and comment]
Metadata
Metadata
Assignees
Labels
Type
Projects
Status