fix #228: Critical: Inconsistent reads in single-node mode #229
fix #228: Critical: Inconsistent reads in single-node mode #229
Conversation
Problem: - put() → get_linearizable() immediately after returned None - Root cause: leader commits log but doesn't wait for state machine apply - Only reproducible on first startup (no existing data) Solution (following OpenRaft best practices): - Add tokio::sync::watch channel to notify when last_applied advances - ensure_state_machine_upto_commit_index() now waits for apply completion - Configurable timeout: state_machine_sync_timeout_ms (default: 10ms) Performance impact: - Linearizable Read: +0.8% (no regression) - LeaseRead: +9.5% (unexpected improvement) - Eventual Read: -6.1% (will optimize in v0.3.0) - Write: -3.0% (acceptable for correctness) Trade-off: -3~6% performance for critical data consistency fix.
- Add mock expectations for wait_applied() and read_from_state_machine() in leader_state_test - Fix quick-start-embedded example to use get_linearizable instead of get_eventual - Add GET verification to service-discovery-embedded after PUT operations - Add comprehensive integration tests for linearizable read after write (Test 1 & Test 5b)
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. 📝 WalkthroughWalkthroughThe pull request adds linearizable read support to a Raft-based key-value system by introducing state machine synchronization. A new configuration parameter specifies the state machine sync timeout, a Changes
Sequence DiagramsequenceDiagram
participant Client
participant LocalKvClient
participant LeaderState
participant StateMachineHandler
participant RaftApply
Client->>LocalKvClient: get_linearizable(key)
LocalKvClient->>LeaderState: handle_client_read_request()
rect rgb(220, 240, 250)
Note over LeaderState,StateMachineHandler: State Machine Synchronization
LeaderState->>LeaderState: ensure_state_machine_upto_commit_index()
LeaderState->>StateMachineHandler: wait_applied(target_index, timeout)
activate StateMachineHandler
par Async Wait
StateMachineHandler->>StateMachineHandler: watch for last_applied advancement
Note over RaftApply: Concurrent apply loop<br/>processes committed entries
RaftApply->>StateMachineHandler: notify applied_notify_tx
StateMachineHandler->>StateMachineHandler: target_index reached
end
StateMachineHandler-->>LeaderState: Ok(())
deactivate StateMachineHandler
end
rect rgb(240, 250, 220)
Note over LeaderState,Client: Safe Read Phase
LeaderState->>LeaderState: read from state machine
LeaderState-->>LocalKvClient: value
LocalKvClient-->>Client: value
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a critical bug (#228) where linearizable reads in single-node mode could return stale or missing data immediately after writes. The fix implements a proper wait mechanism to ensure the state machine has applied all committed entries before serving linearizable reads, maintaining Raft's linearizability guarantee across both single-node and multi-node deployments.
Key Changes
- Introduces
wait_applied()method inStateMachineHandlertrait to wait for state machine to catch up with commit index - Adds watch channel notification mechanism in
DefaultStateMachineHandlerto signal when entries are applied - Updates
ensure_state_machine_upto_commit_index()inLeaderStateto asynchronously wait for state machine synchronization - Adds comprehensive integration tests validating linearizable read guarantees with no artificial delays
Reviewed changes
Copilot reviewed 11 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| d-engine-core/src/state_machine_handler/mod.rs | Adds wait_applied() trait method for linearizable read support |
| d-engine-core/src/state_machine_handler/default_state_machine_handler.rs | Implements watch-based notification mechanism and wait logic for state machine synchronization |
| d-engine-core/src/raft_role/leader_state.rs | Updates linearizable read path to asynchronously wait for state machine before serving reads |
| d-engine-core/src/config/raft.rs | Adds configurable timeout for state machine synchronization (default 10ms) |
| config/base/raft.toml | Documents new state_machine_sync_timeout_ms configuration parameter |
| d-engine-server/tests/embedded/local_kv_client_integration_test.rs | Adds focused tests for linearizable read-after-write and sequential write scenarios |
| d-engine-server/tests/cluster_start_stop/cluster_integration_test.rs | Adds multi-node test to verify consistent behavior across cluster sizes |
| d-engine-server/tests/components/raft_role/leader_state_test.rs | Updates mock tests to handle new async behavior and wait_applied expectations |
| examples/service-discovery-embedded/server.rs | Adds verification of linearizable reads after writes with assertions |
| examples/quick-start-embedded/src/main.rs | Replaces eventual consistency reads with linearizable reads and adds error handling |
| README.md | Simplifies quick start example to use linearizable reads by default |
Comments suppressed due to low confidence (1)
d-engine-core/src/config/raft.rs:996
- The validate method does not check if state_machine_sync_timeout_ms is greater than 0. A timeout value of 0 would cause tokio::time::timeout to return immediately with an error, which could lead to unexpected behavior. Consider adding validation to ensure this value is greater than 0, similar to how lease_duration_ms is validated.
impl ReadConsistencyConfig {
fn validate(&self) -> Result<()> {
// Validate read consistency configuration
if self.lease_duration_ms == 0 {
return Err(Error::Config(ConfigError::Message(
"read_consistency.lease_duration_ms must be greater than 0".into(),
)));
}
Ok(())
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
d-engine-server/tests/cluster_start_stop/cluster_integration_test.rs
Outdated
Show resolved
Hide resolved
d-engine-core/src/state_machine_handler/default_state_machine_handler.rs
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Actionable comments posted: 0
🧹 Nitpick comments (3)
d-engine-core/src/config/raft.rs (1)
987-996: Consider adding validation for state_machine_sync_timeout_ms.The
validate()method checkslease_duration_ms > 0but doesn't validate the newstate_machine_sync_timeout_msfield. For consistency and to prevent misconfigurations, consider adding a similar check.🔎 Suggested validation addition
impl ReadConsistencyConfig { fn validate(&self) -> Result<()> { // Validate read consistency configuration if self.lease_duration_ms == 0 { return Err(Error::Config(ConfigError::Message( "read_consistency.lease_duration_ms must be greater than 0".into(), ))); } + if self.state_machine_sync_timeout_ms == 0 { + return Err(Error::Config(ConfigError::Message( + "read_consistency.state_machine_sync_timeout_ms must be greater than 0".into(), + ))); + } Ok(()) } }d-engine-server/tests/components/raft_role/leader_state_test.rs (1)
857-947: Linearizable read test now correctly exercises state machine handlerWiring
MockStateMachineHandlerinto theMockBuilderand assertingupdate_pending+wait_applied+read_from_state_machineintest_handle_raft_event_case6_2gives you an end‑to‑end check of the linearizable read path. If you want this even tighter, you could add.times(1)onread_from_state_machineas well, but that’s optional.d-engine-core/src/state_machine_handler/default_state_machine_handler.rs (1)
105-109:wait_applied+ watch-channel wiring for linearizable reads looks correct; consider a couple of refinements
- Using a
watchchannel keyed bylast_appliedand cloning the receiver inwait_appliedis a good fit: waiters see the current value immediately and then block onchanged()only when necessary.- Updating
last_appliedwithOrdering::Releaseinapply_chunkand sending onapplied_notify_txensureswait_appliedobserves applied entries in order, and the >= check is safe under the Raft “apply in log order” assumption.- Seeding the channel with
last_applied_indexinnewmeans fresh waiters won’t unnecessarily block when state is already caught up.Two small follow-ups you might consider:
- For timeouts and channel-closure, you currently return
Error::Fatal. If a slow/blocked state machine should not be treated as process‑fatal, you may want a more specific non‑fatal error for read paths so the node can remain healthy while signaling a failed linearizable read.- If there are code paths (e.g., snapshot application) that can advance the state machine’s applied index without going through
apply_chunk, it would be safer to also updatelast_appliedand send onapplied_notify_txthere to keepwait_appliedin sync.Also applies to: 163-187, 238-246, 789-823
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (11)
README.mdconfig/base/raft.tomld-engine-core/src/config/raft.rsd-engine-core/src/raft_role/leader_state.rsd-engine-core/src/state_machine_handler/default_state_machine_handler.rsd-engine-core/src/state_machine_handler/mod.rsd-engine-server/tests/cluster_start_stop/cluster_integration_test.rsd-engine-server/tests/components/raft_role/leader_state_test.rsd-engine-server/tests/embedded/local_kv_client_integration_test.rsexamples/quick-start-embedded/src/main.rsexamples/service-discovery-embedded/server.rs
🧰 Additional context used
🧬 Code graph analysis (4)
d-engine-core/src/state_machine_handler/mod.rs (1)
d-engine-core/src/state_machine_handler/default_state_machine_handler.rs (1)
wait_applied(163-187)
d-engine-server/tests/embedded/local_kv_client_integration_test.rs (1)
d-engine-core/src/watch/manager.rs (1)
key(58-60)
d-engine-server/tests/components/raft_role/leader_state_test.rs (1)
d-engine-server/src/test_utils/mock/mock_node_builder.rs (1)
new(139-160)
d-engine-core/src/state_machine_handler/default_state_machine_handler.rs (1)
d-engine-core/src/state_machine_handler/mod.rs (1)
wait_applied(87-91)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
- GitHub Check: Agent
- GitHub Check: Lint and Format Check
- GitHub Check: Lint and Format Check
🔇 Additional comments (15)
examples/quick-start-embedded/src/main.rs (2)
63-68: LGTM! Linearizable read correctly replaces eventual consistency.The change from
get_eventualtoget_linearizableensures the read reflects the immediately preceding write, which is the correct consistency level for this demo workflow.
81-94: Good addition of error handling for read verification.The error message correctly identifies that a None result after a successful PUT indicates a bug when using linearizable reads. The fallback error path improves the robustness of the example.
examples/service-discovery-embedded/server.rs (2)
88-98: Excellent verification pattern for linearizable reads.The immediate read-after-write verification effectively demonstrates the linearizable consistency guarantee. The panic message clearly identifies this as a bug scenario, which is appropriate for example code.
110-120: LGTM! Consistent verification after UPDATE.The verification pattern after UPDATE matches the PUT verification, maintaining consistency in the example and reinforcing the linearizable read guarantee.
d-engine-core/src/config/raft.rs (1)
953-959: Well-documented configuration field.The documentation clearly explains the purpose, typical latency expectations, and rationale for the default value. The 10ms default provides a reasonable safety margin over the typical <1ms apply latency.
README.md (2)
45-70: LGTM! Quick start example properly demonstrates linearizable reads.The updated example correctly uses
get_linearizableto demonstrate strong consistency guarantees. The simplified error handling withunwrap()is appropriate for a quick-start example, keeping the focus on the core API usage.
74-99: Clear distinction between integration modes.The restructured documentation effectively communicates the embedded vs standalone trade-offs, with clear "Use when" and "Why" guidance for each mode. The performance note provides valuable context for users making deployment decisions.
config/base/raft.toml (1)
86-89: Well-documented configuration parameter.The inline documentation clearly explains the purpose of the timeout, provides context about typical apply latency, and justifies the default value. The 10ms default provides an appropriate safety buffer for linearizable reads.
d-engine-server/tests/components/raft_role/leader_state_test.rs (2)
389-447: Good coverage ofensure_state_machine_upto_commit_indexhappy/no-op pathsThe two tests correctly assert that
update_pending/wait_appliedare invoked only whenlast_applied < commit_index, and skipped whenlast_applied >= commit_index, and they’ve been updated to await the now-async method. This is a solid regression check for the new behavior.
4926-4940: Unspecified policy test now validates linearizable default pathThe
handle_client_read_request::test_handle_client_read_unspecified_policy_leaderupdate to mockwait_applied/read_from_state_machineand inject the handler viawith_state_machine_handlermatches the new linearizable default behavior and should catch regressions in the read path.d-engine-server/tests/cluster_start_stop/cluster_integration_test.rs (1)
260-340: Multi-node linearizable read test is well-structured and reuses helpersThe new
test_multi_node_linearizable_read_consistencycleanly reuses the existing cluster setup andtest_put_gethelper to assert PUT→linearizable GET without sleep and sequential overwrite behavior in a 3‑node cluster. This directly codifies the regression scenario and looks correct.d-engine-core/src/state_machine_handler/mod.rs (1)
81-92: Trait-levelwait_appliedAPI is clear and appropriately scopedAdding
wait_applied(target_index, timeout)toStateMachineHandlerwith explicit linearizability docs cleanly captures the new responsibility and matches the implementation in the default handler. Signature (u64index +Durationtimeout, returningResult<()>) is flexible enough for other implementations.d-engine-server/tests/embedded/local_kv_client_integration_test.rs (1)
377-481: New LocalKvClient linearizable-read tests accurately encode the bug scenarioBoth tests precisely capture the intended guarantees:
test_linearizable_read_after_write_no_sleepvalidates read‑after‑write with no delay on a single node.test_linearizable_read_sees_latest_valuechecks sequential overwrites always surface the latest value viaget_linearizable.They’re minimal, deterministic, and directly guard against regressions in the new
wait_appliedpath.d-engine-core/src/raft_role/leader_state.rs (2)
914-920: Linearizable read fix correctly awaits state machine synchronization.The change properly ensures that linearizable reads wait for the state machine to apply all committed entries before serving the request. The error handling is appropriate, failing the read request with a descriptive message if the state machine fails to catch up within the configured timeout.
1848-1872: State machine synchronization implementation is correct.The async implementation properly waits for the state machine to catch up to the commit index before allowing linearizable reads to proceed. Configuration verification confirms that
state_machine_sync_timeout_msis properly defined (default: 10ms) ind-engine-core/src/config/raft.rs, and thewait_appliedmethod inDefaultStateMachineHandlercorrectly handles timeouts and returns immediately when the state machine has already reached the target index.The
last_appliedparameter comparison could theoretically be stale, but thewait_appliedimplementation gracefully handles this scenario by checking if the current applied index is already at or beyond the target before blocking, avoiding any correctness issues.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
- Add test_put_get_no_sleep() for true no-sleep linearizable read tests - Fix test_remove_node connection conflict with independent mock addresses - Increase state_machine_sync_timeout to 100ms in tests for CI stability - Replace test_put_get with test_put_get_no_sleep in multi-node linearizable test
…gnostics Include current state machine index in timeout error to help diagnose: - State machine stuck at specific index - Apply progress too slow - No progress at all
- Reduce example code from 30 to 13 lines (keep essentials) - Add links to examples/quick-start-embedded - Add links to examples/service-discovery-embedded - Add link to examples/quick-start-standalone
# fix #228: Critical: Inconsistent reads in single-node mode (#229) ## Problem **Critical bug**: `put()` → `get_linearizable()` returned `None` immediately after write - **Root cause**: Leader commits log but doesn't wait for state machine to apply - **Impact**: Violates linearizability guarantee - reads don't reflect committed writes - **Reproducibility**: Only on first startup (no existing data) due to timing race ## Solution Implemented **wait-for-apply mechanism**: 1. **Watch channel for apply notifications** - `tokio::sync::watch` notifies when `last_applied` advances - `ensure_state_machine_upto_commit_index()` waits until target index applied 2. **Configurable timeout** - `state_machine_sync_timeout_ms` (default: 10ms for local SSD) - Tests use 100ms for CI environment reliability 3. **Enhanced error diagnostics** - Timeout errors now include `current_applied` index - Helps diagnose: stuck state machine, slow apply, or no progress
Type
Related Issues
Checklist
Summary by CodeRabbit
New Features
Configuration
state_machine_sync_timeout_msparameter (default 10ms) for read consistency tuning.Documentation
Tests
✏️ Tip: You can customize this high-level summary in your review settings.