feat(cluster): batched Raft commands for compaction manifests#403
Merged
feat(cluster): batched Raft commands for compaction manifests#403
Conversation
Replace the O(N) sequential RegisterFile/DeleteFile applies in the CompletionWatcher with a single CommandBatchFileOps log entry per manifest. For a typical manifest with 20 outputs + 20 deleted sources this drops Raft round-trips from 40 to 1, cutting apply latency from ~200ms to ~5ms. - Add CommandBatchFileOps (type 10) to the FSM with BatchFileOp / BatchFileOpsPayload types and an applyBatchFileOps handler that dispatches to the existing idempotent register/delete helpers. - Add Node.BatchFileOps and Coordinator.BatchFileOpsInManifest with the same leader-or-forward semantics as the single-command path. - Extend ManifestBridge with BatchFileOps; rewrite CompletionWatcher. applyOne to build register/delete slices and issue one batch call, preserving the output_written → sources_deleted two-phase ordering invariant. - Add CommandBatchFileOps to the forward-apply security allowlist. - 13 new tests across FSM, watcher, and bridge layers.
Contributor
There was a problem hiding this comment.
Code Review
This pull request optimizes compaction manifest processing by batching RegisterFile and DeleteFile operations into a single Raft log entry (CommandBatchFileOps). This change reduces Raft round-trips from O(N) to 1 per manifest, significantly cutting apply latency while maintaining idempotency and the two-phase write-then-delete invariant. The implementation includes updates to the CompactionBridge, Coordinator, Raft FSM, and CompletionWatcher, along with comprehensive test coverage. I have no feedback to provide.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
RegisterFile/DeleteFileRaft applies per compaction manifest with a singleCommandBatchFileOpslog entry, cutting apply latency from ~200ms to ~5ms for typical manifestsCommandBatchFileOps(type 10) FSM handler dispatches to existing idempotent register/delete helpers — all files in a batch share the same LSN (the batch log index), which is semantically correct since a compaction job's outputs are causally simultaneousManifestBridgeinterface extended withBatchFileOps;CompletionWatcher.applyOnerewritten to issue one batch call while preserving theoutput_written → sources_deletedtwo-phase ordering invariantCommandBatchFileOpsadded to the forward-apply security allowlist with the same HMAC + role authorization as existing file-manifest commandsCloses #399
Test plan
go test ./internal/cluster/raft/... -run TestFSMBatch— 4 FSM tests (happy path, registers-only, unknown op type, empty batch)go test ./internal/compaction/... -run TestWatcher— all watcher tests including 4 new batch-specific onesgo test ./internal/cluster/... -run TestCompactionBridge_Batch— 4 bridge testsgo build ./...— clean build