PHOENIX-7793 Replication Log writer improvements#2433
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Rewrites the
ReplicationLogwriter rotation mechanism to be lock-free and asynchronous, improving throughput on the disruptor consumer hot path.ReentrantLock-based synchronousrotateLog()with anAtomicReference<LogFileWriter> pendingWriterstaging pattern.LogRotationTaskcreates new writers on a background thread and stages them inpendingWriter. The disruptor consumer thread atomically drains the staged writer insideapply()before each action, replays any unsynced appends, and continues.closeExecutorfor background close, keeping the hot path unblocked.getWriter(),requestRotationIfNeeded()is called post-action and submits aLogRotationTaskto the executor if the size threshold is exceeded. AnAtomicBoolean rotationRequestedprevents duplicate submissions.initialDelayis computed viacomputeInitialDelay()to align with replication round boundaries. The rotation interval is now derived fromreplicationRoundDurationSecondsrather than a separate config key.apply(): First failure retries on the same writer (transient). Second failure triggersrequestRotation()to create a new writer on the background thread. The retry sleep gives the background thread time to stage it, and the next attempt drains it.RotationReasonenum and per-reason metrics: EliminatedTIME/SIZE/ERRORrotation reason tracking. Only total rotation count and failure count remain.closedfield upgraded toAtomicBooleanfor safe concurrent access without locks.Test changes
recreateLogGroup(),overrideConf(), andwaitForRotationTick()helpers toReplicationLogBaseTestTestableLogoverridesstartRotationExecutor()with a full-round initial delay to prevent flaky boundary-related failuresReplicationLogTestwith unit tests forcomputeInitialDelayforceRotation()instead of removedrotateLog()Test plan
mvn test -pl phoenix-core -Dtest=ReplicationLogTestmvn test -pl phoenix-core -Dtest=ReplicationLogGroupTestmvn test -pl phoenix-core -Dtest=ReplicationLogDiscoveryForwarderTest