New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KAFKA-9087 Replace log high watermark by future log high watermark wh… #13075
Conversation
…en adding log dir fetchers
Is it possible to add a test? |
So the error occurs when the fetch offset is not the futureLog's log end offset. Just curious -- is this because the log's highwatermark is lower or higher than the futureLog's? It seems that the value we put into the InitialFetchState is the first offset we fetch from, so just wanted to clarify the race and why this fixes the issue. (Not doubting you, just trying to understand 😄 ) |
@jolshan : It seems that this is a regression introduced in #6841. When a future log is just created, the current log's HWM could be higher than the future replica's HWM (since it's bounded by its log end offset). When the future replica starts fetching, we want it to fetch from its local HWM, not the current log's HWM. Otherwise, there will be holes in the future log. |
higher, and thanks to @junrao for the great explanation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chia7712 : Thanks for the PR. Just one minor comment.
// this method should use hw of future log to create log dir fetcher. Otherwise, it causes offset mismatch error | ||
val result = rm.maybeAddLogDirFetchers(Set(partition), new LazyOffsetCheckpoints(rm.highWatermarkCheckpoints), _ => None) | ||
assertEquals(1, result.size) | ||
assertEquals(0L, result(new TopicPartition(topic, 0)).initOffset) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@junrao Please take a look at line#258
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chia7712 : Thanks for the updated PR. A couple of more comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@chia7712 : Thanks for the updated PR. LGTM if the tests pass.
unrelated error
will merge it |
…9-jan-2023 * apache/trunk: (16 commits) KAFKA-14570: Fix parenthesis in verifyFullFetchResponsePartitions output (apache#13072) MINOR: Remove public mutable fields from ProducerAppendInfo (apache#13091) KAFKA-14558: Move/Rewrite LastRecord, TxnMetadata, BatchMetadata, ProducerStateEntry, and ProducerAppendInfo to the storage module (apache#13043) KAFKA-14535: Fix flaky EndToEndAuthorization tests which were sensitive to ACL change reordering (apache#13086) KAFKA-9087 Replace log high watermark by future log high watermark wh… (apache#13075) MINOR: add error reason when controller failed to handle events (apache#13050) MINOR: doc: note how JDK-8136913 can affect client SASL (apache#13071) 2023 (apache#13083) KAFKA-14279; Add 3.3.x to core compatibility tests (apache#13076) MINOR Fixed doc generation for LogConfig class in genTopicConfigDocs. (apache#13079) ...
apache#13075) Reviewers: Ismael Juma <ismael@juma.me.uk>, Justine Olshan <jolshan@confluent.io>, Jun Rao <junrao@gmail.com>
https://issues.apache.org/jira/browse/KAFKA-9087
The race condition of processing LeaderAndIsrRequest and AlterReplicaLogDirsRequest causes above error (on V2 message format). Also, the error can be reproduced easily on V1 since there is no epoch cache. The ReplicaAlterLogDirsThread checks the offset of “future log” rather than “log. Hence, here is my two cents, we can replace log.highWatermark by futureLog.highWatermark to resolve this issue. I tested it on our cluster and it works well (on both V1 and V2).
Committer Checklist (excluded from commit message)