KAFKA-7401: Fix inconsistent range exception on segment recovery #6220

apovzner · 2019-02-01T23:00:59Z

This PR fixes "java.lang.IllegalArgumentException: inconsistent range" which happens on broker startup after unclean shutdown during log cleaning phase that creates swap files (in case where base offset < log start offset). Added testRecoveryAfterCrashAndIncrementedLogStartOffset that reproduces Kafka-7401.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

hachikuji · 2019-02-04T18:12:43Z

core/src/main/scala/kafka/log/Log.scala

-      val fetchDataInfo = segment.read(startOffset, None, Int.MaxValue)
-      if (fetchDataInfo != null)
-        loadProducersFromLog(stateManager, fetchDataInfo.records)
+    if (stateManager.mapEndOffset < segment.baseOffset) {


The basic intent of this logic is to ensure that we have consistent producer state up to the base offset of the segment we're recovering before we begin the actual recovery. In this case, the log start offset is somewhere above the base offset of this segment, which makes the call to truncateAndReload above a little strange (i.e. the end offset is smaller than the start offset). This seems to work, but feels brittle. Since we do not actually do partial segment recovery, I wonder if we could do something like this:

val recoveryStartOffset = math.min(logStartOffset, segment.baseOffset) stateManager.truncateAndReload(recoveryStartOffset, segment.baseOffset, time.milliseconds)

Then the map end offset is always no greater than the segment base offset. What do you think?

Thanks for the context. Given we want to have consistent producer state up to the base offset of the segment we are recovering, and truncateAndReload will set the map end offset properly in this case, this approach seems better. I also tested it just now, and it works (fixes the original bug). I will update the PR with your suggestion.

hachikuji

LGTM. Thanks for the patch.

landau · 2019-12-02T15:41:48Z

Was this released in Kafka 1.1.2? I cannot find the download here https://kafka.apache.org/downloads or here https://archive.apache.org/dist/kafka/

This ticket, https://issues.apache.org/jira/browse/KAFKA-7401, claims it was released as part of 1.1.2

If it hasn't been released, what effort would it be to get it released? We are currently unable to start one of our brokers due to this. Thank you!

KAFKA-7401: Fix inconsistent range exception on segment recovery

630aaf4

hachikuji reviewed Feb 4, 2019

View reviewed changes

fix by correctly calling truncateAndReload

95c7b74

hachikuji approved these changes Feb 12, 2019

View reviewed changes

hachikuji merged commit 1b6bfda into apache:1.1 Feb 12, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-7401: Fix inconsistent range exception on segment recovery #6220

KAFKA-7401: Fix inconsistent range exception on segment recovery #6220

apovzner commented Feb 1, 2019

hachikuji Feb 4, 2019

apovzner Feb 11, 2019

hachikuji left a comment

landau commented Dec 2, 2019 •

edited

KAFKA-7401: Fix inconsistent range exception on segment recovery #6220

KAFKA-7401: Fix inconsistent range exception on segment recovery #6220

Conversation

apovzner commented Feb 1, 2019

Committer Checklist (excluded from commit message)

hachikuji Feb 4, 2019

Choose a reason for hiding this comment

apovzner Feb 11, 2019

Choose a reason for hiding this comment

hachikuji left a comment

Choose a reason for hiding this comment

landau commented Dec 2, 2019 • edited

landau commented Dec 2, 2019 •

edited