KAFKA-14481: Move LogSegment/LogSegments to storage module #14529

ijuma · 2023-10-11T13:54:29Z

A few notes:

Delete a few methods from UnifiedLog that were simply invoking the related method in LogFileUtils
Fix CoreUtils.swallow to use the passed in logging
Fix LogCleanerParameterizedIntegrationTest to close log before reopening
Minor tweaks in LogSegment for readability

For broader context on this change, please check:

KAFKA-14470: Move log layer to storage module

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

ijuma · 2023-10-11T15:37:46Z

core/src/main/java/kafka/log/remote/RemoteLogManager.java

-            LogSegmentData segmentData = new LogSegmentData(logFile.toPath(), toPathIfExists(segment.lazyOffsetIndex().get().file()),
-                    toPathIfExists(segment.lazyTimeIndex().get().file()), Optional.ofNullable(toPathIfExists(segment.txnIndex().file())),
+            LogSegmentData segmentData = new LogSegmentData(logFile.toPath(), toPathIfExists(segment.offsetIndex().file()),
+                    toPathIfExists(segment.timeIndex().file()), Optional.ofNullable(toPathIfExists(segment.txnIndex().file())),


@satishd Is it intentional that we force the indexes to be materialized here? We could pass the file without materializing if that's better.

This metadata is passed to RSM plugin which is external to Kafka. I would like to hide the details (that we have a lazy materialized index) from the external RSM plugin and instead have a clean contract which states, "Kafka guarantees that these files will be present, RSM can pick up and upload these". This provides a clean decoupling where Kafka <-> RSM plugin state sharing is only via files.
That is why we need to materialize the indexes before giving them to RSM plugin.

There is cost in materializing and hence why we should be sure it's needed. If that's the case here, then we're all good.

I have been thinking about this. I take back my original statement.

We want to ensure that the file we are passing to RSM plugin contains all the data which is present in MemoryByteBuffer i.e. we should have flushed the MemoryByteBuffer to the file using force(). In Kafka, when we close a segment, indexes are flushed asynchronously. Hence, it might be possible that when we are passing the file to RSM, the file doesn't contain flushed data. This is a bug but it is not related to the change we are trying to make here. I will create a separate JIRA for this.

RSM doesn't need to read the index file during archive, hence it's ok to pass just the file name "without" materializing the index in the in-memory mmapped buffers.

Hence, we should flush() the content of indexes into the file before this operation but it is not necessary to materialize (i.e. read the content of file into memory).

@satishd we require your opinion here.

Perhaps we can file the JIRA and discuss it there.

@ijuma, Good catch! afaik, no need to load the lazyIndex to write the contents of the index files to remote storage.

@Divij Right, we do not need to materialize the indexes for RSM to write them to the remote storage. Whenever a log segment is rolled over, segment and its indexes are flushed to disk in an asynchrnous manner. As indexes are mmapped, any file reads fetch from page cache which will be consistent with whatever is written to memory. We can explore whether it is really needed to flush it. RLM has access to segment, that can be flushed before the files are passed to RSM.

Filed https://issues.apache.org/jira/browse/KAFKA-15612 for follow-up discussion.

ijuma · 2023-10-11T15:41:58Z

core/src/main/scala/kafka/log/UnifiedLog.scala

@@ -1945,26 +1947,17 @@ object UnifiedLog extends Logging {
      logOffsetsListener)
  }

-  def logFile(dir: File, offset: Long, suffix: String = ""): File = LogFileUtils.logFile(dir, offset, suffix)


This and other deleted methods were simply passing through to LogFileUtils and hence did not add enough value to retain.

ijuma · 2023-10-11T15:44:22Z

core/src/main/scala/kafka/utils/CoreUtils.scala

+        case Level.WARN => logging.warn(e.getMessage, e)
+        case Level.INFO => logging.info(e.getMessage, e)
+        case Level.DEBUG => logging.debug(e.getMessage, e)
+        case Level.TRACE => logging.trace(e.getMessage, e)


Fixed a bug where we were not using the passed logging.

do we want to add a test (perhaps for one of the functions that are using this utility) which could have caught this? Could be done as a separate JIRA.

I vote for separate JIRA.

separate JIRA is good. Could be a great newbie task.

https://issues.apache.org/jira/browse/KAFKA-15610

storage/src/main/java/org/apache/kafka/storage/internals/log/LazyIndex.java

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

divijvaidya · 2023-10-12T09:55:50Z

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

+        return lazyTimeIndex.get();
+    }
+
+    public File timeIndexFile() {


We are leaking the implementation detail of index here (the fact that it is backed by a file). IMO, this should not be a public method. If someone wants to access the underlying index file, they should be asking the Index about it and not the LogSegment using LogSegment.timeIndex().file() instead.

It's public because it needs to be a public method (this is an internal class though). What you suggested doesn't work because it forces materialization and it would be a serious regression.

We are leaking the fact here that the index is implemented using a lazy index implementation outside the segment.

Does the caller know that File returned by timeIndexFile() may not be consistent with the in-memory state of index? We are adding the responsibility on the caller to ensure that it calls flush() if it requires consistency. This means that we are leaking the internal implementation of index (being lazy) to the caller. This is concerning because it can cause bugs where authors using this method in other parts of the code may not understand that it could be eventually inconsistent.

If we really want to provide an index reference which doesn't require materialization, why not share the LazyIndex with the caller. Then the caller explicitly knows that this index is lazily evaluated?

Also, where are we using this function outside this class? Should this be private?

We are not leaking anything that is not already exposed. It would be different if that was not accessible via the overall interface already. The only discussion here is not whether this is exposed (it already is), but how to expose it.

We are not leaking anything that is not already exposed. It would be different if that was not accessible via the overall interface already. The only discussion here is not whether this is exposed (it already is), but how to expose it.

Fair point. Let tackle the how question in this PR and take up plugging the underlying file implementation leak separately later.

On the how part, can we keep the index or LaxyIndex as the components who choose to expose internal implementation outside (instead of LogSegment)? What that means is that we will probably have a function here in LogSegment which returns LazyIndex.

Thoughts?

LazyIndex exposes a larger surface area though. And it's confusing to expose both LazyIndex and the materialized index. The way it is now is actually simpler. Local indexes are file based and that's core to how they work. LogSegment gives you the file if that's all you need or the materialized index. And LazyIndex is used internally to simplify the implementation.

ok, could we add a javadoc on this method that the physical file pointed to by this file object may not be consistent with in-memory copy of index? It doesn't totally address my concern of a potential bug when someone uses this File and assumes that it represents a consistent view of the index but I will am happy with a javadoc now. The whole index implemented by a file thing needs to be hidden away but that belongs in a separate JIRA.

The indexes are memory mapped, so in theory it should be consistent for reads that go through the page cache.

One last clarification: I am not opposed to improving the overall modularity of the classes in the storage layer. As you said, in a separate PR/discussion.

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

clients/src/main/java/org/apache/kafka/common/record/Records.java

clients/src/main/java/org/apache/kafka/common/record/AbstractRecords.java

ijuma · 2023-10-12T14:17:10Z

Thanks for the prompt review @divijvaidya! Please note that this is still in draft because there are still failing tests. Since it was still in draft mode, I have been force pushing and so on. In the future, please let me know if you are reviewing a draft PR so I can avoid force pushes (which are painful for you as the reviewer).

…sible for testing

…ode vs the old code

ijuma · 2023-10-12T14:59:50Z

core/src/test/scala/unit/kafka/log/LogCleanerParameterizedIntegrationTest.scala

    TestUtils.waitUntilTrue(() => log.logStartOffset == endOffset,
      "Timed out waiting for deletion of old segments")
    assertEquals(1, log.numberOfSegments)

    cleaner.shutdown()
+    closeLog(log)


I noticed we did not close the log before reopening it a bit later. It didn't seem to cause any problems, but it made it more difficult to debug test failures (since the behavior is not clearly defined in this case).

Good catch!

ijuma · 2023-10-12T15:06:08Z

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

+     * The first time this is invoked, it will result in a time index lookup (including potential materialization of
+     * the time index).
+     */
+    public TimestampOffset readMaxTimestampAndOffsetSoFar() throws IOException {


I added a read prefix to this to make it clearer that it does more than than the field maxTimestampAndOffsetSoFar. The conversion from Scala to Java had originally caused some code to use the field instead of the method, which led to some subtly different behavior in some cases.

ijuma · 2023-10-12T15:06:36Z

I believe the tests should pass this time, let's see.

ijuma · 2023-10-12T16:23:02Z

The Java 11 build has 3 unrelated failures:

Build / JDK 11 and Scala 2.13 / org.apache.kafka.common.network.SslTransportLayerTest.[3] tlsProtocol=TLSv1.3, useInlinePem=false
Build / JDK 11 and Scala 2.13 / org.apache.kafka.streams.integration.NamedTopologyIntegrationTest.shouldAddNamedTopologyToRunningApplicationWithSingleInitialNamedTopology()
Build / JDK 11 and Scala 2.13 / org.apache.kafka.streams.processor.internals.StreamsAssignmentScaleTest.testFallbackPriorTaskAssignorLargePartitionCount

This is ready for review.

ijuma · 2023-10-13T00:30:55Z

Test failures:

Build / JDK 17 and Scala 2.13 / kafka.api.PlaintextConsumerTest.testSubsequentPatternSubscription()
Build / JDK 21 and Scala 2.13 / kafka.api.AuthorizerIntegrationTest.testAuthorizeByResourceTypePrefixedResourceDenyDominate(String).quorum=zk
Build / JDK 21 and Scala 2.13 / kafka.api.ConsumerBounceTest.testCloseDuringRebalance()
Build / JDK 21 and Scala 2.13 / kafka.api.DelegationTokenEndToEndAuthorizationWithOwnerTest.testCreateTokenForOtherUserFails(String).quorum=kraft
Build / JDK 21 and Scala 2.13 / org.apache.kafka.trogdor.coordinator.CoordinatorTest.testTaskRequestWithOldStartMsGetsUpdated()
Build / JDK 11 and Scala 2.13 / integration.kafka.server.FetchFromFollowerIntegrationTest.testRackAwareRangeAssignor()
Build / JDK 11 and Scala 2.13 / org.apache.kafka.tools.MetadataQuorumCommandTest.[1] Type=Raft-Combined, Name=testDescribeQuorumReplicationSuccessful, MetadataVersion=3.7-IV0, Security=PLAINTEXT
Build / JDK 11 and Scala 2.13 / org.apache.kafka.tools.MetadataQuorumCommandTest.[1] Type=Raft-Combined, Name=testDescribeQuorumReplicationSuccessful, MetadataVersion=3.7-IV0, Security=PLAINTEXT

Java 8 passed and the failures look unrelated, but I kicked off another build to get more signal.

satishd · 2023-10-13T09:00:16Z

@ijuma Thanks for the PR. I will review it tomorrow.

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

divijvaidya · 2023-10-13T09:20:41Z

core/src/main/scala/kafka/utils/CoreUtils.scala

+        case Level.WARN => logging.warn(e.getMessage, e)
+        case Level.INFO => logging.info(e.getMessage, e)
+        case Level.DEBUG => logging.debug(e.getMessage, e)
+        case Level.TRACE => logging.trace(e.getMessage, e)


do we want to add a test (perhaps for one of the functions that are using this utility) which could have caught this? Could be done as a separate JIRA.

divijvaidya · 2023-10-13T09:22:06Z

core/src/test/scala/unit/kafka/log/LogCleanerParameterizedIntegrationTest.scala

    TestUtils.waitUntilTrue(() => log.logStartOffset == endOffset,
      "Timed out waiting for deletion of old segments")
    assertEquals(1, log.numberOfSegments)

    cleaner.shutdown()
+    closeLog(log)


Good catch!

…LogSegment

divijvaidya

Thank you for patiently answering my comments, Ismael. This looks good to me.

ijuma · 2023-10-13T18:21:19Z

Thank you for patiently answering my comments

Thanks for the review and for paying close attention to code quality (modularity, readability, etc.) - it's important!

ijuma · 2023-10-13T18:33:35Z

@ijuma Thanks for the PR. I will review it tomorrow.

Thanks @satishd. I won't merge until Tuesday to give you a chance to review.

satishd

Thanks @ijuma for the PR covering the refactoring and the cleanup . LGTM.

satishd · 2023-10-16T06:17:35Z

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

+            else if (e instanceof RuntimeException)
+                throw (RuntimeException) e;
+            else
+                throw new IllegalStateException("Unexpected exception thrown: " + e, e);


Good change to maintain the semantics while moving to Java.

satishd · 2023-10-16T06:20:44Z

storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java

+    public void close() throws IOException {
+        if (maxTimestampAndOffsetSoFar != TimestampOffset.UNKNOWN)
+            Utils.swallow(LOGGER, Level.WARN, "maybeAppend", () -> timeIndex().maybeAppend(maxTimestampSoFar(), offsetOfMaxTimestampSoFar(), true));
+        Utils.closeQuietly(lazyOffsetIndex, "offsetIndex", LOGGER);


Nice to use closeQuietly instead of using swallow by closing here.

) A few notes: * Delete a few methods from `UnifiedLog` that were simply invoking the related method in `LogFileUtils` * Fix `CoreUtils.swallow` to use the passed in `logging` * Fix `LogCleanerParameterizedIntegrationTest` to close `log` before reopening * Minor tweaks in `LogSegment` for readability For broader context on this change, please check: * KAFKA-14470: Move log layer to storage module Reviewers: Divij Vaidya <diviv@amazon.com>, Satish Duggana <satishd@apache.org>

ijuma force-pushed the log-segment-java branch 2 times, most recently from 32ac0c1 to db75c14 Compare October 11, 2023 14:24

ijuma commented Oct 11, 2023

View reviewed changes

ijuma force-pushed the log-segment-java branch 6 times, most recently from ac459c9 to 1600c76 Compare October 12, 2023 07:24

divijvaidya reviewed Oct 12, 2023

View reviewed changes

KAFKA-14481: Move LogSegment/LogSegments to storage module

c436992

ijuma force-pushed the log-segment-java branch from 1600c76 to ee4ef39 Compare October 12, 2023 14:18

ijuma added 2 commits October 12, 2023 07:56

Make a couple of methods private and add note about a method being vi…

c0a8c79

…sible for testing

Fix issue where we were not materializing the time index in the new c…

01d2179

…ode vs the old code

ijuma force-pushed the log-segment-java branch from 1f3b432 to 01d2179 Compare October 12, 2023 14:56

Fix test to close the log before re-opening

f05cc4a

ijuma commented Oct 12, 2023

View reviewed changes

Address review comment

90f33a4

ijuma commented Oct 12, 2023

View reviewed changes

ijuma marked this pull request as ready for review October 12, 2023 16:19

ijuma requested a review from satishd October 12, 2023 16:20

Simplify spotbugs exclusion

703d4f6

ijuma mentioned this pull request Oct 12, 2023

KAFKA-14483 Move/Rewrite of LocalLog to storage module. #14034

Closed

3 tasks

divijvaidya reviewed Oct 13, 2023

View reviewed changes

LogSegment implement Closeable, remove public from some methods in …

75b261c

…LogSegment

divijvaidya approved these changes Oct 13, 2023

View reviewed changes

satishd approved these changes Oct 16, 2023

View reviewed changes

ijuma merged commit 1073d43 into apache:trunk Oct 16, 2023
1 check was pending

ijuma deleted the log-segment-java branch October 16, 2023 13:37

atu-sharm mentioned this pull request Oct 19, 2023

KAFKA-15610: Fix CoreUtils.swallow() test gaps #14583

Closed

KAFKA-14481: Move LogSegment/LogSegments to storage module #14529

KAFKA-14481: Move LogSegment/LogSegments to storage module #14529

Conversation

ijuma commented Oct 11, 2023 • edited Loading

Committer Checklist (excluded from commit message)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijuma Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijuma Oct 13, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijuma commented Oct 12, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijuma Oct 12, 2023 • edited Loading

Choose a reason for hiding this comment

ijuma commented Oct 12, 2023

ijuma commented Oct 12, 2023

ijuma commented Oct 13, 2023 • edited Loading

satishd commented Oct 13, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

divijvaidya left a comment

Choose a reason for hiding this comment

ijuma commented Oct 13, 2023 • edited Loading

ijuma commented Oct 13, 2023

satishd left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ijuma commented Oct 11, 2023 •

edited

Loading

ijuma Oct 12, 2023 •

edited

Loading

ijuma Oct 13, 2023 •

edited

Loading

ijuma Oct 12, 2023 •

edited

Loading

ijuma commented Oct 13, 2023 •

edited

Loading

ijuma commented Oct 13, 2023 •

edited

Loading