Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KAFKA-16552: Create an internal config to control InitialTaskDelayMs in LogManager to speed up tests #15719

Merged
merged 7 commits into from Apr 20, 2024

Conversation

brandboat
Copy link
Member

related to KAFKA-16552,

Introduce a new internal config log.initial.task.delay.ms to control InitialTaskDelayMs in LogManager to speed up tests

Committer Checklist (excluded from commit message)

  • Verify design and implementation
  • Verify test coverage and CI build status
  • Verify documentation (including upgrade notes)

@brandboat brandboat marked this pull request as ready for review April 15, 2024 16:30
Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brandboat , thanks for quick PR, overall LGTM! Left some comments.

@@ -835,6 +836,7 @@ object KafkaConfig {
.define(CreateTopicPolicyClassNameProp, CLASS, null, LOW, CreateTopicPolicyClassNameDoc)
.define(AlterConfigPolicyClassNameProp, CLASS, null, LOW, AlterConfigPolicyClassNameDoc)
.define(LogMessageDownConversionEnableProp, BOOLEAN, LogConfig.DEFAULT_MESSAGE_DOWNCONVERSION_ENABLE, LOW, LogMessageDownConversionEnableDoc)
.defineInternal(LogInitialTaskDelayMsProp, LONG, LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS, LOW)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a doc at the last parameter for other developers know what this config is doing for?
Ex:
The initial task delay in millisecond when initializing tasks in LogManager. This should be used for testing only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we can set atLeast(0) in the defineInternal method. So that we don't need additional validator below.

@@ -179,6 +179,7 @@ public LogManager build() {
logDirFailureChannel,
time,
keepPartitionMetadataFile,
remoteStorageSystemEnable);
remoteStorageSystemEnable,
LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know currently we don't use LogManagerBuilder in the tests, but I still think we should add a initialTaskDelayMs setting and set default value to LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS.

Comment on lines -416 to 418
time.sleep(logManager.InitialTaskDelayMs)
time.sleep(logManager.initialTaskDelayMs)
assertEquals(6, log.numberOfSegments, "Now there should be exactly 6 segments")
time.sleep(log.config.fileDeleteDelayMs + 1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we create a test in LogManagerTest to verify the logManager will start these tasks after customized initialTaskDelayMs? Adding a simple test like what we see here, or maybe we can directly add verification inside these tests?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we create a test in LogManagerTest to verify the logManager will start these tasks after customized initialTaskDelayMs

Huge thanks ! I'm a little confused about the comment above, the test in LogManagerTest itself verify that tasks like log cleanup, flush logs are triggered after sleeping initialTaskDelayMs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huge thanks ! I'm a little confused about the comment above, the test in LogManagerTest itself verify that tasks like log cleanup, flush logs are triggered after sleeping initialTaskDelayMs.

Yes, and currently, all of them are setting initialTaskDelayMs=LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS. I'm thinking we could have a test and set initialTaskDelayMs to, let's say, 1000, and we can verify the change is adopted by verifying if we sleep only 1000ms, the log cleanup will be triggered.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation !

or maybe we can directly add verification inside these tests?

I decided to follow the comment as you mentioned earlier, and updated the initialTaskDelayMs to 10s in LogManagerTests. Please take another look 😃

@@ -1524,7 +1524,8 @@ object TestUtils extends Logging {
logDirFailureChannel = new LogDirFailureChannel(logDirs.size),
keepPartitionMetadataFile = true,
interBrokerProtocolVersion = interBrokerProtocolVersion,
remoteStorageSystemEnable = remoteStorageSystemEnable)
remoteStorageSystemEnable = remoteStorageSystemEnable,
initialTaskDelayMs = LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to speed up tests, shouldn't we use a lower value here?

Copy link
Member Author

@brandboat brandboat Apr 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment @soarez 😃 ! If I understand correctly TestUtils#createLogManager use MockTime and the clock would advance immediately after invoking the time#sleep method in the test, pointing to the corresponding sleep time thereafter. The goal in this pr is to introduce a new internal config to speed up for tests like e2e/integration tests which can't use MockTime as above.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Thanks for clarifying

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, thanks for reviewing the pr !

Copy link
Contributor

@chia7712 chia7712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@brandboat thanks for offering this nice function. one small comment left. overall + 1

@@ -55,6 +55,7 @@ public class LogManagerBuilder {
private Time time = Time.SYSTEM;
private boolean keepPartitionMetadataFile = true;
private boolean remoteStorageSystemEnable = false;
private long initialTaskDelayMs = LogConfig.DEFAULT_INITIAL_TASK_DELAY_MS;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add setter for it

Copy link
Contributor

@showuon showuon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@soarez
Copy link
Member

soarez commented Apr 18, 2024

Thanks for the changes.
You have some compilation errors:

[2024-04-17T15:27:16.786Z] > Task :jmh-benchmarks:compileJava
[2024-04-17T15:27:16.786Z] /home/jenkins/jenkins-agent/workspace/Kafka_kafka-pr_PR-15719/jmh-benchmarks/src/main/java/org/apache/kafka/jmh/server/CheckpointBench.java:110: error: method createLogManager in class TestUtils cannot be applied to given types;
[2024-04-17T15:27:16.786Z]         this.logManager = TestUtils.createLogManager(JavaConverters.asScalaBuffer(files),
[2024-04-17T15:27:16.786Z]                                    ^
[2024-04-17T15:27:16.786Z]   required: Seq<File>,LogConfig,ConfigRepository,CleanerConfig,MockTime,MetadataVersion,int,boolean,Option<UnifiedLog>,boolean,long
[2024-04-17T15:27:16.786Z]   found: Buffer<File>,LogConfig,MockConfigRepository,CleanerConfig,MockTime,MetadataVersion,int,boolean,Option<Object>,boolean
[2024-04-17T15:27:16.786Z]   reason: actual and formal argument lists differ in length
[2024-04-17T15:27:16.786Z] 1 error

Can you address these?

@brandboat
Copy link
Member Author

brandboat commented Apr 18, 2024

Thanks for the reminder, soarez ! Already fixed the error.

@chia7712
Copy link
Contributor

@brandboat this PR is great. However, I'd like to merge it after #15569. #15569 is a huge PR which refactor the KafkaConfig and LogConfig, and I try to alleviate the pain of fixing conflicts continually.

@brandboat
Copy link
Member Author

@brandboat this PR is great. However, I'd like to merge it after #15569. #15569 is a huge PR which refactor the KafkaConfig and LogConfig, and I try to alleviate the pain of fixing conflicts continually.

No problem ! Thanks for the notification !

@chia7712
Copy link
Contributor

@brandboat please fix the conflicts, thanks!

@chia7712 chia7712 merged commit ced79ee into apache:trunk Apr 20, 2024
1 check failed
@brandboat brandboat deleted the KAFKA-16552 branch April 20, 2024 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants