KAFKA-13817 Always sync nextTimeToEmit with wall clock #12166

qingwei91 · 2022-05-16T13:57:00Z

We should sync nextTimeToEmit with wall clock on each method call to ensure throttling works correctly in case of clock drift.
If we dont, then in the event of significant clock drift, throttling might not happen for a long time, this can hurt performance.

I've added a unit test to simulate clock drift and verify my change works.

Committer Checklist (excluded from commit message)

Verify design and implementation
Verify test coverage and CI build status
Verify documentation (including upgrade notes)

We should sync nextTimeToEmit with wall clock on each method call to ensure throttling works correctly in case of clock drift. If we dont, then in the event of significant clock drift, throttling might not happen for a long time, this can hurt performance.

qingwei91 · 2022-05-16T13:58:33Z

streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamKStreamJoin.java

+
+            // Ensure `nextTimeToEmit` is synced with `currentSystemTimeMs`, if we dont set it everytime,
+            // they can get out of sync during a clock drift
+            sharedTimeTracker.nextTimeToEmit = internalProcessorContext.currentSystemTimeMs();


Is it ok to have comments here? it wasn't obvious to me what this piece of code was doing initially, I thought having comments might help, but I don't feel strongly, please let me know if you'd like it removed

I'm ok with the comment

qingwei91 · 2022-05-16T14:00:58Z

streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamKStreamJoinTest.java

@@ -333,6 +352,87 @@ public void shouldJoinWithCustomStoreSuppliers() {
        runJoin(streamJoined.withOtherStoreSupplier(otherStoreSupplier), joinWindows);
    }

+    @Test
+    public void shouldThrottleEmitNonJoinedOuterRecordsEvenWhenClockDrift() {


This test is quite convoluted because it relies on low-level API, this appears to be the 1st instance in test (other test relies on higher level API), is this acceptable?

I resort to this approach because we need to manipulate TimeTracker which isn't available in high level API. And I don't feel comfortable to make larger change in the codebase.

Please let me know if you think there's a better way.

It'a a minor improvement to the actual code, so it might be ok to not add a test for it? -- Otherwise, I don't have a proposal for better code... It's in the guts so it's messy (and thus maybe not worth?) to test.

There is KStreamWindowAggregateTest#shouldEmitWithLargeInterval() that tests a similar thing.

I am happy to defer to you or fellow contributors/maintainers.

My personal view is that the behavior change is quite subtle, so having test to codify it is useful, but if we are happy to merge it without unit test I am happy too

Hi @qingwei91 , thanks for fixing and great test coverage! Regarding test complexity, can you do something similar as https://github.com/apache/kafka/blob/trunk/streams/src/test/java/org/apache/kafka/streams/kstream/internals/KStreamWindowAggregateTest.java#L768 to test time drift. Instead of mocking low level stores, can you check the final results?

Thanks for the advice, I will try to mimick that

Hi @lihaosky , I changed the test to this: 26f6fa6

Is it ok?

lihaosky · 2022-07-06T17:32:32Z

I can also take a look by end of this week.

lihaosky

Thanks @qingwei91 ! The fix looks good. Just one comment about the test to see if we can make it simpler.

lihaosky · 2022-07-21T23:09:51Z

streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamKStreamJoin.java

+
+            // Ensure `nextTimeToEmit` is synced with `currentSystemTimeMs`, if we dont set it everytime,
+            // they can get out of sync during a clock drift
+            sharedTimeTracker.nextTimeToEmit = internalProcessorContext.currentSystemTimeMs();


I'm ok with the comment

mjsax · 2022-11-02T23:48:57Z

@qingwei91 -- What is the status of this PR? Seems there is open comments that would need to be addressed? Would be great if we could push this over the finish line.

qingwei91 · 2022-11-03T21:36:28Z

@mjsax sorry, I will try to pick this back up this weekend

lihaosky

Thanks @qingwei91 ! LGTM. Failed test doesn't seem related.

lihaosky · 2022-11-07T21:06:51Z

@mjsax can help approve and merge as a committer.

mjsax · 2022-12-28T20:33:05Z

Thanks for the PR! Merged to trunk.

Reviewers: Matthias J. Sax <matthias@confluent.io>, Hao Li <hli@confluent.io>

qingwei91 commented May 16, 2022

View reviewed changes

qingwei91 marked this pull request as ready for review May 16, 2022 18:06

qingwei91 requested a review from mjsax June 30, 2022 12:27

lihaosky reviewed Jul 21, 2022

View reviewed changes

Address comment, dont use mock in test

26f6fa6

qingwei91 force-pushed the qing/kafka-13817 branch from 637b1f0 to 26f6fa6 Compare November 6, 2022 16:56

lihaosky approved these changes Nov 7, 2022

View reviewed changes

mjsax added the streams label Dec 28, 2022

mjsax merged commit 9c6c6bf into apache:trunk Dec 28, 2022

guozhangwang pushed a commit to guozhangwang/kafka that referenced this pull request Jan 25, 2023

KAFKA-13817 Always sync nextTimeToEmit with wall clock (apache#12166)

5256225

Reviewers: Matthias J. Sax <matthias@confluent.io>, Hao Li <hli@confluent.io>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KAFKA-13817 Always sync nextTimeToEmit with wall clock #12166

KAFKA-13817 Always sync nextTimeToEmit with wall clock #12166

qingwei91 commented May 16, 2022

qingwei91 May 16, 2022

lihaosky Jul 21, 2022

qingwei91 May 16, 2022

mjsax Jun 28, 2022

mjsax Jun 28, 2022

qingwei91 Jun 29, 2022

lihaosky Jul 21, 2022

qingwei91 Jul 23, 2022

qingwei91 Nov 7, 2022

lihaosky commented Jul 6, 2022

lihaosky left a comment

lihaosky Jul 21, 2022

mjsax commented Nov 2, 2022

qingwei91 commented Nov 3, 2022

lihaosky left a comment

lihaosky commented Nov 7, 2022

mjsax commented Dec 28, 2022

KAFKA-13817 Always sync nextTimeToEmit with wall clock #12166

KAFKA-13817 Always sync nextTimeToEmit with wall clock #12166

Conversation

qingwei91 commented May 16, 2022

Committer Checklist (excluded from commit message)

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lihaosky commented Jul 6, 2022

lihaosky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mjsax commented Nov 2, 2022

qingwei91 commented Nov 3, 2022

lihaosky left a comment

Choose a reason for hiding this comment

lihaosky commented Nov 7, 2022

mjsax commented Dec 28, 2022