[FLINK-18119][table-runtime-blink] Retract old records in time range … #12680

hyeonseop-lee · 2020-06-16T11:16:53Z

…bounded preceding functions

What is the purpose of the change

This fixes a bug in time range bounded preceding functions that the old records that is no longer required are retracted only if a new record with the same key comes in. This prevents unlimitedly growing state especially when the keyspace mutates over time.

Brief change log

Register retract timer when new record comes in
Retract all records when the timer fires and no more record has came in

Verifying this change

This change is already covered by existing tests, such as OverWindowITCase.

Does this pull request potentially affect one of the following parts:

Dependencies (does it add or upgrade a dependency): no
The public API, i.e., is any changed class annotated with @Public(Evolving): no
The serializers: no
The runtime per-record code paths (performance sensitive): yes
Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn/Mesos, ZooKeeper: no
The S3 file system connector: no

Documentation

Does this pull request introduce a new feature? no
If yes, how is the feature documented? not applicable

flinkbot · 2020-06-16T11:22:28Z

Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community
to review your pull request. We will use this comment to track the progress of the review.

Automated Checks

Last check on commit a71ca38 (Tue Jun 16 11:22:27 UTC 2020)

Warnings:

No documentation files were touched! Remember to keep the Flink docs up to date!

_{Mention the bot in a comment to re-run the automated checks.}

Review Progress

❓ 1. The [description] looks good.
❓ 2. There is [consensus] that the contribution should go into to Flink.
❓ 3. Needs [attention] from.
❓ 4. The change fits into the overall [architecture].
❓ 5. Overall code [quality] is good.

Please see the Pull Request Review Guide for a full explanation of the review process.

The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot approve description to approve one or more aspects (aspects: description, consensus, architecture and quality)
@flinkbot approve all to approve all aspects
@flinkbot approve-until architecture to approve everything until architecture
@flinkbot attention @username1 [@username2 ..] to require somebody's attention
@flinkbot disapprove architecture to remove an approval you gave earlier

flinkbot · 2020-06-16T11:45:46Z

CI report:

47b13a5 UNKNOWN
e4010de Azure: FAILURE

Bot commands

The @flinkbot bot supports the following commands:

@flinkbot run travis re-run the last Travis build
@flinkbot run azure re-run the last Azure build

KurtYoung · 2020-06-16T12:49:36Z

Thanks for your contribution, could you add a unit test to this issue? This can help ensuring we don't introduce this bug again in the future.

hyeonseop-lee · 2020-06-17T13:35:45Z

@KurtYoung Added unit tests and verified the tests fail without fix.

KurtYoung

Since I'm not super familiar with over aggregate, cc @wuchong to give another review.

.../org/apache/flink/table/runtime/operators/over/RowTimeRangeBoundedPrecedingFunctionTest.java

...org/apache/flink/table/runtime/operators/over/ProcTimeRangeBoundedPrecedingFunctionTest.java

.../org/apache/flink/table/runtime/operators/over/RowTimeRangeBoundedPrecedingFunctionTest.java

libenchao

@protos37 Thanks for you contribution, the changes LGTM generally, only left some minor comments.
And let's wait for @wuchong 's final review.

...ava/org/apache/flink/table/runtime/operators/over/ProcTimeRangeBoundedPrecedingFunction.java

wuchong

Thanks for the great work @protos37 , nice catch! The changes looks good to me in general.

But I would like to go further with this PR. Actually, I don't think we need state ttl (The TableConfig.setIdleStateRetentionTime) for the bounded over aggregates. A bounded over aggregate is just like a processing/event-time interval join or window aggregation, the state size is bounded and stable. The operator should expire state automantically without lossing correctness. We also didn't introduce state ttl for interval join and window aggregation. Therefore, I think we can remove the state ttl logic in ProcTimeRangeBoundedPrecedingFunction and ProcTimeRowsBoundedPrecedingFunction, that means they shouldn't extend KeyedProcessFunctionWithCleanupState. What do you think?

.../org/apache/flink/table/runtime/operators/over/RowTimeRangeBoundedPrecedingFunctionTest.java

hyeonseop-lee · 2020-06-18T09:05:58Z

@wuchong Considering the cases you've mentioned it totally makes sense to me to remove the state ttl from time range bounded over aggregations. Thank you for opinion and I'd be glad to work on that.

wuchong · 2020-06-18T13:14:45Z

...java/org/apache/flink/table/runtime/operators/over/RowTimeRangeBoundedPrecedingFunction.java

+		// update timestamp and register timer if needed
+		Long curCleanupTimestamp = cleanupTsState.value();
+		if (curCleanupTimestamp == null || curCleanupTimestamp < cleanupTimestamp) {
+			// we don't delete existing timer since it may delete timer for data processing


This may cause some performance problem if we register a timer for each record, because each timer is an entry in this state. A better solution might be to use AbstractStreamOperator provides InternalTimerService which can register timer by namespace. We can separate the namespace between cleanup and data processing.

Besides, it would also be better if we can make the cleanup timestamp in a range instead of a point, e.g. if the current cleanup timer is in (timestamp + precedingOffset, precedingOffset + precedingOffset * 1.5) (similar to CleanupState#registerProcessingCleanupTimer) , then we don't need to register a new one. This can avoid to remove/register for each record and be friendly to statebackend.

This might be a big refactoring. Thus I'm fine to add TODO comment here and create a following issue to do that.

About having cleanup timestamp as range, if I understood correctly, it seems to be about tradeoff between immediate state reduction and timer related overhead. While we don't have specific criteria like maxRetentionTime, how can we choose the appropriate generosity for cleanup? Is it okay to go for 1.5 times as you mentioned?

Yes. This is a tradeoff to avoid too many timers. But the 1.5 times is up to discuss.

hyeonseop-lee · 2020-06-19T03:32:23Z

@wuchong Applied ranged timer timestamp. For the timer namespace it seems to be a big refactoring as you said so left it as TODO.

wuchong

Thanks for the updating. LGTM.

wuchong

Oops, we need to update OverWindowHarnessTest because we changed the state TTL behavior, otherwise, the tests are failed.

wuchong

+1 to merge when build is passed.

hyeonseop-lee · 2020-06-19T14:46:52Z

It seems that the e2e test has failed by timeout. Is there anything I can do?

wuchong · 2020-06-19T14:48:21Z

You can rebase to master and push force to trigger the build again.

…bounded preceding functions

…n time range bounded preceding functions

…ounded preceding functions

…mer overhead

…for time range bounded over aggregation This changes the state expiration behavior for RowTimeRangeBoundedPrecedingFunction and ProcTimeRangeBoundedPrecedingFunction. In the previous version, we use TableConfig.setIdleStateRetentionTime to cleanup state when it is idle for some time. However, a bounded over aggregation is just like a processing/event-time interval join or window aggregation, the state size should be bounded and stable. The operator should expire state automatically based on watermark and processing time without losing correctness. This closes apache#12680

hyeonseop-lee · 2020-06-19T15:01:20Z

Rebased and force pushed to branch... Hope it didn't break anything you were doing?

wuchong · 2020-06-20T01:43:31Z

Passed in my branch: https://dev.azure.com/imjark/Flink/_build/results?buildId=177&view=results

wuchong · 2020-06-20T01:43:36Z

Will merge it.

…for time range bounded over aggregation This changes the state expiration behavior for RowTimeRangeBoundedPrecedingFunction and ProcTimeRangeBoundedPrecedingFunction. In the previous version, we use TableConfig.setIdleStateRetentionTime to cleanup state when it is idle for some time. However, a bounded over aggregation is just like a processing/event-time interval join or window aggregation, the state size should be bounded and stable. The operator should expire state automatically based on watermark and processing time without losing correctness. This closes #12680

…for time range bounded over aggregation This changes the state expiration behavior for RowTimeRangeBoundedPrecedingFunction and ProcTimeRangeBoundedPrecedingFunction. In the previous version, we use TableConfig.setIdleStateRetentionTime to cleanup state when it is idle for some time. However, a bounded over aggregation is just like a processing/event-time interval join or window aggregation, the state size should be bounded and stable. The operator should expire state automatically based on watermark and processing time without losing correctness. This closes apache#12680

rmetzger added the review=description? label Jun 16, 2020

rmetzger added the component=TableSQL/Runtime label Jun 16, 2020

KurtYoung reviewed Jun 18, 2020

View reviewed changes

libenchao reviewed Jun 18, 2020

View reviewed changes

wuchong reviewed Jun 18, 2020

View reviewed changes

.../org/apache/flink/table/runtime/operators/over/RowTimeRangeBoundedPrecedingFunctionTest.java Outdated Show resolved Hide resolved

wuchong reviewed Jun 18, 2020

View reviewed changes

wuchong approved these changes Jun 19, 2020

View reviewed changes

wuchong requested changes Jun 19, 2020

View reviewed changes

wuchong approved these changes Jun 19, 2020

View reviewed changes

hyeonseop-lee added 7 commits June 19, 2020 23:52

[FLINK-18119][table-runtime-blink] Retract old records in time range …

87b29ea

…bounded preceding functions

[FLINK-18119][table-runtime-blink] Write unit tests for retractions i…

b8ee830

…n time range bounded preceding functions

[FLINK-18119][table-runtime-blink] Fix typo in comment

fe919e5

[FLINK-18119][table-runtime-blink] Address review comments

7154589

[FLINK-18119][table-runtime-blink] Remove state TTL from time range b…

00e8726

…ounded preceding functions

[FLINK-18119][table-runtime-blink] Allow delayed cleanup to reduce ti…

ac68f55

…mer overhead

[FLINK-18119][table-runtime-blink] Update OverWindowHarnessTest

e4010de

hyeonseop-lee force-pushed the flink-18119 branch from a10056b to e4010de Compare June 19, 2020 14:58

wuchong closed this in 22abfe2 Jun 20, 2020

hyeonseop-lee deleted the flink-18119 branch June 23, 2020 08:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FLINK-18119][table-runtime-blink] Retract old records in time range … #12680

[FLINK-18119][table-runtime-blink] Retract old records in time range … #12680

hyeonseop-lee commented Jun 16, 2020

flinkbot commented Jun 16, 2020

flinkbot commented Jun 16, 2020 •

edited

KurtYoung commented Jun 16, 2020

hyeonseop-lee commented Jun 17, 2020

KurtYoung left a comment

libenchao left a comment

wuchong left a comment

hyeonseop-lee commented Jun 18, 2020

wuchong Jun 18, 2020

hyeonseop-lee Jun 18, 2020

wuchong Jun 18, 2020

hyeonseop-lee commented Jun 19, 2020

wuchong left a comment

wuchong left a comment

wuchong left a comment

hyeonseop-lee commented Jun 19, 2020

wuchong commented Jun 19, 2020

hyeonseop-lee commented Jun 19, 2020

wuchong commented Jun 20, 2020

wuchong commented Jun 20, 2020

[FLINK-18119][table-runtime-blink] Retract old records in time range … #12680

[FLINK-18119][table-runtime-blink] Retract old records in time range … #12680

Conversation

hyeonseop-lee commented Jun 16, 2020

What is the purpose of the change

Brief change log

Verifying this change

Does this pull request potentially affect one of the following parts:

Documentation

flinkbot commented Jun 16, 2020

Automated Checks

Review Progress

flinkbot commented Jun 16, 2020 • edited

CI report:

KurtYoung commented Jun 16, 2020

hyeonseop-lee commented Jun 17, 2020

KurtYoung left a comment

Choose a reason for hiding this comment

libenchao left a comment

Choose a reason for hiding this comment

wuchong left a comment

Choose a reason for hiding this comment

hyeonseop-lee commented Jun 18, 2020

wuchong Jun 18, 2020

Choose a reason for hiding this comment

hyeonseop-lee Jun 18, 2020

Choose a reason for hiding this comment

wuchong Jun 18, 2020

Choose a reason for hiding this comment

hyeonseop-lee commented Jun 19, 2020

wuchong left a comment

Choose a reason for hiding this comment

wuchong left a comment

Choose a reason for hiding this comment

wuchong left a comment

Choose a reason for hiding this comment

hyeonseop-lee commented Jun 19, 2020

wuchong commented Jun 19, 2020

hyeonseop-lee commented Jun 19, 2020

wuchong commented Jun 20, 2020

wuchong commented Jun 20, 2020

flinkbot commented Jun 16, 2020 •

edited