Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FLINK-30570] Fix error inferred partition condition #21622

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

Aitozi
Copy link
Contributor

@Aitozi Aitozi commented Jan 9, 2023

What is the purpose of the change

This PR is meant to fix the unexpected partition pruning. It fix in two parts:

  • The partition condition should contains the InputRef of the partition field. Otherwise, it will be grouped to non-partition predicates
  • The partition condition should only accept the deterministic predicates.

Verifying this change

Some tests are added to verify it.

@flinkbot
Copy link
Collaborator

flinkbot commented Jan 9, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@godfreyhe
Copy link
Contributor

cc @swuferhong

Copy link
Contributor

@swuferhong swuferhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution @Aitozi, I go through the code and left some comments

Comment on lines +1222 to 1248
remainingPartitions = Collections.emptyList();
this.data.put(Collections.emptyMap(), Collections.emptyList());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this cast affect the final result of tests? Or you think it is meaningless?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast will fail, because a map can not be cast to a list (it fails when I running test that pushing the empty list partition down)

@@ -209,7 +209,7 @@ object RexNodeExtractor extends Logging {

val (partitionPredicates, nonPartitionPredicates) =
conjunctions.partition(isSupportedPartitionPredicate(_, partitionFieldNames, inputFieldNames))
(partitionPredicates, nonPartitionPredicates)
(partitionPredicates.filter(p => RexUtil.isDeterministic(p)), nonPartitionPredicates)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the better way to implement this code is to judge the isDeterministic() in isSupportedPartitionPredicate instead of adding filter after getting partitionPredicates. These rexNode which is deterministic need to be added into the list of nonPartitionPredicates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will fix this.

Comment on lines 274 to 282
val inputRefFinder = new InputRefVisitor
predicate.accept(inputRefFinder)
// if no fields reached, it's not partition condition
if (inputRefFinder.getFields.length == 0) {
return false
}
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no need to new an InputRefVisitor. I think you need to deal it in isSupportedPartitionPredicate.visitor. Changing the logic of visitor.visitInputRef and override visitRexCall if needed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea, I will fix this.

Comment on lines 151 to 160

@Test
def testRandCondition(): Unit = {
util.verifyRelPlan("SELECT * FROM PartitionableTable WHERE rand(1) < 0.001")
}

@Test
def testRandCondition2(): Unit = {
util.verifyRelPlan("SELECT * FROM PartitionableTable WHERE rand(part2) < 0.001")
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are the same tests as tests you added in PushPartitionIntoLegacyTableSourceScanRuleTest. I think there is no need to add these tests. WDYT?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mainly follow the suggestion as here: #12966 (review)
Added the test in the rule test PushPartitionIntoLegacyTableSourceScanRuleTest and PartitionableSourceTest

Array("p"))
Assert.assertTrue(p.isEmpty)
Assert.assertTrue(nonP.isEmpty)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it be combined into one test? Roughly speaking, these tests are similar logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I combine the three test to two, and added some comments.

@Aitozi
Copy link
Contributor Author

Aitozi commented Feb 19, 2023

Hi @swuferhong , thank you for helping review this PR, I have fixed your comment, please take a look again.

@Aitozi
Copy link
Contributor Author

Aitozi commented Feb 20, 2023

The failed CI is caused by: https://issues.apache.org/jira/browse/FLINK-31120

@Aitozi
Copy link
Contributor Author

Aitozi commented Feb 20, 2023

@flinkbot run azure

@Aitozi Aitozi requested review from swuferhong and removed request for godfreyhe February 20, 2023 13:26
@Aitozi
Copy link
Contributor Author

Aitozi commented Feb 23, 2023

@swuferhong could you help take a look again?

@swuferhong
Copy link
Contributor

LGTM, cc @godfreyhe

@Aitozi
Copy link
Contributor Author

Aitozi commented Apr 14, 2023

@godfreyhe Can you help take a final look on this PR ? Or the patch need to keep update to fix conflict

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants