Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DRILL-6865: Query returns wrong result when filter pruning happens #1552

Closed
wants to merge 2 commits into from

Conversation

vvysotskyi
Copy link
Member

This PR contains two commits:

  • The first commit contains changes to preserve predicates from filter condition which weren't used for filter pushdown to avoid the case when the filter is pruned. Instead of pruning whole the filter, only predicates which were used in the row group filtering are removed. Please note, that this problem happened only for the case when row group fully matches to the predicates which are used in the filter pushdown.
  • The second commit contains changes to remove filter from the plan when parquet table has a single row group and fully matches the filter.

For problem descriptions please see DRILL-6865.

if (groupScan.isMatchAllRowGroups()) {
RelNode child = project == null ? scan : project;
// if current row group fully matches filter,
// but row group pruning wasn't happened, removes filter.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did not happen, remove the filter

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, changed.

@@ -85,6 +85,8 @@

private List<EndpointAffinity> endpointAffinities;
private ParquetGroupScanStatistics parquetGroupScanStatistics;
// whether all row groups of this group scan fully matches the filter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fully match

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

// but row group pruning wasn't happened, removes filter.
if (nonConvertedPredList.size() == 0) {
call.transformTo(child);
} else if (nonConvertedPredList.size() < predList.size()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Won't be else enough? Why check that non converted list is smaller?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case when nonConvertedPredList.size() == predList.size(), none of the predicates participated in filter pushdown, so call.transformTo() shouldn't be called for this case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This clear, my concern was second condition } else if (nonConvertedPredList.size() < predList.size()). Why we cannot use else instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, else will include both nonConvertedPredList.size() < predList.size() and nonConvertedPredList.size() == predList.size() cases, but as I pointed in the comment above, we shouldn't do anything for the last case.

@@ -63,6 +63,7 @@
static final Logger logger = LoggerFactory.getLogger(ParquetFilterBuilder.class);

private final UdfUtilities udfUtilities;
private final boolean omitUnsupportedExprs;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add javadoc explaining cases when we want to omit unsupported expressions and when we don't.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please describe cases when we need this flag.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to its Javadoc case when it should be used.

Copy link
Member Author

@vvysotskyi vvysotskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arina-ielchiieva, thanks for the review, I have addressed your comments, could you please take a look again?

@@ -85,6 +85,8 @@

private List<EndpointAffinity> endpointAffinities;
private ParquetGroupScanStatistics parquetGroupScanStatistics;
// whether all row groups of this group scan fully matches the filter
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed

@@ -63,6 +63,7 @@
static final Logger logger = LoggerFactory.getLogger(ParquetFilterBuilder.class);

private final UdfUtilities udfUtilities;
private final boolean omitUnsupportedExprs;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, added.

if (groupScan.isMatchAllRowGroups()) {
RelNode child = project == null ? scan : project;
// if current row group fully matches filter,
// but row group pruning wasn't happened, removes filter.
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks, changed.

// but row group pruning wasn't happened, removes filter.
if (nonConvertedPredList.size() == 0) {
call.transformTo(child);
} else if (nonConvertedPredList.size() < predList.size()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the case when nonConvertedPredList.size() == predList.size(), none of the predicates participated in filter pushdown, so call.transformTo() shouldn't be called for this case.

Copy link
Member Author

@vvysotskyi vvysotskyi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arina-ielchiieva, I have made required changes, could you please take a look?

// but row group pruning wasn't happened, removes filter.
if (nonConvertedPredList.size() == 0) {
call.transformTo(child);
} else if (nonConvertedPredList.size() < predList.size()) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, else will include both nonConvertedPredList.size() < predList.size() and nonConvertedPredList.size() == predList.size() cases, but as I pointed in the comment above, we shouldn't do anything for the last case.

@@ -63,6 +63,7 @@
static final Logger logger = LoggerFactory.getLogger(ParquetFilterBuilder.class);

private final UdfUtilities udfUtilities;
private final boolean omitUnsupportedExprs;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added to its Javadoc case when it should be used.

@arina-ielchiieva
Copy link
Member

+1, LGTM.

@asfgit asfgit closed this in 99a3d76 Nov 26, 2018
mattpollack pushed a commit to mattpollack/drill that referenced this pull request Feb 25, 2019
lushuifeng pushed a commit to lushuifeng/drill that referenced this pull request Jun 21, 2019
xiangt920 pushed a commit to xiangt920/drill that referenced this pull request Dec 26, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants