Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE][VL] Fix BatchScanExec filter pushdown logic #4132

Merged
merged 1 commit into from
Jan 4, 2024

Conversation

liujiayi771
Copy link
Contributor

What changes were proposed in this pull request?

Before #3843 the filters pushdown was incorporated into the runtimeFilters of BatchScanExecTransformer, which was ineffective. Hence, #3843 removed this logic. Now, the logic to pushdown filters in BatchScanExecTransformer has been restored and correctly set into a new pushdownFilters variable. This will be accessed when the filterExprs method is called, and afterwards, it can be pushdown into Velox's TableScan.

How was this patch tested?

Add new VeloxScanSuite test case to test subquery filter pushdown in tpch q22.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

liujiayi771 commented Dec 20, 2023

Before(Filter has been executed):
image
After(Filter has not been executed):
image

@zhouyuan
Copy link
Contributor

@liujiayi771 just pushed the Velox rebase for 12-20, you may need to do a rebase

-yuan

Copy link

Run Gluten Clickhouse CI

1 similar comment
Copy link

Run Gluten Clickhouse CI

@liujiayi771
Copy link
Contributor Author

@rui-mo Could you help review?

rui-mo
rui-mo previously approved these changes Jan 2, 2024
Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a trivial comment. Thanks!

@@ -392,7 +393,45 @@ object FilterHandler extends PredicateHelper {
fileSourceScan,
reuseSubquery,
extraFilters = leftFilters)
case batchScan: BatchScanExec =>
val leftFilters = batchScan.scan match {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think "remaining" is more formal and less misleading than "left". Could you revise the naming here and at some other places? cc @rui-mo

Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

1 similar comment
Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

case other =>
throw new UnsupportedOperationException(s"${other.getClass.toString} is not supported.")
}
}

object ScanHandler {
def getBatchScanTransformer(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this method into BatchScanExecTransformer.scala? Thanks!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to put it in ScanTransformerFactory, I will organize the code and only provide a createBatchScanTransformer method in the ScanTransformerFactory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok to me. Thanks!

Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Jan 3, 2024

Run Gluten Clickhouse CI

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will merge the patch soon. Thanks!

@PHILO-HE PHILO-HE merged commit 1953283 into apache:main Jan 4, 2024
19 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4132_time.csv log/native_master_01_03_2024_38de952fe_time.csv difference percentage
q1 32.87 32.78 -0.095 99.71%
q2 24.93 23.75 -1.186 95.24%
q3 37.79 37.47 -0.320 99.15%
q4 39.69 40.67 0.979 102.47%
q5 71.64 73.42 1.772 102.47%
q6 7.07 6.78 -0.288 95.92%
q7 88.67 85.99 -2.679 96.98%
q8 86.31 88.32 2.018 102.34%
q9 120.49 126.96 6.475 105.37%
q10 42.79 44.95 2.168 105.07%
q11 20.26 19.97 -0.285 98.59%
q12 26.16 28.06 1.900 107.26%
q13 46.41 46.25 -0.168 99.64%
q14 14.82 17.13 2.308 115.58%
q15 29.16 28.28 -0.874 97.00%
q16 15.14 15.84 0.708 104.68%
q17 102.52 102.58 0.054 100.05%
q18 152.39 148.96 -3.426 97.75%
q19 12.70 14.29 1.595 112.57%
q20 27.99 28.11 0.124 100.44%
q21 225.46 232.01 6.550 102.91%
q22 13.83 13.97 0.137 100.99%
total 1239.08 1256.55 17.469 101.41%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants