Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[GLUTEN-3582][CH] Remove Not Support PushDownFilters #4624

Merged

Conversation

baibaichen
Copy link
Contributor

@baibaichen baibaichen commented Feb 2, 2024

This is followup of #4582, this PR removes unsupported parquet filters for Clickhouse backend. I also refactor WholeStageTransformerSuite for adding GlutenParquetFilterSuite

  1. Introduce Arm.withResource to close file for fixing resource leak.
  2. Introduce withDataframe for dry run sql.
  3. Introduce tpchSQL to simpilfy getting tpch sql from resources

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

(Fixes: #3582)

How was this patch tested?

Adding new UT.

Refactor WholeStageTransformerSuite for adding GlutenParquetFilterSuite
1. Introduce Arm.withResource to close file for fixing resource leak.
2. Introduce withDataframe for dry run sql.
3. Introduce tpchSQL to simpilfy geting tpch sql from resources
Copy link

github-actions bot commented Feb 2, 2024

#3582

Copy link

github-actions bot commented Feb 2, 2024

Run Gluten Clickhouse CI

Copy link

github-actions bot commented Feb 2, 2024

Run Gluten Clickhouse CI

Copy link
Contributor

@zzcclp zzcclp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@baibaichen baibaichen merged commit 493cffd into apache:main Feb 4, 2024
19 checks passed
@baibaichen baibaichen deleted the feature/removeNotSupportPushDownFilters branch February 4, 2024 02:04
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4624_time.csv log/native_master_02_03_2024_75285bcb8_time.csv difference percentage
q1 34.97 33.28 -1.694 95.16%
q2 24.32 24.26 -0.055 99.77%
q3 39.07 38.78 -0.287 99.27%
q4 37.70 37.64 -0.066 99.83%
q5 71.50 70.97 -0.529 99.26%
q6 6.95 8.56 1.608 123.13%
q7 82.54 84.29 1.755 102.13%
q8 84.90 84.63 -0.264 99.69%
q9 125.15 121.75 -3.392 97.29%
q10 42.65 43.29 0.636 101.49%
q11 19.97 20.58 0.613 103.07%
q12 26.76 26.43 -0.328 98.77%
q13 44.83 45.82 0.988 102.20%
q14 22.19 21.75 -0.446 97.99%
q15 26.80 29.07 2.277 108.50%
q16 14.24 14.30 0.057 100.40%
q17 101.26 103.31 2.048 102.02%
q18 148.01 149.45 1.442 100.97%
q19 14.16 13.09 -1.078 92.39%
q20 23.88 26.67 2.790 111.68%
q21 220.47 222.74 2.269 101.03%
q22 13.72 13.69 -0.031 99.77%
total 1226.03 1234.34 8.311 100.68%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CH]Improve parquet reader performacne
3 participants