Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Fix miss RowToColumnar with columnar table cache in AQE #4104

Merged
merged 2 commits into from
Dec 20, 2023

Conversation

ulysses-you
Copy link
Contributor

What changes were proposed in this pull request?

Since apache/spark#43484 Spark supports cache columnar batch, then outputsColumnar of AQE can be true. It breaks the assumption that the output is always row-based. This pr makes fallback policy compaible with this behavior.

How was this patch tested?

add test

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

yaooqinn
yaooqinn previously approved these changes Dec 19, 2023
Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just a comment for typo. Thanks!

}
}

override def apply(plan: SparkPlan): SparkPlan = {
// By default, the outputsColumnar if always false.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: is always false.

Copy link

Run Gluten Clickhouse CI

Copy link
Contributor

@PHILO-HE PHILO-HE left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@PHILO-HE PHILO-HE merged commit f845d1f into apache:main Dec 20, 2023
17 checks passed
@ulysses-you ulysses-you deleted the cache branch December 20, 2023 03:55
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4104_time.csv log/native_master_12_19_2023_521cad261_time.csv difference percentage
q1 32.40 32.55 0.156 100.48%
q2 24.85 25.18 0.337 101.36%
q3 38.69 37.70 -0.987 97.45%
q4 37.89 39.88 1.993 105.26%
q5 71.78 71.42 -0.363 99.49%
q6 7.23 5.41 -1.823 74.79%
q7 87.52 87.04 -0.482 99.45%
q8 87.14 86.34 -0.796 99.09%
q9 127.43 122.28 -5.153 95.96%
q10 44.87 44.36 -0.511 98.86%
q11 21.50 20.12 -1.381 93.58%
q12 25.95 26.74 0.793 103.06%
q13 46.21 46.35 0.142 100.31%
q14 18.50 17.44 -1.061 94.26%
q15 29.66 29.09 -0.574 98.07%
q16 15.72 15.70 -0.028 99.82%
q17 102.43 102.87 0.441 100.43%
q18 151.62 149.94 -1.678 98.89%
q19 13.92 12.78 -1.138 91.83%
q20 27.90 28.17 0.266 100.95%
q21 230.30 227.31 -2.994 98.70%
q22 13.91 13.95 0.042 100.30%
total 1257.42 1242.62 -14.797 98.82%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants