Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL] Fix and use flattenVector #4783

Merged
merged 2 commits into from
Mar 7, 2024
Merged

Conversation

marin-ma
Copy link
Contributor

@marin-ma marin-ma commented Feb 26, 2024

This patch re-enables the flattern vector optimizations.

The flattenVector optimization firstly landed in #4415
but partially reverted in #4474 due to some bugs on Celeborn code path.

Celeborn integration tests should check the code path already

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

@marin-ma
Copy link
Contributor Author

/Benchmark Velox

@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4783_time.csv log/native_master_02_26_2024_6acd1b367_time.csv difference percentage
q1 35.66 33.25 -2.416 93.23%
q2 24.03 24.24 0.208 100.87%
q3 37.30 38.32 1.025 102.75%
q4 37.29 37.90 0.614 101.65%
q5 68.41 72.46 4.052 105.92%
q6 6.61 6.95 0.340 105.14%
q7 82.44 83.72 1.277 101.55%
q8 85.66 86.53 0.869 101.01%
q9 117.67 127.60 9.932 108.44%
q10 43.04 47.09 4.046 109.40%
q11 20.15 20.47 0.325 101.61%
q12 29.16 28.59 -0.565 98.06%
q13 45.89 47.07 1.174 102.56%
q14 19.59 18.34 -1.250 93.62%
q15 28.43 27.73 -0.697 97.55%
q16 14.50 14.65 0.144 100.99%
q17 99.65 101.67 2.021 102.03%
q18 144.97 147.95 2.983 102.06%
q19 13.92 15.14 1.219 108.76%
q20 29.59 27.29 -2.304 92.21%
q21 225.00 228.47 3.465 101.54%
q22 14.76 13.67 -1.088 92.63%
total 1223.71 1249.09 25.374 102.07%

@marin-ma marin-ma marked this pull request as ready for review March 4, 2024 14:06
@marin-ma
Copy link
Contributor Author

marin-ma commented Mar 4, 2024

/Benchmark Velox

@@ -222,6 +222,7 @@ arrow::Status VeloxShuffleWriter::init() {

ARROW_ASSIGN_OR_RAISE(
partitioner_, Partitioner::make(options_.partitioning, numPartitions_, options_.startPartitionId));
DLOG(INFO) << "Create partitioning type: " << std::to_string(options_.partitioning);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks like a debug log?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. The DLOG ensures that it only get printed with debug mode. We probably need this log for debugging, because sometimes the shuffle operator can get omitted on Spark UI, such as a single partitioning after limit operator.

Copy link
Contributor

@zhouyuan zhouyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@marin-ma marin-ma merged commit f775d42 into apache:main Mar 7, 2024
20 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4783_time.csv log/native_master_03_06_2024_bddc3fd79_time.csv difference percentage
q1 37.58 38.77 1.188 103.16%
q2 24.01 24.36 0.345 101.44%
q3 37.71 39.60 1.894 105.02%
q4 37.31 37.58 0.268 100.72%
q5 69.39 69.92 0.532 100.77%
q6 7.36 8.34 0.975 113.25%
q7 82.86 84.40 1.536 101.85%
q8 85.58 86.73 1.149 101.34%
q9 120.26 119.49 -0.775 99.36%
q10 44.15 43.18 -0.966 97.81%
q11 19.55 20.86 1.312 106.71%
q12 25.38 28.05 2.672 110.53%
q13 46.70 44.55 -2.150 95.40%
q14 19.58 17.03 -2.548 86.98%
q15 31.00 28.15 -2.851 90.80%
q16 14.67 14.07 -0.595 95.94%
q17 102.10 101.46 -0.642 99.37%
q18 140.46 145.26 4.802 103.42%
q19 14.99 13.99 -1.005 93.30%
q20 26.82 28.21 1.393 105.19%
q21 230.66 224.18 -6.479 97.19%
q22 13.55 14.99 1.434 110.58%
total 1231.69 1233.17 1.489 100.12%

loneylee pushed a commit to loneylee/gluten that referenced this pull request Mar 15, 2024
taiyang-li pushed a commit to bigo-sg/gluten that referenced this pull request Mar 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants