Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VL][CI] Fix fallback to columnar shuffle in celeborn test #4474

Merged
merged 2 commits into from
Jan 29, 2024

Conversation

PHILO-HE
Copy link
Contributor

What changes were proposed in this pull request?

Celeborn is not correctly installed, which causes CI test actually falls back to columnar shuffle.

How was this patch tested?

CI verification.

Copy link

Thanks for opening a pull request!

Could you open an issue for this pull request on Github Issues?

https://github.com/oap-project/gluten/issues

Then could you also rename commit message and pull request title in the following format?

[GLUTEN-${ISSUES_ID}][COMPONENT]feat/fix: ${detailed message}

See also:

Copy link

Run Gluten Clickhouse CI

@zhouyuan
Copy link
Contributor

@kerwin-zk could you please help to take a look on this? it seems some wrong results found on tpcds Q51

thanks, -yuan

@PHILO-HE
Copy link
Contributor Author

PHILO-HE commented Jan 23, 2024

Post the error message:

Executing SQL query from resource path /tpcds-queries/q51.sql...
Error running query q51.  Error: FATAL: java.lang.RuntimeException: Error while decoding: java.lang.ArithmeticException: Decimal precision 34 exceeds max precision 27

@kerwin-zk
Copy link
Contributor

OK, I'll take a look.

@kerwin-zk
Copy link
Contributor

@PHILO-HE @zhouyuan I checked and found that the exception was caused by this part of the code in the PR(#4415). CI can pass if this part of the code is rolled back.

Copy link

Run Gluten Clickhouse CI

@ulysses-you
Copy link
Contributor

@kerwin-zk does it mean the celeborn shuffle code path is broken at present ?

@kerwin-zk
Copy link
Contributor

@ulysses-you Yes, code containing #4415 might have issues with some corner cases when using Celeborn.

@ulysses-you
Copy link
Contributor

thank you for the confirming

@marin-ma
Copy link
Contributor

@PHILO-HE Thanks for investigating this issue. Let's merge this one first to ensure the CI operates correctly. I will look into the corner case issue.

@PHILO-HE
Copy link
Contributor Author

@PHILO-HE Thanks for investigating this issue. Let's merge this one first to ensure the CI operates correctly. I will look into the corner case issue.

@marin-ma, thanks so much for your time!

@PHILO-HE PHILO-HE merged commit 7ed8e37 into apache:main Jan 29, 2024
19 checks passed
@GlutenPerfBot
Copy link
Contributor

===== Performance report for TPCH SF2000 with Velox backend, for reference only ====

query log/native_4474_time.csv log/native_master_01_28_2024_d041ce584_time.csv difference percentage
q1 33.82 32.99 -0.831 97.54%
q2 24.23 23.85 -0.380 98.43%
q3 39.07 37.74 -1.331 96.59%
q4 38.16 38.98 0.814 102.13%
q5 68.23 67.95 -0.287 99.58%
q6 7.07 7.02 -0.046 99.35%
q7 83.43 80.87 -2.556 96.94%
q8 86.14 84.35 -1.790 97.92%
q9 124.64 121.55 -3.091 97.52%
q10 43.92 43.06 -0.861 98.04%
q11 20.13 20.27 0.146 100.72%
q12 25.99 27.38 1.393 105.36%
q13 45.40 45.45 0.056 100.12%
q14 20.94 22.94 2.001 109.55%
q15 27.10 28.71 1.620 105.98%
q16 14.78 14.08 -0.691 95.32%
q17 100.85 102.67 1.825 101.81%
q18 146.33 148.89 2.561 101.75%
q19 13.85 12.47 -1.382 90.02%
q20 26.43 26.11 -0.319 98.79%
q21 225.67 226.03 0.362 100.16%
q22 13.58 13.55 -0.032 99.76%
total 1229.75 1226.93 -2.820 99.77%

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants