-
Notifications
You must be signed in to change notification settings - Fork 409
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[GLUTEN-4424][CORE] Upgrade spark version to 3.5.1 in Gluten #4822
Conversation
Thanks for opening a pull request! Could you open an issue for this pull request on Github Issues? https://github.com/oap-project/gluten/issues Then could you also rename commit message and pull request title in the following format?
See also: |
Run Gluten Clickhouse CI |
the failed tests in |
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
@holdenk @ulysses-you @PHILO-HE @zhouyuan Can you help to review? Thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thank you @JkSelf for the work!
shims/spark32/src/main/scala/io/glutenproject/execution/GenerateTreeStringShim.scala
Outdated
Show resolved
Hide resolved
gluten-core/src/main/scala/org/apache/spark/sql/execution/ColumnarShuffleExchangeExec.scala
Outdated
Show resolved
Hide resolved
shims/spark32/src/main/scala/io/glutenproject/execution/GenerateTreeStringShim.scala
Outdated
Show resolved
Hide resolved
Run Gluten Clickhouse CI |
Run Gluten Clickhouse CI |
Awesome, thanks for taking on the upgrade :) P.S. my pronouns are she/her :) |
Co-authored-by: Holden Karau <holden@pigscanfly.ca>
There's a new operator for partial window need to be supported, fired one issue to track this #4836 |
/Benchmark Velox |
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
CC: @zzcclp |
shims/spark34/src/main/scala/io/glutenproject/execution/GenerateTreeStringShim.scala
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm except one comment
Run Gluten Clickhouse CI |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some trivial comments. Could you check and see if it makes sense? Thanks!
assert(child.isInstanceOf[TransformSupport]) | ||
|
||
def stageId: Int = transformStageId | ||
|
||
def substraitPlanJsonValue: String = substraitPlanJson |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we remove the new method substraitPlanJsonValue
and just use the existing substraitPlanJson
? We may just need to declare substraitPlanJson
in the extending trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will updated in the following PRs. Thanks @PHILO-HE
* @since 1.3.0 | ||
*/ | ||
@Stable | ||
class ExtendedAnalysisException protected[sql] ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just keep one for this new exception class? Assume they have no difference. Maybe, we can move it into a common place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to move it in shim/common? This class only need in 32/33/34. And 35 is not needed. It maybe not make sense to move to shim/common.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a strong feeling on this. Keeping the current change is OK to me. BTW, it would be better to also add some comment for these classes to let people know why they are introduced. Thanks!
===== Performance report for TPCH SF2000 with Velox backend, for reference only ====
|
What changes were proposed in this pull request?
This PR is a follow-up to PR#4425. Many thanks to @holdenk for her previous work
The mainly changes as below:
WhostageTransfomer
: Spark 3.5 has changed the parameter type of thegenerateTreeString
API in TreeNode, which preventsWhostageTransfomer
from directly overriding thegenerateTreeString
method. AGenerateTreeStringShim
trait is defined in the shim to overridegenerateTreeString
.PartitionDirectory
API andPartitionedFileUtil.splitFiles
API. To be compatible with 3.5, we have made these changes in the shim layer.ColumnarShuffleExchangeExec
: Spark 3.5 has added a new interface,advisoryPartitionSize
, inShuffleExchangeLike
.GlutenQueryTest
: Spark 3.5 has introduced ExtendedAnalysisException to replace the previous AnalysisException. The solution is to introduce ExtendedAnalysisException in versions prior to 3.5.ColumnarBuildSideRelation
andBroadcastUtils
: Spark 3.5 has modified thefromAttributes
andtoAttributes
interfaces inStructType
.How was this patch tested?
Add tpcds/h test