You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
COMET_EXEC_BROADCAST_ENABLED is disabled by default now. It is because as Comet broadcast exec operator CometBroadcastExchange is column-based. So Spark planner will add ColumnarToRow between downstream operator (e.g., BroadcastHashJoinExec) and CometBroadcastExchange. It will cause the following error in runtime:
org.apache.spark.SparkUnsupportedOperationException: ColumnarToRow does not implement doExecuteBroadcast.
at org.apache.spark.sql.errors.QueryExecutionErrors$.doExecuteBroadcastNotImplementedError(QueryExecutionErrors.scala:2552)
at org.apache.spark.sql.execution.SparkPlan.doExecuteBroadcast(SparkPlan.scala:326)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeBroadcast$1(SparkPlan.scala:208)
at org.apache.spark.sql.execution.SparkPlan.$anonfun$executeQuery$1(SparkPlan.scala:246)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:243)
at org.apache.spark.sql.execution.SparkPlan.executeBroadcast(SparkPlan.scala:204)
at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareBroadcast(BroadcastHashJoinExec.scala:207) at org.apache.spark.sql.execution.joins.BroadcastHashJoinExec.prepareRelation(BroadcastHashJoinExec.scala:221)
at org.apache.spark.sql.execution.joins.HashJoin.codegenInner(HashJoin.scala:390)
This is because Spark will insert ColumnarToRow between row-based operator (BroadcastHashJoinExec) and column-based operator (CometBroadcastExchange). In BroadcastHashJoinExec, its build side that assumes to be broadcasted operator is changed to ColumnarToRow. But BroadcastHashJoinExec sill invokes doExecuteBroadcast on the changed build side which doesn't implement that.
Describe the potential solution
No response
Additional context
No response
The text was updated successfully, but these errors were encountered:
What is the problem the feature request solves?
COMET_EXEC_BROADCAST_ENABLED
is disabled by default now. It is because as Comet broadcast exec operatorCometBroadcastExchange
is column-based. So Spark planner will addColumnarToRow
between downstream operator (e.g.,BroadcastHashJoinExec
) andCometBroadcastExchange
. It will cause the following error in runtime:This is because Spark will insert
ColumnarToRow
between row-based operator (BroadcastHashJoinExec
) and column-based operator (CometBroadcastExchange
). InBroadcastHashJoinExec
, its build side that assumes to be broadcasted operator is changed toColumnarToRow
. ButBroadcastHashJoinExec
sill invokesdoExecuteBroadcast
on the changed build side which doesn't implement that.Describe the potential solution
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: