Skip to content

[SPARK-50789][CONNECT] The inputs for typed aggregations should be analyzed#49449

Closed
ueshin wants to merge 1 commit intoapache:masterfrom
ueshin:issues/SPARK-50789/typed_agg
Closed

[SPARK-50789][CONNECT] The inputs for typed aggregations should be analyzed#49449
ueshin wants to merge 1 commit intoapache:masterfrom
ueshin:issues/SPARK-50789/typed_agg

Conversation

@ueshin
Copy link
Copy Markdown
Member

@ueshin ueshin commented Jan 10, 2025

What changes were proposed in this pull request?

Fixes SparkConnectPlanner to analyze the inputs for typed aggregations.

Why are the changes needed?

The inputs for typed aggregations should be analyzed.

For example:

val ds = Seq("abc", "xyz", "hello").toDS().select("*").as[String]
ds.groupByKey(_.length).reduceGroups(_ + _).show()

fails with:

org.apache.spark.SparkException: [INTERNAL_ERROR] Invalid call to toAttribute on unresolved object SQLSTATE: XX000
  org.apache.spark.sql.catalyst.analysis.Star.toAttribute(unresolved.scala:439)
  org.apache.spark.sql.catalyst.plans.logical.Project.$anonfun$output$1(basicLogicalOperators.scala:74)
  scala.collection.immutable.List.map(List.scala:247)
  scala.collection.immutable.List.map(List.scala:79)
  org.apache.spark.sql.catalyst.plans.logical.Project.output(basicLogicalOperators.scala:74)
  org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformExpressionWithTypedReduceExpression(SparkConnectPlanner.scala:2340)
  org.apache.spark.sql.connect.planner.SparkConnectPlanner.$anonfun$transformKeyValueGroupedAggregate$1(SparkConnectPlanner.scala:2244)
  scala.collection.immutable.List.map(List.scala:247)
  scala.collection.immutable.List.map(List.scala:79)
  org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformKeyValueGroupedAggregate(SparkConnectPlanner.scala:2244)
  org.apache.spark.sql.connect.planner.SparkConnectPlanner.transformAggregate(SparkConnectPlanner.scala:2232)
...

Does this PR introduce any user-facing change?

The failure will not appear.

How was this patch tested?

Added the related tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@ueshin
Copy link
Copy Markdown
Member Author

ueshin commented Jan 13, 2025

The remaining test failures are not related to this PR.

@ueshin
Copy link
Copy Markdown
Member Author

ueshin commented Jan 13, 2025

Thanks! merging to master.

@ueshin ueshin closed this in 3569e76 Jan 13, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants