[SUPPORT] Support Apache Spark 3.2 #4202

melin · 2021-12-03T07:29:10Z

@pengzhiwei2018

[INFO] Compiling 67 source files to /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/target/classes at 1638515566715
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieFileIndex.scala:562: error: value literals is not a member of org.apache.spark.sql.execution.datasources.PartitioningUtils.PartitionValues
[ERROR]           partitionValues.map(_.literals.map(_.value))
[ERROR]                                 ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieFileIndex.scala:563: error: missing argument list for method fromSeq in object InternalRow
[ERROR] Unapplied methods are only converted to functions when a function type is expected.
[ERROR] You can make this conversion explicit by writing `fromSeq _` or `fromSeq(_)` instead of `fromSeq`.
[ERROR]             .map(InternalRow.fromSeq)
[ERROR]                              ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/avro/HoodieAvroDeserializer.scala:28: error: overloaded method constructor AvroDeserializer with alternatives:
[ERROR]   (rootAvroType: org.apache.avro.Schema,rootCatalystType: org.apache.spark.sql.types.DataType,datetimeRebaseMode: String)org.apache.spark.sql.avro.AvroDeserializer <and>
[ERROR]   (rootAvroType: org.apache.avro.Schema,rootCatalystType: org.apache.spark.sql.types.DataType,positionalFieldMatch: Boolean,datetimeRebaseMode: org.apache.spark.sql.internal.SQLConf.LegacyBehaviorPolicy.Value,filters: org.apache.spark.sql.catalyst.StructFilters)org.apache.spark.sql.avro.AvroDeserializer
[ERROR]  cannot be applied to (org.apache.avro.Schema, org.apache.spark.sql.types.DataType)
[ERROR]   extends AvroDeserializer(rootAvroType, rootCatalystType) {
[ERROR]           ^
[WARNING] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/DataSkippingUtils.scala:169: warning: non-variable type argument org.apache.spark.sql.catalyst.expressions.Literal in type pattern Seq[org.apache.spark.sql.catalyst.expressions.Literal] (the underlying of Seq[org.apache.spark.sql.catalyst.expressions.Literal]) is unchecked since it is eliminated by erasure
[WARNING]       case In(attribute: AttributeReference, list: Seq[Literal]) =>
[WARNING]                                                    ^
[WARNING] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/DataSkippingUtils.scala:178: warning: non-variable type argument org.apache.spark.sql.catalyst.expressions.Literal in type pattern Seq[org.apache.spark.sql.catalyst.expressions.Literal] (the underlying of Seq[org.apache.spark.sql.catalyst.expressions.Literal]) is unchecked since it is eliminated by erasure
[WARNING]       case Not(In(attribute: AttributeReference, list: Seq[Literal])) =>
[WARNING]                                                        ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/analysis/HoodieAnalysis.scala:427: error: wrong number of arguments for pattern org.apache.spark.sql.execution.command.ShowPartitionsCommand(tableName: org.apache.spark.sql.catalyst.TableIdentifier,output: Seq[org.apache.spark.sql.catalyst.expressions.Attribute],spec: Option[org.apache.spark.sql.catalyst.catalog.CatalogTypes.TablePartitionSpec])
[ERROR]       case ShowPartitionsCommand(tableName, specOpt)
[ERROR]                                 ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/AlterHoodieTableAddColumnsCommand.scala:90: error: overloaded method value checkDataColNames with alternatives:
[ERROR]   (provider: String,schema: org.apache.spark.sql.types.StructType)Unit <and>
[ERROR]   (table: org.apache.spark.sql.catalyst.catalog.CatalogTable,schema: org.apache.spark.sql.types.StructType)Unit
[ERROR]  cannot be applied to (org.apache.spark.sql.catalyst.catalog.CatalogTable, Seq[String])
[ERROR]     DDLUtils.checkDataColNames(table, colsToAdd.map(_.name))
[ERROR]              ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/CreateHoodieTableCommand.scala:201: error: not found: value DATASOURCE_SCHEMA_NUMPARTS
[ERROR]     properties.put(DATASOURCE_SCHEMA_NUMPARTS, parts.size.toString)
[ERROR]                    ^
[ERROR] /Users/huaixin/Documents/codes/bigdata/hudi/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/spark/sql/hudi/command/MergeIntoHoodieTableCommand.scala:206: error: wrong number of arguments for pattern org.apache.spark.sql.catalyst.expressions.Cast(child: org.apache.spark.sql.catalyst.expressions.Expression,dataType: org.apache.spark.sql.types.DataType,timeZoneId: Option[String],ansiEnabled: Boolean)
[ERROR]       case Cast(attr: AttributeReference, _, _) if sourceColumnName.find(resolver(_, attr.name)).get.equals(targetColumnName) => true
[ERROR]                ^
[WARNING] two warnings found
[ERROR] 7 errors found

The text was updated successfully, but these errors were encountered:

xushiyan · 2021-12-04T23:07:03Z

@melin Understood the need for 3.2 support. we have been tracking this in https://issues.apache.org/jira/browse/HUDI-2811
and we should be prioritizing this in next release.
cc @YannByron

maddy2u · 2021-12-26T20:36:34Z

Facing the same issue. When will this be fixed? Is it part of the next minor release of 0.10.1 ?

xushiyan · 2021-12-26T21:30:30Z

Facing the same issue. When will this be fixed? Is it part of the next minor release of 0.10.1 ?

@maddy2u Pls let me clarify: spark 3.2 support is a feature to be added, not an issue.
And no, 0.10.1 will be a bug fix release, no new feature should be added there. This is expected in 0.11.0 major release.

maddy2u · 2021-12-27T13:39:06Z

If i am trying to compile the master branch, I am not able to compile the code as it is failing on Spark 3.2. I have to revert to the earlier release (0.10.0) to be able to do it. Not sure if it is something wrong i am doing with the configuration.

xushiyan · 2022-01-11T04:41:38Z

If i am trying to compile the master branch, I am not able to compile the code as it is failing on Spark 3.2. I have to revert to the earlier release (0.10.0) to be able to do it. Not sure if it is something wrong i am doing with the configuration.

it can be compiled now. spark 3.2 support was added in #4270 .

maddy2u · 2022-03-11T12:38:01Z

It works. Thanks.

xushiyan closed this as completed Dec 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SUPPORT] Support Apache Spark 3.2 #4202

[SUPPORT] Support Apache Spark 3.2 #4202

melin commented Dec 3, 2021 •

edited

xushiyan commented Dec 4, 2021

maddy2u commented Dec 26, 2021

xushiyan commented Dec 26, 2021

maddy2u commented Dec 27, 2021

xushiyan commented Jan 11, 2022

maddy2u commented Mar 11, 2022

[SUPPORT] Support Apache Spark 3.2 #4202

[SUPPORT] Support Apache Spark 3.2 #4202

Comments

melin commented Dec 3, 2021 • edited

xushiyan commented Dec 4, 2021

maddy2u commented Dec 26, 2021

xushiyan commented Dec 26, 2021

maddy2u commented Dec 27, 2021

xushiyan commented Jan 11, 2022

maddy2u commented Mar 11, 2022

melin commented Dec 3, 2021 •

edited