Skip to content

[SUPPORT] Using SparkSQL DDL To Create External Error. #6405

@JoshuaZhuCN

Description

@JoshuaZhuCN

What causes this error when using sparksql to create a table。

Error:Hoodie table not found in path Unable to find a hudi table for the user provided paths.

SQL DDL like:

CREATE TABLE IF NOT EXISTS `xxx`.`finance_receipt` (
   source_from,
    xxxx
) USING HUDI
OPTIONS (
   `hoodie.query.as.ro.table` = 'false'
)
TBLPROPERTIES (
     type = 'mor'
    ,primaryKey = 'source_from,finance_receipt_id'
    ,preCombineField = 'precombine_field'
    ,`hoodie.compaction.payload.class` = 'org.apache.hudi.common.model.OverwriteWithLatestAvroPayload'
    ,`hoodie.datasource.write.hive_style_partitioning` = 'false'
    ,`hoodie.table.keygenerator.class` = 'org.apache.hudi.keygen.ComplexKeyGenerator'
    ,`hoodie.index.type` = 'GLOBAL_BLOOM'
)
COMMENT '收款记录表'
PARTITIONED BY (source_from)
LOCATION 'hdfs://localhost:8020/hoodie/xxx/xxx/finance_receipt'

To Reproduce

Steps to reproduce the behavior:

1.make the directory manually
2.execute the spark sql ddl
3.error Hoodie table not found in path Unable to find a hudi table for the user provided paths.
4.if skip step 1,it will cause File does not exist: hdfs://localhost:8020/hoodie/xxx/xxx/finance_receipt

Environment Description

  • Hudi version : 0.10.1

  • Spark version : 3.1.3

  • Hive version : 3.1.0

  • Hadoop version : 3.1.1

  • Storage (HDFS/S3/GCS..) : HDFS

  • Running on Docker? (yes/no) : no

Stacktrace

Exception in thread "main" org.apache.hudi.exception.TableNotFoundException: Hoodie table not found in path Unable to find a hudi table for the user provided paths.
	at org.apache.hudi.DataSourceUtils.getTablePath(DataSourceUtils.java:86)
	at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:103)
	at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:353)
	at org.apache.spark.sql.execution.command.CreateDataSourceTableCommand.run(createDataSourceTables.scala:78)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:70)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:68)
	at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:79)
	at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
	at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3700)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
	at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3698)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:618)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:775)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:613)

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:sqlSQL interfacespriority:highSignificant impact; potential bugs

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions