-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SUPPORT] Hoodie table not found in path Unable to find a hudi table for the user provided paths. #2282
Comments
It looks like the error is happening during loading the data at hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/837b6714-40b3-4a00-bcf5-97a6f33d2af7.parquet Can you check if hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/ is a hudi table. Do you see hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/.hoodie folder ? Can you list the entire folder hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125 and attach ? |
the entire folder (hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125) as follows
the question is that when i using 0.5.3 it is ok , 0.6.0 is not work hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125 is the destination of sqoop import not the hudi table directory |
@wosow : If this is a plain parquet dataset, you should be reading like spark.read.parquet("hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/*") and not use hudi format. |
thank you ,i will try |
@wosow : Please reopen if you are still stuck. |
Tips before filing an issue
Have you gone through our FAQs?
Join the mailing list to engage in conversations and get faster support at dev-subscribe@hudi.apache.org.
If you have triaged this as a bug, then file an issue directly.
An error occurred when I used Hudi-0.6.0 to integrate Spark-2.4.4 to write data to Hudi and synchronize Hive, as follows
20/11/26 14:22:51 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /SQL/execution/json.
20/11/26 14:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@bd93bc3{/SQL/execution/json,null,AVAILABLE,@spark}
20/11/26 14:22:51 INFO ui.JettyUtils: Adding filter org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter to /static/sql.
20/11/26 14:22:51 INFO handler.ContextHandler: Started o.s.j.s.ServletContextHandler@4e67cfe1{/static/sql,null,AVAILABLE,@spark}
20/11/26 14:22:52 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Registered executor NettyRpcEndpointRef(spark-client://Executor) (10.228.86.12:42864) with ID 3
20/11/26 14:22:52 INFO state.StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint
20/11/26 14:22:52 INFO storage.BlockManagerMasterEndpoint: Registering block manager lake03:40372 with 8.4 GB RAM, BlockManagerId(3, lake03, 40372, None)
20/11/26 14:22:52 INFO fs.FSUtils: Hadoop Configuration: fs.defaultFS: [hdfs://nameservice], Config:[Configuration: core-default.xml, core-site.xml, mapred-default.xml, mapred-site.xml, yarn-default.xml, yarn-site.xml, hdfs-default.xml, hdfs-site.xml, spark_hadoop_conf.xml, file:/opt/modules/spark-2.4.4/conf/hive-site.xml], FileSystem: [DFS[DFSClient[clientName=DFSClient_NONMAPREDUCE_1481461246_1, ugi=root (auth:SIMPLE)]]]
20/11/26 14:22:52 INFO hudi.DataSourceUtils: Getting table path..
20/11/26 14:22:52 INFO util.TablePathUtils: Getting table path from path : hdfs://nameservice/data/wdt/sqoop/cow/inc/stockout_order_20201125/837b6714-40b3-4a00-bcf5-97a6f33d2af7.parquet
Exception in thread "main" org.apache.hudi.exception.TableNotFoundException: Hoodie table not found in path Unable to find a hudi table for the user provided paths.
at org.apache.hudi.DataSourceUtils.getTablePath(DataSourceUtils.java:120)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:72)
at org.apache.hudi.DefaultSource.createRelation(DefaultSource.scala:51)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:318)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:223)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:211)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:178)
at com.ws.hudi.wdt.cow.StockOutOrder$.stockOutOrderIncUpdate(StockOutOrder.scala:104)
at com.ws.hudi.wdt.cow.StockOutOrder$.main(StockOutOrder.scala:41)
at com.ws.hudi.wdt.cow.StockOutOrder.main(StockOutOrder.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:845)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:920)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:929)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
20/11/26 14:22:52 INFO spark.SparkContext: Invoking stop() from shutdown hook
20/11/26 14:22:52 INFO server.AbstractConnector: Stopped Spark@76b224cd{HTTP/1.1,[http/1.1]}{0.0.0.0:4040}
20/11/26 14:22:52 INFO ui.SparkUI: Stopped Spark web UI at http://lake03:4040
20/11/26 14:22:52 INFO cluster.YarnClientSchedulerBackend: Interrupting monitor thread
20/11/26 14:22:52 INFO cluster.YarnClientSchedulerBackend: Shutting down all executors
20/11/26 14:22:52 INFO cluster.YarnSchedulerBackend$YarnDriverEndpoint: Asking each executor to shut down
20/11/26 14:22:52 INFO cluster.SchedulerExtensionServices: Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
20/11/26 14:22:52 INFO cluster.YarnClientSchedulerBackend: Stopped
20/11/26 14:22:55 INFO spark.MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
20/11/26 14:22:55 INFO memory.MemoryStore: MemoryStore cleared
20/11/26 14:22:55 INFO storage.BlockManager: BlockManager stopped
20/11/26 14:22:55 INFO storage.BlockManagerMaster: BlockManagerMaster stopped
20/11/26 14:22:55 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
20/11/26 14:22:55 INFO spark.SparkContext: Successfully stopped SparkContext
20/11/26 14:22:55 INFO util.ShutdownHookManager: Shutdown hook called
Environment Description
hudi-0.6.0
spark-2.4.4
hive-2.3.1
hadoop-2.7.5
HDFS
no
The text was updated successfully, but these errors were encountered: