Skip to content

Add tdunning json library to spark and utilities bundle #14459

@hudi-bot

Description

@hudi-bot

Exception during Hive Sync:


An error occurred while calling o175.save.\n: java.lang.NoClassDefFoundError: org/json/JSONException\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeCreateTable(SemanticAnalyzer.java:10847)\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.genResolvedParseTree(SemanticAnalyzer.java:10047)\n\tat org.apache.hadoop.hive.ql.parse.SemanticAnalyzer.analyzeInternal(SemanticAnalyzer.java:10128)\n\tat org.apache.hadoop.hive.ql.parse.CalcitePlanner.analyzeInternal(CalcitePlanner.java:209)\n\tat org.apache.hadoop.hive.ql.parse.BaseSemanticAnalyzer.analyze(BaseSemanticAnalyzer.java:227)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:424)\n\tat org.apache.hadoop.hive.ql.Driver.compile(Driver.java:308)\n\tat org.apache.hadoop.hive.ql.Driver.compileInternal(Driver.java:1122)\n\tat org.apache.hadoop.hive.ql.Driver.runInternal(Driver.java:1170)\n\tat org.apache.hadoop.hive.ql.Driver.run(Driver.java:1059)\n\tat org.apache.hadoop.hive.ql.Driver.run(Driver.java:1049)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLs(HoodieHiveClient.java:515)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQLUsingHiveDriver(HoodieHiveClient.java:498)\n\tat org.apache.hudi.hive.HoodieHiveClient.updateHiveSQL(HoodieHiveClient.java:488)\n\tat org.apache.hudi.hive.HoodieHiveClient.createTable(HoodieHiveClient.java:273)\n\tat org.apache.hudi.hive.HiveSyncTool.syncSchema(HiveSyncTool.java:146)\n\tat

This is from using hudi-spark-bundle. [https://github.com//issues/1787]

JSONException class is coming from https://mvnrepository.com/artifact/org.json/json There is licensing issue and hence not part of hudi bundle packages. The underlying issue is due to Hive 1.x vs 2.x ( See https://issues.apache.org/jira/browse/HUDI-150?jql=text%20~%20%22org.json%22%20and%20project%20%3D%20%22Apache%20Hudi%22%20)

Spark Hive integration still brings in hive 1.x jars which depends on org.json. I believe this was provided in user's environment and hence we have not seen folks complaining about this issue.

Even though this is not Hudi issue per se, let me check a jar with compatible license : https://mvnrepository.com/artifact/com.tdunning/json/1.8 and if it works, we will add to 0.6 bundles after discussing with community. 

JIRA info


Comments

22/Jul/20 00:07;vbalaji;THis can also be potentially solved by including hive-exec package inside bundle but that is more risky and involves more testing. ;;;


01/Aug/20 01:15;akmodi;We've run into this error with {{hive-exec}} multiple times at Uber. We've found that the safest workaround is to add the json jars to the spark extraClassPath

{{"spark.driver.extraClassPath": "json-20090211.jar",
"spark.executor.extraClassPath": "json-20090211.jar"}};;;


09/Jul/21 19:29;vino; Even after adding the JSON jar to the classpath, it didn't resolve the issue.

 ;;;

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions