Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-24930][SQL] Improve exception information when using LOAD DATA LOCAL INPATH #21881

Closed
wants to merge 1 commit into from

Conversation

ouyangxiaochen
Copy link

@ouyangxiaochen ouyangxiaochen commented Jul 26, 2018

What changes were proposed in this pull request?

  1. root user create a test.txt file contains a record '123' in /root/ directory
  2. switch mr user to execute spark-shell --master local
scala> spark.version
res2: String = 2.2.1

scala> spark.sql("create table t1(id int) partitioned by(area string)");
2018-07-26 17:20:37,523 WARN org.apache.hadoop.hive.metastore.HiveMetaStore: Location: hdfs://nameservice/spark/t1 specified for non-external table:t1
res4: org.apache.spark.sql.DataFrame = []

scala> spark.sql("load data local inpath '/root/test.txt' into table t1 partition(area ='025')")
org.apache.spark.sql.AnalysisException: LOAD DATA input path does not exist: /root/test.txt;
 at org.apache.spark.sql.execution.command.LoadDataCommand.run(tables.scala:339)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
 at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:67)
 at org.apache.spark.sql.Dataset.<init>(Dataset.scala:183)
 at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:68)
 at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:639)
 ... 48 elided

scala>

In fact, the input path exists, but the mr user does not have permission to access the directory /root/ ,so the message throwed by AnalysisException can confuse user.

How was this patch tested?

existing test case

Please review http://spark.apache.org/contributing.html before opening a pull request.

@gatorsmile
Copy link
Member

ok to test

@SparkQA
Copy link

SparkQA commented Jul 26, 2018

Test build #93622 has finished for PR 21881 at commit 0eb648c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@xuanyuanking xuanyuanking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just nits about exception msg.

throw new AnalysisException(s"LOAD DATA input path does not exist: $path")
// If user have no permission to access the given input path, `File.exists()` return false
// , `LOAD DATA input path does not exist` can confuse users.
throw new AnalysisException(s"LOAD DATA input path does not exist: `$path` or current " +
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to print the $path twice.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, Thanks!

@gatorsmile
Copy link
Member

Test case?

@ouyangxiaochen
Copy link
Author

ouyangxiaochen commented Jul 31, 2018

@gatorsmile Hi, i am not sure how to build this scene in test case, just assert whether the exception info contains the key message have no permission to access the input path or not?

@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@github-actions
Copy link

github-actions bot commented Jan 9, 2020

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Jan 9, 2020
@github-actions github-actions bot closed this Jan 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants