[SPARK-33358][SQL] Return code when command process failed #30263

artiship · 2020-11-05T10:17:05Z

What changes were proposed in this pull request?

Exit Spark SQL CLI processing loop if one of the commands (sub sql statement) process failed

Why are the changes needed?

This is a regression at Apache Spark 3.0.0.

$ cat 1.sql
select * from nonexistent_table;
select 2;

Apache Spark 2.4.7

spark-2.4.7-bin-hadoop2.7:$ bin/spark-sql -f 1.sql
20/11/15 16:14:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Error in query: Table or view not found: nonexistent_table; line 1 pos 14

Apache Spark 3.0.1

$ bin/spark-sql -f 1.sql
Error in query: Table or view not found: nonexistent_table; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [nonexistent_table]

2
Time taken: 2.786 seconds, Fetched 1 row(s)

Apache Hive 1.2.2

apache-hive-1.2.2-bin:$ bin/hive -f 1.sql

Logging initialized using configuration in jar:file:/Users/dongjoon/APACHE/hive-release/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties
FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'nonexistent_table'

Does this PR introduce any user-facing change?

Yes. This is a fix of regression.

How was this patch tested?

Pass the UT.

HyukjinKwon · 2020-11-05T10:59:47Z

@artiship how was this patch tested?

artiship · 2020-11-05T11:35:37Z

@HyukjinKwon I reproduced this issue and then verified the fix manually from my dev env, I would like to add unit test to cover this.

HyukjinKwon · 2020-11-05T12:58:21Z

@artiship can you elaborate the reproducible steps and how you tested manually in your env?

artiship · 2020-11-05T16:13:47Z

@HyukjinKwon You can see after the first statement failed, following statement still got executed, and finally let the whole script succeed.

env

spark version: 3.0.1
os: centos 7

/tmp/tmp.sql

select * from nonexistent_table;
select 2;

submit command:

export HADOOP_USER_NAME=my-hadoop-user
bin/spark-sql  \
--master yarn \
--deploy-mode client \
--queue my.queue.name \
--conf spark.driver.host=$(hostname -i) \
--conf spark.app.name=spark-test  \
--name "spark-test" \
-f /tmp/tmp.sql

execution log:

# bin/spark-sql  \
> --master yarn \
> --deploy-mode client \
> --queue my.queue.name \
> --conf spark.driver.host=$(hostname -i) \
> --conf spark.app.name=spark-test  \
> --name "spark-test" \
> -f /tmp/tmp.sql
20/11/06 00:06:19 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.spark.client.rpc.server.address.use.ip does not exist
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.spark.client.submit.timeout.interval does not exist
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.enforce.bucketing does not exist
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.server2.enable.impersonation does not exist
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.run.timeout.seconds does not exist
20/11/06 00:06:20 WARN HiveConf: HiveConf of name hive.support.sql11.reserved.keywords does not exist
20/11/06 00:06:20 WARN DomainSocketFactory: The short-circuit local reads feature cannot be used because libhadoop cannot be loaded.
20/11/06 00:06:20 WARN SparkConf: Note that spark.local.dir will be overridden by the value set by the cluster manager (via SPARK_LOCAL_DIRS in mesos/standalone/kubernetes and LOCAL_DIRS in YARN).
20/11/06 00:06:22 WARN Client: Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.

Error in query: Table or view not found: nonexistent_table; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [nonexistent_table]

2
2
Time taken: 4.437 seconds, Fetched 1 row(s)

dongjoon-hyun · 2020-11-06T00:04:32Z

ok to test

SparkQA · 2020-11-06T00:54:46Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35287/

SparkQA · 2020-11-06T00:55:35Z

Test build #130676 has finished for PR 30263 at commit 96c753c.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-11-06T01:16:50Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35287/

HyukjinKwon · 2020-11-06T04:23:44Z

Looks like something went wrong during rebase. You can either resolve it here or just open another PR.

SparkQA · 2020-11-06T05:04:58Z

Test build #130687 has finished for PR 30263 at commit 3d71aca.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-11-06T05:06:57Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35298/

artiship · 2020-11-06T05:12:20Z

add a new test case

SparkQA · 2020-11-06T05:35:28Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35298/

… error

artiship · 2020-11-06T08:58:45Z

Looks like something went wrong during rebase. You can either resolve it here or just open another PR.

@HyukjinKwon Sorry for the incorrect rebase. It was resolved and now fix and test case have been merged into one commit.

SparkQA · 2020-11-06T09:37:44Z

Test build #130710 has finished for PR 30263 at commit db231c3.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

SparkQA · 2020-11-06T10:32:39Z

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35320/

SparkQA · 2020-11-06T10:54:53Z

Kubernetes integration test status success
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/35320/

dongjoon-hyun · 2020-11-11T17:01:04Z

Hi, @artiship . Does this bug happen only in YARN environment? It seems to work correctly in the local environment.

spark-3.0.1-bin-hadoop3.2:$ cat 1.sql
select * from n;
select 2;

spark-3.0.1-bin-hadoop3.2:$ bin/spark-sql -f 1.sql
Error in query: Table or view not found: n; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [n]

2
Time taken: 2.244 seconds, Fetched 1 row(s)

artiship · 2020-11-12T03:44:47Z

@dongjoon-hyun

I've verified it again using a distribution dowloaded from the apache spark website:

https://www.apache.org/dyn/closer.lua/spark/spark-3.0.1/spark-3.0.1-bin-hadoop2.7.tgz

This bug still can reproduce in local mode. The spark-sql should break after the first statement failed.

➜  spark-3.0.1-bin-hadoop2.7 cat 1.sql
select * from n;
select 2+2;
➜  spark-3.0.1-bin-hadoop2.7 bin/spark-sql -f 1.sql
20/11/12 11:35:28 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/11/12 11:35:38 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
20/11/12 11:35:38 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
20/11/12 11:35:46 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
20/11/12 11:35:46 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore lichuanliang@10.104.16.36
Error in query: Table or view not found: n; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [n]

4
Time taken: 4.39 seconds, Fetched 1 row(s)

With no statement fail:

➜  spark-3.0.1-bin-hadoop2.7 cat 2.sql
select 1+1;
select 2+2;

➜  spark-3.0.1-bin-hadoop2.7 bin/spark-sql -f 2.sql
20/11/12 11:45:56 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
20/11/12 11:46:06 WARN HiveConf: HiveConf of name hive.stats.jdbc.timeout does not exist
20/11/12 11:46:06 WARN HiveConf: HiveConf of name hive.stats.retries.wait does not exist
20/11/12 11:46:14 WARN ObjectStore: Version information not found in metastore. hive.metastore.schema.verification is not enabled so recording the schema version 2.3.0
20/11/12 11:46:14 WARN ObjectStore: setMetaStoreSchemaVersion called but recording version is disabled: version = 2.3.0, comment = Set by MetaStore lichuanliang@10.104.16.36
2
Time taken: 5.167 seconds, Fetched 1 row(s)
4
Time taken: 0.151 seconds, Fetched 1 row(s)

dongjoon-hyun

Given @artiship 's and my results, it seems that this does not happen always.

For me, this doesn't happen on Mac (Big Sur) so far.

spark-3.0.1-bin-hadoop3.2:$ sw_vers
ProductName:	macOS
ProductVersion:	11.0.1

spark-3.0.1-bin-hadoop3.2:$ bin/spark-sql -f 1.sql
Error in query: Table or view not found: n; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [n]

4
Time taken: 3.251 seconds, Fetched 1 row(s)

spark-3.0.1-bin-hadoop2.7:$ bin/spark-sql -f 1.sql
Error in query: Table or view not found: n; line 1 pos 14;
'Project [*]
+- 'UnresolvedRelation [n]

4
Time taken: 5.6 seconds, Fetched 1 row(s)

I'm looking at linux environment.

dongjoon-hyun · 2020-11-15T23:59:02Z

Ah, it seems that there is misunderstanding due to the PR description, @artiship .

In the PR description and your comment (#30263 (comment)), you mentioned that the output 2 is repeated twice. Is that true?

dongjoon-hyun · 2020-11-16T00:38:21Z

sql/hive-thriftserver/src/test/scala/org/apache/spark/sql/hive/thriftserver/CliSuite.scala

@@ -571,4 +571,13 @@ class CliSuite extends SparkFunSuite with BeforeAndAfterAll with Logging {
    // the date formatter for `java.sql.LocalDate` must output negative years with sign.
    runCliWithin(1.minute)("SELECT MAKE_DATE(-44, 3, 15);" -> "-0044-03-15")
  }
+
+  test("SPARK-33358 CLI should break when have command failed") {


This test case seems to succeed without your patch.

dongjoon-hyun

@artiship and @HyukjinKwon . I verified this patch manually.
However, the test code is wrong because it always succeeds.
In addition, I realized that the current test framework runCliWithin has limitation to test this kind of this. So, I'll proceed to merge without the test case now. We can revise the test framework later.

Exit Spark SQL CLI processing loop if one of the commands (sub sql statement) process failed This is a regression at Apache Spark 3.0.0. ``` $ cat 1.sql select * from nonexistent_table; select 2; ``` **Apache Spark 2.4.7** ``` spark-2.4.7-bin-hadoop2.7:$ bin/spark-sql -f 1.sql 20/11/15 16:14:38 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable Error in query: Table or view not found: nonexistent_table; line 1 pos 14 ``` **Apache Spark 3.0.1** ``` $ bin/spark-sql -f 1.sql Error in query: Table or view not found: nonexistent_table; line 1 pos 14; 'Project [*] +- 'UnresolvedRelation [nonexistent_table] 2 Time taken: 2.786 seconds, Fetched 1 row(s) ``` **Apache Hive 1.2.2** ``` apache-hive-1.2.2-bin:$ bin/hive -f 1.sql Logging initialized using configuration in jar:file:/Users/dongjoon/APACHE/hive-release/apache-hive-1.2.2-bin/lib/hive-common-1.2.2.jar!/hive-log4j.properties FAILED: SemanticException [Error 10001]: Line 1:14 Table not found 'nonexistent_table' ``` Yes. This is a fix of regression. Pass the UT. Closes #30263 from artiship/SPARK-33358. Authored-by: artiship <meilziner@gmail.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org> (cherry picked from commit 1ae6d64) Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>

dongjoon-hyun · 2020-11-16T01:00:06Z

Thank you so much for your contribution, @artiship .
This landed at master branch for Apache Spark 3.1.0 and branch-3.0 for Apache Spark 3.0.2.
SPARK-33358 will be assigned to you. What is your JIRA id?

dongjoon-hyun · 2020-11-16T01:01:38Z

I found your id from the commit log.
SPARK-33358 is assigned to you. Thanks.

artiship · 2020-11-17T02:45:27Z

@dongjoon-hyun Thanks for your carefully review. The output 2 repeated twice is a testing result I got from my production environment. It seems that it might have a configuration makes it print both command and result.

github-actions bot added the SQL label Nov 5, 2020

github-actions bot added BUILD DOCS INFRA YARN labels Nov 6, 2020

[SPARK-33358][SQL] Spark SQL CLI should exit if part of the statement…

db231c3

… error

artiship force-pushed the SPARK-33358 branch from 3d71aca to db231c3 Compare November 6, 2020 08:52

dongjoon-hyun reviewed Nov 15, 2020

View reviewed changes

dongjoon-hyun reviewed Nov 16, 2020

View reviewed changes

dongjoon-hyun closed this in 1ae6d64 Nov 16, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-33358][SQL] Return code when command process failed #30263

[SPARK-33358][SQL] Return code when command process failed #30263

artiship commented Nov 5, 2020 •

edited by dongjoon-hyun

Loading

HyukjinKwon commented Nov 5, 2020

artiship commented Nov 5, 2020

HyukjinKwon commented Nov 5, 2020

artiship commented Nov 5, 2020 •

edited

Loading

dongjoon-hyun commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

HyukjinKwon commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

artiship commented Nov 6, 2020

SparkQA commented Nov 6, 2020

artiship commented Nov 6, 2020 •

edited

Loading

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

dongjoon-hyun commented Nov 11, 2020 •

edited

Loading

artiship commented Nov 12, 2020 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

dongjoon-hyun commented Nov 15, 2020 •

edited

Loading

dongjoon-hyun Nov 16, 2020 •

edited

Loading

dongjoon-hyun left a comment

dongjoon-hyun commented Nov 16, 2020

dongjoon-hyun commented Nov 16, 2020

artiship commented Nov 17, 2020

[SPARK-33358][SQL] Return code when command process failed #30263

[SPARK-33358][SQL] Return code when command process failed #30263

Conversation

artiship commented Nov 5, 2020 • edited by dongjoon-hyun Loading

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

HyukjinKwon commented Nov 5, 2020

artiship commented Nov 5, 2020

HyukjinKwon commented Nov 5, 2020

artiship commented Nov 5, 2020 • edited Loading

dongjoon-hyun commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

HyukjinKwon commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

artiship commented Nov 6, 2020

SparkQA commented Nov 6, 2020

artiship commented Nov 6, 2020 • edited Loading

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

SparkQA commented Nov 6, 2020

dongjoon-hyun commented Nov 11, 2020 • edited Loading

artiship commented Nov 12, 2020 • edited Loading

dongjoon-hyun left a comment • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun commented Nov 15, 2020 • edited Loading

dongjoon-hyun Nov 16, 2020 • edited Loading

Choose a reason for hiding this comment

dongjoon-hyun left a comment

Choose a reason for hiding this comment

dongjoon-hyun commented Nov 16, 2020

dongjoon-hyun commented Nov 16, 2020

artiship commented Nov 17, 2020

artiship commented Nov 5, 2020 •

edited by dongjoon-hyun

Loading

artiship commented Nov 5, 2020 •

edited

Loading

artiship commented Nov 6, 2020 •

edited

Loading

dongjoon-hyun commented Nov 11, 2020 •

edited

Loading

artiship commented Nov 12, 2020 •

edited

Loading

dongjoon-hyun left a comment •

edited

Loading

dongjoon-hyun commented Nov 15, 2020 •

edited

Loading

dongjoon-hyun Nov 16, 2020 •

edited

Loading