[SPARK-6324] [core] Centralize handling of script usage messages. #5841

vanzin · 2015-05-01T18:49:29Z

Reorganize code so that the launcher library handles most of the work
of printing usage messages, instead of having an awkward protocol between
the library and the scripts for that.

This mostly applies to SparkSubmit, since the launcher lib does not do
command line parsing for classes invoked in other ways, and thus cannot
handle failures for those. Most scripts end up going through SparkSubmit,
though, so it all works.

The change adds a new, internal command line switch, "--usage-error",
which prints the usage message and exits with a non-zero status. Scripts
can override the command printed in the usage message by setting an
environment variable - this avoids having to grep the output of
SparkSubmit to remove references to the "spark-submit" script.

The only sub-optimal part of the change is the special handling for the
spark-sql usage, which is now done in SparkSubmitArguments.

Reorganize code so that the launcher library handles most of the work of printing usage messages, instead of having an awkward protocol between the library and the scripts for that. This mostly applies to SparkSubmit, since the launcher lib does not do command line parsing for classes invoked in other ways, and thus cannot handle failures for those. Most scripts end up going through SparkSubmit, though, so it all works. The change adds a new, internal command line switch, "--usage-error", which prints the usage message and exits with a non-zero status. Scripts can override the command printed in the usage message by setting an environment variable - this avoids having to grep the output of SparkSubmit to remove references to the "spark-submit" script. The only sub-optimal part of the change is the special handling for the spark-sql usage, which is now done in SparkSubmitArguments.

AmplabJenkins · 2015-05-01T18:52:10Z

Merged build triggered.

AmplabJenkins · 2015-05-01T18:52:19Z

Merged build started.

SparkQA · 2015-05-01T18:52:56Z

Test build #31582 has started for PR 5841 at commit bf139b5.

SparkQA · 2015-05-01T21:08:37Z

Test build #31582 has finished for PR 5841 at commit bf139b5.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-05-01T21:08:42Z

Merged build finished. Test PASSed.

AmplabJenkins · 2015-05-01T21:08:42Z

Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/31582/
Test PASSed.

AmplabJenkins · 2015-06-02T15:52:13Z

Merged build triggered.

AmplabJenkins · 2015-06-02T15:52:18Z

Merged build started.

SparkQA · 2015-06-02T15:54:59Z

Test build #33996 has started for PR 5841 at commit 2821481.

srowen · 2015-06-02T17:27:36Z

LGTM after a read through the changes. I understand the intent and it makes sense. I haven't tested it myself.

SparkQA · 2015-06-02T18:25:00Z

Test build #33996 timed out for PR 5841 at commit 2821481 after a configured wait of 150m.

AmplabJenkins · 2015-06-02T18:25:06Z

Merged build finished. Test FAILed.

srowen · 2015-06-02T20:40:31Z

Jenkins, retest this please

AmplabJenkins · 2015-06-02T20:42:14Z

Merged build triggered.

AmplabJenkins · 2015-06-02T20:42:23Z

Merged build started.

SparkQA · 2015-06-02T20:43:21Z

Test build #34019 has started for PR 5841 at commit 2821481.

SparkQA · 2015-06-02T23:10:48Z

Test build #34019 has finished for PR 5841 at commit 2821481.

This patch passes all tests.
This patch merges cleanly.
This patch adds no public classes.

AmplabJenkins · 2015-06-02T23:10:53Z

Merged build finished. Test PASSed.

Reorganize code so that the launcher library handles most of the work of printing usage messages, instead of having an awkward protocol between the library and the scripts for that. This mostly applies to SparkSubmit, since the launcher lib does not do command line parsing for classes invoked in other ways, and thus cannot handle failures for those. Most scripts end up going through SparkSubmit, though, so it all works. The change adds a new, internal command line switch, "--usage-error", which prints the usage message and exits with a non-zero status. Scripts can override the command printed in the usage message by setting an environment variable - this avoids having to grep the output of SparkSubmit to remove references to the "spark-submit" script. The only sub-optimal part of the change is the special handling for the spark-sql usage, which is now done in SparkSubmitArguments. Author: Marcelo Vanzin <vanzin@cloudera.com> Closes apache#5841 from vanzin/SPARK-6324 and squashes the following commits: 2821481 [Marcelo Vanzin] Merge branch 'master' into SPARK-6324 bf139b5 [Marcelo Vanzin] Filter output of Spark SQL CLI help. c6609bf [Marcelo Vanzin] Fix exit code never being used when printing usage messages. 6bc1b41 [Marcelo Vanzin] [SPARK-6324] [core] Centralize handling of script usage messages.

### What changes were proposed in this pull request? This PR aims to remove the deprecate Java `System.setSecurityManager` usage for Apache Spark 4.0. Note that this is the only usage in Apache Spark AS-IS code. ``` $ git grep setSecurityManager core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala: System.setSecurityManager(sm) core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala: System.setSecurityManager(currentSm) ``` This usage was added at `Apache Spark 1.5.0`. - #5841 Since `Apache Spark 2.4.0`, we don't need `setSecurityManager` due to the following improvement. - #20925 ### Why are the changes needed? ``` $ java -version openjdk version "21-ea" 2023-09-19 OpenJDK Runtime Environment (build 21-ea+32-2482) OpenJDK 64-Bit Server VM (build 21-ea+32-2482, mixed mode, sharing) ``` ``` max spark-3.5.0-bin-hadoop3:$ bin/spark-sql --help ... CLI options: Exception in thread "main" java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release at java.base/java.lang.System.setSecurityManager(System.java:429) at org.apache.spark.deploy.SparkSubmitArguments.getSqlShellOptions(SparkSubmitArguments.scala:623) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual test. ``` $ build/sbt test:package -Phive -Phive-thriftserver $ bin/spark-sql --help ... CLI options: -d,--define <key=value> Variable substitution to apply to Hive commands. e.g. -d A=B or --define A=B --database <databasename> Specify the database to use -e <quoted-query-string> SQL from command line -f <filename> SQL from files -H,--help Print help information --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable substitution to apply to Hive commands. e.g. --hivevar A=B -i <filename> Initialization SQL file -S,--silent Silent mode in interactive shell -v,--verbose Verbose mode (echo executed SQL to the console) ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes #42901 from dongjoon-hyun/SPARK-45147. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

### What changes were proposed in this pull request? This PR aims to remove the deprecate Java `System.setSecurityManager` usage for Apache Spark 4.0. Note that this is the only usage in Apache Spark AS-IS code. ``` $ git grep setSecurityManager core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala: System.setSecurityManager(sm) core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala: System.setSecurityManager(currentSm) ``` This usage was added at `Apache Spark 1.5.0`. - apache#5841 Since `Apache Spark 2.4.0`, we don't need `setSecurityManager` due to the following improvement. - apache#20925 ### Why are the changes needed? ``` $ java -version openjdk version "21-ea" 2023-09-19 OpenJDK Runtime Environment (build 21-ea+32-2482) OpenJDK 64-Bit Server VM (build 21-ea+32-2482, mixed mode, sharing) ``` ``` max spark-3.5.0-bin-hadoop3:$ bin/spark-sql --help ... CLI options: Exception in thread "main" java.lang.UnsupportedOperationException: The Security Manager is deprecated and will be removed in a future release at java.base/java.lang.System.setSecurityManager(System.java:429) at org.apache.spark.deploy.SparkSubmitArguments.getSqlShellOptions(SparkSubmitArguments.scala:623) ``` ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual test. ``` $ build/sbt test:package -Phive -Phive-thriftserver $ bin/spark-sql --help ... CLI options: -d,--define <key=value> Variable substitution to apply to Hive commands. e.g. -d A=B or --define A=B --database <databasename> Specify the database to use -e <quoted-query-string> SQL from command line -f <filename> SQL from files -H,--help Print help information --hiveconf <property=value> Use value for given property --hivevar <key=value> Variable substitution to apply to Hive commands. e.g. --hivevar A=B -i <filename> Initialization SQL file -S,--silent Silent mode in interactive shell -v,--verbose Verbose mode (echo executed SQL to the console) ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#42901 from dongjoon-hyun/SPARK-45147. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>

Marcelo Vanzin added 3 commits May 1, 2015 11:47

Fix exit code never being used when printing usage messages.

c6609bf

Filter output of Spark SQL CLI help.

bf139b5

Merge branch 'master' into SPARK-6324

2821481

asfgit closed this in 700312e Jun 5, 2015

vanzin deleted the SPARK-6324 branch June 5, 2015 16:53

dongjoon-hyun mentioned this pull request Sep 13, 2023

[SPARK-45147][CORE] Remove System.setSecurityManager usage #42901

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SPARK-6324] [core] Centralize handling of script usage messages. #5841

[SPARK-6324] [core] Centralize handling of script usage messages. #5841

vanzin commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented May 1, 2015

SparkQA commented May 1, 2015

SparkQA commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

SparkQA commented Jun 2, 2015

srowen commented Jun 2, 2015

SparkQA commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

srowen commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

SparkQA commented Jun 2, 2015

SparkQA commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

[SPARK-6324] [core] Centralize handling of script usage messages. #5841

[SPARK-6324] [core] Centralize handling of script usage messages. #5841

Conversation

vanzin commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented May 1, 2015

SparkQA commented May 1, 2015

SparkQA commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented May 1, 2015

AmplabJenkins commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

SparkQA commented Jun 2, 2015

srowen commented Jun 2, 2015

SparkQA commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

srowen commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015

SparkQA commented Jun 2, 2015

SparkQA commented Jun 2, 2015

AmplabJenkins commented Jun 2, 2015