Skip to content

Conversation

@wangmiao1981
Copy link
Contributor

What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.

How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This duplicates all of the usage info in SparkSubmitArguments. We definitely don't want that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Sean this copy-paste really not a good way to handle this issue. we should find out the root cause of this broken compared to 1.6 and then try to fix in that way.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this one is copied from SparkSubmitArguments. SparkSubmitArguments is the scala project. I didn't find the Main.java call anything from SparkSubmitArguments in the scala file, which seems the scala version is not used. I will debug more to find out the reason.

@srowen
Copy link
Member

srowen commented May 18, 2016

Is the fix not to just pull up the call to builder.buildCommand(env); in Main from line 86 to after line 60? then it does require rethinking a bit how the error path is handled, where it creates a builder for the 'help' command. But this would at least use the existing mechanism for handling arg parsing errors.

CC @vanzin

@SparkQA
Copy link

SparkQA commented May 18, 2016

Test build #58751 has finished for PR 13163 at commit ce0d2c0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 18, 2016

Test build #58752 has finished for PR 13163 at commit 8d14bcf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented May 18, 2016

Sorry, this was me. But as others said, this is not the right fix. Here's a patch that does it:

diff --git a/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java b/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
index 76897c4..a7c4179 100644
--- a/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
+++ b/launcher/src/main/java/org/apache/spark/launcher/SparkSubmitCommandBuilder.java
@@ -107,28 +107,37 @@ class SparkSubmitCommandBuilder extends AbstractCommandBuilder {

   SparkSubmitCommandBuilder(List<String> args) {
     this.allowsMixedArguments = false;
+    this.sparkArgs = new ArrayList<>();

     boolean isExample = false;
     List<String> submitArgs = args;
-    if (args.size() > 0 && args.get(0).equals(PYSPARK_SHELL)) {
-      this.allowsMixedArguments = true;
-      appResource = PYSPARK_SHELL;
-      submitArgs = args.subList(1, args.size());
-    } else if (args.size() > 0 && args.get(0).equals(SPARKR_SHELL)) {
-      this.allowsMixedArguments = true;
-      appResource = SPARKR_SHELL;
-      submitArgs = args.subList(1, args.size());
-    } else if (args.size() > 0 && args.get(0).equals(RUN_EXAMPLE)) {
-      isExample = true;
-      submitArgs = args.subList(1, args.size());
+    if (args.size() > 0) {
+      switch (args.get(0)) {
+        case PYSPARK_SHELL:
+          this.allowsMixedArguments = true;
+          appResource = PYSPARK_SHELL;
+          submitArgs = args.subList(1, args.size());
+          break;
+
+        case SPARKR_SHELL:
+          this.allowsMixedArguments = true;
+          appResource = SPARKR_SHELL;
+          submitArgs = args.subList(1, args.size());
+          break;
+
+        case RUN_EXAMPLE:
+          isExample = true;
+          submitArgs = args.subList(1, args.size());
+      }
+
+      OptionParser parser = new OptionParser();
+      parser.parse(submitArgs);
+      this.printInfo = parser.infoRequested;
+    }  else {
+      this.printInfo = true;
     }

-    this.sparkArgs = new ArrayList<>();
     this.isExample = isExample;
-
-    OptionParser parser = new OptionParser();
-    parser.parse(submitArgs);
-    this.printInfo = parser.infoRequested;
   }

   @Override
@@ -147,7 +156,7 @@ class SparkSubmitCommandBuilder extends AbstractCommandBuilder {
     List<String> args = new ArrayList<>();
     SparkSubmitOptionParser parser = new SparkSubmitOptionParser();

-    if (!allowsMixedArguments) {
+    if (!allowsMixedArguments && !printInfo) {
       checkArgument(appResource != null, "Missing application resource.");
     }

@wangmiao1981
Copy link
Contributor Author

wangmiao1981 commented May 18, 2016

@vanzin Thanks for your clarification. Let me learn and revise the change. '--help' is also broken. I am learning the flow of spark submit and invoking the scala logic to printout usage.

@vanzin
Copy link
Contributor

vanzin commented May 18, 2016

--help seems to also work with my patch.

@vanzin
Copy link
Contributor

vanzin commented May 18, 2016

BTW this should be a good opportunity to add a unit test to make sure --help works, for example.

@wangmiao1981
Copy link
Contributor Author

@vanzin Sure. I will do it after I fully understand the logic. Good to learn how Spark submit works. Thanks for your time!

@wangmiao1981 wangmiao1981 force-pushed the submit branch 2 times, most recently from c125dd1 to 9034dd3 Compare May 18, 2016 19:13
@SparkQA
Copy link

SparkQA commented May 18, 2016

Test build #58810 has finished for PR 13163 at commit 9034dd3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangmiao1981
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 18, 2016

Test build #58813 has finished for PR 13163 at commit 9034dd3.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangmiao1981
Copy link
Contributor Author

retest this please

@vanzin
Copy link
Contributor

vanzin commented May 18, 2016

@wangmiao1981 while you wait for tests to pass could you try to add a unit test to SparkSubmitCommandBuilderSuite.java?

@wangmiao1981
Copy link
Contributor Author

@vanzin Based on my understanding, the help message is print on console and the launcher should not expect help message String. So, will I expect no exception.

@vanzin
Copy link
Contributor

vanzin commented May 19, 2016

I'm not really sure I understand what you're saying; but it should be possible to write a unit test where you create a SparkSubmitCommandBuilder to execute --help and it doesn't throw an exception like it does before this fix. You don't need to check the output.

@wangmiao1981
Copy link
Contributor Author

@vanzin Yes, that is what I mean and I want to confirm with you. I only check no exception.

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58826 has finished for PR 13163 at commit bfc4876.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@wangmiao1981
Copy link
Contributor Author

@vanzin I will check and fix the existing Unit test failure tonight.

[error] Test org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.testExamplesRunner failed:

Thanks!

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58832 has finished for PR 13163 at commit b257891.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58852 has finished for PR 13163 at commit e2776eb.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@vanzin
Copy link
Contributor

vanzin commented May 19, 2016

retest this please

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58873 has finished for PR 13163 at commit 9e90805.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

SparkSubmitOptionParser parser = new SparkSubmitOptionParser();

if (!allowsMixedArguments) {
if (!allowsMixedArguments &!printInfo) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be ... && !printInfo

@vanzin
Copy link
Contributor

vanzin commented May 19, 2016

Looks good, just a remaining nit.

@SparkQA
Copy link

SparkQA commented May 19, 2016

Test build #58903 has finished for PR 13163 at commit 8b632e3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Map<String, String> env = new HashMap<>();
List<String> cmd = buildCommand(sparkSubmitArgs, env);

List<String> sparkEmptyArgs = Arrays.asList("");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, there's a problem here. This line should be testing sparkEmptyArgs. And at that point the assert needs to be moved up a bit.

In fact, it's better to have two separate test methods (one for help, one for empty).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I add one more assert for sparkSubmitArgs and modified the message of sparkEmptyArgs. Local unit test works fine.

List<String> cmd = buildCommand(sparkSubmitArgs, env);
assertTrue("--help should be contained in the final cmd.", cmd.contains(parser.HELP));

List<String> sparkEmptyArgs = Arrays.asList("");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is still wrong; you're not testing sparkEmptyArgs at all!

If you break this into two different tests, you'll see what I mean, because sparkSubmitArgs won't be defined when you're supposed to be testing the empty args list.

(If that makes it clearer, call the first list helpArgs instead of sparkSubmitArgs; then it will be easier to see that you're using the wrong list in this call.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, and BTW, you should use Collections.emptyList(), because you don't really have an empty args list, but an args list with a single argument that happens to be an empty string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for this obvious mistake! It is really a stupid mistake. Thanks for your time!

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #58914 has finished for PR 13163 at commit 79d2aa8.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 20, 2016

Test build #58937 has finished for PR 13163 at commit 2941e62.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


List<String> sparkEmptyArgs = Collections.emptyList();
cmd = buildCommand(sparkEmptyArgs, env);
assertTrue("org.apache.spark.deploy.SparkSubmit should be contained in the final cmd of empty input.",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: alignment is incorrect here. I'll fix during merge.

@vanzin
Copy link
Contributor

vanzin commented May 20, 2016

LGTM, merging to master / 2.0.

@asfgit asfgit closed this in fe2fcb4 May 20, 2016
asfgit pushed a commit that referenced this pull request May 20, 2016
…rguments is specified

(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.

(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]

Options:
  --master MASTER_URL         spark://host:port, mesos://host:port, yarn, or local.
  --deploy-mode DEPLOY_MODE   Whether to launch the driver program locally ("client") or
                              on one of the worker machines inside the cluster ("cluster")
                              (Default: client).
  --class CLASS_NAME          Your application's main class (for Java / Scala apps).
  --name NAME                 A name of your application.
  --jars JARS                 Comma-separated list of local jars to include on the driver
                              and executor classpaths.
  --packages                  Comma-separated list of maven coordinates of jars to include
                              on the driver and executor classpaths. Will search the local
                              maven repo, then maven central and any additional remote
                              repositories given by --repositories. The format for the
                              coordinates should be groupId:artifactId:version.

Author: wm624@hotmail.com <wm624@hotmail.com>

Closes #13163 from wangmiao1981/submit.

(cherry picked from commit fe2fcb4)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants