-
Notifications
You must be signed in to change notification settings - Fork 29.1k
[SPARK-15360][Spark-Submit]Should print spark-submit usage when no arguments is specified #13163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This duplicates all of the usage info in SparkSubmitArguments. We definitely don't want that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with Sean this copy-paste really not a good way to handle this issue. we should find out the root cause of this broken compared to 1.6 and then try to fix in that way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this one is copied from SparkSubmitArguments. SparkSubmitArguments is the scala project. I didn't find the Main.java call anything from SparkSubmitArguments in the scala file, which seems the scala version is not used. I will debug more to find out the reason.
|
Is the fix not to just pull up the call to CC @vanzin |
|
Test build #58751 has finished for PR 13163 at commit
|
|
Test build #58752 has finished for PR 13163 at commit
|
|
Sorry, this was me. But as others said, this is not the right fix. Here's a patch that does it: |
|
@vanzin Thanks for your clarification. Let me learn and revise the change. '--help' is also broken. I am learning the flow of spark submit and invoking the scala logic to printout usage. |
|
|
|
BTW this should be a good opportunity to add a unit test to make sure |
|
@vanzin Sure. I will do it after I fully understand the logic. Good to learn how Spark submit works. Thanks for your time! |
c125dd1 to
9034dd3
Compare
|
Test build #58810 has finished for PR 13163 at commit
|
|
retest this please |
|
Test build #58813 has finished for PR 13163 at commit
|
|
retest this please |
|
@wangmiao1981 while you wait for tests to pass could you try to add a unit test to SparkSubmitCommandBuilderSuite.java? |
|
@vanzin Based on my understanding, the help message is print on console and the launcher should not expect help message String. So, will I expect no exception. |
|
I'm not really sure I understand what you're saying; but it should be possible to write a unit test where you create a |
|
@vanzin Yes, that is what I mean and I want to confirm with you. I only check no exception. |
|
Test build #58826 has finished for PR 13163 at commit
|
|
@vanzin I will check and fix the existing Unit test failure tonight. [error] Test org.apache.spark.launcher.SparkSubmitCommandBuilderSuite.testExamplesRunner failed: Thanks! |
|
Test build #58832 has finished for PR 13163 at commit
|
|
Test build #58852 has finished for PR 13163 at commit
|
|
retest this please |
|
Test build #58873 has finished for PR 13163 at commit
|
| SparkSubmitOptionParser parser = new SparkSubmitOptionParser(); | ||
|
|
||
| if (!allowsMixedArguments) { | ||
| if (!allowsMixedArguments &!printInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should be ... && !printInfo
|
Looks good, just a remaining nit. |
|
Test build #58903 has finished for PR 13163 at commit
|
| Map<String, String> env = new HashMap<>(); | ||
| List<String> cmd = buildCommand(sparkSubmitArgs, env); | ||
|
|
||
| List<String> sparkEmptyArgs = Arrays.asList(""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, there's a problem here. This line should be testing sparkEmptyArgs. And at that point the assert needs to be moved up a bit.
In fact, it's better to have two separate test methods (one for help, one for empty).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I add one more assert for sparkSubmitArgs and modified the message of sparkEmptyArgs. Local unit test works fine.
| List<String> cmd = buildCommand(sparkSubmitArgs, env); | ||
| assertTrue("--help should be contained in the final cmd.", cmd.contains(parser.HELP)); | ||
|
|
||
| List<String> sparkEmptyArgs = Arrays.asList(""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still wrong; you're not testing sparkEmptyArgs at all!
If you break this into two different tests, you'll see what I mean, because sparkSubmitArgs won't be defined when you're supposed to be testing the empty args list.
(If that makes it clearer, call the first list helpArgs instead of sparkSubmitArgs; then it will be easier to see that you're using the wrong list in this call.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, and BTW, you should use Collections.emptyList(), because you don't really have an empty args list, but an args list with a single argument that happens to be an empty string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for this obvious mistake! It is really a stupid mistake. Thanks for your time!
|
Test build #58914 has finished for PR 13163 at commit
|
|
Test build #58937 has finished for PR 13163 at commit
|
|
|
||
| List<String> sparkEmptyArgs = Collections.emptyList(); | ||
| cmd = buildCommand(sparkEmptyArgs, env); | ||
| assertTrue("org.apache.spark.deploy.SparkSubmit should be contained in the final cmd of empty input.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: alignment is incorrect here. I'll fix during merge.
|
LGTM, merging to master / 2.0. |
…rguments is specified
(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.
Author: wm624@hotmail.com <wm624@hotmail.com>
Closes #13163 from wangmiao1981/submit.
(cherry picked from commit fe2fcb4)
Signed-off-by: Marcelo Vanzin <vanzin@cloudera.com>
What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
In 2.0, ./bin/spark-submit doesn't print out usage, but it raises an exception.
In this PR, an exception handling is added in the Main.java when the exception is thrown. In the handling code, if there is no additional argument, it prints out usage.
How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests)
Manually tested.
./bin/spark-submit
Usage: spark-submit [options] <app jar | python file> [app arguments]
Usage: spark-submit --kill [submission ID] --master [spark://...]
Usage: spark-submit --status [submission ID] --master [spark://...]
Usage: spark-submit run-example [options] example-class [example args]
Options:
--master MASTER_URL spark://host:port, mesos://host:port, yarn, or local.
--deploy-mode DEPLOY_MODE Whether to launch the driver program locally ("client") or
on one of the worker machines inside the cluster ("cluster")
(Default: client).
--class CLASS_NAME Your application's main class (for Java / Scala apps).
--name NAME A name of your application.
--jars JARS Comma-separated list of local jars to include on the driver
and executor classpaths.
--packages Comma-separated list of maven coordinates of jars to include
on the driver and executor classpaths. Will search the local
maven repo, then maven central and any additional remote
repositories given by --repositories. The format for the
coordinates should be groupId:artifactId:version.