-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-9270] [PySpark] allow --name option in pyspark #7610
Conversation
That seems fine if it still sets the app name as desired and improves consistency. |
Test build #38189 has finished for PR 7610 at commit
|
retest this please. |
1 similar comment
retest this please. |
This is not critical, but not great either; it points at some inconsistency in the code somewhere. From looking at SparkSubmit.scala, it should work, but I haven't explicitly tested it.
Is that true? |
Test build #38224 has finished for PR 7610 at commit
|
@vanzin thank you for your comments!
So if you run the following cmd-
It invokes the following cmd b/c
Since In summary, |
So the Python part is OK but now does |
@srowen In short, |
Test build #38232 has finished for PR 7610 at commit
|
Hmm, jenkins unstable? |
Hmm, it may not work, but I don't think that's the cause. With your changes, that line should never be reached when starting the shell. What I think is happening is:
So it seems like an ordering issue in SparkSubmit.scala. In any case, it doesn't seem important enough to change just for this particular edge case. The change LGTM. |
Test build #38256 has finished for PR 7610 at commit
|
@@ -82,4 +82,4 @@ fi | |||
|
|||
export PYSPARK_DRIVER_PYTHON | |||
export PYSPARK_DRIVER_PYTHON_OPTS | |||
exec "$SPARK_HOME"/bin/spark-submit pyspark-shell-main "$@" | |||
exec "$SPARK_HOME"/bin/spark-submit pyspark-shell-main --name "PySparkShell" "$@" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we support bin/pyspark --name MyName
? spark-shell
support that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--name MyName
works. So the final command will be like this:
spark-submit pyspark-shell-main --name "PySparkShell" --name "MyName" <other args>
Then, MyName takes effect since it comes later than PySparkShell.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to be clear, this is how --name
is supported in spark-shell
too (#7512). I am following the pattern introduced by that patch.
Merged to master. Thanks! |
This is continuation of #7512 which added
--name
option to spark-shell. This PR adds the same option to pyspark.Note that
--conf spark.app.name
in command-line has no effect in spark-shell and pyspark. Instead,--name
must be used. This is in fact inconsistency with spark-sql which doesn't accept--name
option while it accepts--conf spark.app.name
. I am not fixing this inconsistency in this PR. IMO, one of--name
and--conf spark.app.name
is needed not both. But since I cannot decide which to choose, I am not making any change here.