-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-45756][CORE] Support spark.master.useAppNameAsAppId.enabled
#43743
Conversation
Could you review this experimental feature, @viirya ? |
val appId = if (useAppNameAsAppId) { | ||
desc.name.toLowerCase().replaceAll("\\s+", "") | ||
} else { | ||
newApplicationId(date) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, how Spark responses to duplicate appId? If two submitted applications have same app names.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add nextAppNumber
as postfix to the appId as usual?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This experimental design is for the advanced users who maintains unique appName.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good especially this is still internal config.
Thank you, @viirya ! |
spark.master.useAppNameAsAppId.enabled
spark.master.useAppNameAsAppId.enabled
Oops. I used the umbrella JIRA id. |
spark.master.useAppNameAsAppId.enabled
spark.master.useAppNameAsAppId.enabled
|
What is the behavior if this is turned on and there is a conflict ? Will it result in application submission failure ? Overwriting running app state with new app (or vice versa) ? |
Thank you for review, @mridulm .
Yes, it will. Nothing will work correctly. It's a user responsibility to keep it unique as a ID.
This feature gives the users lots of controllability. For simple examples,
Based on (1), (2) and (3), they can build their service very easily by linking jobs and logs (event/driver) effortless with their patterns. |
Why not continue to use the existing scheme (iirc prefix + time: I am afk, so can't confirm), but make the prefix customizable instead ? Pushing responsibility to user and documenting it as such would still result in hard to debug failure modes :-) I dont use standalone mode - so not sure how common this can become, but something to think about. |
@mridulm . It seems that you mean SPARK-45754 (Support
Let me give you the context, @mridulm . There is an umbrella JIRA and most customizable configurations are delivered already.
|
This is not answering my query @dongjoon-hyun - as to why we are allowing expicitly setting an application id, and introducing hard to debug modes; instead of relying on minimizing the issue. |
To @mridulm . The above was the exact answer for your following question. What I meant is that Apache Spark already provide more patterns including
For the following question, that is the exact design goal of this PR where a upper service system uses
In addition, I already answered with the examples. Let me reiterate that. You can generate a SHS link like Lastly, this configuration is introduced as
|
In a specific ecosystem, this can be designed away from being an issue (have narrow waist submission layer above spark which ensures uniqueness, among other things) - and so no risk to debugging in that ecosystem. Thoughts ? |
Sure, I love to provide more document on this because I want to advertise Currently, I'm revisiting the documentation on Does it meet your requirement, @mridulm ? |
Sounds good to me, thanks @dongjoon-hyun ! |
### What changes were proposed in this pull request? This PR aims to support `spark.master.useAppNameAsAppId.enabled` as an experimental feature in Spark Standalone cluster. ### Why are the changes needed? This allows the users to control the appID completely. <img width="359" alt="Screenshot 2023-11-09 at 5 33 45 PM" src="https://github.com/apache/spark/assets/9700541/ad2b89ce-9d7d-4144-bd52-f29b94051103"> ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Manual tests with the following procedure. ``` $ SPARK_MASTER_OPTS="-Dspark.master.useAppNameAsAppId.enabled=true" sbin/start-master.sh $ bin/spark-shell --master spark://max.local:7077 ``` ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#43743 from dongjoon-hyun/SPARK-45756. Authored-by: Dongjoon Hyun <dhyun@apple.com> Signed-off-by: Dongjoon Hyun <dhyun@apple.com>
What changes were proposed in this pull request?
This PR aims to support
spark.master.useAppNameAsAppId.enabled
as an experimental feature in Spark Standalone cluster.Why are the changes needed?
This allows the users to control the appID completely.
Does this PR introduce any user-facing change?
No.
How was this patch tested?
Manual tests with the following procedure.
Was this patch authored or co-authored using generative AI tooling?
No.