New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-19522] Fix executor memory in local-cluster mode #16975
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -466,7 +466,7 @@ object SparkSubmit extends CommandLineUtils { | |
// Other options | ||
OptionAssigner(args.executorCores, STANDALONE | YARN, ALL_DEPLOY_MODES, | ||
sysProp = "spark.executor.cores"), | ||
OptionAssigner(args.executorMemory, STANDALONE | MESOS | YARN, ALL_DEPLOY_MODES, | ||
OptionAssigner(args.executorMemory, ALL_CLUSTER_MGRS, ALL_DEPLOY_MODES, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is the change in As far as I understand, the last value in the local-cluster master is the amount of memory the worker has available; you may, for whatever reason, want to run executors with less than that, which your change doesn't seem to allow. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If this were the only change then specifying There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yeah, I thought about this long and hard but I just couldn't come up with a case where you would possibly want the worker size to be different from executor size in local-cluster mode. If you want to launch 5 workers (2GB), each with 2 executors (1GB), then you might as well just launch 10 executors (1GB) or run real standalone mode locally. I think it's better to fix the out-of-the-box case than to try to cover all potentially non-existent corner cases. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, it would make (e.g. the same command line with a different master would behave differently - you'd fall back to having 1g of memory for executors instead of whatever was defined in the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Anyway, either way is probably fine, so go with your judgement. It just seems like a lot of code in SparkContext just to support that use case.) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The inconsistency is already inherent with the parameters in There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. also we're talking about a net addition of 7 LOC in |
||
sysProp = "spark.executor.memory"), | ||
OptionAssigner(args.totalExecutorCores, STANDALONE | MESOS, ALL_DEPLOY_MODES, | ||
sysProp = "spark.cores.max"), | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you at least change this so that
spark.executor.memory
takes precedence if it's set? Then both use cases are possible. (Maybe someone is crazy enough to be trying dynamic allocation in local-cluster mode, or something else...)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sure