-
Notifications
You must be signed in to change notification settings - Fork 29k
[SPARK-19373][MESOS] Base spark.scheduler.minRegisteredResourceRatio on registered cores rather than accepted cores #17045
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…ather than acquired cores in the Mesos Coarse Grained Scheduler
|
|
||
| override def sufficientResourcesRegistered(): Boolean = { | ||
| totalCoresAcquired >= maxCores * minRegisteredRatio | ||
| totalCoreCount.get >= maxCoresOption.getOrElse(0) * minRegisteredRatio |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the only substantive change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify stuff. totalCoreCount holds the total number of cores in the cluster registered by executors connected back to the scheduler. So as soon as you have the all executors connected you can start the tasks, instead of gradually registering executors and running tasks on them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep.
|
Test build #73370 has finished for PR 17045 at commit
|
|
Test build #73371 has finished for PR 17045 at commit
|
|
Please avoid using "fix" as the description in a PR -- it doesn't tell us anything substantive about the nature of the problem or its resolution, so any future reviewing of commit messages will require digging into the actual committed code to get even a minimal idea of what was done. |
|
@markhamstra Updated |
|
thanks |
|
Test build #73441 has finished for PR 17045 at commit
|
|
LGTM run it locally works fine. |
|
@srowen Can we get a merge? This is a bugfix, so it potentially belongs in all supported branches (1.6, 2.0, 2.1, master). It's not a major bug, though, so I'll leave the backport decisions up to you. |
|
Merged to master. The cherry-pick wasn't clean into 2.1, and it wasn't entirely trivial. Rather than try to fix and get it wrong, if you'd care to open a PR against 2.1 I can merge that back and see how far that gets. |
…on registered cores rather than accepted cores See JIRA Unit tests, Mesos/Spark integration tests cc skonto susanxhuynh Author: Michael Gummelt <mgummelt@mesosphere.io> Closes apache#17045 from mgummelt/SPARK-19373-registered-resources.
…on registered cores rather than accepted cores See JIRA Unit tests, Mesos/Spark integration tests cc skonto susanxhuynh Author: Michael Gummelt <mgummeltmesosphere.io> Closes #17045 from mgummelt/SPARK-19373-registered-resources. ## What changes were proposed in this pull request? (Please fill in changes proposed in this fix) ## How was this patch tested? (Please explain how this patch was tested. E.g. unit tests, integration tests, manual tests) (If this patch involves UI changes, please attach a screenshot; otherwise, remove this) Please review http://spark.apache.org/contributing.html before opening a pull request. Author: Michael Gummelt <mgummelt@mesosphere.io> Closes #17129 from mgummelt/SPARK-19373-registered-resources-2.1.
What changes were proposed in this pull request?
See JIRA
How was this patch tested?
Unit tests, Mesos/Spark integration tests
cc @skonto @susanxhuynh