-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-14082][MESOS] Enable GPU support with Mesos #14644
Conversation
Does Mesos have a general node labeling mechanism like YARN that one could use to choose machines with a certain property? It seems OK to also support a Mesos-specific property under spark.mesos.* if this is broken out separately in Mesos but just wondering if that provides a simpler and more generic solution. |
Test build #63776 has finished for PR 14644 at commit
|
Test build #63777 has finished for PR 14644 at commit
|
Test build #63792 has finished for PR 14644 at commit
|
@srowen Mesos also supports node labels as well (which is how constraints is implemented in Spark framework). However GPUs are implemented as a resource (as we want to account for # of GPUs instead of just placing a task there). As for the config name, I just picked that to begin with. I was also thinking we should consider having a generic config name (spark.gpus?) as I believe it could be reused. But I wasn't sure how we like to account for this yet as GPUs are quite different from CPUs (Mesos currently just do a integer number of GPUs, not sharing or topology information yet). You have suggestons? |
OK, that makes sense. I think a property under spark.mesos makes sense right now. |
@tnachen does this resolve SPARK-14082? /cc @mgummelt |
@@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend( | |||
private val stateLock = new ReentrantLock | |||
|
|||
val extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0) | |||
val maxGpus = conf.getInt("spark.mesos.gpus.max", 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know in a previous iteration of this patch we were using the config variable spark.mesos.gpu.enabled=true
instead of spark.mesos.gpu.max=XX
. Why did this change? If I nee to set a max, how do I know how big to make the value? Do the spark configuration scripts allow me to run bash commands to programattically count the total number of GPUs on the machine if that's my desired max?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My thoughts was that by only allowing a Boolean flag a spark job either uses all GPUs from a host or not, which it won't be able to have different GPu devices shared by different jobs. By specifying a limit at least there is ability to let a job specify how much GPUs it should grab per node. thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it makes sense. The only additional question I had was if there's a
way to auto-discover the number of GPUs on the machine and do some math on
it inside the config. i.e. does the config allow bash syntax or something
so I can inspect /dev/
and count the number of GPUs installed? I'm
picturing a scenario where I'd like to deploy the same config to a bunch of
hosts, but have the autodiscover this value themselves.
On Thu, Sep 8, 2016 at 3:17 PM Timothy Chen notifications@github.com
wrote:
In
core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosCoarseGrainedSchedulerBackend.scala
#14644 (comment):@@ -103,6 +103,7 @@ private[spark] class MesosCoarseGrainedSchedulerBackend(
private val stateLock = new ReentrantLockval extraCoresPerExecutor = conf.getInt("spark.mesos.extra.cores", 0)
- val maxGpus = conf.getInt("spark.mesos.gpus.max", 0)
My thoughts was that by only allowing a Boolean flag a spark job either
uses all GPUs from a host or not, which it won't be able to have different
GPu devices shared by different jobs. By specifying a limit at least there
is ability to let a job specify how much GPUs it should grab per node.
thoughts?—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://github.com/apache/spark/pull/14644/files/2f658b40251ff75ab4695e4afc55487a6a6eb015#r78002761,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAF4o5c68RBufSO5UF2DpuLtcbkspxClks5qoArjgaJpZM4JkJdn
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@klueska I think you do not need to autodiscover anything, the concept is similar to cpus max in the scheduler. Thoughts?
@tnachen I think there should be some logic checking current total against the max gpus configured like in the case of cpusMax. I dont see any. I expect offers to be splitted. In that case we need to check the sum of the assigned gpus against the max right?
Am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just looked up what the semantics of "cores.max" are and they seem to be slightly different than what this patch includes:
http://spark.apache.org/docs/latest/running-on-mesos.html#coarse-grained
It says that if "cores.max" is not included, then spark will just blindly accept all cpu offers it is given. In the current patch, no gpu resources will be accepted if "gpus.max" is not set.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which sounds sensible to me since GPU is not usually required to run your Spark job. And also cores.max is an aggregate max, where gpu.max as the current patch is a per node max. I think I will change this into how cores.max work, but default to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not saying it's not sensible. I'm just trying to figure what I can do to tell it to accept all GPUs in the offer (which is what I want in my setup). Some offers have more than others and it feels weird to just pick a really big number to ensure that I get them all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, in this case it's the same semantics as cpus.max, so I think using a really big number seems right to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tnachen I think it is only a threshold for each offer per node not merely per node. You may get multiple offers for gpus from the same node correct? From what I see this pr does not do any counting of how many gpus have been assigned per node so far.
I just tested this against the newest GPU support in Mesos 1.0.1 and everything seems to work as expected. My only question is the use of |
Tim, please file a JIRA! |
Test build #65700 has finished for PR 14644 at commit
|
Test build #65708 has finished for PR 14644 at commit
|
Test build #65775 has finished for PR 14644 at commit
|
@klueska Just updated the patch and I think it's using the right semantics now, where it has a global gpus max just like cores. Can you try it out? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
|
2,3: we don't fail if you ask for more GPUs since it's not a hard requirement but simply a max, just like how CPUs.max work. I didn't add a required amount setting but we can certainly add it in the future. |
...so if you ask for 1 GPU you may only get 0? |
Ya, the default GPU requirement I have is 0 (cores per executor/node is 1). |
Test build #66160 has finished for PR 14644 at commit
|
Test build #66613 has finished for PR 14644 at commit
|
Merged to master |
## What changes were proposed in this pull request? Enable GPU resources to be used when running coarse grain mode with Mesos. ## How was this patch tested? Manual test with GPU. Author: Timothy Chen <tnachen@gmail.com> Closes apache#14644 from tnachen/gpu_mesos.
We have some servers running 8 GPUs on mesos. I would like to run Spark on it but I need to be able from spark to allocate a GPU only per map phase. On Hadoop 3.0 you can do spark.yarn.executor.resource.yarn.io/gpu. I have a Spark job that receives a list of files to process, each map on spark should call a c script that reads a chunk of the list and process it on the gpu. For this I need that Spark recognizes the allocated gpu from Mesos like GPU0 is yours and of course mesos needs to mark that gpu as used. with this gpu.max this is not possible |
What changes were proposed in this pull request?
Enable GPU resources to be used when running coarse grain mode with Mesos.
How was this patch tested?
Manual test with GPU.