New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-20992][Scheduler] Add support for Nomad as a scheduler backend #18209
Conversation
I don't believe we'd accept a change this big into the core project. This isn't a widely used scheduler AFAICT (I had never heard of it). If possible, it can begin life as an external add-on. |
Test build #3782 has finished for PR 18209 at commit
|
I very much appreciate this effort. We are using Nomad since one year now and it has proven to be a very well designed and robust cluster manager. Currently we use Nomad to deploy a Standalone Spark cluster which then blocks a lot of resources on Nomad just to be ready for jobs to be submitted. If we were able to submit jobs directly to Nomad this could heavily improve our resource utilization. Thanks a lot for the contribution @barnardb we will definitely look into it and give it a try. |
Hey @srowen. Nomad is an unusual product for HashiCorp in that we have had tremendous interest from the top of the funnel at an early stage (due in part to the success of our other products). This PR is the result of a first class, internal initiative that was driven by a number of top tier customers. The majority of the changes are unit tests and documentation. Most of the complexity is isolated from the Spark code base. We fully expect and have planned for the onus of maintenance to be on us. We are committed to the integration and will dedicate resources to resolve issues quickly as they arise. |
The code should also just live outside Spark if that's true. If there are SPIs that need to be established to make that happen then that is what should be proposed. I know this has been rejected in the past though. |
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes). | ||
[standalone mode](spark-standalone.html), [YARN mode](running-on-yarn.html), | ||
[Mesos coarse-grained mode](running-on-mesos.html#mesos-run-modes), and | ||
[Nomad mode](running-on-nomade.html#dynamic-allocation-of-executors). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be running-on-nomad.html
, shouldn't it?
+1 to this. Nomad is our chosen scheduler and adding Nomad Spark support would be a huge benefit, adding additional flexibility to the great Spark project. Thanks to @barnardb for the great work on this. |
All: it's not relevant whether Nomad is a good piece of software or not, or whether you use it. I do not believe it makes sense to make the project support this code. Similar things have been rejected in the past. Please focus on what SPIs might enable you to support this externally. |
+1 these efforts. |
Nomad is great scheduler, solves a lot of scaling issues with schedulers that most schedulers out there today cannot. I have been around the open source community a bit and I can say that if support of something isn't directly merged into a project and its part of some external add-on, supporting it will be a nightmare and support will probably eventually disappear since there is most likely not going to be a full time person monitoring PRs to see if a change will break support for this "add-on". Given the code is of quality, I am in favor of merging this PR into spark. |
I don't see why spark should have native mesos support but not native nomad support - it doesn't make sense. We heavily rely on nomad for all ephemeral workload scheduling and if this doesn't go into the trunk people will always have to follow a cumbersome patching (or just a "don't run the real thing, run this instead") process. I think it's worth noting why we can't just run Mesos - when we run a spark workload in the cloud, we provision tens of thousands of VMs and immediately want to start using them for a short burst of work. When we're done, it all goes away. Mesos does not handle this sort of extreme elasticity very well, but nomad excels at it. We've benchmarked nomad at scheduling nearly 2,500 containers per second which is crazy fast compared to both mesos and kubernetes. It's a win for everyone when there's more choice - especially since I'm sure hashicorp will support the ongoing efforts to support and maintain this integration. Thanks for the hard work on this, @barnardb. We tested it thoroughly and found it great for our needs. Would love to see this pushed through. |
thanks very much @barnardb ! We've been testing this patch extensively and it seems to be working great for us, it solves the problem much better than we had solved it ourselves waiting for a patch like this. I'd love to see this get support! |
The next one to add is probably Kubernetes. Even the Spark on Kubernetes is going through this cycle of maintaining a separate project for it first. Adding a new scheduler has huge overhead. It is not the initial development that's the issue, but the following testing, releases, etc. In the past, it is almost always bugs in scheduler integration that's blocking a Spark release. |
Given that at Facebook we use our own in-house scheduler, I see why people would want to see their scheduler impls added right in Spark codebase as a first class citizen. Like @srowen said, this should not be part of Spark codebase. There would be many schedulers that would be developed in future and if we keep adding those to Spark, it will be lot of maintenance burden. Mesos and YARN have their place in the codebase given that they were there in early days of the project and are widely used in deployments (not saying that this is good or bad). This PR is making things worse for future by taking the easier route of adding yet another I would recommend to add interfaces in Spark code to abstract out things so that plugging in new schedulers would be as simple as adding a jar to the classpath. As a matter of fact, we are currently able to use Spark with our own scheduler by extending the existing interfaces. The only patching we have to do is 3-4 lines which is not super pain during upgrades. Seems like you need more than the current interfaces. Having said this, I do appreciate the efforts put in this PR. |
@rxin is there any timeline / plan around Kubernetes? Where can I follow that activity? Thanks! |
@manojlds I'm a part of the Spark-on-k8s team that's currently building k8s integration for Spark outside of the Apache Spark repo. You can follow our work at https://github.com/apache-spark-on-k8s/spark Here's the current outstanding diff on top of You can follow from the original design and discussion for the kubernetes efforts at https://issues.apache.org/jira/browse/SPARK-18278 One of the major pieces of feedback we received back in November was that, if possible, the kubernetes integration should be done as a plugin to the Apache Spark repo. Unfortunately that plugin point does not yet exist in Spark. So we filed https://issues.apache.org/jira/browse/SPARK-19700 to begin the process of creating that API but have not yet devoted effort to building that plugin point. Once active development on the k8s integration begin to slow down and near completion (starting to see signs of this slowing this month) we'll probably shift focus to the plugin point as a means of gaining even wider use of the k8s integration. On what a plugin point could possibly look like, we produced a thought experiment at palantir#81 which you may find interesting. As of now, it remains unclear whether the k8s code will be brought in as a first-class citizen alongside Mesos and YARN, or whether Spark will gain a plugin point and the k8s integration. We'd be happy to be a part of any wider efforts to create a plugin point for custom resource managers at SPARK-19700. Is that something the Nomad team would be interested in contributing to? |
The Apache Spark fork enhanced to support HashiCorp Nomad as a scheduler is now located at: https://github.com/hashicorp/nomad-spark. If you are interested in trying it out, the best place to get started is here: https://www.nomadproject.io/guides/spark/spark.html. |
@rcgenova @barnardb Where does that fork stand in terms of support or feature readiness?
Is there a document how "add-on" are maintained? My first thought is that there is https://spark-packages.org/ , but Nomad as a scheduler seems to require rebuilding all of Spark, for some reason, not just a "Drop-in" / "Add On" package... |
Do these such interfaces exist in Spark at present day? What efforts do you know of to make support for such efforts to "be as simple as adding a jar to the classpath"? I see https://issues.apache.org/jira/browse/SPARK-19700 is still open...
Well, looks like it's first class now. Is there historical context one can read on how that decision was ultimately made instead of working on the plugin interface, as discussed? |
Follow up question, what's the |
What changes were proposed in this pull request?
Adds support for Nomad as a scheduler backend. Nomad is a cluster manager designed for both long lived services and short lived batch processing workloads.
The integration supports client and cluster mode, dynamic allocation (increasing only), has basic support for python and R applications, and works with applications packaged either as JARs or as docker images.
Documentation is in docs/running-on-nomad.md.
This will be presented at Spark Summit 2017.
A build of the pull request with Nomad support is at available here.
Feedback would be much appreciated.
How was this patch tested?
This patch was tested with Integration and manual tests, and a load test was performed to ensure it doesn't have worse performance than the YARN integration.
The feature was developed and tested against Nomad 0.5.6 (current stable version)
on Spark 2.1.0, rebased to 2.1.1 and retested, and finally rebased to master and retested.