-
Notifications
You must be signed in to change notification settings - Fork 28.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SPARK-1818 Freshen Mesos documentation #756
Conversation
Place more emphasis on using precompiled binary versions of Spark and Mesos instead of encouraging the reader to compile from source.
Merged build triggered. |
Merged build started. |
|
||
# Installing Mesos | ||
|
||
Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Incidentally I think MESOS_VERSION
needs to be updated to 0.18 - mind doing that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to 0.18.1 per the pom.xml
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
## Verification | ||
|
||
To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port | ||
:5050 Confirm that all expected machines are present in the slaves tab |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Period at the end of this sentence?
Made a few surface level comments - this is a great and much-needed update. Thanks! |
Merged build finished. All automated tests passed. |
All automated tests passed. |
At least that's how I read SparkILoop#createSparkContext()
Made changes to cover Patrick's suggestions. I had some trouble getting LZO libraries accessible from the Mesos executor and never actually got that working. Do you know if there's a way to add additional .jars to the Mesos executors, or do all .jars need to be bundled in the distribution tarball downloaded from e.g. HDFS? If so I'd like to add that to the documentation too. |
Merged build triggered. |
Merged build started. |
cc: @ceteri -- we're updating the Spark documentation for running on Mesos. Given your expertise with this setup (I believe you did the video and wrote the docs below), would you mind taking a look at this documentation refresh? This will likely end up being the documentation for the next Spark release and I'd like to get it in a great state for the big 1.0 https://www.youtube.com/watch?v=KVWMhIeKM_A Thanks! |
Merged build finished. All automated tests passed. |
All automated tests passed. |
* `export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`. | ||
* `export MASTER=mesos://HOST:PORT` where HOST:PORT is the host and port (default: 5050) of your Mesos master (or `zk://...` if using Mesos with ZooKeeper). | ||
8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the master URL. In addition, you'll need to set the `spark.executor.uri` property. For example: | ||
# Why Mesos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I miss a chapter on "How Spark works on Mesos" that helps understanding the need of the assembly in a reachable location by the slaves and how the spark driver program becomes the 'scheduler' in Mesos terms and the tasks get delivered to the executors.
Great work. |
Hi @maasg thanks for the review! I added a section on troubleshooting and debugging, and also a quick overview of how Spark and Mesos interaction in a How it works section. Do those start to cover what you were looking for as an addition? |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14961/ |
When running a shell the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so | ||
it does not need to be redundantly passed in as a system property. | ||
|
||
{% highlight shell %} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this needs to be highlight bash
. At least on my version of jekyll I get an error otherwise when compiling docs.
I'm going to pull this in with the required build fix so it gets into a release candidate. This is a major improvement to the mesos docs. Please feel free to submit follow on work in a new PR. |
Place more emphasis on using precompiled binary versions of Spark and Mesos instead of encouraging the reader to compile from source. Author: Andrew Ash <andrew@andrewash.com> Closes #756 from ash211/spark-1818 and squashes the following commits: 7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos e7dea8e [Andrew Ash] Add troubleshooting and debugging section 956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell de3353b [Andrew Ash] Wrap to 100char 7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs 3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution 41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050 8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml 74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation (cherry picked from commit d1d41cc) Signed-off-by: Patrick Wendell <pwendell@gmail.com>
Thanks Patrick!
|
Thanks for the updates! +1 (after the facts :) ) |
The Apache Mesos project only publishes source package releases, no binary releases. But other | ||
third party projects publish binary releases that may be helpful in setting Mesos up. | ||
|
||
One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ash211 @pwendell, I missed this when the PR was being reviewed, but we need to list the source instructions first. We are first and foremost an Apache project, and need to tell people how to use Apache software to get their work done. Third parties can go ahead and publish binary distributions if they want, but we can't be posting that as the preferred way to install another Apache project. We do the same thing for Hadoop distros -- we say how to build against an Apache release of Hadoop, and then we have a page on building against third-party distros.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also I would call these "third-party packages" instead of "prebuilt packages". Again, since this is going on an Apache web page, we can't be misrepresenting the Mesos project. Bits that are not built by Apache are unfortunately third-party.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. I'm starting a cleanup PR that should be in within an hour.
On Fri, May 16, 2014 at 3:17 PM, Matei Zaharia notifications@github.comwrote:
In docs/running-on-mesos.md:
+# Installing Mesos
+
+Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not
+require any special patches of Mesos.
+
+If you already have a Mesos cluster running, you can skip this Mesos installation step.
+
+Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
+frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
+
+## Prebuilt packages
+
+The Apache Mesos project only publishes source package releases, no binary releases. But other
+third party projects publish binary releases that may be helpful in setting Mesos up.
+
+One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:Also I would call these "third-party packages" instead of "prebuilt
packages". Again, since this is going on an Apache web page, we can't be
misrepresenting the Mesos project. Bits that are not built by Apache are
unfortunately third-party.—
Reply to this email directly or view it on GitHubhttps://github.com//pull/756/files#r12761962
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On Fri, May 16, 2014 at 3:41 PM, Andrew Ash andrew@andrewash.com wrote:
Got it. I'm starting a cleanup PR that should be in within an hour.
On Fri, May 16, 2014 at 3:17 PM, Matei Zaharia notifications@github.comwrote:
In docs/running-on-mesos.md:
+# Installing Mesos
+
+Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not
+require any special patches of Mesos.
+
+If you already have a Mesos cluster running, you can skip this Mesos installation step.
+
+Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
+frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
+
+## Prebuilt packages
+
+The Apache Mesos project only publishes source package releases, no binary releases. But other
+third party projects publish binary releases that may be helpful in setting Mesos up.
+
+One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:Also I would call these "third-party packages" instead of "prebuilt
packages". Again, since this is going on an Apache web page, we can't be
misrepresenting the Mesos project. Bits that are not built by Apache are
unfortunately third-party.—
Reply to this email directly or view it on GitHubhttps://github.com//pull/756/files#r12761962
.
Place more emphasis on using precompiled binary versions of Spark and Mesos instead of encouraging the reader to compile from source. Author: Andrew Ash <andrew@andrewash.com> Closes apache#756 from ash211/spark-1818 and squashes the following commits: 7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos e7dea8e [Andrew Ash] Add troubleshooting and debugging section 956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell de3353b [Andrew Ash] Wrap to 100char 7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs 3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution 41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050 8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml 74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation
Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.