Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPARK-1818 Freshen Mesos documentation #756

Closed
wants to merge 9 commits into from
Closed

SPARK-1818 Freshen Mesos documentation #756

wants to merge 9 commits into from

Conversation

ash211
Copy link
Contributor

@ash211 ash211 commented May 13, 2014

Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.

Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.
@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.


# Installing Mesos

Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incidentally I think MESOS_VERSION needs to be updated to 0.18 - mind doing that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to 0.18.1 per the pom.xml

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14936/

## Verification

To verify that the Mesos cluster is ready for Spark, navigate to the Mesos master webui at port
:5050 Confirm that all expected machines are present in the slaves tab
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Period at the end of this sentence?

@pwendell
Copy link
Contributor

Made a few surface level comments - this is a great and much-needed update. Thanks!

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14937/

@ash211
Copy link
Contributor Author

ash211 commented May 13, 2014

Made changes to cover Patrick's suggestions.

I had some trouble getting LZO libraries accessible from the Mesos executor and never actually got that working. Do you know if there's a way to add additional .jars to the Mesos executors, or do all .jars need to be bundled in the distribution tarball downloaded from e.g. HDFS?

If so I'd like to add that to the documentation too.

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@ash211
Copy link
Contributor Author

ash211 commented May 13, 2014

cc: @ceteri -- we're updating the Spark documentation for running on Mesos. Given your expertise with this setup (I believe you did the video and wrote the docs below), would you mind taking a look at this documentation refresh?

This will likely end up being the documentation for the next Spark release and I'd like to get it in a great state for the big 1.0

https://www.youtube.com/watch?v=KVWMhIeKM_A
http://mesosphere.io/learn/run-spark-on-mesos

Thanks!
Andrew

@AmplabJenkins
Copy link

Merged build finished. All automated tests passed.

@AmplabJenkins
Copy link

All automated tests passed.
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14938/

* `export SPARK_EXECUTOR_URI=<path to spark-{{site.SPARK_VERSION}}.tar.gz uploaded above>`.
* `export MASTER=mesos://HOST:PORT` where HOST:PORT is the host and port (default: 5050) of your Mesos master (or `zk://...` if using Mesos with ZooKeeper).
8. To run a Spark application against the cluster, when you create your `SparkContext`, pass the string `mesos://HOST:PORT` as the master URL. In addition, you'll need to set the `spark.executor.uri` property. For example:
# Why Mesos
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I miss a chapter on "How Spark works on Mesos" that helps understanding the need of the assembly in a reachable location by the slaves and how the spark driver program becomes the 'scheduler' in Mesos terms and the tasks get delivered to the executors.

@maasg
Copy link

maasg commented May 13, 2014

Great work.
I'd love to see some more background of the dynamics of Spark running on Mesos. It has been a tough learning experience to get our Spark + Spark Streaming running on a Mesos cluster, and I think the docs can make the life easier for 'future generations' :-)

@ash211
Copy link
Contributor Author

ash211 commented May 14, 2014

Hi @maasg thanks for the review!

I added a section on troubleshooting and debugging, and also a quick overview of how Spark and Mesos interaction in a How it works section.

Do those start to cover what you were looking for as an addition?

@AmplabJenkins
Copy link

Merged build triggered.

@AmplabJenkins
Copy link

Merged build started.

@AmplabJenkins
Copy link

Merged build finished.

@AmplabJenkins
Copy link

Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/14961/

When running a shell the `spark.executor.uri` parameter is inherited from `SPARK_EXECUTOR_URI`, so
it does not need to be redundantly passed in as a system property.

{% highlight shell %}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to be highlight bash. At least on my version of jekyll I get an error otherwise when compiling docs.

@pwendell
Copy link
Contributor

I'm going to pull this in with the required build fix so it gets into a release candidate. This is a major improvement to the mesos docs. Please feel free to submit follow on work in a new PR.

asfgit pushed a commit that referenced this pull request May 14, 2014
Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.

Author: Andrew Ash <andrew@andrewash.com>

Closes #756 from ash211/spark-1818 and squashes the following commits:

7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos
e7dea8e [Andrew Ash] Add troubleshooting and debugging section
956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell
de3353b [Andrew Ash] Wrap to 100char
7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs
3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution
41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050
8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml
74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation
(cherry picked from commit d1d41cc)

Signed-off-by: Patrick Wendell <pwendell@gmail.com>
@asfgit asfgit closed this in d1d41cc May 14, 2014
@ash211
Copy link
Contributor Author

ash211 commented May 14, 2014

Thanks Patrick!
On May 14, 2014 9:45 AM, "Patrick Wendell" notifications@github.com wrote:

I'm going to pull this in with the required build fix so it gets into a
release candidate. This is a major improvement to the mesos docs. Please
feel free to submit follow on work in a new PR.


Reply to this email directly or view it on GitHubhttps://github.com//pull/756#issuecomment-43105912
.

@maasg
Copy link

maasg commented May 15, 2014

Thanks for the updates! +1 (after the facts :) )

The Apache Mesos project only publishes source package releases, no binary releases. But other
third party projects publish binary releases that may be helpful in setting Mesos up.

One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ash211 @pwendell, I missed this when the PR was being reviewed, but we need to list the source instructions first. We are first and foremost an Apache project, and need to tell people how to use Apache software to get their work done. Third parties can go ahead and publish binary distributions if they want, but we can't be posting that as the preferred way to install another Apache project. We do the same thing for Hadoop distros -- we say how to build against an Apache release of Hadoop, and then we have a page on building against third-party distros.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I would call these "third-party packages" instead of "prebuilt packages". Again, since this is going on an Apache web page, we can't be misrepresenting the Mesos project. Bits that are not built by Apache are unfortunately third-party.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I'm starting a cleanup PR that should be in within an hour.

On Fri, May 16, 2014 at 3:17 PM, Matei Zaharia notifications@github.comwrote:

In docs/running-on-mesos.md:

+# Installing Mesos
+
+Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not
+require any special patches of Mesos.
+
+If you already have a Mesos cluster running, you can skip this Mesos installation step.
+
+Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
+frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
+
+## Prebuilt packages
+
+The Apache Mesos project only publishes source package releases, no binary releases. But other
+third party projects publish binary releases that may be helpful in setting Mesos up.
+
+One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:

Also I would call these "third-party packages" instead of "prebuilt
packages". Again, since this is going on an Apache web page, we can't be
misrepresenting the Mesos project. Bits that are not built by Apache are
unfortunately third-party.


Reply to this email directly or view it on GitHubhttps://github.com//pull/756/files#r12761962
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#805

On Fri, May 16, 2014 at 3:41 PM, Andrew Ash andrew@andrewash.com wrote:

Got it. I'm starting a cleanup PR that should be in within an hour.

On Fri, May 16, 2014 at 3:17 PM, Matei Zaharia notifications@github.comwrote:

In docs/running-on-mesos.md:

+# Installing Mesos
+
+Spark {{site.SPARK_VERSION}} is designed for use with Mesos {{site.MESOS_VERSION}} and does not
+require any special patches of Mesos.
+
+If you already have a Mesos cluster running, you can skip this Mesos installation step.
+
+Otherwise, installing Mesos for Spark is no different than installing Mesos for use by other
+frameworks. You can install Mesos using either prebuilt packages or by compiling from source.
+
+## Prebuilt packages
+
+The Apache Mesos project only publishes source package releases, no binary releases. But other
+third party projects publish binary releases that may be helpful in setting Mesos up.
+
+One of those is Mesosphere. To install Mesos using the binary releases provided by Mesosphere:

Also I would call these "third-party packages" instead of "prebuilt
packages". Again, since this is going on an Apache web page, we can't be
misrepresenting the Mesos project. Bits that are not built by Apache are
unfortunately third-party.


Reply to this email directly or view it on GitHubhttps://github.com//pull/756/files#r12761962
.

pdeyhim pushed a commit to pdeyhim/spark-1 that referenced this pull request Jun 25, 2014
Place more emphasis on using precompiled binary versions of Spark and Mesos
instead of encouraging the reader to compile from source.

Author: Andrew Ash <andrew@andrewash.com>

Closes apache#756 from ash211/spark-1818 and squashes the following commits:

7ef3b33 [Andrew Ash] Brief explanation of the interactions between Spark and Mesos
e7dea8e [Andrew Ash] Add troubleshooting and debugging section
956362d [Andrew Ash] Don't need to pass spark.executor.uri into the spark shell
de3353b [Andrew Ash] Wrap to 100char
7ebf6ef [Andrew Ash] Polish on the section on Mesos Master URLs
3dcc2c1 [Andrew Ash] Use --tgz parameter of make-distribution
41b68ed [Andrew Ash] Period at end of sentence; formatting on :5050
8bf2c53 [Andrew Ash] Update site.MESOS_VERSIOn to match /pom.xml
74f2040 [Andrew Ash] SPARK-1818 Freshen Mesos documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants