Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build Spark from source in Docker image #530

Merged
merged 1 commit into from
Jun 29, 2016

Conversation

mariussoutier
Copy link
Contributor

@mariussoutier mariussoutier commented Jun 28, 2016

This allows to compile for Scala 2.11. To build for Scala 2.11, run SCALA_VERSION="2.11.8 sbt docker.

I've already built 0.6.2 images for both Scala versions using this build file:
https://hub.docker.com/r/mariussoutier/spark-jobserver/tags/

The 0.7.0 snapshot has an Akka issue inside Docker which I'm still investigating, but it doesn't seem related to building from source.

@codecov-io
Copy link

codecov-io commented Jun 28, 2016

Current coverage is 62.06%

Merging #530 into master will increase coverage by 0.05%

@@             master       #530   diff @@
==========================================
  Files            64         64          
  Lines          1740       1740          
  Methods        1627       1627          
  Messages          0          0          
  Branches        102        102          
==========================================
+ Hits           1079       1080     +1   
+ Misses          661        660     -1   
  Partials          0          0          

Powered by Codecov. Last updated by a3f6293...0ccb870

@noorul
Copy link
Contributor

noorul commented Jun 28, 2016

Can you squash them to single commit?

@mariussoutier
Copy link
Contributor Author

Done.

@noorul
Copy link
Contributor

noorul commented Jun 28, 2016

  1. I think it will be great to take hadoop version also as input.
  2. I am not sure whether we should be compiling spark inside docker. Is there a benefit of doing this instead of using binary?

@mariussoutier
Copy link
Contributor Author

This allows to compile for Scala 2.11.

@mariussoutier mariussoutier mentioned this pull request Jun 28, 2016
5 tasks
@hntd187
Copy link
Member

hntd187 commented Jun 28, 2016

This is probably not super far off from the docker file I made for this awhile back. https://github.com/hntd187/spark-jobserver-docker/blob/master/Dockerfile

@@ -123,21 +136,28 @@ object JobServerBuild extends Build {
copy(artifact, artifactTargetPath)
copy(baseDirectory(_ / "bin" / "server_start.sh").value, file("app/server_start.sh"))
copy(baseDirectory(_ / "bin" / "server_stop.sh").value, file("app/server_stop.sh"))
copy(baseDirectory(_ / "bin" / "manager_start.sh").value, file("app/manager_start.sh"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I somehow thought we had added this fix before, but must have somehow slipped.... :-p

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually in 0.6.2 it was present...

@velvia
Copy link
Contributor

velvia commented Jun 28, 2016

Looks fine to me. If it is possible to download the binary with Scala 2.11 support also, that would be better, would significantly reduce Docker build times.

@mariussoutier
Copy link
Contributor Author

mariussoutier commented Jun 28, 2016

If that were possible I wouldn't bother to build from source :) Scala 2.11 is not available as a pre-built binary. We could download the 2.10 version, but would make it harder to test. Locally the build takes around 20 minutes.

@hntd187
Copy link
Member

hntd187 commented Jun 28, 2016

Also @velvia soon enough with spark 2.0 scala 2.11 will be the default

val sparkBuild = s"spark-$sparkVersion"
val sparkBuildCmd = scalaBinaryVersion.value match {
case "2.10" =>
"./make-distribution.sh -Phadoop-2.4 -Phive"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if someone wants to compile it with hadoop 2.6?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is a fair point. We might as well make this Hadoop version configurable.

On Jun 28, 2016, at 8:21 PM, Noorul Islam K M notifications@github.com wrote:

In project/Build.scala #530 (comment):

@@ -108,6 +108,19 @@ object JobServerBuild extends Build {
dockerfile in docker := {
val artifact = (assemblyOutputPath in assembly in jobServerExtras).value
val artifactTargetPath = s"/app/${artifact.name}"
+

  •  val sparkBuild = s"spark-$sparkVersion"
    
  •  val sparkBuildCmd = scalaBinaryVersion.value match {
    
  •    case "2.10" =>
    
  •      "./make-distribution.sh -Phadoop-2.4 -Phive"
    
    What if someone wants to compile it with hadoop 2.6?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/spark-jobserver/spark-jobserver/pull/530/files/0ccb8703f12d5e70a37a0cc6732c464a6c2ea7bc#r68879916, or mute the thread https://github.com/notifications/unsubscribe/ABA32_ATqVfiLvLoI1Dyu_GeSz3ZApcOks5qQeSpgaJpZM4JAAR7.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current docker image is also fixed to Hadoop 2.4 and I didn't want to change everything in one PR. I suggest trying this PR out and see if it fits the build process, then expand on it afterwards (like configuring Hadoop and Java, building SparkR, ...).

@noorul noorul merged commit afb081b into spark-jobserver:master Jun 29, 2016
@noorul
Copy link
Contributor

noorul commented Jun 29, 2016

I will add to notes once #517 is complete.

@mariussoutier
Copy link
Contributor Author

What needs to be done for your release process?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants