Build Spark from source in Docker image#530
Conversation
Current coverage is 62.06%@@ master #530 diff @@
==========================================
Files 64 64
Lines 1740 1740
Methods 1627 1627
Messages 0 0
Branches 102 102
==========================================
+ Hits 1079 1080 +1
+ Misses 661 660 -1
Partials 0 0
|
|
Can you squash them to single commit? |
|
Done. |
|
|
|
This is probably not super far off from the docker file I made for this awhile back. https://github.com/hntd187/spark-jobserver-docker/blob/master/Dockerfile |
| copy(artifact, artifactTargetPath) | ||
| copy(baseDirectory(_ / "bin" / "server_start.sh").value, file("app/server_start.sh")) | ||
| copy(baseDirectory(_ / "bin" / "server_stop.sh").value, file("app/server_stop.sh")) | ||
| copy(baseDirectory(_ / "bin" / "manager_start.sh").value, file("app/manager_start.sh")) |
There was a problem hiding this comment.
Thanks. I somehow thought we had added this fix before, but must have somehow slipped.... :-p
There was a problem hiding this comment.
Actually in 0.6.2 it was present...
|
Looks fine to me. If it is possible to download the binary with Scala 2.11 support also, that would be better, would significantly reduce Docker build times. |
|
If that were possible I wouldn't bother to build from source :) Scala 2.11 is not available as a pre-built binary. We could download the 2.10 version, but would make it harder to test. Locally the build takes around 20 minutes. |
|
Also @velvia soon enough with spark 2.0 scala 2.11 will be the default |
| val sparkBuild = s"spark-$sparkVersion" | ||
| val sparkBuildCmd = scalaBinaryVersion.value match { | ||
| case "2.10" => | ||
| "./make-distribution.sh -Phadoop-2.4 -Phive" |
There was a problem hiding this comment.
What if someone wants to compile it with hadoop 2.6?
There was a problem hiding this comment.
Yes, this is a fair point. We might as well make this Hadoop version configurable.
On Jun 28, 2016, at 8:21 PM, Noorul Islam K M notifications@github.com wrote:
In project/Build.scala #530 (comment):
@@ -108,6 +108,19 @@ object JobServerBuild extends Build {
dockerfile in docker := {
val artifact = (assemblyOutputPath in assembly in jobServerExtras).value
val artifactTargetPath = s"/app/${artifact.name}"
+
val sparkBuild = s"spark-$sparkVersion"val sparkBuildCmd = scalaBinaryVersion.value match {case "2.10" => What if someone wants to compile it with hadoop 2.6?"./make-distribution.sh -Phadoop-2.4 -Phive"—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub https://github.com/spark-jobserver/spark-jobserver/pull/530/files/0ccb8703f12d5e70a37a0cc6732c464a6c2ea7bc#r68879916, or mute the thread https://github.com/notifications/unsubscribe/ABA32_ATqVfiLvLoI1Dyu_GeSz3ZApcOks5qQeSpgaJpZM4JAAR7.
There was a problem hiding this comment.
The current docker image is also fixed to Hadoop 2.4 and I didn't want to change everything in one PR. I suggest trying this PR out and see if it fits the build process, then expand on it afterwards (like configuring Hadoop and Java, building SparkR, ...).
|
I will add to notes once #517 is complete. |
|
What needs to be done for your release process? |
This allows to compile for Scala 2.11. To build for Scala 2.11, run
SCALA_VERSION="2.11.8 sbt docker.I've already built 0.6.2 images for both Scala versions using this build file:
https://hub.docker.com/r/mariussoutier/spark-jobserver/tags/
The 0.7.0 snapshot has an Akka issue inside Docker which I'm still investigating, but it doesn't seem related to building from source.