Skip to content

neil-rubens/docker-spark

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

As of Jun.2015, Apache Spark binary distributions are only for scala 2.10 (and so is sequenceiq/docker-spark image).

This fork provides a spark-docker image for scala 2.11 building spark from the source.

Getting the Docker Image

Pre-Build Image (recommended)

docker pull activeintel/spark:1.4.0

Manually Build Docker Image

You can also build the image.

Download/pull this project; and at it's root:

docker build --rm -t spark activeintel/spark:1.4.0 .

Note: this is a lengthy process; will take ~30 min (for compilation/downloads, etc.).

Verification

To verify that image works you can do the following.

Run the image:

docker run -it -h sandbox activeintel/spark:1.4.0 bash

# execute the the following command which should write the "Pi is roughly 3.1418" into the logs
# note you must specify --files argument in cluster mode to enable metrics
spark-submit \
--class org.apache.spark.examples.SparkPi \
--files $SPARK_HOME/conf/metrics.properties \
--master yarn-client \
--driver-memory 1g \
--executor-memory 1g \
--executor-cores 1 \
$SPARK_HOME/examples/target/scala-2.11/spark-examples-1.4.0-hadoop2.6.0.jar

Note that location of the jar examples/target/scala-2.11/ is different from sequenceiq/docker-spark which is in lib/.
For more information see issue #1

Tested with docker 1.5

Running Container

docker run -d -h sandbox activeintel/spark:1.4.0 -d

See Also

sequenceiq/docker-spark

About

fork of sequenceiq/docker-spark modified to work with scala 2.11

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Shell 100.0%