This is a Spark template for generating a new Spark application project. It comes bundled with:
- main and test source directories
- ScalaTest
- Scalacheck
- SBT configuration for 0.13.0 , Scala 2.10.4, and ScalaTest 2.0 dependencies
- project name , package and version customizable as variables
First, you need to install conscript .:
$ curl https://raw.github.com/n8han/conscript/master/setup.sh | sh
Conscript command is installed into ~/bin/cs. If ~/bin is included in PATH, you can install giter8 :
$ cs n8han/giter8
Next, the following command generates the first set of Spark application.:
$ g8 nttdata-oss/basic-spark-project
After the short question, you have the project directory (default: basic-spark) In the project directory, README.rst is found, which tells us how to execute sample application.
You can read README.rst of the sample project from this link
0.1.4 (Change versions of CDH, Spark and Scala) -------------------------------------* Spark 1.2.0 -> Spark 1.3.1 * CDH5.2.1 -> CDH5.3.3
- Spark 1.1.0 -> Spark 1.2.1
- CDH5.2.1 -> CDH5.3.1
Add the following sample applications little changed from Spark official samples. The difference is not the algorithm but the mechanism to handle classes and parameters.
- WordCount, RandomTextWriter (the test data generator for WordCount) and Words (dictionary file)
- GroupByTest
- SparkLR
- SparkHdfsLR and SparkLRTestDataGenerator, the test data generator for SparkHdfsLR.
- Scalatest 2.0 for testing
- Sbt 0.12.4
- Scala 2.10.3
- Spark 0.9.0
- Hadoop 2.3 (CDH5)
- SparkPi
- Scalatest 2.0 for testing
- Sbt 0.12.4
- Scala 2.10.3
- Spark 0.9.0
- Hadoop 2.2 (CDH5b2)
- SparkPi