This readme outlines how we set up Spark for our testing environment.
The tests we used for Spark's batch processing was the sample WordCount program, which counts the number of word occurrences in a given file, and can be found here.
The tests we used for Spark's realtime processing was ________.
-
Install Spark dependencies
Java (at time of initial setup, our version was "1.8.0_151")
sudo apt-get install default-jdk sudo apt-get install maven
-
Download Spark binaries into
~/server
directory (we tested on version 1.6.3)cd ~ mkdir server cd server wget http://<spark download link>/spark-1.6.3-bin-hadoop2.6.tgz
-
Extract the files
tar -xzvf spark-1.6.3-bin-hadoop2.6.tgz
Instructions can be found at SparkFiles/batch.md.
Instructions can be found at SparkFiles/stream.md.