-
Use of flatMap() on JavaRDD - https://github.com/pnijem/spark-for-java-devs/blob/master/src/main/java/com/pnijem/spark/MainFlatMap.java
-
Use of map() on JavaRDD - https://github.com/pnijem/spark-for-java-devs/blob/master/src/main/java/com/pnijem/spark/MainMapping.java
-
Ranking of keywords from a file using Spark - https://github.com/pnijem/spark-for-java-devs/blob/master/src/main/java/com/pnijem/spark/MainKeywordRanking.java
-
Loading file from Disk into a Spark RDD - https://github.com/pnijem/spark-for-java-devs/blob/master/src/main/java/com/pnijem/spark/MainReadFromDisk.java
-
Working with JavaPairRDD - https://github.com/pnijem/spark-for-java-devs/blob/master/src/main/java/com/pnijem/spark/MainPairRDD.java
In the file below you will find a boolean flag 'testMode'. Have it as 'true' to use hardcoded data. 'false' means the hardcoded data will be ignored and the relevant files will be loaded. These files can be found here: https://github.com/pnijem/spark-for-java-devs/tree/master/src/main/resources/viewing%20figures
- Java 8
- Spark Core 2.0.0
- Hadoop HDFS 2.2.0