This project contains the solutions to some basic exercises using Spark.
The initial idea was to practice some strategies when solving a wide variety of problems processing small sets of data using Scala, Apache Spark and SBT
- Find the total expenses per users.
- Group friends by age.
- Get the minimum temperature on a set of measures on 1800.
- Find the most popular hero.
- Find the top 10 most popular movies.
- Count the number of ratings per movie.
- Find the degrees of separation between Heroes (without GraphX)
- Find similar movies based on the rating history by user.
-
This repo will be merged with Spark-Training when I add all the unit test for this exercises since all the code on that project was unit tested.
-
This repo will be updated from time to time because I need to get used to SBT. I like it but It takes too many resources.