Skip to content

tusharkm/spark_movie_superheros

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

32 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Spark with Scala

Objective:

To learn spark, sparksql and GraphX with scala by designing:
• Movie recommender system for movie lens 100k Dataset
• Movie recommender system hosted on AWS EMR with 3 Machines finding item based recommendation in 1M using cosine distance
• Popular movies list of movie lens 100k Dataset
• Marvel Superhero’s friends connection for 20,000 characters, using Breath First Search algorithm
• Spark Graphx to find degree of separation between superheroes and identify most famous Marvel superhero

Technologies used: Scala, Spark, Sparksql, Graphx, Amazon EMR and EC2, SBT