Movie Recommendation System Using K-Means Clustering

My bachelor thesis at KTH Royal Institute of Technology, where I implemented a recommendation system for movies using K-means. The program uses a movie database called omdb0216. Recommendations are given to the user based on a movie the user previously like, i.e. the user needs to query a movie to base the recommendations on. The program gives 5 recommendations of movies that the program thinks are relevant. The user can also query based on movie attributes. The program uses Apache Spark™ to do the distributed computing of the k-means algorithm.

Files

KmeansClusering.scala: The main file that executes the program. The program sorts out the most relevant movies based on IMDB rating > 5.0, as the original database contains over 1M movies. Next the program converts each movie into a unique vector, based on selected attributes. The then program clusters similar movies together and assigns a centroid to each movie using the distributed k-means clustering algorithm in Apache Spark™. This program also handles the users query and gives the recommendations.
Movie.scala: Converts each movie in the database to an object.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
src		src
README.md		README.md
The_Thesis.pdf		The_Thesis.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Movie Recommendation System Using K-Means Clustering

Files

About

Releases

Packages

Languages

aktersnurra/Recommendation_System_for_Movies

Folders and files

Latest commit

History

Repository files navigation

Movie Recommendation System Using K-Means Clustering

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages