Skip to content

Script to run and find similarities between movies from a movie lens data set using Python & Spark Clustering.

Notifications You must be signed in to change notification settings

vaibhavmagon/Spark-Python-MovieReviews

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Python & Spark Collaborative Filtering Script using Movielens Dataset.

This is a script with dataset to run and find similarities between from a big data set using Python and Spark. One needs to essesntially pass an id for the movie and then find similar movies based on item based collaborative filtering. One can change the values of threshold and modify accordingly.

More here: https://realpython.com/build-recommendation-engine-collaborative-filtering/

Files

To Run

  • Install Spark & Python on your system.
spark-submit movie-similarities.py <id>

(The id of the movie to find similarities for, 50 is for star wars!).

Maintainers

  • Vaibhav Magon

About

Script to run and find similarities between movies from a movie lens data set using Python & Spark Clustering.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published