This project implements a Recommendation System for Movies.
Interested in a targeted system in order to improve stickiness on the gaming site. We were hired to Researching a Collaborative Filtering System.
MovieLens 100K -GroupLens research lab at the University of Minnesota
4 files
-100K user ratings of movies (.5 - 5)
-Tags for descriptions
-Movies
-Links to tmdb, imdb
Data containing 100K User ratings from 600 users 3000 descriptive tags. The first wordcloud is tags from all the movies with rating gt 4.5. The second is all of the tags for all of the tagged movies regardless of rating.
We performed KFold Cross Validation on the movie ratings with Matrix reduction algorithms and optimized with GridSearch. We were looking for minimal errors choosing RMSE as our main metric and also time it takes to fit the moved as the matric will need to run to re fit after a user updates their ratings.
We used a Collaborative Filtering Model Based approach for this first implementation.
Best Predictive Results were found with the SVD algorithm With a RMSE of .8696, Fit time of 4 seconds which was reduced to 2 sec with Hyperparameter tuning.
Model | RMSE | MSE | Fit Time(sec) |
---|---|---|---|
SVD | 0.8696 | 0.6670 | 4.99 |
BaselineOnlyALS | 0.8725 | 0.6727 | 0.25 |
KNNwMeans | 0.8924 | 0.6815 | 0.24 |
CoClustering | 0.9420 | 0.7295 | 2.89 |
KNNBasic | 0.9458 | 0.7254 | 0.18 |
Challenges
Finalize web app to be fully Operational
Implement Hybrid Collaborative Model
Sentence of interest, mood, type NLP
Flexible between recommender models
See the full analysis in the Jupyter Notebook or review this presentation
For additional info, contact Daniel M. Smith at danielmsmith1@gmail.com
├── code
│ ├── init.py
│ ├── __.py
├── data
├── images
├── init.py
├── README.md
├── presentation.pdf
├── gitbhub.pdf
├── notebook_RecSys.pdf
├── RecSys_notebook.ipynb