Skip to content
Movie Recommender Case Study (Galvanize g88 - Spring 2019)
Jupyter Notebook Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.

Movie Recommender Case Study


  • The Team's presentation slides can be found here
  • src
    • model & code for sharing updates to Slack
  • exploratory_data_analysis
    • 4 Jupyter Notebooks covering EDA done
      • focus on genre and movie
  • Results
    • Model predicts marginally better than randomly guessing
    • Average rating for suggested content is 3.59 (our model) vs. 3.54 (random)


Today you are going to have a little friendly competition with your classmates.

You are going to building a recommendation system based off data from the MovieLens dataset. It includes movie information, user information, and the users' ratings. Your goal is to build a recommendation system and to suggest movies to users!

The movies data and user data are in data/movies.dat and data/users.dat.

The ratings data can be found in data/training.csv. The users' ratings have been broken into a training and test set for you (to obtain the testing set, we have split the 20% of the most recent ratings).

The Team's Matrix Decomposition Model

  • Assume “latent features” in our movies and users
    • Use Alternating Least Squares (ALS) to predict latent features
      • Guess avg. rating per movie if confronted with new user
      • Thumbs Method
        • Converted ratings to binary system of approval and disapproval based on ratings above and below 4
      • Thumbs method takes user bias into account and it is intuitive
    • Pros
      • Identifies Hidden Connections Organically
      • Calculates all known user-movie combos
    • Areas for Improvement
      • Fits poorly on “sparse” data
      • Does not solve the “cold start” problem
  • Reflection
    • What went well?
      • Good ideation
        • Feature Engineering
        • “Thumbs” method
        • EDA
      • The real 5-star rating is the friends you make along the way :)
    • Potential Future Work
      • Adjust predicted ratings based on
        • Average Movie Rating
        • User Genre Profile (cosine similarity)
        • Movie similarity to highly rated movies
      • Compare models
        • Adjust hyperparameters
        • Try linear regression

Movie Recommender Case Study (Galvanize g88 - Spring 2019)

You can’t perform that action at this time.