# Item-Based Collaborative Filtering

## Summary

This model uses item-based collaborative filtering. The recommendation are calculated using the correlation between movie preferences based on the input movie preferences and their rating. Giving one as rating will have a reverse effect on the recommendation.

## Demo

Get the data from MovieLens' database and generate a correlation matrix and calculate the average movie score

In [1]:
import os

from models.spark import preprocessing as pp

if not os.path.exists('../data/processed/movie_ratings.csv'):
    pp.load_and_transform(dataset='ml-100k')

Initialize the recommendation module

In [2]:
from models.itembased import MovieRecommender

model = MovieRecommender()

Now create a profile and let's see what the model recommends

In [3]:
from models import RatedMovie

movies = model.get_recommendations(profile=[
    RatedMovie('Home Alone (1990)', 5.0),
    RatedMovie('Rent-a-Kid (1995)', 5.0),
    RatedMovie('Seven (Se7en) (1995)', 5.0),
    RatedMovie('E.T. the Extra-Terrestrial (1982)', 4.0),
])

movies.head(10)

Unnamed: 0,movieId,preference_score,title,rating,ratingCount,score
171,174,2.656222,Raiders of the Lost Ark (1981),4.252381,420,2.735863
201,204,2.647904,Back to the Future (1985),3.834286,350,2.646994
169,172,2.617145,"Empire Strikes Back, The (1980)",4.20436,367,2.636031
67,69,2.712046,Forrest Gump (1994),3.853583,321,2.575977
77,79,2.602998,"Fugitive, The (1993)",4.044643,336,2.539488
207,210,2.612617,Indiana Jones and the Last Crusade (1989),3.930514,331,2.526427
80,82,2.928413,Jurassic Park (1993),3.720307,261,2.479591
26,28,2.661909,Apollo 13 (1995),3.931159,276,2.322538
192,195,2.461329,"Terminator, The (1984)",3.933555,301,2.253258
20,22,2.458629,Braveheart (1995),4.151515,297,2.233889


Now create a family friendly version of this profile and let's see what the model recommends

In [4]:
from models import RatedMovie

family_friendly = model.get_recommendations(profile=[
    RatedMovie('Home Alone (1990)', 5.0),
    RatedMovie('Rent-a-Kid (1995)', 5.0),
    RatedMovie('Seven (Se7en) (1995)', 1.0),
    RatedMovie('E.T. the Extra-Terrestrial (1982)', 4.0),
])

family_friendly.head(10)

Unnamed: 0,movieId,preference_score,title,rating,ratingCount,score
201,204,1.667173,Back to the Future (1985),3.834286,350,1.6666
80,82,1.947777,Jurassic Park (1993),3.720307,261,1.649252
390,393,2.247687,Mrs. Doubtfire (1993),3.411458,192,1.636718
207,210,1.64095,Indiana Jones and the Last Crusade (1989),3.930514,331,1.586815
67,69,1.660524,Forrest Gump (1994),3.853583,321,1.577212
169,172,1.555961,"Empire Strikes Back, The (1980)",4.20436,367,1.56719
171,174,1.516067,Raiders of the Lost Ark (1981),4.252381,420,1.561523
26,28,1.72611,Apollo 13 (1995),3.931159,276,1.506046
69,71,1.781792,"Lion King, The (1994)",3.781818,220,1.383185
92,95,1.755203,Aladdin (1992),3.812785,219,1.359528


In [5]:
from models.itembased import RatedMovie

action_and_crime = model.get_recommendations(profile=[
    RatedMovie('Terminator 2: Judgment Day (1991)', 5.0),
    RatedMovie('Silence of the Lambs, The (1991)', 4.0),
    RatedMovie('Seven (Se7en) (1995)', 5.0),
])

action_and_crime.head(10)

Unnamed: 0,movieId,preference_score,title,rating,ratingCount,score
54,56,2.997991,Pulp Fiction (1994),4.060914,394,3.054396
170,174,2.95637,Raiders of the Lost Ark (1981),4.252381,420,3.04501
191,195,3.166409,"Terminator, The (1984)",3.933555,301,2.898733
77,79,2.935267,"Fugitive, The (1993)",4.044643,336,2.86365
168,172,2.804641,"Empire Strikes Back, The (1980)",4.20436,367,2.824881
172,176,2.905088,Aliens (1986),3.947183,284,2.574647
200,204,2.560548,Back to the Future (1985),3.834286,350,2.559668
179,183,2.752769,Alien (1979),4.034364,291,2.472763
20,22,2.690105,Braveheart (1995),4.151515,297,2.444205
67,69,2.564026,Forrest Gump (1994),3.853583,321,2.435384
