# Simple Item-Based Collaborative Filtering

This notebook demonstrates a simple implementation of item-based collaborative filtering using the MovieLens dataset.

The process includes:
1. Loading the MovieLens dataset
2. Training an item-based KNN model
3. Finding similar items to those rated by a test user
4. Generating personalized recommendations

In [1]:
# Import required libraries
from recsys.MovieLens import MovieLens
from surprise import KNNBasic
import heapq
from collections import defaultdict
from operator import itemgetter

## Setup and Data Loading

We'll use user ID 85 as our test subject and set k=10 for the number of similar items to consider.

In [None]:
# Set parameters
testSubject = '85'
k = 10

# Load the MovieLens dataset
ml, data, ratings = MovieLens.load()

# Build the training set
trainSet = data.build_full_trainset()

## Train the Model

We'll train an item-based KNN model using cosine similarity.

In [None]:
# Configure and train the model
sim_options = {
    'name': 'cosine',
    'user_based': False
}

model = KNNBasic(sim_options=sim_options)
model.fit(trainSet)
simsMatrix = model.compute_similarities()

## Generate Recommendations

For our test user, we'll:
1. Get their top k rated items
2. Find similar items to those they rated highly
3. Generate recommendations based on item similarity and ratings

In [None]:
# Get the test user's inner ID
testUserInnerID = trainSet.to_inner_uid(testSubject)

# Get the top K items rated by the user
testUserRatings = trainSet.ur[testUserInnerID]
kNeighbors = heapq.nlargest(k, testUserRatings, key=lambda t: t[1])

# Find similar items to those rated highly by the user
candidates = defaultdict(float)
for itemID, rating in kNeighbors:
    similarityRow = simsMatrix[itemID]
    for innerID, score in enumerate(similarityRow):
        candidates[innerID] += score * (rating / 5.0)

# Track items the user has already seen
watched = {}
for itemID, rating in trainSet.ur[testUserInnerID]:
    watched[itemID] = 1

# Get and display top recommendations
print("Top 10 recommendations for user", testSubject, ":\n")
pos = 0
for itemID, ratingSum in sorted(candidates.items(), key=itemgetter(1), reverse=True):
    if not itemID in watched:
        movieID = trainSet.to_raw_iid(itemID)
        print(ml.getMovieName(int(movieID)), ratingSum)
        pos += 1
        if (pos > 10):
            break