## KNN Item-Based Collaborative Filtering

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

To build a movie recommender, I choose MovieLens Datasets.
URL - https://grouplens.org/datasets/movielens/latest/

In [2]:
movies_data = pd.read_csv('./Datasets/ml-25m/movies.csv', usecols=['movieId', 'title'])
rating_data = pd.read_csv('./Datasets/ml-25m/ratings.csv', usecols=['userId', 'movieId', 'rating'])

First, we need to transform the dataframe of ratings into a proper format that can be consumed by a KNN model.

In [3]:
data = pd.merge(movies_data, rating_data)
user_movie_table = data.iloc[:10000000,:].pivot(
    index = 'title',
    columns = 'userId',
    values = 'rating'
).fillna(0)

In [4]:
user_movie_table

userId,1,2,3,4,5,6,7,8,9,10,...,162532,162533,162534,162535,162536,162537,162538,162539,162540,162541
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
'Til There Was You (1997),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"'burbs, The (1989)",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.5
1-900 (06) (1994),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
101 Dalmatians (1996),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,3.0,3.0
12 Angry Men (1957),0.0,0.0,0.0,0.0,0.0,5.0,0.0,4.0,0.0,0.0,...,0.0,0.0,4.5,0.0,0.0,0.0,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
Young Guns II (1990),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
"Young Poisoner's Handbook, The (1995)",0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zero Effect (1998),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
Zero Kelvin (Kjærlighetens kjøtere) (1995),0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


## Let’s Make Some Movie Recommendations

In [5]:
from sklearn.neighbors import KNeighborsClassifier

In [6]:
recommender = KNeighborsClassifier(n_neighbors = 11, algorithm = 'brute', metric = 'cosine')
recommender.fit(user_movie_table.to_numpy(), user_movie_table.index)

KNeighborsClassifier(algorithm='brute', leaf_size=30, metric='cosine',
                     metric_params=None, n_jobs=None, n_neighbors=11, p=2,
                     weights='uniform')

In [7]:
query = 'Jumanji (1995)'
distances, indices = recommender.kneighbors(user_movie_table.loc[query].values.reshape(1, -1))

In [8]:
print('recommendations for', query, 'are-', end = '\n\n')

recommendions = []
for i in indices[0]:
    movie = user_movie_table.index[i]
    if movie != query:
        recommendions.append(movie)

for i in recommendions:
    print(i)

recommendations for Jumanji (1995) are-

Mask, The (1994)
Lion King, The (1994)
Mrs. Doubtfire (1993)
Home Alone (1990)
Jurassic Park (1993)
Aladdin (1992)
Speed (1994)
Beauty and the Beast (1991)
Santa Clause, The (1994)
Waterworld (1995)


### Our recommender system actually works!!
## Now, we have our own movie recommender.