# Assignment - Collaborative Filtering

- by Qi Sun

The requirements of this assignment are:

"Your task is to try different similarity methods: jaccard and centered-cosine (pearson’s)

You should implement using both user-user and item-item collaborative filtering.

You should implement using both (1) 1 to 5 ratings, and (2) 0 for ratings 1, 2, or 3, and 1 for ratings 4 and 5.

The above means that you’ll try 8 different methods!"

I build recommenders for this assignment by borrowing codes from:

https://github.com/Arvind-Maan/Reco-Me/blob/master/movie_recommendation.py


I made some changes to the original codes based on the assignment requirements and my dataset. 


# Prepare data

In [252]:
import pandas as pd
import numpy as np

# survey results
df = pd.read_csv('https://raw.githubusercontent.com/susanqisun/DAV6300/main/movie%20recommender.csv')
df


Unnamed: 0,UserID,The Little Things,The White Tiger,The Dig,Soul,Wonder Woman 1984,Promising Young Woman
0,1,3,4,5,Not Sure,1,3.0
1,2,Not Sure,Not Sure,4,5,5,4.0
2,3,5,4,5,5,3,4.0
3,4,Not Sure,Not Sure,Not Sure,3,4,5.0
4,5,5,5,4,3,4,2.0
5,6,3,2,,3,4,5.0
6,7,,1,,,4,


In [253]:
df02 = df.replace('Not Sure',np.NaN)
df02

Unnamed: 0,UserID,The Little Things,The White Tiger,The Dig,Soul,Wonder Woman 1984,Promising Young Woman
0,1,3.0,4.0,5.0,,1,3.0
1,2,,,4.0,5.0,5,4.0
2,3,5.0,4.0,5.0,5.0,3,4.0
3,4,,,,3.0,4,5.0
4,5,5.0,5.0,4.0,3.0,4,2.0
5,6,3.0,2.0,,3.0,4,5.0
6,7,,1.0,,,4,


In [255]:
df03a = df02.copy()

def top(x):
    x.set_index('UserID', inplace=True)
    df03a = pd.DataFrame({'1st Max':[],'Max1Value':[],'2nd Max':[],'Max2Value':[],'3rd Max':[],'Max3Value':[]})
    df03a.index.name='User'
    df03a.loc[x.index.values[0],['1st Max', '2nd Max','3rd Max']] = x.sum().nlargest(3).index.tolist()
    df03a.loc[x.index.values[0],['Max1Value', 'Max2Value','Max3Value']] = x.sum().nlargest(3).values
    return df03a

df_top = df03a.groupby('UserID').apply(top).reset_index(level=1, drop=True).reset_index()
df_top

Unnamed: 0,UserID,1st Max,Max1Value,2nd Max,Max2Value,3rd Max,Max3Value
0,1,The Dig,5.0,The White Tiger,4.0,The Little Things,3.0
1,2,Soul,5.0,Wonder Woman 1984,5.0,The Dig,4.0
2,3,The Little Things,5.0,The Dig,5.0,Soul,5.0
3,4,Promising Young Woman,5.0,Wonder Woman 1984,4.0,Soul,3.0
4,5,The Little Things,5.0,The White Tiger,5.0,The Dig,4.0
5,6,Promising Young Woman,5.0,Wonder Woman 1984,4.0,The Little Things,3.0
6,7,Wonder Woman 1984,4.0,The White Tiger,1.0,The Little Things,0.0


In [256]:
df03 = df02.fillna(0)
df03

Unnamed: 0,UserID,The Little Things,The White Tiger,The Dig,Soul,Wonder Woman 1984,Promising Young Woman
0,1,3,4,5,0,1,3.0
1,2,0,0,4,5,5,4.0
2,3,5,4,5,5,3,4.0
3,4,0,0,0,3,4,5.0
4,5,5,5,4,3,4,2.0
5,6,3,2,0,3,4,5.0
6,7,0,1,0,0,4,0.0


In [158]:
# read movie ID
movie = pd.read_csv('https://raw.githubusercontent.com/susanqisun/DAV6300/main/movieID.csv')
movie

Unnamed: 0,movieID,movie
0,101,The Little Things
1,102,The White Tiger
2,103,The Dig
3,104,Soul
4,105,Wonder Woman 1984
5,106,Promising Young Woman


In [193]:
df04 = df03.copy()
df04.index = np.arange(1, len(df03) + 1)
df04

Unnamed: 0,UserID,The Little Things,The White Tiger,The Dig,Soul,Wonder Woman 1984,Promising Young Woman
1,1,3,4,5,0,1,3.0
2,2,0,0,4,5,5,4.0
3,3,5,4,5,5,3,4.0
4,4,0,0,0,3,4,5.0
5,5,5,5,4,3,4,2.0
6,6,3,2,0,3,4,5.0
7,7,0,1,0,0,4,0.0


In [246]:
df05 = df04.drop(['UserID'], axis=1)

df_survey = df05.stack().reset_index()
df_survey.columns=['UserId', 'movie', 'rating']


In [360]:
df_survey.head()

Unnamed: 0,UserId,movie,rating
0,1,The Little Things,3
1,1,The White Tiger,4
2,1,The Dig,5
3,1,Soul,0
4,1,Wonder Woman 1984,1


In [354]:
# merge together
ratings01 = pd.merge(left=df_survey, right=movie, how='outer')

ratings01.sort_values(by='UserId')

ratings = ratings01[['UserId','movieID','rating']]


In [355]:
# change data type
ratings['rating'] = ratings['rating'].astype('int64') 
ratings.dtypes

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  


UserId     int64
movieID    int64
rating     int64
dtype: object

In [349]:
n_ratings = len(ratings)
n_movies = ratings['movieID'].nunique()
n_users = ratings['UserId'].nunique()

print(f"Number of ratings: {n_ratings}")
print(f"Number of unique movieId's: {n_movies}")
print(f"Number of unique users: {n_users}")
print(f"Average number of ratings per user: {round(n_ratings/n_users, 2)}")
print(f"Average number of ratings per movie: {round(n_ratings/n_movies, 2)}")

Number of ratings: 42
Number of unique movieId's: 6
Number of unique users: 7
Average number of ratings per user: 6.0
Average number of ratings per movie: 7.0


# Create function for user-based recommender

https://github.com/Arvind-Maan/Reco-Me/blob/master/movie_recommendation.py


In [364]:
"""
simularity matrix
    returns a matrix of users that are similar to our user and how similar they are.
    [ (user similarity, user), (user similarity, user), .... , ]
    usings [fn] to calculate simularity
    and limits the matrix to size clusters
"""
def simularity_matrix(user, fn, clusters):
    # get the distances 
    # sort and reverse the data (closest -> least closest) and attach a bound 
    toReturn = [(fn(user, other),other) for other in range(0,len(mu_matrix)-1) if other != user]
    toReturn.sort(reverse=True)
    toReturn = toReturn[0:clusters] 
    return toReturn

In [365]:
import operator,collections

"""
filter_results
sorts all the recommended movies and returns an array of the top (num_of_recommendations)
parameters:
    all_outputs: a dictionary consisting of ALL recommendations
    num_of_recommendations: the number of recommendations we want
"""
def filter_results(all_outputs, num_of_recommendations):
    # sort the recommedations by possible rankings
    sorted_recommendations = sorted(all_outputs.items(), key=operator.itemgetter(1),reverse=True)
    # return the top n
    sorted_recommendations = sorted_recommendations[0:num_of_recommendations]
    return sorted_recommendations


In [366]:
"""
recommend
Recommends [k] number of movies using [fn] with [depth] as the depth and on user [user].
Parameters:
    user: the user we are recommending to
    depth: how many users are we comparing to this user
    fn: the function used for calculating simularity and distance
    k: the number of recommendations 
"""
def recommend (user, clusters, fn, k):
    print("############ RECOMMENDATION FOR USER %d ############" %(user))
    print("Using %s for calculating similarity/distance with %d clusters" %(fn,clusters))
    # we want to use our simularity functions to develop a list of ratings that our user might rate a movie as.
    # this allows us to think what our user will like the most in comparison
    distances = simularity_matrix(user,fn,clusters)
    #for each distance 
    all_unseen_movies = {}
    for dist,o in distances: 
        # each movie is an index, the number of movies is the LENGTH of the items row
        for i in range(0, len(mu_matrix[o])-1):
            predicted_rank = int(dist*mu_matrix[o][i])
            # update recommendations i to be the value of the similarity predicted rank 
            if (mu_matrix[o][i] > 0 and mu_matrix[user][i] <= 0): #if they ranked it but we didn't
                # if we are already planning to recommend it, add more weight to the prediction and increase similarity!
                all_unseen_movies[i] = (dist, [predicted_rank]) if i not in all_unseen_movies else (all_unseen_movies[i][0] + dist, all_unseen_movies[i][1] + [predicted_rank])   
    # update all the values
    for m in all_unseen_movies: 
        all_unseen_movies[m] = sum(all_unseen_movies[m][1])/all_unseen_movies[m][0] if all_unseen_movies[m][0] != 0 else 0 
    # at this point we have a dictionary of all possible recommendations with the format:
    # { [movie_id] : [predicted_rating]}
    # lets pick the top 5!
    recommend_movies = filter_results(all_unseen_movies,k)
    # recommend movies outputs an array formatted like so: [ (movie_id, predicted rank)]
    # lets print it in a nice way:
    print_recommendations(recommend_movies)
    print("############ END OF RECOMMENDATION FOR USER %d ############" %(user))
    return recommend_movies

# Use user1 as an example

In [367]:
"""
print_user_summary
Prints the users top 3 movies
Parameters:
    the user
"""

def print_user_summary(user):
    # the header
    print("*****USER SUMMARY FOR USER %d*****" %(user))
    user_x = df.loc[user-1]
    print(user_x)
    print("==Users Top 3 Favourite Movies==")
    return df_top.loc[user-1]

In [290]:
print_user_summary(1)

*****USER SUMMARY FOR USER 1*****
UserID                          1
The Little Things               3
The White Tiger                 4
The Dig                         5
Soul                     Not Sure
Wonder Woman 1984               1
Promising Young Woman           3
Name: 0, dtype: object
==Users Top 3 Favourite Movies==


UserID                       1
1st Max                The Dig
Max1Value                    5
2nd Max        The White Tiger
Max2Value                    4
3rd Max      The Little Things
Max3Value                    3
Name: 0, dtype: object

# Part 1: User-based Recommender system


## 1. User-based Recommender using Jaccard similarity


In [375]:
movie_features = ratings.pivot(index='UserId',columns='movieID',values='rating').fillna(0)
mu_matrix = np.array(movie_features.values, dtype=int)
mu_matrix


array([[3, 4, 5, 0, 1, 3],
       [0, 0, 4, 5, 5, 4],
       [5, 4, 5, 5, 3, 4],
       [0, 0, 0, 3, 4, 5],
       [5, 5, 4, 3, 4, 2],
       [3, 2, 0, 3, 4, 5],
       [0, 1, 0, 0, 4, 0]])

In [368]:
"""
Jaccard Similarity
computes the jaccard simularity function on the users and returns the value
PArameters: the users we are comparing
"""
def jaccard_similarity(user1, user2):
    # jacard similarity is the intersection of the 2 / the union of the 2 
    # in other words, the common rankings / all ranked
    # we just care about length, so don't worry about maintaing indices
    union_ab = []
    intersection_ab = []
    for i in range (0, len(mu_matrix[user1]-1)):
        if mu_matrix[user1][i] > 0 and mu_matrix[user2][i] > 0:
            intersection_ab.append(mu_matrix[user1][i])
        else: # can't be both
            if mu_matrix[user1][i] > 0:
                union_ab.append(mu_matrix[user1][i])
            elif mu_matrix[user2][i] > 0:
                union_ab.append(mu_matrix[user2][i])    
    # typically jaccard only factors in set difference
    # for the sake of the movie example, we will use the actual ratings to see if this makes it more accurate
    return 1 - (sum(intersection_ab)/sum(union_ab)) if sum(union_ab) != 0 else 0



In [285]:
# top 2 recommendations for user1 by using jaccard_similarity
recommend(1, 2, jaccard_similarity, 2)

############ RECOMMENDATION FOR USER 1 ############
Using <function jaccard_similarity at 0x12e845f80> for calculating similarity/distance with 2 clusters
----Recommendation 1----
101 | The Little Things | 
Predicted Rating: 1.565217
-------------------------
----Recommendation 2----
102 | The White Tiger | 
Predicted Rating: 1.565217
-------------------------
############ END OF RECOMMENDATION FOR USER 1 ############


[(0, 1.565217391304348), (1, 1.565217391304348)]

## 2. User-based Recommender using Pearson similarity


In [288]:
import math
"""
Pearson Simularity
returns the value of the correlation coefficient in the pearson simularity algorithm
Parameters: the users we are comparing
"""
def pearson_similarity(user1,user2):
    # get the mutually ranked
    common_rankings1 = [mu_matrix[user1][i] for i in range(0,len(mu_matrix[user1])-1) if mu_matrix[user1][i] > 0 and mu_matrix[user2][i] > 0]
    common_rankings2 = [mu_matrix[user2][i] for i in range(0,len(mu_matrix[user1])-1) if mu_matrix[user1][i] > 0 and mu_matrix[user2][i] > 0]
    # get the variables needed for the algorithm
    n = len(common_rankings2) # both rankings SHOULD be the same size, so choose either to be n
    # sum of both users squared
    sum_x_sqr = sum(square(common_rankings1))
    sum_y_sqr = sum(square(common_rankings2))
    # sum of the products of paired rankings
    sum_xy = sum([common_rankings1[i] * common_rankings2[i] for i in range (0, n)])
    # we have everything we need, find the correlation coefficient!
    denom = math.sqrt((n * sum_x_sqr - (sum(common_rankings1)**2)) * (n * sum_y_sqr - (sum(common_rankings2)**2)))
    return ((n * sum_xy - (sum(common_rankings1) - sum(common_rankings2)))/denom) if denom != 0 else 0


In [289]:
# top 2 recommendations for user1 by using pearson_similarity
recommend(1, 2, pearson_similarity, 2)

############ RECOMMENDATION FOR USER 1 ############
Using <function pearson_similarity at 0x12cc2b200> for calculating similarity/distance with 2 clusters
----Recommendation 1----
101 | The Little Things | 
Predicted Rating: 4.993737
-------------------------
----Recommendation 2----
102 | The White Tiger | 
Predicted Rating: 4.626305
-------------------------
############ END OF RECOMMENDATION FOR USER 1 ############


[(0, 4.993736951983299), (1, 4.6263048016701465)]

# Part 2: Item-based Recommender system

## Prepare movie description data

In [292]:
# create table for movie genres
data = {'title':  ['The Little Things', 'The White Tiger','The Dig','Soul','Wonder Woman 1984','Promising Young Woman'],
        'genre': ['Crime, Drama, Thriller', 'Crime, Drama','Biography, Drama, History','Animation, Adventure, Comedy','Action, Adventure, Fantasy','Crime, Drama, Thriller']
        }

df_genres = pd.DataFrame (data, columns = ['title','genre'])
df_genres


Unnamed: 0,title,genre
0,The Little Things,"Crime, Drama, Thriller"
1,The White Tiger,"Crime, Drama"
2,The Dig,"Biography, Drama, History"
3,Soul,"Animation, Adventure, Comedy"
4,Wonder Woman 1984,"Action, Adventure, Fantasy"
5,Promising Young Woman,"Crime, Drama, Thriller"


In [293]:
# create table for movie description
data02 = {'movie':  ['The Little Things', 'The White Tiger','The Dig','Soul','Wonder Woman 1984','Promising Young Woman'],
        'description': ['Kern County Deputy Sheriff Joe Deacon is sent to Los Angeles for what should have been a quick evidence-gathering assignment. Instead, he becomes embroiled in the search for a serial killer who is terrorizing the city.', 
                        'An ambitious Indian driver uses his wit and cunning to escape from poverty and rise to the top. An epic journey based on the New York Times bestseller.',
                        'An archaeologist embarks on the historically important excavation of Sutton Hoo in 1938.',
                        'After landing the gig of a lifetime, a New York jazz pianist suddenly finds himself trapped in a strange land between Earth and the afterlife.',
                        'Diana must contend with a work colleague and businessman, whose desire for extreme wealth sends the world down a path of destruction, after an ancient artifact that grants wishes goes missing.',
                        'A young woman, traumatized by a tragic event in her past, seeks out vengeance against those who crossed her path.']
        }

df_desc = pd.DataFrame (data02, columns = ['movie','description'])
df_desc

Unnamed: 0,movie,description
0,The Little Things,Kern County Deputy Sheriff Joe Deacon is sent ...
1,The White Tiger,An ambitious Indian driver uses his wit and cu...
2,The Dig,An archaeologist embarks on the historically i...
3,Soul,"After landing the gig of a lifetime, a New Yor..."
4,Wonder Woman 1984,Diana must contend with a work colleague and b...
5,Promising Young Woman,"A young woman, traumatized by a tragic event i..."


In [297]:
# merge together
df_movie = pd.merge(left=movie, right=df_desc, how='outer')

df_movie.sort_values(by='movie')
df_movie

Unnamed: 0,movieID,movie,description
0,101,The Little Things,Kern County Deputy Sheriff Joe Deacon is sent ...
1,102,The White Tiger,An ambitious Indian driver uses his wit and cu...
2,103,The Dig,An archaeologist embarks on the historically i...
3,104,Soul,"After landing the gig of a lifetime, a New Yor..."
4,105,Wonder Woman 1984,Diana must contend with a work colleague and b...
5,106,Promising Young Woman,"A young woman, traumatized by a tragic event i..."


## Create function for item-based recommender

In [300]:
"""
print_user_summary
Prints the users top 3 movies
Parameters:
    the user
"""

def print_user_summary(movieID):
    # the header
    print("*****Movie SUMMARY FOR Movie %d*****" %(movieID))
    return df_movie[df_movie['movieID']==movieID]

In [301]:
print_user_summary(101)

*****Movie SUMMARY FOR Movie 101*****


Unnamed: 0,movieID,movie,description
0,101,The Little Things,Kern County Deputy Sheriff Joe Deacon is sent ...


In [291]:
movie_features = ratings.pivot(index='movieID',columns='UserId',values='rating').fillna(0)
mu_matrix = np.array(movie_features.values, dtype=int)
mu_matrix

array([[3, 0, 5, 0, 5, 3, 0],
       [4, 0, 4, 0, 5, 2, 1],
       [5, 4, 5, 0, 4, 0, 0],
       [0, 5, 5, 3, 3, 3, 0],
       [1, 5, 3, 4, 4, 4, 4],
       [3, 4, 4, 5, 2, 5, 0]])

In [371]:
"""
recommend
Recommends [k] number of movies using [fn] with [depth] as the depth and on user [user].
Parameters:
    user: the user we are recommending to
    depth: how many users are we comparing to this user
    fn: the function used for calculating simularity and distance
    k: the number of recommendations 
"""
def recommend02 (user, clusters, fn, k):
    print("############ RECOMMENDATION FOR MOVIE %d ############" %(df_movie.loc[user-1]['movieID']))
    print("Using %s for calculating similarity/distance with %d clusters" %(fn,clusters))
    # we want to use our simularity functions to develop a list of ratings that our user might rate a movie as.
    # this allows us to think what our user will like the most in comparison
    distances = simularity_matrix(user,fn,clusters)
    #for each distance 
    all_unseen_movies = {}
    for dist,o in distances: 
        # each movie is an index, the number of movies is the LENGTH of the items row
        for i in range(0, len(mu_matrix[o])-1):
            predicted_rank = int(dist*mu_matrix[o][i])
            # update recommendations i to be the value of the similarity predicted rank 
            if (mu_matrix[o][i] > 0 and mu_matrix[user][i] <= 0): #if they ranked it but we didn't
                # if we are already planning to recommend it, add more weight to the prediction and increase similarity!
                all_unseen_movies[i] = (dist, [predicted_rank]) if i not in all_unseen_movies else (all_unseen_movies[i][0] + dist, all_unseen_movies[i][1] + [predicted_rank])   
    # update all the values
    for m in all_unseen_movies: 
        all_unseen_movies[m] = sum(all_unseen_movies[m][1])/all_unseen_movies[m][0] if all_unseen_movies[m][0] != 0 else 0 
    # at this point we have a dictionary of all possible recommendations with the format:
    # { [movie_id] : [predicted_rating]}
    # lets pick the top 5!
    recommend_movies = filter_results(all_unseen_movies,k)
    # recommend movies outputs an array formatted like so: [ (movie_id, predicted rank)]
    # lets print it in a nice way:
    print_recommendations(recommend_movies)
    return recommend_movies

## 3. Item-based Recommender using Jaccard similarity


In [337]:
recommend02(1, 1, jaccard_similarity, 2)

############ RECOMMENDATION FOR MOVIE 101 ############
Using <function jaccard_similarity at 0x12e845f80> for calculating similarity/distance with 1 clusters
----Recommendation 1----
102 | The White Tiger | 
Predicted Rating: 0.000000
-------------------------
----Recommendation 2----
104 | Soul | 
Predicted Rating: 0.000000
-------------------------


[(1, 0.0), (3, 0.0)]

## 4. Item-based Recommender using Pearson similarity


In [340]:
recommend02(1, 1, pearson_similarity, 2)

############ RECOMMENDATION FOR MOVIE 101 ############
Using <function pearson_similarity at 0x12cc2b200> for calculating similarity/distance with 1 clusters
----Recommendation 1----
102 | The White Tiger | 
Predicted Rating: 4.000000
-------------------------


[(1, 4.0)]

# Part 3: User-based Recommender - group ratings into 0/1


0 for ratings 1, 2, or 3, and 1 for ratings 4 and 5.

## Prepare data - group ratings

In [356]:
ratings.head()

Unnamed: 0,UserId,movieID,rating
0,1,101,3
1,2,101,0
2,3,101,5
3,4,101,0
4,5,101,5


In [357]:
ratings02 = ratings.copy()


In [358]:
# group ratings
ratings02.loc[ratings02['rating'] <= 3, 'rating'] = 0
ratings02.loc[ratings02['rating'] > 3, 'rating'] = 1
ratings02.head()

Unnamed: 0,UserId,movieID,rating
0,1,101,0
1,2,101,0
2,3,101,1
3,4,101,0
4,5,101,1


## 5. User-based Recommender using Jaccard similarity


In [361]:
movie_features = ratings02.pivot(index='UserId',columns='movieID',values='rating').fillna(0)
mu_matrix = np.array(movie_features.values, dtype=int)


In [369]:
# top 2 recommendations for user1 by using jaccard_similarity
recommend(1, 2, jaccard_similarity, 2)

############ RECOMMENDATION FOR USER 1 ############
Using <function jaccard_similarity at 0x12ce4be60> for calculating similarity/distance with 2 clusters
----Recommendation 1----
102 | The White Tiger | 
Predicted Rating: 0.000000
-------------------------
----Recommendation 2----
101 | The Little Things | 
Predicted Rating: 0.000000
-------------------------
############ END OF RECOMMENDATION FOR USER 1 ############


[(1, 0.0), (0, 0.0)]

## 6. User-based Recommender using Pearson similarity



In [370]:
# top 2 recommendations for user1 by using pearson_similarity
recommend(1, 2, pearson_similarity, 2)

############ RECOMMENDATION FOR USER 1 ############
Using <function pearson_similarity at 0x12cc2b200> for calculating similarity/distance with 2 clusters
----Recommendation 1----
101 | The Little Things | 
Predicted Rating: 0.000000
-------------------------
----Recommendation 2----
102 | The White Tiger | 
Predicted Rating: 0.000000
-------------------------
############ END OF RECOMMENDATION FOR USER 1 ############


[(0, 0), (1, 0)]

# Part 4: Item-based Recommender - group ratings into 0/1

## 7. Item-based Recommender using Jaccard similarity


In [372]:
movie_features = ratings02.pivot(index='movieID',columns='UserId',values='rating').fillna(0)
mu_matrix = np.array(movie_features.values, dtype=int)

In [373]:
recommend02(1, 1, jaccard_similarity, 2)

############ RECOMMENDATION FOR MOVIE 101 ############
Using <function jaccard_similarity at 0x12ce4be60> for calculating similarity/distance with 1 clusters
----Recommendation 1----
102 | The White Tiger | 
Predicted Rating: 0.000000
-------------------------
----Recommendation 2----
104 | Soul | 
Predicted Rating: 0.000000
-------------------------


[(1, 0.0), (3, 0.0)]

## 8. Item-based Recommender using Pearson similarity

In [374]:
recommend02(1, 1, pearson_similarity, 2)

############ RECOMMENDATION FOR MOVIE 101 ############
Using <function pearson_similarity at 0x12cc2b200> for calculating similarity/distance with 1 clusters
----Recommendation 1----
102 | The White Tiger | 
Predicted Rating: 0.000000
-------------------------
----Recommendation 2----
104 | Soul | 
Predicted Rating: 0.000000
-------------------------


[(1, 0), (3, 0)]