# **Song Recommender System - Individual Project - Data 417**

## DATA
There are two files of data
**triplet_file**,
**metadata_file** <Br>
  triplet_file - user_id, song_id, listen time <Br> metadata_file - ong_id, title, release, year, artist_name

## **Importing Required Modules**

In [66]:
import pandas as pd
import numpy as np

## **Loading the Data Sets**

In [67]:
song_df_1 = pd.read_csv('triplets_file.csv')
song_df_1.head()

Unnamed: 0,user_id,song_id,listen_count
0,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAKIMP12A8C130995,1
1,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,2
2,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBXHDL12A81C204C0,1
3,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBYHAJ12A6701BF1D,1
4,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SODACBL12A8C13C273,1


In [68]:
song_df_2 = pd.read_csv('song_data.csv')
song_df_2.head()

Unnamed: 0,song_id,title,release,artist_name,year
0,SOQMMHC12AB0180CB8,Silent Night,Monster Ballads X-Mas,Faster Pussy cat,2003
1,SOVFVAK12A8C1350D9,Tanssi vaan,Karkuteillä,Karkkiautomaatti,1995
2,SOGTUKN12AB017F4F1,No One Could Ever,Butter,Hudson Mohawke,2006
3,SOBNYVR12A8C13558C,Si Vos Querés,De Culo,Yerba Brava,2003
4,SOHSBXH12A8C13B0DF,Tangle Of Aspens,Rene Ablaze Presents Winter Sessions,Der Mystic,0


## **Combining both datasets in order to make one datasdet**

In [69]:
song_df = pd.merge(song_df_1, song_df_2.drop_duplicates(['song_id']), on='song_id', how='left')
song_df.head()

Unnamed: 0,user_id,song_id,listen_count,title,release,artist_name,year
0,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAKIMP12A8C130995,1,The Cove,Thicker Than Water,Jack Johnson,0
1,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,2,Entre Dos Aguas,Flamenco Para Niños,Paco De Lucia,1976
2,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBXHDL12A81C204C0,1,Stronger,Graduation,Kanye West,2007
3,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBYHAJ12A6701BF1D,1,Constellations,In Between Dreams,Jack Johnson,2005
4,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SODACBL12A8C13C273,1,Learn To Fly,There Is Nothing Left To Lose,Foo Fighters,1999


## **Data Preprocessing**

### Creating another column combinig song title and the artist name. The column name is "Song"

In [70]:
song_df['song'] = song_df['title']+' - '+song_df['artist_name']
song_df.head()

Unnamed: 0,user_id,song_id,listen_count,title,release,artist_name,year,song
0,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOAKIMP12A8C130995,1,The Cove,Thicker Than Water,Jack Johnson,0,The Cove - Jack Johnson
1,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBBMDR12A8C13253B,2,Entre Dos Aguas,Flamenco Para Niños,Paco De Lucia,1976,Entre Dos Aguas - Paco De Lucia
2,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBXHDL12A81C204C0,1,Stronger,Graduation,Kanye West,2007,Stronger - Kanye West
3,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SOBYHAJ12A6701BF1D,1,Constellations,In Between Dreams,Jack Johnson,2005,Constellations - Jack Johnson
4,b80344d063b5ccb3212f76538f3d9e43d87dca9e,SODACBL12A8C13C273,1,Learn To Fly,There Is Nothing Left To Lose,Foo Fighters,1999,Learn To Fly - Foo Fighters


### Create a subset of data in order to get quick results

In [71]:
song_df = song_df.head(10000)

### Giving Each User a Number - ( For easy identification )

In [72]:
song_df['user_id'], _ = pd.factorize(song_df ['user_id'])

# **(1)** 

# **Excisting Popularity Recommendations**

### The defined class function

In [73]:
class popularity_recommender_py():
    def __init__(self):
        self.train_data = None
        self.user_id = None
        self.item_id = None
        self.popularity_recommendations = None
        
    #Creating the model by taking the total listen count of songs
    def create(self, train_data, user_id, item_id):
        self.train_data = train_data
        self.user_id = user_id
        self.item_id = item_id


        train_data_grouped = train_data.groupby([self.item_id]).agg({self.user_id: 'count'}).reset_index()
        train_data_grouped.rename(columns = {'user_id': 'score'},inplace=True)
    
        #Sorting the songs according to the score of each song
        train_data_sort = train_data_grouped.sort_values(['score', self.item_id], ascending = [0,1])
    
        #Generate a recommendation rank based upon score
        train_data_sort['Rank'] = train_data_sort['score'].rank(ascending=0, method='first')
        
        #Get the top 10 recommendations
        self.popularity_recommendations = train_data_sort.head(10)

    #Use the popularity based recommender system model to make recommendations
    def recommend(self, user_id):    
        user_recommendations = self.popularity_recommendations
        
        #Add user_id column for which the recommendations are being generated
        user_recommendations['user_id'] = user_id
    
        #Bring user_id column to the front
        cols = user_recommendations.columns.tolist()
        cols = cols[-1:] + cols[:-1]
        user_recommendations = user_recommendations[cols]
        
        return user_recommendations

In [74]:
pr = popularity_recommender_py()
pr.create(song_df, 'user_id', 'song')

pr.recommend(song_df['user_id'][50])

Unnamed: 0,user_id,song,score,Rank
3660,2,Sehr kosmisch - Harmonia,45,1.0
4678,2,Undo - Björk,32,2.0
5105,2,You're The One - Dwight Yoakam,32,3.0
1071,2,Dog Days Are Over (Radio Edit) - Florence + Th...,28,4.0
3655,2,Secrets - OneRepublic,28,5.0
4378,2,The Scientist - Coldplay,27,6.0
4712,2,Use Somebody - Kings Of Leon,27,7.0
3476,2,Revelry - Kings Of Leon,26,8.0
1387,2,Fireflies - Charttraxx Karaoke,24,9.0
1862,2,Horn Concerto No. 4 in E flat K495: II. Romanc...,23,10.0


In [75]:
pr = popularity_recommender_py()
pr.create(song_df, 'user_id', 'song')

pr.recommend(song_df['user_id'][1000])

Unnamed: 0,user_id,song,score,Rank
3660,19,Sehr kosmisch - Harmonia,45,1.0
4678,19,Undo - Björk,32,2.0
5105,19,You're The One - Dwight Yoakam,32,3.0
1071,19,Dog Days Are Over (Radio Edit) - Florence + Th...,28,4.0
3655,19,Secrets - OneRepublic,28,5.0
4378,19,The Scientist - Coldplay,27,6.0
4712,19,Use Somebody - Kings Of Leon,27,7.0
3476,19,Revelry - Kings Of Leon,26,8.0
1387,19,Fireflies - Charttraxx Karaoke,24,9.0
1862,19,Horn Concerto No. 4 in E flat K495: II. Romanc...,23,10.0


# **Modified Popularity Recommendations (Unpopular Recommendations)**

In [76]:
class unpopularity_recommender_py():
    
    def __init__(self):
        self.train_data = None
        self.user_id = None
        self.item_id = None
        self.popularity_recommendations = None
        
    def create(self, train_data, user_id, item_id):
        self.train_data = train_data
        self.user_id = user_id
        self.item_id = item_id

        # Get a count of user_ids for each unique item as recommendation score
        train_data_grouped = train_data.groupby([self.item_id]).agg({self.user_id: 'count'}).reset_index()
        train_data_grouped.rename(columns = {self.user_id: 'score'}, inplace=True)

        # Sort the items based upon recommendation score in ascending order
        train_data_sort = train_data_grouped.sort_values(['score', self.item_id], ascending = [1,1]) # The modification is done here by changing the order

        # Generate a recommendation rank based upon score
        train_data_sort['Rank'] = train_data_sort['score'].rank(ascending=1, method='first')

        # Get the top 10 recommendations (Less Popular recommendations)
        self.popularity_recommendations = train_data_sort.head(10)
            
    def recommend(self, user_id):    
        user_recommendations = self.popularity_recommendations

        # Add user_id column for which the recommendations are being generated
        user_recommendations['user_id'] = user_id

        # Bring user_id column to the front
        cols = user_recommendations.columns.tolist()
        cols = cols[-1:] + cols[:-1]
        user_recommendations = user_recommendations[cols]

        return user_recommendations

In [77]:
upr = unpopularity_recommender_py()
upr.create(song_df, 'user_id', 'song')

# display  10 unpopular songs
upr.recommend(song_df['user_id'][50])

Unnamed: 0,user_id,song,score,Rank
0,2,#40 - DAVE MATTHEWS BAND,1,1.0
5,2,(Anaesthesia) Pulling Teath - Metallica,1,2.0
6,2,(I Cant Get No) Satisfaction - Cat Power,1,3.0
7,2,(I Can't Get Me No) Satisfaction - Devo,1,4.0
13,2,(iii) - The Gerbils,1,5.0
15,2,1.36 - Coldplay,1,6.0
16,2,100° - Shout Out Louds,1,7.0
18,2,14 Years - Guns N' Roses,1,8.0
20,2,15 Step - Radiohead,1,9.0
22,2,19-2000 - Gorillaz,1,10.0


In [78]:
# display  10 unpopular songs
upr.recommend(song_df['user_id'][1000])

Unnamed: 0,user_id,song,score,Rank
0,19,#40 - DAVE MATTHEWS BAND,1,1.0
5,19,(Anaesthesia) Pulling Teath - Metallica,1,2.0
6,19,(I Cant Get No) Satisfaction - Cat Power,1,3.0
7,19,(I Can't Get Me No) Satisfaction - Devo,1,4.0
13,19,(iii) - The Gerbils,1,5.0
15,19,1.36 - Coldplay,1,6.0
16,19,100° - Shout Out Louds,1,7.0
18,19,14 Years - Guns N' Roses,1,8.0
20,19,15 Step - Radiohead,1,9.0
22,19,19-2000 - Gorillaz,1,10.0


# **(2)**

# **Excisting Collaborative Recommendations**

In [79]:
class item_similarity_recommender_py():
    def __init__(self):
        self.train_data = None
        self.user_id = None
        self.item_id = None
        self.cooccurence_matrix = None
        self.songs_dict = None
        self.rev_songs_dict = None
        self.item_similarity_recommendations = None
        
    #Get unique items (songs) corresponding to a given user
    def get_user_items(self, user):
        user_data = self.train_data[self.train_data[self.user_id] == user]
        user_items = list(user_data[self.item_id].unique())
        
        return user_items
        
    #Get unique users for a given item (song)
    def get_item_users(self, item):
        item_data = self.train_data[self.train_data[self.item_id] == item]
        item_users = set(item_data[self.user_id].unique())
            
        return item_users
        
    #Get unique items (songs) in the training data
    def get_all_items_train_data(self):
        all_items = list(self.train_data[self.item_id].unique())
            
        return all_items
        
    #Construct cooccurence matrix
    def construct_cooccurence_matrix(self, user_songs, all_songs):
            
        #Get users for all songs in user_songs.
        user_songs_users = []        
        for i in range(0, len(user_songs)):
            user_songs_users.append(self.get_item_users(user_songs[i]))
            
        #Initialize the item cooccurence matrix of size len(user_songs) X len(songs)
        cooccurence_matrix = np.matrix(np.zeros(shape=(len(user_songs), len(all_songs))), float)
           
        #Calculate similarity between user songs and all unique songs in the training data
        for i in range(0,len(all_songs)):
            
            #Calculate unique listeners (users) of song (item) i
            songs_i_data = self.train_data[self.train_data[self.item_id] == all_songs[i]]
            users_i = set(songs_i_data[self.user_id].unique())
            
            for j in range(0,len(user_songs)):       
                    
                #Get unique listeners (users) of song (item) j
                users_j = user_songs_users[j]
                    
                #Calculate intersection of listeners of songs i and j
                users_intersection = users_i.intersection(users_j)
                
                #Calculate cooccurence_matrix[i,j] as Jaccard Index
                if len(users_intersection) != 0:
                    #Calculate union of listeners of songs i and j
                    users_union = users_i.union(users_j)
                    
                    cooccurence_matrix[j,i] = float(len(users_intersection))/float(len(users_union))
                else:
                    cooccurence_matrix[j,i] = 0
                    
        
        return cooccurence_matrix

    
    #Use the cooccurence matrix to make top recommendations
    def generate_top_recommendations(self, user, cooccurence_matrix, all_songs, user_songs):
        print("Non zero values in cooccurence_matrix :%d" % np.count_nonzero(cooccurence_matrix))
        
        #Calculate a weighted average of the scores in cooccurence matrix for all user songs.
        user_sim_scores = cooccurence_matrix.sum(axis=0)/float(cooccurence_matrix.shape[0])
        user_sim_scores = np.array(user_sim_scores)[0].tolist()
 
        #Sort the indices of user_sim_scores based upon their value
        #Also maintain the corresponding score
        sort_index = sorted(((e,i) for i,e in enumerate(list(user_sim_scores))), reverse=True)
    
        #Create a dataframe from the following
        columns = ['user_id', 'song', 'score', 'rank']
        #index = np.arange(1) # array of numbers for the number of samples
        df = pd.DataFrame(columns=columns)
         
        #Fill the dataframe with top 10 item based recommendations
        rank = 1 
        for i in range(0,len(sort_index)):
            if ~np.isnan(sort_index[i][0]) and all_songs[sort_index[i][1]] not in user_songs and rank <= 10:
                df.loc[len(df)]=[user,all_songs[sort_index[i][1]],sort_index[i][0],rank]
                rank = rank+1
        
        #Handle the case where there are no recommendations
        if df.shape[0] == 0:
            print("The current user has no songs for training the item similarity based recommendation model.")
            return -1
        else:
            return df
 
    #Create the item similarity based recommender system model
    def create(self, train_data, user_id, item_id):
        self.train_data = train_data
        self.user_id = user_id
        self.item_id = item_id

    #Use the item similarity based recommender system model to make recommendations
    def recommend(self, user):
        
        
        #A. Get all unique songs for this user
        user_songs = self.get_user_items(user)    
            
        print("No. of unique songs for the user: %d" % len(user_songs))
        
        #B. Get all unique items (songs) in the training data
        all_songs = self.get_all_items_train_data()
        
        print("no. of unique songs in the training set: %d" % len(all_songs))
         
        #C. Construct item cooccurence matrix of size  len(user_songs) X len(songs)
        cooccurence_matrix = self.construct_cooccurence_matrix(user_songs, all_songs)
        
        #D. Use the cooccurence matrix to make recommendations
        df_recommendations = self.generate_top_recommendations(user, cooccurence_matrix, all_songs, user_songs)
                
        return df_recommendations


In [80]:
ir = item_similarity_recommender_py()
ir.create(song_df, 'user_id', 'song')

In [81]:
user_items = ir.get_user_items(song_df['user_id'][1000])
# display user songs history
for user_item in user_items:
    print(user_item)

Uprising - Muse
No One Else - Weezer
Runaway - Yeah Yeah Yeahs
Losing Touch - The Killers
Don't Haunt This Place - The Rural Alberta Advantage
Dog Days Are Over (Radio Edit) - Florence + The Machine
At The Bottom Of Everything - Bright Eyes
Lucky (Album Version) - Jason Mraz & Colbie Caillat
Island In The Sun - Weezer
They Might Follow You - Tiny Vipers
Innocent Son - Fleet Foxes
Bleed It Out [Live At Milton Keynes] - Linkin Park
Yawns - Frightened Rabbit
El Scorcho - Weezer
Clocks - Coldplay
Whataya Want From Me - Adam Lambert
Somebody To Love - Justin Bieber
Waking Up In Vegas (Calvin Harris Remix Edit) - Katy Perry
Mia - Emmy The Great
My Name Is Jonas - Weezer
Radar Detector - Darwin Deez
Rehab - Rihanna
Teenager - Camera Obscura
Not Big - Lily Allen
Give It To Me - Timbaland / Justin Timberlake / Nelly Furtado
Falling Through Your Clothes - The New Pornographers
Trouble (Album Version) - Ray LaMontagne
Soft Shock - Yeah Yeah Yeahs
Old Soul Song - Bright Eyes
These Old Shoes - Deer

In [82]:
# give song recommendation for that user
ir.recommend(song_df['user_id'][1000])

No. of unique songs for the user: 118
no. of unique songs in the training set: 5151
Non zero values in cooccurence_matrix :41544


Unnamed: 0,user_id,song,score,rank
0,19,Just Dance - Lady GaGa / Colby O'Donis,0.055901,1
1,19,Vanilla Twilight - Owl City,0.054669,2
2,19,Pursuit Of Happiness (nightmare) - Kid Cudi / ...,0.053688,3
3,19,Here Without You - 3 Doors Down,0.053491,4
4,19,The Funeral (Album Version) - Band Of Horses,0.052492,5
5,19,The Only Exception (Album Version) - Paramore,0.051594,6
6,19,Halo - Beyoncé,0.050433,7
7,19,Breakeven - The Script,0.049297,8
8,19,Bulletproof - La Roux,0.048103,9
9,19,Bad Company - Five Finger Death Punch,0.047367,10


In [83]:
# give song recommendation for a user
ir.recommend(song_df['user_id'][50])

No. of unique songs for the user: 8
no. of unique songs in the training set: 5151
Non zero values in cooccurence_matrix :524


Unnamed: 0,user_id,song,score,rank
0,2,Isolation - Joy Division,0.104167,1
1,2,Rome - Phoenix,0.104167,2
2,2,Shadowplay - Joy Division,0.104167,3
3,2,Dead Souls [Re-mastered] - Joy Division,0.104167,4
4,2,Little L - Jamiroquai,0.104167,5
5,2,Digital - Joy Division,0.104167,6
6,2,Space Cowboy - Jamiroquai,0.104167,7
7,2,Amor de Loca Juventud - Buena Vista Social Club,0.104167,8
8,2,Fences - Phoenix,0.09375,9
9,2,Elevator - The Black Keys,0.083333,10


# **Modified Collaborative Recommendations**

In [84]:
class item_dissimilarity_recommender_py:
    def __init__(self):
        self.train_data = None
        self.user_id = None
        self.item_id = None
        self.cooccurence_matrix = None
        self.songs_dict = None
        self.rev_songs_dict = None
        self.item_similarity_recommendations = None
        
    def get_user_items(self, user):
        user_data = self.train_data[self.train_data[self.user_id] == user]
        user_items = list(user_data[self.item_id].unique())
        return user_items
        
    def get_item_users(self, item):
        item_data = self.train_data[self.train_data[self.item_id] == item]
        item_users = set(item_data[self.user_id].unique())
        return item_users
        
    def get_all_items_train_data(self):
        all_items = list(self.train_data[self.item_id].unique())
        return all_items
        
    def construct_cooccurence_matrix(self, user_songs, all_songs):
        user_songs_users = []
        for i in range(0, len(user_songs)):
            user_songs_users.append(self.get_item_users(user_songs[i]))
            
        cooccurence_matrix = np.matrix(np.zeros(shape=(len(user_songs), len(all_songs))), float)
        
        for i in range(len(all_songs)):
            songs_i_data = self.train_data[self.train_data[self.item_id] == all_songs[i]]
            users_i = set(songs_i_data[self.user_id].unique())

            for j in range(len(user_songs)):
                users_j = user_songs_users[j]
                users_intersection = users_i.intersection(users_j)

                if len(users_intersection) != 0:
                    users_union = users_i.union(users_j)
                    similarity_score = float(len(users_intersection)) / float(len(users_union))
                    cooccurence_matrix[j, i] = 1 - similarity_score  # Dissimilarity
                else:
                    cooccurence_matrix[j, i] = 1  # Maximally dissimilar if no intersection

        return cooccurence_matrix

 
    # Modified method to use the cooccurrence matrix to make top recommendations based on dissimilarity
    def generate_top_recommendations(self, user, cooccurence_matrix, all_songs, user_songs):
        print("Non zero values in cooccurence_matrix :%d" % np.count_nonzero(cooccurence_matrix))
        
        # Calculate a weighted average of the scores in cooccurence matrix for all user songs.
        user_dissim_scores = cooccurence_matrix.sum(axis=0) / float(cooccurence_matrix.shape[0])
        user_dissim_scores = np.array(user_dissim_scores)[0].tolist()

        # Invert scores to find dissimilarity
        max_score = max(user_dissim_scores)
        user_dissim_scores = [max_score - x for x in user_dissim_scores]

        # Sort the indices of user_dissim_scores based upon their value, ascending to find least similar
        sort_index = sorted(((e, i) for i, e in enumerate(user_dissim_scores)), reverse=False)
    
        # Create a dataframe from the following
        columns = ['user_id', 'song', 'score', 'rank']
        df = pd.DataFrame(columns=columns)
         
        # Fill the dataframe with top 10 least similar item based recommendations
        rank = 1 
        for i in range(0, len(sort_index)):
            if ~np.isnan(sort_index[i][0]) and all_songs[sort_index[i][1]] not in user_songs and rank <= 10:
                df.loc[len(df)] = [user, all_songs[sort_index[i][1]], sort_index[i][0], rank]
                rank = rank + 1
        
        # Handle the case where there are no recommendations
        if df.shape[0] == 0:
            print("The current user has no songs for training the item dissimilarity based recommendation model.")
            return -1
        else:
            return df


    def create(self, train_data, user_id, item_id):
        self.train_data = train_data
        self.user_id = user_id
        self.item_id = item_id

    def recommend(self, user):
        user_songs = self.get_user_items(user)
        print("No. of unique songs for the user: %d" % len(user_songs))
        
        all_songs = self.get_all_items_train_data()
        print("no. of unique songs in the training set: %d" % len(all_songs))
        
        cooccurence_matrix = self.construct_cooccurence_matrix(user_songs, all_songs)
        df_recommendations = self.generate_top_recommendations(user, cooccurence_matrix, all_songs, user_songs)
        
        return df_recommendations

In [85]:
dis = item_dissimilarity_recommender_py()
dis.create(song_df, 'user_id', 'song')
user_items_2 = dis.get_user_items(song_df['user_id'][1000])
# display user songs history
for user_item in user_items_2:
    print(user_item)

Uprising - Muse
No One Else - Weezer
Runaway - Yeah Yeah Yeahs
Losing Touch - The Killers
Don't Haunt This Place - The Rural Alberta Advantage
Dog Days Are Over (Radio Edit) - Florence + The Machine
At The Bottom Of Everything - Bright Eyes
Lucky (Album Version) - Jason Mraz & Colbie Caillat
Island In The Sun - Weezer
They Might Follow You - Tiny Vipers
Innocent Son - Fleet Foxes
Bleed It Out [Live At Milton Keynes] - Linkin Park
Yawns - Frightened Rabbit
El Scorcho - Weezer
Clocks - Coldplay
Whataya Want From Me - Adam Lambert
Somebody To Love - Justin Bieber
Waking Up In Vegas (Calvin Harris Remix Edit) - Katy Perry
Mia - Emmy The Great
My Name Is Jonas - Weezer
Radar Detector - Darwin Deez
Rehab - Rihanna
Teenager - Camera Obscura
Not Big - Lily Allen
Give It To Me - Timbaland / Justin Timberlake / Nelly Furtado
Falling Through Your Clothes - The New Pornographers
Trouble (Album Version) - Ray LaMontagne
Soft Shock - Yeah Yeah Yeahs
Old Soul Song - Bright Eyes
These Old Shoes - Deer

In [86]:
# give the dissimilarity recommendation for that user
dis.recommend(song_df['user_id'][1000])

No. of unique songs for the user: 118
no. of unique songs in the training set: 5151
Non zero values in cooccurence_matrix :606934


Unnamed: 0,user_id,song,score,rank
0,19,The Best of Times - Sage Francis,0.0,1
1,19,Sun Hands - Local Natives,0.0,2
2,19,Belle - Jack Johnson,0.0,3
3,19,Auto Rock - Mogwai,0.0,4
4,19,Who Knows Who Cares - Local Natives,0.0,5
5,19,Armistice - Phoenix,0.0,6
6,19,Tell Me Why - Supermode,0.0,7
7,19,If I Can't Have You - Mount Sims,0.0,8
8,19,Angel On My Shoulder (EDX Radio Edit) - Kaskade,0.0,9
9,19,Three Days (2006 Remastered Album Version) - J...,0.0,10


In [87]:
# give the dissimilarity recommendation for that user
dis.recommend(song_df['user_id'][10])

No. of unique songs for the user: 45
no. of unique songs in the training set: 5151
Non zero values in cooccurence_matrix :231474


Unnamed: 0,user_id,song,score,rank
0,0,The Best of Times - Sage Francis,0.0,1
1,0,Belle - Jack Johnson,0.0,2
2,0,Auto Rock - Mogwai,0.0,3
3,0,Who Knows Who Cares - Local Natives,0.0,4
4,0,Girlfriend - Phoenix,0.0,5
5,0,Armistice - Phoenix,0.0,6
6,0,Streets On Lock - Young Jeezy,0.0,7
7,0,The Way Things Go - Octopus Project,0.0,8
8,0,A Pain That Im Used To - Depeche Mode,0.0,9
9,0,Medicating - Boys Night Out,0.0,10
