# CONTENT BASED MUSIC RECOMMENDATION

 The same process that was used for the movie recommendations has been used 
This dataset contains name, artist, and lyrics for 57650 songs in English. The data has been acquired from LyricsFreak through scraping.



In [1]:
import numpy as np
import pandas as pd



In [2]:
from typing import List, Dict

In [3]:
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.metrics.pairwise import cosine_similarity

In [7]:
songs = pd.read_csv('songdata.csv')

In [8]:
songs.head()

Unnamed: 0,artist,song,link,text
0,ABBA,Ahe's My Kind Of Girl,/a/abba/ahes+my+kind+of+girl_20598417.html,"Look at her face, it's a wonderful face \nAnd..."
1,ABBA,"Andante, Andante",/a/abba/andante+andante_20002708.html,"Take it easy with me, please \nTouch me gentl..."
2,ABBA,As Good As New,/a/abba/as+good+as+new_20003033.html,I'll never know why I had to go \nWhy I had t...
3,ABBA,Bang,/a/abba/bang_20598415.html,Making somebody happy is a question of give an...
4,ABBA,Bang-A-Boomerang,/a/abba/bang+a+boomerang_20002668.html,Making somebody happy is a question of give an...


In [9]:
songs = songs.sample(n=5000).drop('link', axis=1).reset_index(drop=True)

In [10]:
songs['text'] = songs['text'].str.replace(r'\n', '')

  """Entry point for launching an IPython kernel.


We have used td-idf vectorizer , and cosine similarity as our metrics here to get the recommendations 

In [11]:
tfidf = TfidfVectorizer(analyzer='word', stop_words='english')

In [12]:
lyrics_matrix = tfidf.fit_transform(songs['text'])

In [13]:
cosine_similarities = cosine_similarity(lyrics_matrix) 

In [14]:
similarities = {}

In [15]:
for i in range(len(cosine_similarities)):
    #  sort each element in cosine_similarities and get the indexes of the songs. 
    similar_indices = cosine_similarities[i].argsort()[:-50:-1] 
    # After that, we'll store in similarities each name of the 50 most similar songs.
    
    similarities[songs['song'].iloc[i]] = [(cosine_similarities[i][x], songs['song'][x], songs['artist'][x]) for x in similar_indices][1:]

We have defined the class to get the recommendations , we can give the input of the song index and also specify the number of songs we need for recommendations

In [16]:
class ContentBasedRecommender:
    def __init__(self, matrix):
        self.matrix_similar = matrix

    def _print_message(self, song, recom_song):
        rec_items = len(recom_song)
        
        print(f'The {rec_items} recommended songs for {song} are:')
        for i in range(rec_items):
            print(f"Number {i+1}:")
            print(f"{recom_song[i][1]} by {recom_song[i][2]} with {round(recom_song[i][0], 3)} similarity score") 
            print("--------------------")
        
    def recommend(self, recommendation):
        # Get song to find recommendations for
        song = recommendation['song']
        # Get number of songs to recommend
        number_songs = recommendation['number_songs']
        # Get the number of songs most similars from matrix similarities
        recom_song = self.matrix_similar[song][:number_songs]
        # print each item
        self._print_message(song=song, recom_song=recom_song)

In [17]:
recommedations = ContentBasedRecommender(similarities)

In [20]:
recommendation = {
    "song": songs['song'].iloc[10],
    "number_songs": 5
}

In [21]:
recommedations.recommend(recommendation)

The 5 recommended songs for Searching are:
Number 1:
Heart Of Gold by Willie Nelson with 0.363 similarity score
--------------------
Number 2:
Could It Be You? by 'n Sync with 0.33 similarity score
--------------------
Number 3:
Can't Stop Loving You by Tom Jones with 0.318 similarity score
--------------------
Number 4:
Can't Stop Thinking About You by George Harrison with 0.267 similarity score
--------------------
Number 5:
Won't Stop by Justin Bieber with 0.257 similarity score
--------------------


In [22]:
recommendation2 = {
    "song": songs['song'].iloc[120],
    "number_songs": 4 
}

In [23]:
recommedations.recommend(recommendation2)

The 4 recommended songs for Basin Street Blues are:
Number 1:
If New Orleans Is Beat by Tragically Hip with 0.26 similarity score
--------------------
Number 2:
Free by Prince with 0.221 similarity score
--------------------
Number 3:
I'll Meet You There by Owl City with 0.194 similarity score
--------------------
Number 4:
Do You Know What It Means To Miss New Orleans? by Billie Holiday with 0.183 similarity score
--------------------


In [24]:
recommendation3 = {
    "song": songs['song'].iloc[12],
    "number_songs": 10 
}

In [25]:
recommedations.recommend(recommendation3)

The 10 recommended songs for Think For Yourself are:
Number 1:
Do You Know by Fleetwood Mac with 0.257 similarity score
--------------------
Number 2:
Pick Myself Up by Peter Tosh with 0.203 similarity score
--------------------
Number 3:
Don't Think So by Quiet Riot with 0.202 similarity score
--------------------
Number 4:
Before You Make Up Your Mind by Dolly Parton with 0.192 similarity score
--------------------
Number 5:
Let Go by Ne-Yo with 0.19 similarity score
--------------------
Number 6:
Just This One Time by Cher with 0.18 similarity score
--------------------
Number 7:
A Reason To Believe by Wilson Phillips with 0.176 similarity score
--------------------
Number 8:
I Don't Know by Zucchero with 0.172 similarity score
--------------------
Number 9:
Sundays by Counting Crows with 0.172 similarity score
--------------------
Number 10:
I Want It All by Depeche Mode with 0.168 similarity score
--------------------


The model has done a good job in obtaining the recommendations for the songs