<a href="https://colab.research.google.com/github/Rajnandanigithub/Songs_Recommendation_using_Word2Vec_model/blob/main/Song_Recommendation_using_word2Vec.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Song Recommendation Using Word2Vec Model**

In this approach, the Word2Vec algorithm is used to generate embeddings for songs based on human-curated playlists. The idea is to treat each song as a "word" and each playlist as a "sentence." By training the model on playlists, where songs frequently appear together, we allow the algorithm to learn relationships between songs that often co-occur in the same playlist. This way, songs that are contextually similar (i.e., often appear together) are represented by similar vectors in the embedding space. These song embeddings can then be used to recommend similar songs that are frequently grouped together in playlists. Essentially, the model captures patterns in user preferences, offering personalized song recommendations based on the co-occurrence of songs in different playlists.


Another approach to song recommendation using the Word2Vec model involves suggesting similar songs based on the artist. If the name of the artist is known, this method allows for finding a list of the most similar songs by the same artist.

In [23]:
import pandas as pd
import numpy as np

In [24]:
from google.colab import files
import pandas as pd

# Step 1: Upload the files
uploaded = files.upload()

Saving play_list_train.txt to play_list_train (1).txt


In [25]:
from google.colab import files
import pandas as pd

# Step 1: Upload the files
uploaded = files.upload()

Saving playlist _song_hash.txt to playlist _song_hash (1).txt


In [26]:
import os
os.listdir('/content/')

['.config',
 'play_list_train (1).txt',
 'song_artist_word2vec.model',
 'play_list_train.txt',
 'playlist _song_hash (1).txt',
 'playlist _song_hash.txt',
 'sample_data']

In [27]:


# The uploaded files will be in '/content/', so you can use these paths
train_file_path = '/content/play_list_train.txt'  # Replace with the name of your uploaded file
song_file_path = '/content/playlist _song_hash.txt'  # Replace with the name of your uploaded file

# Step 2: Read the playlist dataset file from the uploaded location
with open(train_file_path, 'r') as file:
    lines = file.read().split('\n')[2:]

# Step 3: Remove playlists with only one song
playlists = [s.rstrip().split() for s in lines if len(s.split()) > 1]

# Step 4: Read song metadata from the uploaded location
with open(song_file_path, 'r') as file:
    songs_file = file.read().split('\n')

# Step 5: Parse the song metadata
songs = [s.rstrip().split('\t') for s in songs_file]

# Step 6: Create a DataFrame for the song metadata
songs_df = pd.DataFrame(data=songs, columns=['id', 'title', 'artist'])
songs_df = songs_df.set_index('id')

# Display the first few rows of the DataFrame
songs_df.head()


Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
0,Gucci Time (w\/ Swizz Beatz),Gucci Mane
1,Aston Martin Music (w\/ Drake & Chrisette Mich...,Rick Ross
2,Get Back Up (w\/ Chris Brown),T.I.
3,Hot Toddy (w\/ Jay-Z & Ester Dean),Usher
4,Whip My Hair,Willow


In [28]:
## Model Training

In [29]:
from gensim.models import Word2Vec
# Train our Word2Vec model
model = Word2Vec(
 playlists, vector_size=32, window=20, negative=50,min_count=1, workers=4)


In [30]:
song_id = 2172
# Ask the model for songs similar to song #2172
model.wv.most_similar(positive=str(song_id))


[('6641', 0.9969308972358704),
 ('1922', 0.9966843128204346),
 ('11473', 0.9964725971221924),
 ('6626', 0.9962006211280823),
 ('11517', 0.9957231283187866),
 ('1954', 0.9956360459327698),
 ('2014', 0.9954926371574402),
 ('2849', 0.995324432849884),
 ('5586', 0.9948760271072388),
 ('3116', 0.9945465326309204)]

In [31]:
# Note : That is the list of the songs whose embeddings are most similar to song Id 2172


In [32]:
# Lets find out which song corresponds to song id 2172

In [33]:
print(songs_df.iloc[2172])

title     Fade To Black
artist        Metallica
Name: 2172 , dtype: object


In [34]:
## Making a fnction named make_recommendation that will take song_id as argument and return a dataframe constituting song_id , Title , artist
import numpy as np
def print_recommendations(song_id):
 similar_songs = np.array(
 model.wv.most_similar(positive=str(song_id),topn=5)
 )[:,0]
 return songs_df.iloc[similar_songs]
# Extract recommendations
print_recommendations(2172)


Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
6641,Shout At The Devil,Motley Crue
1922,One,Metallica
11473,Little Guitars,Van Halen
6626,Blackout,Scorpions
11517,Mary Had A Little Lamb,Stevie Ray Vaughan & Double Trouble


In [35]:
# checking for unique artists presents
songs_df["artist"].nunique()

15976

In [36]:
songs_df.shape

(75263, 2)

In [37]:
songs_df["artist"].value_counts()

Unnamed: 0_level_0,count
artist,Unnamed: 1_level_1
-,1812
The Beatles,201
Frank Sinatra,166
Vicente Fernandez,166
Metallica,141
...,...
"Peedi Crakk, Beanie Sigel, Freeway & Young Chris",1
Blackberry Smoke,1
Earl Hooker,1
Mable John,1


In [38]:
songs_df[songs_df["artist"]=="The Beatles"]

Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
1675,Let It Be,The Beatles
2578,Don't Let Me Down (w\/ Billy Preston),The Beatles
2663,Come Together,The Beatles
2788,Sgt. Pepper's Lonely Hearts Club Band,The Beatles
2789,A Day In The Life,The Beatles
...,...,...
73490,Goodnight,The Beatles
73494,Anna (Go To Him),The Beatles
74639,The Continuing Story Of Bungalow Bill,The Beatles
74946,"I'll Get You (Mono, Past masters)",The Beatles


In [39]:
import numpy as np
def print_recommendations(song_id):
 similar_songs = np.array(
 model.wv.most_similar(positive=str(song_id),topn=5)
 )[:,0]
 return songs_df.iloc[similar_songs]
# Extract recommendations
print_recommendations(1675)

Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
3106,You Ain't Seen Nothing Yet,Bachman-Turner Overdrive
2819,Magic Carpet Ride,Steppenwolf
3065,Centerfield,John Fogerty
9623,Piece Of My Heart,Big Brother & The Holding Company
9567,Have You Ever Seen The Rain,Creedence Clearwater Revival


songs recommended similar to "Let it Be" in playlist are "you Ain't Seen Nothing Yet " , "Down to The Corner" ,"Revolution"

## **Recommending Songs based on Artist name using the Word2Vec Model**

In [40]:
songs_df.head()

Unnamed: 0_level_0,title,artist
id,Unnamed: 1_level_1,Unnamed: 2_level_1
0,Gucci Time (w\/ Swizz Beatz),Gucci Mane
1,Aston Martin Music (w\/ Drake & Chrisette Mich...,Rick Ross
2,Get Back Up (w\/ Chris Brown),T.I.
3,Hot Toddy (w\/ Jay-Z & Ester Dean),Usher
4,Whip My Hair,Willow


In [41]:
import pandas as pd
import gensim
from gensim.models import Word2Vec
from collections import defaultdict



# 1. Group songs by artist
songs_by_artist = defaultdict(list)

for idx, row in songs_df.iterrows():
    songs_by_artist[row['artist']].append(row['title'])
#print(songs_by_artist)

# 2. Prepare the corpus for Word2Vec (each artist's songs are a sentence)
corpus = []
for artist, songs in songs_by_artist.items():
    corpus.append([artist]+ songs)

# 3. Train Word2Vec model
model = Word2Vec(sentences=corpus, vector_size=32, window=20, min_count=1, sg=0)

# 4. Save model for later use (optional)
model.save("song_artist_word2vec.model")

# 5. Recommend similar songs for an artist
def recommend_songs(artist_name, top_n=5):
    try:
        similar_artists = model.wv.most_similar(artist_name, topn=top_n)
        print(f"Songs recommended based on artist {artist_name}:")
        for artist, _ in similar_artists:
            print(f"- {artist}")
    except KeyError:
        print(f"Artist '{artist_name}' not found in the model.")

In [42]:
# Example of getting song recommendations based on artist
recommend_songs('Gucci Mane')

Songs recommended based on artist Gucci Mane:
- Don't Pull Your Love
- Palito Ortega
- Don't Count Me Out
- Everything You Need Is Right Here
- Spanish Castle Magic


In [43]:
recommend_songs('The Beatles')

Songs recommended based on artist The Beatles:
- A Swingin' Safari
- The Big Idea
- Let The Music Play
- All I Want Is You
- Rastafari Anthem
