This notebook allows you to get song recommendations given that you have a song that you already like (and that song is contained in my data). I encourage you to look through the functions in recommendation_engine.py if you would like to tweak more parameters.

In [1]:
import pandas as pd
from gensim.models.doc2vec import Doc2Vec
from recommendation_engine import *
import ipywidgets as widgets
from ipywidgets import interactive,Layout

"""Load in all our data needed to make recommendations.

    feature_df: contains all of our features extracted from lyrics, we will use these to compute similarity.
    
    model: pretrained doc2vec model, similarity scores are stored
    
    decode_dict: dictionary to decode our recommendations. Go from song_id --> song title, aritst
    
"""
feature_df = pd.read_csv('song_features.csv',index_col=0)
model = Doc2Vec.load('doc2vec_len100.model')
decode_df = pd.read_csv('song_ids_full.txt',sep='\t')
decode_dict = decoder(decode_df)

artists = sorted(decode_df['artist_name'].unique().tolist())
songs = sorted(decode_df['song_title'].tolist())

Here we will just create a couple widgets to help with using the recommender. These cells will need to be run with a fresh kernel.

In [2]:
def view_artist(x=''):
    if x=='': return 
    else: return decode_df[(decode_df.artist_name==x)]
    

    

artist_sel = widgets.Select(options=artists)
artist_sel.layout.height='300px'
interactive(view_artist, x=artist_sel)



interactive(children=(Select(description='x', layout=Layout(height='300px'), options=('03 Greedo', '070 Shake'…

After selecting an artist from above, rerun the cell below to see their available songs. Hopefully I can get these widgets linked together soon. Linking at the moment was crashing jupyter.

In [5]:
def view_song(x=''):
    if x=='': return 
    else: return decode_df[(decode_df.song_title==x)]
    
song_sel = widgets.Select(options=decode_df[(decode_df.artist_name==artist_sel.value)]['song_title'].to_list())
song_sel.layout.height='300px'
interactive(view_song, x=song_sel)

interactive(children=(Select(description='x', layout=Layout(height='300px'), options=('365', 'Addicted', 'All …

Tweak the parameters! (Don't forget that the doc2vec model you choose also impacts what songs are returned)

- feature_importance is the weight we put on similarities found from our extracted features (syllables per second, sentiment,etc).

- doc_importance is the weight we put on the similarities found from our doc2vec model.

- std_drop drops songs with features that are >3 stds from our mean, mostly used to exclude non-songs

- time_exclusion will drop songs if they don't have a recorded duration. This is also helpful in dropping non-songs. If time_exclusion = False, time based features are removed from similarity scoring. If the initial song has no duration, this defaults to False

- songs_returned: number of songs to be recommended

In [6]:

feature_importance=0.5
doc_importance = 0.5
std_drop=3
time_exclusion=True
songs_returned = 5

song_id = decode_df[(decode_df.artist_name==artist_sel.value)&(decode_df.song_title==song_sel.value)]['song_id'].values[0]
feature_df = clean_features(feature_df,std_drop=std_drop)
similarities1 = feature_similarity(song_id,feature_df,time_exclusion=time_exclusion, std_drop=std_drop)
similarities2 = similarity_df(similarities1,model,song_id,feature_importance=feature_importance,doc_importance=doc_importance)
recommend(similarities2,decode_dict,song_id,n=songs_returned)

Recommendations for Candy Paint by DJ Khaled

Try out 'Keep On Pushin' by King Chip
Try out 'Ridin’ Slow' by Bun B
Try out 'Cadillac & Benz' by Chamillionaire
Try out 'Recognize A Playa' by Slim Thug
Try out '30 Inches' by Juicy J
