## Case Study: The site for recommendations - "Gnod"

### Algorithm 'Song_finder_bff'

Antonio Montilla

This notebook combines work from other notebooks in this same repository to build a song recommendation algorithm, using Spotify API.

The algorithm will ask the user to input the name of a song and returns a recommendation based on the following criteria:
- First, it checks whether or not the input song is currently a hot song, i.e. whether it is in the hot_songs_all dataframe (which was collected using web scrapping techniques from the websites https://www.popvortex.com/music/charts/top-100-songs.php and https://www.billboard.com/charts/greatest-hot-100-singles/ and saved in the csv file 'hot_songs_all.csv').
- If the input song is indeed a hot song, then it recommends the user another hot song from the list.
- If it is NOT a hot song, then:
    * Collects the audio features from the Spotify API for the input song via a request.
    * Sends the Spotify audio features of the submitted song to a clustering model, a model that has been fitted using information from 30,000 songs from different playlists in Spotify (and saved in cvs file 'songs_total_output.csv'.
    * The previous step should assign a cluster number to the input song. With this information, the algorithm finally recommends a song from the same cluster number.

In [1]:
#importing libraries
import requests
import pandas as pd
from pandas import json_normalize
import numpy as np
import random
from random import randint
from sklearn import cluster, datasets
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import pickle

### (1) Connecting with Spotify API

In [2]:
secrets_file = open("secrets.txt","r")
string = secrets_file.read()
secrets_dict={}
for line in string.split('\n'):
    if len(line) > 0:
        #print(line.split(':'))
        secrets_dict[line.split(':')[0]]=line.split(':')[1].strip()

In [3]:
#Initialize SpotiPy with user credentials
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id=secrets_dict['clientid'],
                                                           client_secret=secrets_dict['clientsecret']))

### (2) Importing CSV files, scaler and fitted K-Means model

In [10]:
#hot_song csv file
hot_songs_all = pd.read_csv('hot_songs_all.csv')
#songs_total_output csv file
songs_total_output = pd.read_csv('songs_total_output.csv')
#scaler
Standardtransformer = pickle.load(open('Standardtransformer.pkl','rb'))
#K-means model
kmeans = pickle.load(open('kmean.pkl', 'rb'))

### (3) Building algorithm: song_finder_bff

In [11]:
def song_finder_bff(song_name):
    lower_case_song = song_name.lower()
    if lower_case_song in hot_songs_all['song'].str.lower().tolist():
        recommended_song = random.choice(hot_songs_all['song'])
        return f"We recommend you to listen '{recommended_song}' as well, one of top hot hits right now"
    else:
        results = sp.search(q=song_name.lower(), limit=10)
        tracks = json_normalize(results["tracks"]["items"])
        def expand_list_dict(row):
            df = json_normalize(row['artists'])
            df['song_id'] = row['id']
            return df
        tracks['artists_dfs'] = tracks.apply(expand_list_dict, axis=1)
        artist_df = pd.DataFrame(columns=['external_urls.spotify', 'href', 'id', 'name', 'type', 'uri', 'song_id'])
        for mini_df in tracks['artists_dfs']:
            artist_df = pd.concat([artist_df, mini_df], axis=0)
        df_merged = pd.merge(left=tracks,
                    right=artist_df,
                    how='inner',
                    left_on='id',
                    right_on='song_id')
        #saving into a df_final the name of the song, artist and song_id associated with input song
        df_final = df_merged[['name_x', 'name_y', 'song_id']]
        #now need to confirm with user the song and artist from list.
        #if yes, then do another request to Spotify to get song data; if not, ask the user again until possible.
        row_index = 0 #so then it starts selecting the first row
        while row_index < len(df_final):
            x = input('Did you mean '+df_final['name_x'].iloc[row_index]+' by '+df_final['name_y'].iloc[row_index]+'?').lower()
            if x in ['yes', 'y', 'ys', 'es', 'si', 'oui']:
                song_info = json_normalize(sp.audio_features(df_final['song_id'].iloc[row_index]))
                break #to break the while loop
            else:
                print('ok, let me try again')
                row_index += 1 #to repeat the process but taking the next song in the df_final
        song_input_df = song_info[['danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo', 'duration_ms', 'time_signature']]
        song_input_df_scaled = Standardtransformer.transform(song_input_df)#scaling
        song_input_df_scaled = pd.DataFrame(song_input_df_scaled,columns=song_input_df.columns)#creating df
        cluster_song = kmeans.predict(song_input_df_scaled)
        array = songs_total_output.song_artist[songs_total_output.cluster == cluster_song[0]].reset_index(drop=True)
        recommended_song = random.choice(array)
        return f"We recommend you to listen '{recommended_song}' as well:)"

In [12]:
#trying with 'poker face'
song_finder_bff('poker face')

Did you mean Poker Face by Lady Gaga?si


"We recommend you to listen 'Genova by Lilla Sällskapet' as well:)"

In [13]:
#searching for 'music'
song_finder_bff('music madonna')

Did you mean Music by Madonna?no
ok, let me try again
Did you mean Popular (with Playboi Carti & Madonna) - Music from the HBO Original Series by The Weeknd?yes


"We recommend you to listen 'Karavaani by Roope Salminen & Koirat' as well:)"

In [14]:
#now for an artist 'shakira'
song_finder_bff('Shakira')

Did you mean Shakira: Bzrp Music Sessions, Vol. 53 by Bizarrap?si


"We recommend you to listen 'Naked In The Rain by Infinity' as well:)"

In [15]:
#trying something from the hot list:
song_finder_bff('Blinding Lights')

"We recommend you to listen 'Cruel Summer' as well, one of top hot hits right now"