<h1>Song recommender</h1>

**Let's go through all the steps of our pipeline:**

1) User inputs a song

2) Get user's song audio features 

3) Predict user's song cluster membership (using the best clustering model)

4) Is the user's song included in the hot songs database?

 - If yes: Recommend another song from the same cluster from the hot_songs database
        
 - If no: Recommend another song from the same cluster from the not_hot_songs database

5) The user wants another reccomendation?

 - If yes: Start again from the first step
        
 - If no: End

In [1]:
from config import *
import spotipy
import json
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials
import pickle
from sklearn.cluster import KMeans

In [2]:
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= Client_ID, client_secret= Client_Secret))

In [3]:
song_db=pd.read_csv("final_dataset_with_prediction.csv")

In [4]:
def load(filename = "filename.pickle"): 
    try: 
        with open(filename, "rb") as file: 
            return pickle.load(file) 
    except FileNotFoundError: 
        print("File not found!") 

In [5]:
scalar = load("scalers/standard.pickle")

In [6]:
model = load("models/kmeans_7.pickle")

In [7]:
def search_song(title, artist, ask_for_options=True):
    sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= Client_ID, client_secret= Client_Secret))
    print("Searching for song {} of artist {}".format(title,artist))
    results = sp.search(q="track:"+title+" artist:"+artist,limit=10)
    if ask_for_options == True:
        if len(results['tracks']['items']) > 1:
            count = 0
            for item in results['tracks']['items']:
                print(count,"Song: '{}', artist: '{}', album: '{}', duration: {} minutes".format(item['name'], item['artists'][0]['name'],item['album']['name'], round((item['duration_ms']/60000),2)))
                count = count + 1
            song_index = int((input("Desirable song (row number): ")))
            return results['tracks']['items'][song_index]['uri'] 

        elif len(results['tracks']['items']) == 1:
            return results['tracks']['items'][0]['uri'] 
        else:
            return ''
    
    if ask_for_options == False:
        if len(results['tracks']['items']) >= 1:
            return results['tracks']['items'][0]['uri']
        else:
            return ''

In [8]:
def is_hot_song(uri):
    return (uri in song_db['uri'].unique()) and ((song_db[song_db['uri'] == uri]['type'] == 'hot')[0])

In [9]:
def search_url(title, artist):
    sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id= Client_ID, client_secret= Client_Secret))
    results = sp.search(q="track:"+title+" artist:"+artist,limit=10)
    song_name=results['tracks']['items'][0]['name']
    song_artist=results['tracks']['items'][0]['artists'][0]['name']
    song_url=results['tracks']['items'][0]['external_urls']['spotify']
    return song_name, song_artist, song_url

In [10]:
def get_song():
    answer = True
    while answer == True:
        artist = str(input("Please, enter the name of the artist: "))
        song = str(input("Please, enter the name of the track: "))
        song_id = search_song(song,artist)
        if song_id=='':
            print("Spotify doesn't recognize that song")
            get_song()
        audio_features = pd.DataFrame()
        results = sp.audio_features(song_id)
        audio_features = pd.concat([audio_features, pd.DataFrame(results)])
        y=audio_features[['danceability', 'energy', 'key',
       'loudness', 'mode', 'speechiness', 'acousticness', 'instrumentalness',
       'liveness', 'tempo', 'time_signature']]
        y_scaled=scalar.transform(y)
        pred_cluster=model.predict(y_scaled)
        if is_hot_song(song_id):
            list_of_hot_songs = song_db[song_db['type'] == 'hot']
            hot_same_cluster_songs = list_of_hot_songs[list_of_hot_songs['kmeans_7'] == pred_cluster[0]]
            sample_song = hot_same_cluster_songs.sample()
            sample_artist = sample_song['artist_name']
            sample_track = sample_song['track_name']
            artist_sample = search_url(sample_track, sample_artist)[1]
            track_sample = search_url(sample_track, sample_artist)[0]
            url=search_url(sample_track, sample_artist)[2]
            print(url)
            print("Song: " + track_sample + " from Artist: "+ artist_sample)
        else:
            list_of_not_songs = song_db[song_db['type'] != 'hot']
            not_same_cluster_songs = list_of_not_songs[list_of_not_songs['kmeans_7'] == pred_cluster[0]]
            sample_song = not_same_cluster_songs.sample()
            sample_artist = sample_song['artist_name']
            sample_track = sample_song['track_name']
            artist_sample = search_url(sample_track, sample_artist)[1]
            track_sample = search_url(sample_track, sample_artist)[0]
            url=search_url(sample_track, sample_artist)[2]
            print(url)
            print("Song: " + track_sample + " from Artist: "+ artist_sample)
        user_answer=str(input("Would like to hear another song (yes or no): "))
        if user_answer.lower()=='yes':
            answer=True
        else:
            answer=False

In [11]:
get_song()

Please, enter the name of the artist: Coldplay
Please, enter the name of the track: Viva la Vida
Searching for song Viva la Vida of artist Coldplay
0 Song: 'Viva La Vida', artist: 'Coldplay', album: 'Viva La Vida or Death and All His Friends', duration: 4.04 minutes
1 Song: 'Viva La Vida', artist: 'Coldplay', album: 'keeping your head up - trending virals', duration: 4.04 minutes
2 Song: 'Viva La Vida', artist: 'Coldplay', album: 'Christmas Break!', duration: 4.04 minutes
3 Song: 'Viva La Vida', artist: 'Coldplay', album: 'Happy 00s', duration: 4.04 minutes
4 Song: 'Viva La Vida', artist: 'Coldplay', album: 'Viva La Vida (Prospekt's March Edition)', duration: 4.04 minutes
5 Song: 'Viva La Vida - Live from Spotify London', artist: 'Coldplay', album: 'Live from Spotify London', duration: 3.9 minutes
6 Song: 'Viva La Vida - Live in Buenos Aires', artist: 'Coldplay', album: 'Live in Buenos Aires', duration: 4.18 minutes
7 Song: 'Viva La Vida - Live', artist: 'Coldplay', album: 'Live 2012',



https://open.spotify.com/track/2nxSAQBvF6gDIwZmG6B9nO
Song: I Should Be Proud from Artist: Martha Reeves & The Vandellas
Would like to hear another song (yes or no): no
