# Song Recommender Project

The goal of this notebook is to develop a new Song Recommender product based on Billboard - The Hot 100 list: (https://www.billboard.com/charts/hot-100/). 

When the user enters the name of a song included in the hot list, the Song Recommender will suggest another song from the hot list.

In case the song entered by the user is not included in the Billboard list anymore, the Song Recommender will conect with Spotify in order to find another song with the same features of the one entered by the user, and recommend it. 

Song Recommender will base its recommendations on the following features: 

* **Danceability**: Danceability describes how suitable for dancing a track is, based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

* **Acousticness**: A measure from 0.0 to 1.0 of whether the track is acoustic.

* **Energy**: Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy.

* **Instrumentalness**: Predicts whether a track contains no vocals. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content.

* **Liveness**: Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live.

* **Loudness**: The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track. Values typical range between -60 and 0 db.

* **Speechiness**: Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value.

* **Tempo**: The overall estimated tempo of a track in beats per minute (BPM). In musical terminology, tempo is the speed or pace of a given piece and derives directly from the average beat duration.

* **Valence**: A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).


In [2]:
# Import all necessary libraries:

import numpy as np
import pandas as pd
import spotipy as sp
import config
from time import sleep
from random import randint
import spotipy
import json
from spotipy.oauth2 import SpotifyClientCredentials
import random 
import pickle
from sklearn import datasets # sklearn comes with some toy datasets to practise
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
from matplotlib import pyplot
from sklearn.metrics import silhouette_score
from IPython.display import Image
from IPython.display import display

In [3]:
#import files form "Spotify Clustering" Notebook (Notebook available in the repository)

def load(filename = "filename.pickle"): 
    try: 
        with open(filename, "rb") as f: 
            return pickle.load(f) 
    except FileNotFoundError: 
        print("File not found!")

In [4]:
# #import files form "Spotify Clustering" Notebook (Notebook available in the repository)
scaler2 = load("Model/scaler.pickle")
scaler2

StandardScaler()

In [5]:
#import files form "Spotify Clustering" Notebook (Notebook available in the repository)
kmeans2 = load("Model/kmeans_4.pickle")
kmeans2

KMeans(n_clusters=6, random_state=1234)

# Loading data

In [6]:

df=pd.read_csv("Data/hot100.csv") #import files form "Bilboard-The hot 100" Notebook (Notebook available in the repository)

clus_df=pd.read_csv("Data/clustered_df.csv")#import files form "Spotify Clustering" Notebook (Notebook available in the repository)

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id=config.client_id,
                                                           client_secret=config.client_secret)) # Credentials to connect to Spotify


# Connect with Spotify

In [7]:
# Conect with Spotify in order to get the most important features of a song: 

#def spotify_search():

song = sp.search(q="Bohemian Rhapsody", limit=1) 
#pprint.pprint(song['tracks']['items'][0]['uri'])
#song["tracks"]["items"][0]["uri"]
sp.audio_features(song["tracks"]["items"][0]["uri"])[0]

{'danceability': 0.392,
 'energy': 0.402,
 'key': 0,
 'loudness': -9.961,
 'mode': 0,
 'speechiness': 0.0536,
 'acousticness': 0.288,
 'instrumentalness': 0,
 'liveness': 0.243,
 'valence': 0.228,
 'tempo': 143.883,
 'type': 'audio_features',
 'id': '7tFiyTwD0nx5a1eklYtX2J',
 'uri': 'spotify:track:7tFiyTwD0nx5a1eklYtX2J',
 'track_href': 'https://api.spotify.com/v1/tracks/7tFiyTwD0nx5a1eklYtX2J',
 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/7tFiyTwD0nx5a1eklYtX2J',
 'duration_ms': 354320,
 'time_signature': 4}

In [8]:
#def spotify_search():
song = sp.search(q="hello", limit=1)
features=sp.audio_features(song["tracks"]["items"][0]["uri"])[0]

    

In [9]:
features

{'danceability': 0.905,
 'energy': 0.647,
 'key': 10,
 'loudness': -5.065,
 'mode': 0,
 'speechiness': 0.107,
 'acousticness': 0.0187,
 'instrumentalness': 0,
 'liveness': 0.282,
 'valence': 0.367,
 'tempo': 130.97,
 'type': 'audio_features',
 'id': '2r6OAV3WsYtXuXjvJ1lIDi',
 'uri': 'spotify:track:2r6OAV3WsYtXuXjvJ1lIDi',
 'track_href': 'https://api.spotify.com/v1/tracks/2r6OAV3WsYtXuXjvJ1lIDi',
 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2r6OAV3WsYtXuXjvJ1lIDi',
 'duration_ms': 190534,
 'time_signature': 4}

In [10]:
type(features)

dict

In [11]:
my_dict = sp.audio_features(song["tracks"]["items"][0]["uri"])[0] # you can provide a list of uri's

print(my_dict)
my_dict_new = { key: [my_dict[key]] for key in list(my_dict.keys()) }
#my_dict_new['name'] = [song["tracks"]["items"][0]['name']]
print(my_dict_new)

pd.DataFrame(my_dict_new)


{'danceability': 0.905, 'energy': 0.647, 'key': 10, 'loudness': -5.065, 'mode': 0, 'speechiness': 0.107, 'acousticness': 0.0187, 'instrumentalness': 0, 'liveness': 0.282, 'valence': 0.367, 'tempo': 130.97, 'type': 'audio_features', 'id': '2r6OAV3WsYtXuXjvJ1lIDi', 'uri': 'spotify:track:2r6OAV3WsYtXuXjvJ1lIDi', 'track_href': 'https://api.spotify.com/v1/tracks/2r6OAV3WsYtXuXjvJ1lIDi', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2r6OAV3WsYtXuXjvJ1lIDi', 'duration_ms': 190534, 'time_signature': 4}
{'danceability': [0.905], 'energy': [0.647], 'key': [10], 'loudness': [-5.065], 'mode': [0], 'speechiness': [0.107], 'acousticness': [0.0187], 'instrumentalness': [0], 'liveness': [0.282], 'valence': [0.367], 'tempo': [130.97], 'type': ['audio_features'], 'id': ['2r6OAV3WsYtXuXjvJ1lIDi'], 'uri': ['spotify:track:2r6OAV3WsYtXuXjvJ1lIDi'], 'track_href': ['https://api.spotify.com/v1/tracks/2r6OAV3WsYtXuXjvJ1lIDi'], 'analysis_url': ['https://api.spotify.com/v1/audio-analysis/2r6OAV3WsYtX

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.905,0.647,10,-5.065,0,0.107,0.0187,0,0.282,0.367,130.97,audio_features,2r6OAV3WsYtXuXjvJ1lIDi,spotify:track:2r6OAV3WsYtXuXjvJ1lIDi,https://api.spotify.com/v1/tracks/2r6OAV3WsYtX...,https://api.spotify.com/v1/audio-analysis/2r6O...,190534,4


In [12]:
Z=pd.DataFrame(my_dict_new)

In [13]:
# We get numerical columns:
Z=Z._get_numeric_data()

In [14]:
scaler = StandardScaler()
scaler.fit(Z)
Z_scaled = scaler.transform(Z)
Z_scaled_df = pd.DataFrame(Z_scaled, columns = Z.columns)
display(Z.head())

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms,time_signature
0,0.905,0.647,10,-5.065,0,0.107,0.0187,0,0.282,0.367,130.97,190534,4


In [15]:
clusters = kmeans2.predict(Z_scaled_df)

In [16]:
clusters

array([1])

# Create the recommender function

In [17]:
choice=["is a great choice!"," is amazing!","is cool!"]
user=""

def user_selection():
    user=input("Please enter the title of a hot song: ")
    if user in df.values:
        print(user,random.choice(choice),"You might like this one from the hot list too:",random.choice(df["titles"]))
  
    else:
        song = sp.search(q=user, limit=1)
        
        
        features=sp.audio_features(song["tracks"]["items"][0]["uri"])[0]
        
        
        my_dict_new = sp.audio_features(song["tracks"]["items"][0]["uri"])[0]
    
        Z=pd.DataFrame(my_dict_new,index=[0])
        Z=Z._get_numeric_data()
        Z_scaled=scaler2.transform(Z)
        Z_predict_cluster=kmeans2.predict(Z_scaled)[0]
        song_to_recommend=clus_df[clus_df.cluster==Z_predict_cluster].sample(1)
        song_to_recommend_uri=song_to_recommend.uri.values
        print(song_to_recommend)
        from IPython.display import IFrame
        track_id = "7kOJsVkJXvLQPQ9osGWeKd"
        IFrame(src=f"https://open.spotify.com/embed/track/{track_id}",
        width="320",
        height="80",
        frameborder="0",
        allowtransparency="true",
        allow="encrypted-media",
        )
        
      

       
        
                                                                   
      
       


In [21]:
user_selection()

Please enter the title of a hot song: lose yourself
      danceability  energy  key  loudness  mode  speechiness  acousticness  \
4622         0.746   0.779    7    -3.947     0       0.0927          0.01   

      instrumentalness  liveness  valence  ...  time_signature  cluster  \
4622               0.0     0.256    0.824  ...               4        1   

            songs       artists                                  uris  \
4622  Rahan takii  Antti Tuisku  spotify:track:4UDDfd77rlXmNjgW1NKhWq   

                type                      id  \
4622  audio_features  4UDDfd77rlXmNjgW1NKhWq   

                                       uri  \
4622  spotify:track:4UDDfd77rlXmNjgW1NKhWq   

                                             track_href  \
4622  https://api.spotify.com/v1/tracks/4UDDfd77rlXm...   

                                           analysis_url  
4622  https://api.spotify.com/v1/audio-analysis/4UDD...  

[1 rows x 22 columns]


In [23]:
from IPython.display import IFrame
track_id = "7kOJsVkJXvLQPQ9osGWeKd"
IFrame(src=f"https://open.spotify.com/embed/track/{track_id}",
       width="320",
       height="80",
       frameborder="0",
       allowtransparency="true",
       allow="encrypted-media",
      )
