### danceability

Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

### energy

Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. For example, death metal has high energy, while a Bach prelude scores low on the scale. Perceptual features contributing to this attribute include dynamic range, perceived loudness, timbre, onset rate, and general entropy.

### loudness

The overall loudness of a track in decibels (dB). Loudness values are averaged across the entire track and are useful for comparing relative loudness of tracks. Loudness is the quality of a sound that is the primary psychological correlate of physical strength (amplitude). Values typically range between -60 and 0 db.

### valence

A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive (e.g. happy, cheerful, euphoric), while tracks with low valence sound more negative (e.g. sad, depressed, angry).

### speechiness

Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music. Values below 0.33 most likely represent music and other non-speech-like tracks.

In [None]:
# get a list of playlists containing different moods for training for personality 


In [1]:
import pandas as pd
import numpy as np
import requests
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials


In [2]:
# save your IDs from new project in Spotify Developer Dashboard
CLIENT_ID = '08923f56c32e46c1ae4b4ff05260b76f'
CLIENT_SECRET = '226fd5b07d874db0a61e720c750d646c'
# generate access token

# authentication URL
AUTH_URL = 'https://accounts.spotify.com/api/token'

# POST
auth_response = requests.post(AUTH_URL, {
    'grant_type': 'client_credentials',
    'client_id': CLIENT_ID,
    'client_secret': CLIENT_SECRET,
})

# convert the response to JSON
auth_response_data = auth_response.json()

# save the access token
access_token = auth_response_data['access_token']

# used for authenticating all API calls
headers = {'Authorization': 'Bearer {token}'.format(token=access_token)}

# base URL of all Spotify API endpoints
BASE_URL = 'https://api.spotify.com/v1/'

In [None]:
#pip install spotipy --upgrade -q

In [3]:

#!pip install spotipy --upgrade -q

client_credentials_manager = SpotifyClientCredentials(client_id=CLIENT_ID, client_secret=CLIENT_SECRET)
sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)
def call_playlist(creator_id, playlist_url, limit, offset):
    # NOTE: use playlist_url instead of playlist_id. playlist_id doesn't accept limit and offset parameters for some reason?
    
    # create an empty list and an empty df
    playlist_features_list = ['artist', 'album', 'track_name', 'track_id', 'danceability', 'energy', 'key', 'loudness', 'mode', 
                              'speechiness', 'instrumentalness', 'liveness', 'valence', 'tempo', 'duration_ms', 'time_signature', 'acousticness']
    playlist_df = pd.DataFrame(columns = playlist_features_list)
    
    # loop through the specified playlist and extract wanted features
    playlist = sp.user_playlist_tracks(creator_id, playlist_url, limit=limit, offset=offset)["items"]
    for track in playlist:

        # create empty dict
        playlist_features = {}

        # get metadata
        playlist_features['artist'] = track['track']['album']['artists'][0]['name']
        playlist_features['album'] = track['track']['album']['name']
        playlist_features["track_name"] = track["track"]["name"]
        playlist_features["track_id"] = track["track"]["id"]
        # playlist_features["explicit"] = track["track"]["explicit"]
        playlist_features["popularity"] = track["track"]["popularity"]
        playlist_features["album_release_date"] = track["track"]["album"]["release_date"]
        playlist_features["duration_ms"] = track["track"]["duration_ms"]
        # playlist_features['added_by'] = track["added_by"]["id"]
        # playlist_features['added_at'] = track["added_at"]
        
        # get audio features
        audio_features = sp.audio_features(playlist_features["track_id"])[0]
        for feature in playlist_features_list[4:]:
            playlist_features[feature] = audio_features[feature]
        
        # concat dfs
        track_df = pd.DataFrame(playlist_features, index = [0])
        playlist_df = pd.concat([playlist_df, track_df], ignore_index = True)

    # return df
    return playlist_df


In [4]:
from sklearn import preprocessing
scaler = preprocessing.MinMaxScaler()


In [5]:
username = 'Spotify'

## usually takes 1m44s to load
upbeat_pls = [
    'https://open.spotify.com/playlist/37i9dQZF1DX3rxVfibe1L0', # mood booster
    'https://open.spotify.com/playlist/37i9dQZF1DX66m4icL86Ru', #BBE
    'https://open.spotify.com/playlist/37i9dQZF1DWYBO1MoTDhZI', # good vibes
    'https://open.spotify.com/playlist/37i9dQZF1DWSf2RDTDayIx', # happy beats
    'https://open.spotify.com/playlist/37i9dQZF1DX7KNKjOK0o75', # have a great day
    'https://open.spotify.com/playlist/37i9dQZF1DXdPec7aLTmlC', # happy hits
    'https://open.spotify.com/playlist/37i9dQZF1DX0Uv9tZ47pWo', #girls night
    'https://open.spotify.com/playlist/37i9dQZF1DWXti3N4Wp5xy', #pop party
    'https://open.spotify.com/playlist/37i9dQZF1DWSqmBTGDYngZ', #sing in the shower
    'https://open.spotify.com/playlist/37i9dQZF1DWZixSclZdoFE' #energy boost
]


# call the call_playlist function for each playlist URL and concatenate the results
upbeats = pd.concat([call_playlist(username, url, 100, 0) for url in upbeat_pls], ignore_index=True)
upbeats['vibe']='upbeat'


Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write token to cache at: .cache
Couldn't write t

In [6]:
import os
os.chdir('/Users/lphan/Documents/Spotify')

In [7]:

# upbeats['vibe']='upbeat'
upbeats.to_csv('upbeats.csv',index=False)

In [8]:
#usually takes 
sad_pls = [
    'https://open.spotify.com/playlist/37i9dQZF1DX7qK8ma5wgG1',
    'https://open.spotify.com/playlist/37i9dQZF1DX3YSRoSdA634', #life sucks
    'https://open.spotify.com/playlist/37i9dQZF1DWSqBruwoIXkA',#sad hours
    'https://open.spotify.com/playlist/37i9dQZF1DX64Y3du11rR1',
    'https://open.spotify.com/playlist/37i9dQZF1DX46zHlTTyolY', #heartache
    'https://open.spotify.com/playlist/37i9dQZF1DX3bgBpcV2oGY', #sad guitar instrumental
    'https://open.spotify.com/playlist/37i9dQZF1DX15JKV0q7shD', #classics for crying
    'https://open.spotify.com/playlist/37i9dQZF1DWVV27DiNWxkR', #sad indie
    'https://open.spotify.com/playlist/37i9dQZF1EIfAoIM3ht61G', #mellow mix
    'https://open.spotify.com/playlist/37i9dQZF1EIg6gLNLe52Bd' #lonely sad mix
    ]     
sads = pd.concat([call_playlist(username, url, 100, 0) for url in sad_pls], ignore_index=True)
sads['vibe']='sad'


In [12]:
from sklearn import preprocessing
scaler = preprocessing.MinMaxScaler()
import plotly.graph_objs as go

In [22]:
upbeats_feature = upbeats[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo', 'popularity']]

music_feature_upbeats = pd.DataFrame(scaler.fit_transform(upbeats_feature), columns=upbeats_feature.columns)

In [23]:
sads_feature = sads[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo', 'popularity']]

music_feature_sads = pd.DataFrame(scaler.fit_transform(sads_feature), columns=sads_feature.columns)

In [28]:
# Create radar chart
fig1 = go.Figure(data=go.Scatterpolar(
    r=list(round(music_feature_upbeats.median(),2)),
    theta=['Danceability', 'Energy', 'Loudness', 'Speechiness', 'Valence','Tempo','Popularity'],
    fill='toself'
))

# Update chart layout
fig1.update_layout(
    polar=dict(
        radialaxis=dict(
            # gridcolor="skyblue",
            visible=True,
            range=[0,1],
            tickfont=dict(color='slategrey')
        )),
    showlegend=False,
    title={'text': "Your Audio Features", 'x': 0.5, 'y': 0.95},
    margin=dict(l=0, r=0, t=60, b=0),
    font=dict(size=14)
)
# sandybrown, seagreen, seashell, sienna, silver,
#             skyblue, slateblue, slategray, slategrey, snow,
#             springgreen, steelblue, tan, teal, thistle, tomato,
#             turquoise, violet, wheat, white, whitesmoke,
#             yellow, yellowgreen
# Show chart
fig1.show()

#change the website color
# from IPython.display import display, HTML
# display(HTML('<style>body{background-color: white;}</style>'))


In [29]:
# Create radar chart
fig2 = go.Figure(data=go.Scatterpolar(
    r=list(round(music_feature_sads.median(),2)),
    theta=['Danceability', 'Energy', 'Loudness', 'Speechiness', 'Valence','Tempo','Popularity'],
    fill='toself'
))

# Update chart layout
fig2.update_layout(
    polar=dict(
        radialaxis=dict(
            # gridcolor="skyblue",
            visible=True,
            range=[0,1],
            tickfont=dict(color='slategrey')
        )),
    showlegend=False,
    title={'text': "Your Audio Features", 'x': 0.5, 'y': 0.95},
    margin=dict(l=0, r=0, t=60, b=0),
    font=dict(size=14)
)
fig2.show()



### Findings:
Upbeat: High energy, high valence, high loudness

Sad: Low energy, low valence, high loudness

In [17]:
my_songs = pd.read_csv('AudioFeaturesTable.csv')

In [30]:
my_features = my_songs[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo', 'popularity']]

music_feature_me = pd.DataFrame(scaler.fit_transform(my_features), columns=my_features.columns)

In [31]:
# Create radar chart
fig3 = go.Figure(data=go.Scatterpolar(
    r=list(round(music_feature_me.median(),2)),
    theta=['Danceability', 'Energy', 'Loudness', 'Speechiness', 'Valence','Tempo','Popularity'],
    fill='toself'
))

# Update chart layout
fig3.update_layout(
    polar=dict(
        radialaxis=dict(
            # gridcolor="skyblue",
            visible=True,
            range=[0,1],
            tickfont=dict(color='slategrey')
        )),
    showlegend=False,
    title={'text': "Your Audio Features", 'x': 0.5, 'y': 0.95},
    margin=dict(l=0, r=0, t=60, b=0),
    font=dict(size=14)
)
fig3.show()



In [None]:
import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.cluster import KMeans
from sklearn.metrics.pairwise import cosine_similarity
my_features = my_songs[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo']]
sad_features = sads[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo']]
sad_features = pd.DataFrame(scaler.fit_transform(sad_features), columns=features_only.columns)

upbeat_features = upbeats[['danceability', 'energy', 'loudness','speechiness',
                             'valence', 'tempo']]
upbeat_features = pd.DataFrame(scaler.fit_transform(upbeat_features), columns=features_only.columns)
# df_person = pd.read_csv('AudioFeaturesTable.csv')

# Normalize audio feature
my_features = pd.DataFrame(scaler.fit_transform(my_features), columns=features_only.columns)
# Load audio features for person's playlist into a DataFrame
# Cluster songs in person's playlist
kmeans = KMeans(n_clusters=5)
kmeans.fit(my_features)
clusters = kmeans.predict(my_features)
centroids = kmeans.cluster_centers_

# Compute centroids for reference playlists

centroid_upbeats = upbeat_features.mean(axis=0).values



centroid_sads = sad_features.mean(axis=0).values

# Compute similarity scores for reference playlists
similarity_upbeats = cosine_similarity(centroids, centroid_upbeats.reshape(1, -1))
similarity_sads = cosine_similarity(centroids, centroid_sads.reshape(1, -1))

print('Similarity to reference playlist 1:', similarity_upbeats.sum())
print('Similarity to reference playlist 2:', similarity_sads.sum())
