# Valence-Arousal-Recommendation System

In this notebook, we use a dataset featuring around 12,000 Spotify tracks, each with a `valence` and an `energy` value, to build a simple mood-based recommendation system.

In [8]:
import pandas as pd
import random
import authorization
import numpy as np
from numpy.linalg import norm

## 1. Preparations

__Load Data__

In [3]:
df = pd.read_csv("valence_arousal_dataset.csv")
print(df.shape)
df.head()

(11091, 6)


Unnamed: 0,id,genre,track_name,artist_name,valence,energy
0,7DneDMOlQ1ol2bd2tWukd2,acoustic,Mercy - Radio 1 Live Lounge Session (Duffy cover),OneRepublic,0.815,0.53
1,2CjMm3TDd9BS8xAcvbe6yY,acoustic,Let Her Go (feat. Hannah Trigwell),Boyce Avenue,0.307,0.333
2,0O36Yqb2aFcThBphczYoAY,acoustic,A Drop In The Ocean,Ron Pope,0.473,0.452
3,5GCgC77m6EaAqu7ZlukMu2,acoustic,Umbrella,Train,0.627,0.849
4,37z3ghNvcoPvHypKWTb2Sz,acoustic,Left for America,Ciaran Lavery,0.25,0.384


In order to compute distances between two tracks, we need to transform the seperate `valence`and `energy` columns to a `mood-vector` column.
This can be done by using `df.apply()` alongside a lambda function.

__Create Mood Vector__

In [4]:
df["mood_vec"] = df[["valence", "energy"]].values.tolist()
df["mood_vec"].head()

0     [0.815, 0.53]
1    [0.307, 0.333]
2    [0.473, 0.452]
3    [0.627, 0.849]
4     [0.25, 0.384]
Name: mood_vec, dtype: object

__Authorize Spotify API Access__

In [22]:
sp = authorization.authorize() # Use the authorization script provided earlier in the blog post

## 2. Recommendation Algorithm

The algorithm that finds similar tracks to a given input track is now very simple. 
1. Crawl the track's `valence` and `energy` values from the Spotify API.
2. Compute the distances of the input track to each track in the reference dataset.
3. Sort the reference track from lowest to highest distance.
4. Return the `n` most similar tracks.

In [45]:
def recommend(track_id, ref_df, sp, n_recs = 5):
    
    # Crawl valence and arousal of given track from spotify api
    track_features = sp.track_audio_features(track_id)
    track_moodvec = np.array([track_features.valence, track_features.energy])
    print(f"mood_vec for {track_id}: {track_moodvec}")
    
    # Compute distances to all reference tracks
    ref_df["distances"] = ref_df["mood_vec"].apply(lambda x: norm(track_moodvec-np.array(x)))
    # Sort distances from lowest to highest
    ref_df_sorted = ref_df.sort_values(by = "distances", ascending = True)
    # If the input track is in the reference set, it will have a distance of 0, but should not be recommendet
    ref_df_sorted = ref_df_sorted[ref_df_sorted["id"] != track_id]
    
    # Return n recommendations
    return ref_df_sorted.iloc[:n_recs]

Let us try it out using some random tracks from our dataset.

In [47]:
track1 = random.choice(df["id"])
recommend(track_id = track1, ref_df = df, sp = sp, n_recs = 5)

mood_vec for 2eFpNOUGhlyScJIJwbq1E7: [0.221 0.998]


Unnamed: 0,id,genre,track_name,artist_name,valence,energy,mood_vec,distances
8551,1R1yJO9oVNcq1ae8AZH0kT,punk,Quicksand,The Story So Far,0.223,0.995,"[0.223, 0.995]",0.003606
4068,3VFm6SnmdlTR5AtW5lNbuu,grindcore,Repression Out Of Uniform,Napalm Death,0.221,0.994,"[0.221, 0.994]",0.004
636,5siwqJepQkbKKaAmLRveYY,black-metal,Retribution - Storm Of The Light's Bane,Dissection,0.223,0.993,"[0.223, 0.993]",0.005385
7051,7nPQ3oS7C3opzuxofBhw5k,metalcore,Through the Darkest Dark and Brightest Bright,We Came As Romans,0.217,0.992,"[0.217, 0.992]",0.007211
4669,3ha9AklaYxF0TjACgmtAin,hardcore,Billy,Bad Religion,0.23,0.993,"[0.23, 0.993]",0.010296


In [48]:
mad_world = "3JOVTQ5h8HGFnDdp4VT3MP"
recommend(track_id = mad_world, ref_df = df, sp = sp, n_recs = 5)

mood_vec for 3JOVTQ5h8HGFnDdp4VT3MP: [0.304  0.0581]


Unnamed: 0,id,genre,track_name,artist_name,valence,energy,mood_vec,distances
4980,2WP3DiCNEhMu7yNG8CxORz,holidays,Glory Manger,Harry Belafonte,0.306,0.0535,"[0.306, 0.0535]",0.005016
2703,0tgZmJoPVxLuQ4T6bILgSs,disney,Just When I Brought You a Mother / Banished,Various Artists,0.316,0.0478,"[0.316, 0.0478]",0.015814
1583,4dHEQ1W1jRmNjjB0S7vB8V,chill,re:stacks,Bon Iver,0.304,0.0813,"[0.304, 0.0813]",0.0232
1707,1Org5XClvfulvp1mrLUQuT,classical,"Symphonie fantastique, Op. 14: IV. Marche au s...",Hector Berlioz,0.285,0.0765,"[0.285, 0.0765]",0.026449
7897,1XXAqff63KKumpw6plBqgL,piano,O My Father,Paul Cardall,0.297,0.031,"[0.297, 0.031]",0.027989


In [49]:
rosanna = "37BTh5g05cxBIRYMbw8g2T"
recommend(track_id = rosanna, ref_df = df, sp = sp, n_recs = 5)

mood_vec for 37BTh5g05cxBIRYMbw8g2T: [0.739 0.513]


Unnamed: 0,id,genre,track_name,artist_name,valence,energy,mood_vec,distances
6590,7ctLV5QnYVq3V89EK3XNsa,latin,Sentimientos De Cartón,Duelo,0.74,0.504,"[0.74, 0.504]",0.009055
6514,2lENjjnvVaHFdfs7Zz02lp,kids,Jack Tar on Shore - with Broken Social Scene,Various Artists,0.747,0.507,"[0.747, 0.507]",0.01
10474,0fbyHCbGs4F9B8I2BqAvOq,swedish,Ett Snedsteg Bort,Kapten Röd,0.727,0.512,"[0.727, 0.512]",0.012042
6548,17LxkTp8UNbPcYrDrI6UOq,latin,Contigo Siempre,Alejandro Fernández,0.731,0.503,"[0.731, 0.503]",0.012806
11054,2qudLQVpY4JV1eq7ned5v1,world-music,Amor Verdadero,Afro-Cuban All Stars,0.745,0.501,"[0.745, 0.501]",0.013416
