# Spotify Mood Session

## 1. Algorithm

a. **First Selection**. User selects a song $s_0$ from the set of all songs $S$ using some kind of search method (by name / artist) and plays that song.

b. **Played List**.  Let $P$ be a new list of played songs and let $P \gets s_0$   

c. **Cold Start**. Create a bootstrap random selection of $n$ songs from $S$ called $R$. Calculate the distance between each song and the first selection and select the closest to the chosen song . Formally:

$$
\forall s \in R \text{ dist}(s_i, s_0)
$$

*potential parameter: use a value to add the closest n songs to the initial bootstrap played list*

d. **Rating**. For every song that a user plays for more than 30 seconds, the song is rated $\text{pos}$, songs that are skipped within this time period are rated $\text{neg}$.  Songs that have not been played are rated $\text{non}$

e. **Recommending**.  Take the current set of vectors and select the next song based on some sort of agregated mean or nearest neighbour system.

repeat (d) and (e) ad infinitum.

Formally:

$$
\text{let: } s_0 \gets \text{ user}(S) \\
\text{let: } P = \{ s_0 \} \\
\text{let: } R \subset S \\
\text{let: } D = \emptyset \\ 
\text{for } s_i \text{ in } R:\\
D_i = \text{dist}(s_i, s_0) \\
\text{end for }\\
\text{select}(\text{max}(D))\\
$$

In [24]:
import json
import pandas as pd
import os
from IPython.display import display

# create list of playlists
playlists = []
path = 'data'
count = 5
for filename in os.listdir(path):
    count -= 1
    if count < 0:
        break
    f = os.path.join(path, filename)
    if os.path.isfile(f):
        d = json.load(open(f))
        playlists.append(pd.DataFrame(d['playlists']))

playlists = pd.concat(playlists)
playlists = playlists.reset_index(drop=True)
print(playlists.shape)

(5000, 12)


In [25]:
# Add all songs from the playlists into a single list of songs
songs = [] 
for i in range(len(playlists)):
    tracks = playlists.iloc[i].loc['tracks']
    for track in tracks:
        songs.append(track)

songs_df = pd.DataFrame(songs)
print(songs_df.shape)
songs_df.head()

(334487, 8)


Unnamed: 0,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name
0,0,Original Broadway Cast - The Little Mermaid,spotify:track:5IbCV9Icebx8rR6wAp5hhP,spotify:artist:3TymzPhJTMyupk7P5xkahM,Fathoms Below - Broadway Cast Recording,spotify:album:3ULJeOMgroG27dpn27MDfS,154506,The Little Mermaid: Original Broadway Cast Rec...
1,1,Original Broadway Cast - The Little Mermaid,spotify:track:6rKVAvjHcxAzZ1BHtwh5yC,spotify:artist:3TymzPhJTMyupk7P5xkahM,Daughters Of Triton - Broadway Cast Recording,spotify:album:3ULJeOMgroG27dpn27MDfS,79066,The Little Mermaid: Original Broadway Cast Rec...
2,2,Original Broadway Cast - The Little Mermaid,spotify:track:6Jlkb1Wh08RYHstWScsTvg,spotify:artist:3TymzPhJTMyupk7P5xkahM,The World Above - Broadway Cast Recording,spotify:album:3ULJeOMgroG27dpn27MDfS,94600,The Little Mermaid: Original Broadway Cast Rec...
3,3,Original Broadway Cast - The Little Mermaid,spotify:track:0XhC8bfStML9ygBmfOt1JJ,spotify:artist:3TymzPhJTMyupk7P5xkahM,Human Stuff - Broadway Cast Recording,spotify:album:3ULJeOMgroG27dpn27MDfS,151480,The Little Mermaid: Original Broadway Cast Rec...
4,4,Original Broadway Cast - The Little Mermaid,spotify:track:0ABxAcsRWlqckkyONsfP67,spotify:artist:3TymzPhJTMyupk7P5xkahM,I Want the Good Times Back - Broadway Cast Rec...,spotify:album:3ULJeOMgroG27dpn27MDfS,297920,The Little Mermaid: Original Broadway Cast Rec...


In [26]:
# Remove duplicate songs by URI
songs_df = songs_df.drop_duplicates('track_uri')

In [27]:
# Add score placeholder to the songs
classification = [0]*len(songs_df)
songs_df.insert(8, "class", classification) # Append data frame by one column

In [28]:
# Print song data to ensure the score was added
print(songs_df.iloc[0])

pos                                                            0
artist_name          Original Broadway Cast - The Little Mermaid
track_uri                   spotify:track:5IbCV9Icebx8rR6wAp5hhP
artist_uri                 spotify:artist:3TymzPhJTMyupk7P5xkahM
track_name               Fathoms Below - Broadway Cast Recording
album_uri                   spotify:album:3ULJeOMgroG27dpn27MDfS
duration_ms                                               154506
album_name     The Little Mermaid: Original Broadway Cast Rec...
class                                                          0
Name: 0, dtype: object


In [51]:
# Search for a particular song 
print("Search for a song name")
s = input()
pattern = [s]

# filter for rows that contain the partial string inputted by the user
results = songs_df.copy()
results[songs_df.track_name.str.contains('|'.join(pattern))]

Search for a song name
leon


Unnamed: 0,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name,class
14511,147,Culture Club,spotify:track:2wSAWEYUHkt92X4SBAPqZE,spotify:artist:6kz53iCdBSqhQCZ21CoLcc,Karma Chameleon - 2002 - Remaster,spotify:album:51NPMfa9QfxsYtqzcB2VfY,252773,Colour By Numbers,0
54000,12,Freedom Fry,spotify:track:2rkJBc8XbmMFpdNf5TLcJr,spotify:artist:195hFqaTDENqLCcG8uGtM7,Napoleon,spotify:album:43aNIk2XCYEK3liM60kDWv,202637,Napoleon,0
169846,102,Shaggy,spotify:track:3T3PZrwCEhWnfxL8H8b3PZ,spotify:artist:5EvFsr3kj42KNv97ZEnqij,I Need Your Love (Don Corleon Dancehall Remix),spotify:album:3bB0arzOnJWYtHVRWUNXdX,212509,I Need Your Love (Don Corleon Dancehall Remix),0
206164,34,Yuri,spotify:track:4Wie4t5LeGwqQKdU405itH,spotify:artist:4OgNARLQSC4yy7Dsa5cqxx,Karma Camaleon,spotify:album:1ZEQ9ACZzt6mGTM1oNW9XT,247866,Viva La Diva,0
249325,72,At The Drive In,spotify:track:34q1w3Eh0vocZj0FOtVQAS,spotify:artist:5E2rtn57BM2WPjwak4kGd5,Napoleon Solo,spotify:album:35qZXJifEQcpWnKP6E4oNv,288066,In / Casino / Out,0
287341,23,KING,spotify:track:70JSnplpMA0L55S7A5K8pG,spotify:artist:0FPWyyf6MD4QZTj3aypD3O,Mister Chameleon,spotify:album:3FYKiMNG19UUdbs8xhpZc7,215468,We Are KING,0
300765,30,Pnau,spotify:track:4pbLMXPtU8ruMCMPmQNY4q,spotify:artist:6n28c9qs9hNGriNa72b26u,Chameleon,spotify:album:4zZhV656BJMvD2hSAveA91,198020,Changa,0
329237,4,Grey,spotify:track:2M0PF3WQt38vwoKjay5Ioh,spotify:artist:4lDBihdpMlOalxy1jkUbPl,Chameleon,spotify:album:1jF5ctJkMohGQj087HI78I,186512,Chameleon,0


In [53]:
# Ask the user to enter a track URI for the first song they want to play
print("Select your first song by copying the track_URI from above (i.e. spotify:track:2wSAWEYUHkt92X4SBAPqZE)")
s = input()

# Mark this song as "Played" with apositive rating (as the user chose it)
index = songs_df.index[songs_df['track_uri'] == s]     # This returns the song with the uri of 's'
print(index)
songs_df.iloc[index.values]


Select your first song by copying the track_URI from above (i.e. spotify:track:2wSAWEYUHkt92X4SBAPqZE)
spotify:track:2rkJBc8XbmMFpdNf5TLcJr
Int64Index([54000], dtype='int64')


Unnamed: 0,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name,class
116443,26,Dean Martin,spotify:track:4vHUPeokmvJgWJsGnWt8MK,spotify:artist:49e4v89VmlDcFCMyDv9wQ9,I Met A Girl,spotify:album:2gWp6ct1c3hFGIN1rR9j18,118066,"Dean Martin: The Capitol Recordings, Vol. 10 (...",0
