# Spotify Mood Session

## 1. Algorithm

a. **First Selection**. User selects a song $s_0$ from the set of all songs $S$ using some kind of search method (by name / artist) and plays that song.

b. **Played List**.  Let $P$ be a new list of played songs and let $P \gets s_0$   

c. **Cold Start**. Create a bootstrap random selection of $n$ songs from $S$ called $R$. Calculate the distance between each song and the first selection and select the closest to the chosen song . Formally:

$$
\forall s \in R \text{ dist}(s_i, s_0)
$$

*potential parameter: use a value to add the closest n songs to the initial bootstrap played list*

d. **Rating**. For every song that a user plays for more than 30 seconds, the song is rated $\text{pos}$, songs that are skipped within this time period are rated $\text{neg}$.  Songs that have not been played are rated $\text{non}$

e. **Recommending**.  Take the current set of vectors and select the next song based on some sort of agregated mean or nearest neighbour system.

repeat (d) and (e) ad infinitum.

Formally:

$$
\text{let: } s_0 \gets \text{ user}(S) \\
\text{let: } P = \{ s_0 \} \\
\text{let: } R \subset S \\
\text{let: } D = \emptyset \\ 
\text{for } s_i \text{ in } R:\\
D_i = \text{dist}(s_i, s_0) \\
\text{end for }\\
\text{select}(\text{max}(D))\\
$$

In [17]:
import json
import pandas as pd
import os
from IPython.display import display

# create list of playlists
playlists = []
path = 'data'
count = 5
for filename in os.listdir(path):
    count -= 1
    if count < 0:
        break
    f = os.path.join(path, filename)
    if os.path.isfile(f):
        d = json.load(open(f))
        playlists.append(pd.DataFrame(d['playlists']))

playlists = pd.concat(playlists)
playlists = playlists.reset_index(drop=True)
print(playlists.shape)

(5000, 12)


In [23]:
# Add all songs from the playlists into a single list of songs
songs = [] 
for i in range(len(playlists)):
    tracks = playlists.iloc[1].loc['tracks']
    for track in tracks:
        songs.append(track)

print(len(songs))
print(songs[0])

130000
{'pos': 0, 'artist_name': 'Erykah Badu', 'track_uri': 'spotify:track:6AaQoliicVSZhobw0OSSM0', 'artist_uri': 'spotify:artist:7IfculRW2WXyzNQ8djX8WX', 'track_name': 'Hello', 'album_uri': 'spotify:album:5q1BjSadVkASNdJ4neVmt6', 'duration_ms': 319958, 'album_name': 'But You Caint Use My Phone'}


In [24]:
songs_df = pd.DataFrame(songs)
print(songs_df.shape)

songs_df.head()

(130000, 8)


Unnamed: 0,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name
0,0,Erykah Badu,spotify:track:6AaQoliicVSZhobw0OSSM0,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Hello,spotify:album:5q1BjSadVkASNdJ4neVmt6,319958,But You Caint Use My Phone
1,1,Erykah Badu,spotify:track:74HYrIbnpc2xKCTenv5qKM,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Window Seat,spotify:album:1MOub955Uer957RVqqkF2a,289720,New Amerykah Part Two: Return Of The Ankh
2,2,Erykah Badu,spotify:track:2BZDLYv27Jzi3fC01K9E8O,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Honey,spotify:album:0Rq4F1xHOjSTIHF9YziK1F,320986,New Amerykah Part One (4th World War)
3,3,Erykah Badu,spotify:track:1MCem6JigI6jgQPMgriKbU,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Tyrone - Live Version,spotify:album:7Cg83CbNY30zxin7u5zwSX,221000,Live
4,4,Drake,spotify:track:4eSGSqP2TZvvX0kadZZttM,spotify:artist:3TVXtAsR1Inumwj472S9r4,Doing It Wrong,spotify:album:6X1x82kppWZmDzlXXK3y3q,265120,Take Care


In [25]:
# Remove duplicate songs by URI
songs_df.drop_duplicates('track_name')

Unnamed: 0,pos,artist_name,track_uri,artist_uri,track_name,album_uri,duration_ms,album_name
0,0,Erykah Badu,spotify:track:6AaQoliicVSZhobw0OSSM0,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Hello,spotify:album:5q1BjSadVkASNdJ4neVmt6,319958,But You Caint Use My Phone
1,1,Erykah Badu,spotify:track:74HYrIbnpc2xKCTenv5qKM,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Window Seat,spotify:album:1MOub955Uer957RVqqkF2a,289720,New Amerykah Part Two: Return Of The Ankh
2,2,Erykah Badu,spotify:track:2BZDLYv27Jzi3fC01K9E8O,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Honey,spotify:album:0Rq4F1xHOjSTIHF9YziK1F,320986,New Amerykah Part One (4th World War)
3,3,Erykah Badu,spotify:track:1MCem6JigI6jgQPMgriKbU,spotify:artist:7IfculRW2WXyzNQ8djX8WX,Tyrone - Live Version,spotify:album:7Cg83CbNY30zxin7u5zwSX,221000,Live
4,4,Drake,spotify:track:4eSGSqP2TZvvX0kadZZttM,spotify:artist:3TVXtAsR1Inumwj472S9r4,Doing It Wrong,spotify:album:6X1x82kppWZmDzlXXK3y3q,265120,Take Care
5,5,Drake,spotify:track:74atKkOasLOVzvqB6mYgga,spotify:artist:3TVXtAsR1Inumwj472S9r4,The Real Her,spotify:album:6X1x82kppWZmDzlXXK3y3q,321080,Take Care
6,6,Drake,spotify:track:7JXZq0JgG2zTrSOAgY8VMC,spotify:artist:3TVXtAsR1Inumwj472S9r4,Jungle,spotify:album:0ptlfJfwGTy0Yvrk14JK1I,320400,If You're Reading This It's Too Late
7,7,OutKast,spotify:track:2faSzprTWJ7L1EkZNko4ww,spotify:artist:1G9G7WwrXka3Z1r7aIDjI7,Take Off Your Cool,spotify:album:1UsmQ3bpJTyK6ygoOOjG1r,158093,Speakerboxxx/The Love Below
8,8,Lianne La Havas,spotify:track:6FZTAXHzXjkZxhfiHeLG6n,spotify:artist:2RP4pPHTXlQpDnO9LvR7Yt,Lost & Found,spotify:album:5202ndVOTQnaBapl3YjNme,267746,Is Your Love Big Enough?
9,9,Lianne La Havas,spotify:track:77GGyeCfAbhPlWjNDCwwr4,spotify:artist:2RP4pPHTXlQpDnO9LvR7Yt,Gone,spotify:album:5202ndVOTQnaBapl3YjNme,264973,Is Your Love Big Enough?
