#### Project Goal: create insightful visualizations of my Spotify streaming history for the past year [07.02.2020 - 07.02.2021].

I requested my data from Spotify on 07.02.2021 and received it as .json files: here we will only use the 'Streaming History' files. In the code below I clean, manipulate and store this data as a pandas dataframe. I also collect extraneous information on the tracks' features from Spotify's API and save all the final info in a csv file.

In [1]:
#Function for getting the 
import ast
from typing import List
from os import listdir

def get_streamings(path: str = 'SpotifyData') -> List[dict]:
    
    files = ['SpotifyData/' + x for x in listdir(path)
             if x.split('.')[0][:-1] == 'StreamingHistory']
    
    all_streamings = []
    
    for file in files: 
        with open(file, 'r', encoding='UTF-8') as f:
            new_streamings = ast.literal_eval(f.read())
            all_streamings += [streaming for streaming 
                               in new_streamings]
    return all_streamings

In [2]:
streamings = get_streamings()

In [33]:
import pandas as pd
df = pd.DataFrame(streamings)

In [34]:
df = df.sort_values(by=['msPlayed'])
df

Unnamed: 0,endTime,artistName,trackName,msPlayed
28235,2020-12-30 22:38,Cage The Elephant,Come a Little Closer,0
24508,2020-11-18 16:59,Nina Simone,Suzanne,0
24511,2020-11-18 17:01,The Velvet Underground,Heroin,0
24515,2020-11-18 17:01,The Cure,Lullaby,0
24516,2020-11-18 17:01,Pulp,Underwear,0
...,...,...,...,...
23688,2020-11-12 03:13,Conspiracy Theories,Government Poisoned Alcohol Pt. 1,2789473
23689,2020-11-12 04:00,Conspiracy Theories,Government Poisoned Alcohol Pt. 2,2802247
23687,2020-11-12 02:26,Conspiracy Theories,Best of 2019: Watergate Pt. 2,2883775
18387,2020-08-03 23:23,LeVar Burton Reads,"""End Game"" by Nancy Kress",3171165


In [35]:
#Delete songs that were listened to for less than a minute
df = df[df.msPlayed >= 60000]

#delete podcasts by assuming no song's duration is greater than 20 minutes
df = df[df.msPlayed <= (600000*2)] #10 minutes *2

In [36]:
#By taking a quick look at the remaining table I see that my longest song is 'Dance Warriors' and is 5th from last
#Let's delete the last 4 items
df = df[:-4]
df

Unnamed: 0,endTime,artistName,trackName,msPlayed
11204,2020-03-03 01:36,Darondo,Didn't I (Dave Allison Rework),60010
17062,2020-07-10 11:18,Bloody Hawk,Dachtylidi,60015
6355,2021-01-29 00:58,Etta James,I Just Want To Make Love To You - Single Version,60070
26326,2020-12-09 19:53,Angus Stone,Paper Aeroplane,60074
8559,2020-02-10 11:07,Billie Eilish,ocean eyes,60160
...,...,...,...,...
16847,2020-07-07 12:41,Phosphorescent,Song for Zula,822658
5438,2021-01-25 22:49,William Onyeabor,When the Going is Smooth & Good,844897
470,2021-01-02 09:09,Tara Brach,Meditation: A Listening Presence (2020-02-05),865690
23694,2020-11-12 06:53,Conspiracy Theories,The Phantom Dark Age Pt. 2,868247


In [37]:
#Keep only date from endTime
df['endTime'] = pd.to_datetime(df['endTime']).dt.normalize()

#Delete msPlayed column (won't need it anymore)
df = df.drop(['msPlayed'], axis=1)

#Rename columns 
df= df.rename(columns={"trackName": "Track", "artistName":"Artist", "endTime": "Date"})

In [38]:
df

Unnamed: 0,Date,Artist,Track
11204,2020-03-03,Darondo,Didn't I (Dave Allison Rework)
17062,2020-07-10,Bloody Hawk,Dachtylidi
6355,2021-01-29,Etta James,I Just Want To Make Love To You - Single Version
26326,2020-12-09,Angus Stone,Paper Aeroplane
8559,2020-02-10,Billie Eilish,ocean eyes
...,...,...,...
16847,2020-07-07,Phosphorescent,Song for Zula
5438,2021-01-25,William Onyeabor,When the Going is Smooth & Good
470,2021-01-02,Tara Brach,Meditation: A Listening Presence (2020-02-05)
23694,2020-11-12,Conspiracy Theories,The Phantom Dark Age Pt. 2


In [9]:
#Get a list of all the unique tracks in the table
unique_tracks = list(set(df['Track']))

In [10]:
#Get my personal token for collecting to my Spotify developer app
import spotipy.util as util

username = '12160047479'
client_id ='8d1803ecaf40420991ff24d8f4e0a701'
client_secret = '80574ff641d4444b9214f1a87974669f'
redirect_uri = 'http://localhost:7777/callback'
scope = 'user-read-recently-played'

token = util.prompt_for_user_token(username=username, 
                                   scope=scope, 
                                   client_id=client_id,   
                                   client_secret=client_secret,     
                                   redirect_uri=redirect_uri)

In [11]:
print(token)

BQDbS9ZT34Xc1gBCuwlEcRDwaSxgnl6z7Z1KjEcDTPOKW3XrK4C62C-ytrYZLaoR9VmHsejoJ1AGmIMUx3cPNtnWyWgKQ4HfwZygx6qBMoWIqUjgwa8nIqr9m3rKvVELCsOfkBXfQLr813Kkyc1nwGhU


In [12]:
#Function for getting the Track ID from Spotify given the track name and my token
import requests
def get_id(track_name: str, token: str) -> str:
    headers = {
    'Accept': 'application/json',
    'Content-Type': 'application/json',
    'Authorization': f'Bearer ' + token,
    }
    params = [
    ('q', track_name),
    ('type', 'track'),
    ]
    try:
        response = requests.get('https://api.spotify.com/v1/search', 
                    headers = headers, params = params, timeout = 5)
        json = response.json()
        first_result = json['tracks']['items'][0]
        track_id = first_result['id']
        return track_id
    except:
        return None

In [13]:
#Test function above with a specific song in the table
lonely_id = get_id('Lonely', token)
print(lonely_id)

2ZEq4HT450Ye9IFGPTl9qV


In [14]:
#Function for getting music features for each track 

import spotipy 
def get_features(track_id: str, token: str) -> dict:
    sp = spotipy.Spotify(auth=token)
    try:
        features = sp.audio_features([track_id])
        return features[0]
    except:
        return None

In [15]:
#Test function above
lonely_features = get_features(lonely_id, token)
print(lonely_features)

{'danceability': 0.697, 'energy': 0.921, 'key': 5, 'loudness': -4.283, 'mode': 1, 'speechiness': 0.0424, 'acousticness': 0.0946, 'instrumentalness': 1.9e-05, 'liveness': 0.159, 'valence': 0.722, 'tempo': 123.988, 'type': 'audio_features', 'id': '2ZEq4HT450Ye9IFGPTl9qV', 'uri': 'spotify:track:2ZEq4HT450Ye9IFGPTl9qV', 'track_href': 'https://api.spotify.com/v1/tracks/2ZEq4HT450Ye9IFGPTl9qV', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/2ZEq4HT450Ye9IFGPTl9qV', 'duration_ms': 190955, 'time_signature': 4}


In [18]:
#Create function to get all the features for each song in the table and save features in a list
all_features = {}
for track in unique_tracks:
    track_id = get_id(track, token)
    features = get_features(track_id, token)
    if features:
        all_features[track] = features
        
with_features = []
for track_name, features in all_features.items():
    with_features.append({'name': track_name, **features})

In [25]:
features = pd.DataFrame(with_features)
features

Unnamed: 0,name,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,Together Onetime,0.740,0.8280,7,-8.151,1,0.0698,0.00762,0.558000,0.1120,0.7110,124.021,audio_features,3rWSVB8ImDF3x1P22txCPe,spotify:track:3rWSVB8ImDF3x1P22txCPe,https://api.spotify.com/v1/tracks/3rWSVB8ImDF3...,https://api.spotify.com/v1/audio-analysis/3rWS...,419015,4
1,When You Were Mine,0.694,0.4440,9,-13.254,1,0.0648,0.00657,0.776000,0.0719,0.8790,143.078,audio_features,6Kbkge4WbvwWv1jVzSQsr8,spotify:track:6Kbkge4WbvwWv1jVzSQsr8,https://api.spotify.com/v1/tracks/6Kbkge4WbvwW...,https://api.spotify.com/v1/audio-analysis/6Kbk...,225547,4
2,Stuck on the puzzle,0.580,0.6820,4,-8.523,1,0.0332,0.07920,0.031600,0.1610,0.7740,77.530,audio_features,3cUxncrTWSA9lhlQbuIwUY,spotify:track:3cUxncrTWSA9lhlQbuIwUY,https://api.spotify.com/v1/tracks/3cUxncrTWSA9...,https://api.spotify.com/v1/audio-analysis/3cUx...,211053,4
3,Don't Let Me Down - Remastered 2009,0.593,0.2890,4,-11.049,1,0.0259,0.56800,0.000829,0.0928,0.7330,77.119,audio_features,3evG0BIqEFMMP7lVJh1cSf,spotify:track:3evG0BIqEFMMP7lVJh1cSf,https://api.spotify.com/v1/tracks/3evG0BIqEFMM...,https://api.spotify.com/v1/audio-analysis/3evG...,215733,4
4,Oli I Rempetes Tou Ntounia [1937] - Όλοι οι ρε...,0.596,0.5500,0,-3.985,0,0.0352,0.73400,0.005880,0.0878,0.8960,156.843,audio_features,0yjkutezuqVOUyVNUdO5Sc,spotify:track:0yjkutezuqVOUyVNUdO5Sc,https://api.spotify.com/v1/tracks/0yjkutezuqVO...,https://api.spotify.com/v1/audio-analysis/0yjk...,191533,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3018,My Ships,0.755,0.2860,0,-14.017,1,0.0411,0.77000,0.003080,0.1130,0.5910,71.456,audio_features,35UT9ZB7UROy8W8JvMmzMD,spotify:track:35UT9ZB7UROy8W8JvMmzMD,https://api.spotify.com/v1/tracks/35UT9ZB7UROy...,https://api.spotify.com/v1/audio-analysis/35UT...,165827,4
3019,SKETCH 7,0.256,0.0188,0,-28.463,1,0.0351,0.99100,0.962000,0.0622,0.0765,70.459,audio_features,0DFkHoN7FKjLPRQ1RIOhDR,spotify:track:0DFkHoN7FKjLPRQ1RIOhDR,https://api.spotify.com/v1/tracks/0DFkHoN7FKjL...,https://api.spotify.com/v1/audio-analysis/0DFk...,206967,5
3020,Dynata (Homecoming),0.573,0.7500,11,-11.073,0,0.0478,0.63100,0.004950,0.6540,0.6760,88.177,audio_features,18xYnRWJ8sSu9lFXI6uyX5,spotify:track:18xYnRWJ8sSu9lFXI6uyX5,https://api.spotify.com/v1/tracks/18xYnRWJ8sSu...,https://api.spotify.com/v1/audio-analysis/18xY...,214507,4
3021,Girls Just Wanna Have Some,0.667,0.6290,0,-8.493,1,0.0324,0.17600,0.117000,0.0915,0.4540,114.999,audio_features,1WM80A5a4xDtlndjqjZQIv,spotify:track:1WM80A5a4xDtlndjqjZQIv,https://api.spotify.com/v1/tracks/1WM80A5a4xDt...,https://api.spotify.com/v1/audio-analysis/1WM8...,223237,4


In [39]:
#Delete columns I will not be using
features.drop(columns=['type','uri','track_href','analysis_url','duration_ms'])

#Rename the track column to match the dataframe above
features = features.rename(columns={"name": "Track"})

In [40]:
features

Unnamed: 0,Track,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,Together Onetime,0.740,0.8280,7,-8.151,1,0.0698,0.00762,0.558000,0.1120,0.7110,124.021,audio_features,3rWSVB8ImDF3x1P22txCPe,spotify:track:3rWSVB8ImDF3x1P22txCPe,https://api.spotify.com/v1/tracks/3rWSVB8ImDF3...,https://api.spotify.com/v1/audio-analysis/3rWS...,419015,4
1,When You Were Mine,0.694,0.4440,9,-13.254,1,0.0648,0.00657,0.776000,0.0719,0.8790,143.078,audio_features,6Kbkge4WbvwWv1jVzSQsr8,spotify:track:6Kbkge4WbvwWv1jVzSQsr8,https://api.spotify.com/v1/tracks/6Kbkge4WbvwW...,https://api.spotify.com/v1/audio-analysis/6Kbk...,225547,4
2,Stuck on the puzzle,0.580,0.6820,4,-8.523,1,0.0332,0.07920,0.031600,0.1610,0.7740,77.530,audio_features,3cUxncrTWSA9lhlQbuIwUY,spotify:track:3cUxncrTWSA9lhlQbuIwUY,https://api.spotify.com/v1/tracks/3cUxncrTWSA9...,https://api.spotify.com/v1/audio-analysis/3cUx...,211053,4
3,Don't Let Me Down - Remastered 2009,0.593,0.2890,4,-11.049,1,0.0259,0.56800,0.000829,0.0928,0.7330,77.119,audio_features,3evG0BIqEFMMP7lVJh1cSf,spotify:track:3evG0BIqEFMMP7lVJh1cSf,https://api.spotify.com/v1/tracks/3evG0BIqEFMM...,https://api.spotify.com/v1/audio-analysis/3evG...,215733,4
4,Oli I Rempetes Tou Ntounia [1937] - Όλοι οι ρε...,0.596,0.5500,0,-3.985,0,0.0352,0.73400,0.005880,0.0878,0.8960,156.843,audio_features,0yjkutezuqVOUyVNUdO5Sc,spotify:track:0yjkutezuqVOUyVNUdO5Sc,https://api.spotify.com/v1/tracks/0yjkutezuqVO...,https://api.spotify.com/v1/audio-analysis/0yjk...,191533,4
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
3018,My Ships,0.755,0.2860,0,-14.017,1,0.0411,0.77000,0.003080,0.1130,0.5910,71.456,audio_features,35UT9ZB7UROy8W8JvMmzMD,spotify:track:35UT9ZB7UROy8W8JvMmzMD,https://api.spotify.com/v1/tracks/35UT9ZB7UROy...,https://api.spotify.com/v1/audio-analysis/35UT...,165827,4
3019,SKETCH 7,0.256,0.0188,0,-28.463,1,0.0351,0.99100,0.962000,0.0622,0.0765,70.459,audio_features,0DFkHoN7FKjLPRQ1RIOhDR,spotify:track:0DFkHoN7FKjLPRQ1RIOhDR,https://api.spotify.com/v1/tracks/0DFkHoN7FKjL...,https://api.spotify.com/v1/audio-analysis/0DFk...,206967,5
3020,Dynata (Homecoming),0.573,0.7500,11,-11.073,0,0.0478,0.63100,0.004950,0.6540,0.6760,88.177,audio_features,18xYnRWJ8sSu9lFXI6uyX5,spotify:track:18xYnRWJ8sSu9lFXI6uyX5,https://api.spotify.com/v1/tracks/18xYnRWJ8sSu...,https://api.spotify.com/v1/audio-analysis/18xY...,214507,4
3021,Girls Just Wanna Have Some,0.667,0.6290,0,-8.493,1,0.0324,0.17600,0.117000,0.0915,0.4540,114.999,audio_features,1WM80A5a4xDtlndjqjZQIv,spotify:track:1WM80A5a4xDtlndjqjZQIv,https://api.spotify.com/v1/tracks/1WM80A5a4xDt...,https://api.spotify.com/v1/audio-analysis/1WM8...,223237,4


In [41]:
#Merge the two dataframes using Track as key
result = pd.merge(df, features, on="Track")

In [44]:
#Save final table as csv locally
result.to_csv('myspotify.csv')

I will visualize the myspotify.csv file using Tableau.