<img src="./music_analysis.png" alt="Logo" style="width: 250px;"/>

# Personal Spotify Music Trends Analysis

[This Blog Post](https://www.ivaylopavlov.com/personal-spotify-music-trend-analysis/) contains a guide on how to set up a Developer App and authenticate the Web API requests to query successfully.

#### by Ivaylo Pavlov (12 Jan 2020)

In [221]:
pip install spotipy

Note: you may need to restart the kernel to use updated packages.


In [223]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import spotipy
from pandas.io.json import json_normalize

### Get Top Artists

In [224]:
username='<username>'
scope = 'user-library-read user-top-read'
client_id = '<client_id>'
client_secret = '<client_secret>'
redirect_uri = '<redirect_uri>'

tk = spotipy.util.prompt_for_user_token(username, scope, client_id, client_secret, redirect_uri)



            User authentication requires interaction with your
            web browser. Once you enter your credentials and
            give authorization, you will be redirected to
            a url.  Paste that url you were directed to to
            complete the authorization.

        
Opened https://accounts.spotify.com/authorize?client_id=aa500e662d85432e9b23239504676dd9&response_type=code&redirect_uri=http%3A%2F%2Flocalhost%2Fcallback&scope=user-library-read+user-top-read in your browser


Enter the URL you were redirected to: http://localhost/callback?code=AQDnCw9spxK7XGr-LyLYsIVDWcvODIoHF56mRm6OFBpbk5mVC01ppGq69jvkOZ1liE06UIE4D00lpXqZXP4T_l6dVAgaBGLcydpa61q8igYdTFVSbMUhsUkNp6ssPmfxxIbFATYx1PdsAZZ9UM8dUxtf_vR-8kY8W1QXomKAKQIqkpFPWuSCcg9sopCQKWI-lFK7gsvhz7YTvG1w57Axt3MrN40lf_XwCvzhFisxjd4J




In [225]:
sp = spotipy.Spotify(auth=tk)

In [226]:
ARTISTS_SAMPLE_SIZE=20
ARTISTS_OFFSET=0

In [227]:
st_artists = sp.current_user_top_artists(limit=ARTISTS_SAMPLE_SIZE, offset=ARTISTS_OFFSET, time_range='short_term')
mt_artists = sp.current_user_top_artists(limit=ARTISTS_SAMPLE_SIZE, offset=ARTISTS_OFFSET, time_range='medium_term')
lt_artists = sp.current_user_top_artists(limit=ARTISTS_SAMPLE_SIZE, offset=ARTISTS_OFFSET, time_range='long_term')

In [228]:
short_term_top_artists_df = json_normalize(st_artists["items"])
medium_term_top_artists_df = json_normalize(mt_artists["items"])
long_term_top_artists_df = json_normalize(lt_artists["items"])

In [229]:
short_term_top_artists_df = short_term_top_artists_df[['name','popularity','genres', 'followers.total', 'uri']].sort_values(by=['popularity'], ascending=False)
medium_term_top_artists_df = medium_term_top_artists_df[['name','popularity','genres', 'followers.total', 'uri']].sort_values(by=['popularity'], ascending=False)
long_term_top_artists_df = long_term_top_artists_df[['name','popularity','genres', 'followers.total', 'uri']].sort_values(by=['popularity'], ascending=False)

### Short-term Top Artists

In [230]:
short_term_top_artists_df.head(10)

Unnamed: 0,name,popularity,genres,followers.total,uri
6,Post Malone,100,"[dfw rap, melodic rap, rap]",21268403,spotify:artist:246dkjvS1zLTtiykXe5h60
19,Billie Eilish,98,"[electropop, pop]",18760441,spotify:artist:6qqNVTkY8uBg9cP3Jd7DAH
12,blackbear,90,"[pop, pop rap]",2701737,spotify:artist:2cFrymmkijnjDg9SS92EPM
3,Machine Gun Kelly,84,"[pop rap, rap]",1810247,spotify:artist:6TIYQ3jFPwQSRmorSezPxX
13,Joyner Lucas,81,"[boston hip hop, hip hop, pop rap, rap]",1000504,spotify:artist:6C1ohJrd5VydigQtaGy5Wa
8,TOOL,78,"[alternative metal, alternative rock, art rock...",1121789,spotify:artist:2yEwvVSSSUkcLeSTNyHKh8
7,Dennis Lloyd,76,[israeli pop],363825,spotify:artist:3EOEK57CV77D4ovYVcmiyt
9,Yelawolf,74,"[alabama rap, hip hop, pop rap, rap, southern ...",1229736,spotify:artist:68DWke2VjdDmA75aJX5C57
0,Chris Webby,70,"[deep underground hip hop, indie pop rap, pop ...",302284,spotify:artist:3IstlZaHyUP9SONpulb4SM
4,Jungle,69,"[indie soul, indietronica, uk contemporary r&b]",447681,spotify:artist:59oA5WbbQvomJz2BuRG071


### Medium-term Top Artists

In [231]:
medium_term_top_artists_df.head(10)

Unnamed: 0,name,popularity,genres,followers.total,uri
1,Post Malone,100,"[dfw rap, melodic rap, rap]",21268403,spotify:artist:246dkjvS1zLTtiykXe5h60
12,Ariana Grande,97,"[dance pop, pop, post-teen pop]",40870799,spotify:artist:66CXWjxzNUsdJxJ2JdwvnR
13,Kanye West,94,"[chicago rap, rap]",10985124,spotify:artist:5K4W6rqBFWDnAN6FQUkS6x
0,Labrinth,83,[pop],1023234,spotify:artist:2feDdbD5araYcm6JhFHHw7
8,George Ezra,81,"[folk-pop, neo-singer-songwriter, pop]",2872697,spotify:artist:2ysnwxxNtSgbb9t1m2Ur4j
18,AJR,79,[modern rock],1093587,spotify:artist:6s22t5Y3prQHyaHWUN1R1C
15,The Vamps,79,"[boy band, dance pop, pop, post-teen pop, trop...",3293560,spotify:artist:7gAppWoH7pcYmphCVTXkzs
2,Chris Webby,70,"[deep underground hip hop, indie pop rap, pop ...",302284,spotify:artist:3IstlZaHyUP9SONpulb4SM
5,Liam Gallagher,69,"[britpop, modern rock, rock]",555821,spotify:artist:6sN51vEARnAAdBw1IKZ8Q9
4,The Faim,61,[australian alternative rock],38586,spotify:artist:6VsU92soWFLtVsSP65rkrN


### Long-term Top Artists

In [232]:
long_term_top_artists_df.head(10)

Unnamed: 0,name,popularity,genres,followers.total,uri
4,Post Malone,100,"[dfw rap, melodic rap, rap]",21268403,spotify:artist:246dkjvS1zLTtiykXe5h60
5,J Balvin,99,"[latin, reggaeton]",17070424,spotify:artist:1vyhD5VmyZ7KMfW5gqLgo5
13,Ed Sheeran,98,"[pop, uk pop]",57349360,spotify:artist:6eUKZXaKkcviH0Ku9w2n3V
17,Camila Cabello,96,"[dance pop, pop, post-teen pop]",14204502,spotify:artist:4nDoRrQiYLoBzwC5BhVJzF
19,The Chainsmokers,91,"[electropop, pop, tropical house]",15116376,spotify:artist:69GGBxA162lTqCwzJG5jLp
12,David Guetta,90,"[dance pop, edm, pop, tropical house]",20788110,spotify:artist:1Cs0zKBU1kc0i8ypK3B9ai
10,Sia,90,"[australian dance, australian pop, dance pop, ...",15290212,spotify:artist:5WUlDfRSoLAfcVSX1WnrxN
6,G-Eazy,87,"[hip hop, indie pop rap, pop rap, rap]",4232632,spotify:artist:02kJSzxNuaWGqwubyUba0Z
7,R3HAB,85,"[big room, dance pop, dutch house, edm, electr...",1937236,spotify:artist:6cEuCEZu7PAE9ZSzLLc2oQ
3,Labrinth,83,[pop],1023234,spotify:artist:2feDdbD5araYcm6JhFHHw7


### Get Top 200 Tracks' Features Data For the 3 Time frames

In [233]:
TRACKS_SAMPLE_SIZE = 1000
TRACKS_OFFSET = 0

In [234]:
st_tracks = sp.current_user_top_tracks(limit=TRACKS_SAMPLE_SIZE, offset=TRACKS_OFFSET, time_range='short_term')
mt_tracks = sp.current_user_top_tracks(limit=TRACKS_SAMPLE_SIZE, offset=TRACKS_OFFSET, time_range='medium_term')
lt_tracks = sp.current_user_top_tracks(limit=TRACKS_SAMPLE_SIZE, offset=TRACKS_OFFSET, time_range='long_term')

In [235]:
short_term_top_tracks_df = json_normalize(st_tracks["items"])
medium_term_top_tracks_df = json_normalize(mt_tracks["items"])
long_term_top_tracks_df = json_normalize(lt_tracks["items"])

In [236]:
properNameFunc = lambda raw_col_value: raw_col_value[0]['name']

short_term_top_tracks_df = short_term_top_tracks_df[['artists', 'name','popularity', 'id', 'duration_ms', 'album.name', 'album.release_date']].sort_values(by=['popularity'], ascending=False)
medium_term_top_tracks_df = medium_term_top_tracks_df[['artists', 'name','popularity', 'id', 'duration_ms', 'album.name', 'album.release_date']].sort_values(by=['popularity'], ascending=False)
long_term_top_tracks_df = long_term_top_tracks_df[['artists', 'name','popularity', 'id', 'duration_ms', 'album.name', 'album.release_date']].sort_values(by=['popularity'], ascending=False)

short_term_top_tracks_df['artists'] = short_term_top_tracks_df['artists'].apply(properNameFunc)
medium_term_top_tracks_df['artists'] = medium_term_top_tracks_df['artists'].apply(properNameFunc)
long_term_top_tracks_df['artists'] = long_term_top_tracks_df['artists'].apply(properNameFunc)

### Short-term Top Tracks

In [237]:
short_term_top_tracks_df.head()

Unnamed: 0,artists,name,popularity,id,duration_ms,album.name,album.release_date
7,NF,PAID MY DUES,78,18czZN7uruOjftj71Kt8oj,211643,PAID MY DUES,2019-12-03
16,Tame Impala,Borderline,76,3O8X1DE9btbzy4UH9cSX9a,274293,Borderline,2019-04-12
5,Tame Impala,Posthumous Forgiveness,74,3rQ0ZaLrkLv8HhAAKSbAC0,366066,Posthumous Forgiveness,2019-12-03
38,Hailee Steinfeld,Afterlife,71,6PFyGdWw0qnstR8KJAWrrr,209200,Afterlife (Dickinson),2019-09-19
17,Dimitri Vegas & Like Mike,Boomshakalaka - Dimitri Vegas & Like Mike vs. ...,71,3o7TK4inMYzJC3ZMJCWsRC,211519,Boomshakalaka (Dimitri Vegas & Like Mike vs. A...,2019-11-29


### Medium-term Top Tracks

In [238]:
medium_term_top_tracks_df.head()

Unnamed: 0,artists,name,popularity,id,duration_ms,album.name,album.release_date
7,Post Malone,Circles,98,21jGcNKet2qwijlDFuPiPb,215280,Hollywood's Bleeding,2019-09-06
30,Travis Scott,HIGHEST IN THE ROOM,94,3eekarcy7kvN4yt5ZFzltW,175720,HIGHEST IN THE ROOM,2019-10-04
0,SHAED,Trampoline (with ZAYN),91,1iQDltZqI7BXnHrFy4Qo1k,184280,Trampoline (with ZAYN),2019-09-26
1,Post Malone,Hollywood's Bleeding,85,7sWRlDoTDX8geTR8zzr2vt,156266,Hollywood's Bleeding,2019-09-06
15,Post Malone,Die For Me (feat. Future & Halsey),82,2C6WXnmZ66tHhHlnvwePiK,245266,Hollywood's Bleeding,2019-09-06


### Long-term Top Tracks

In [239]:
long_term_top_tracks_df.head()

Unnamed: 0,artists,name,popularity,id,duration_ms,album.name,album.release_date
9,SHAED,Trampoline (with ZAYN),91,1iQDltZqI7BXnHrFy4Qo1k,184280,Trampoline (with ZAYN),2019-09-26
21,Post Malone,Hollywood's Bleeding,85,7sWRlDoTDX8geTR8zzr2vt,156266,Hollywood's Bleeding,2019-09-06
40,Logic,Everyday,77,4EAV2cKiqKP5UPZmY6dejk,204746,Bobby Tarantino II,2018-03-09
34,Madison Beer,Home with You,73,0iwsQWgtjSq2kUXuZwTDAL,190576,As She Pleases,2018-02-02
20,Bryce Vine,Drew Barrymore,73,0OgGn1ofaj55l2PcihQQGV,191146,Drew Barrymore,2017-12-01


### Fetch the top 1000 songs for each time frame and calculate the weighted average

In [241]:
trends_set = [
    { 'ref': long_term_top_tracks_df,   'term': 'Long'   },
    { 'ref': medium_term_top_tracks_df, 'term': 'Medium' },
    { 'ref': short_term_top_tracks_df,  'term': 'Short'  }
]

trend_loudness = {}
trend_tempo = {}
trend_mode = {}
trend_danceability = {}
trend_energy = {}
trend_speechiness = {}
trend_acousticness = {}
trend_instrumentalness = {}
trend_liveness = {}
trend_valence = {}

for df_item in trends_set:
    dict_data = df_item['ref'].to_dict(orient='index')
    track_popularity = {v['id']: v['popularity'] for k,v in dict_data.items()}

    try:
        tracks_list_features = sp.audio_features(track_popularity.keys())
    except:
        print("Couldn't get Track Features Analytics")
        continue
        tracks_list_features
    track_analytics = None        
    collect_column_name = True
    for a in tracks_list_features:
        a['popularity'] = track_popularity[a['id']]
        
        if collect_column_name:
            track_analytics = pd.DataFrame(columns=list(a.keys()))
            collect_column_name = False
        track_analytics = track_analytics.append(a, ignore_index=True)
        
    trend_loudness[df_item['term']] = np.average(track_analytics['loudness'], weights=track_analytics['popularity'])
    trend_tempo[df_item['term']] = np.average(track_analytics['tempo'], weights=track_analytics['popularity'])
    trend_mode[df_item['term']] = np.average(track_analytics['mode'], weights=track_analytics['popularity'])
    trend_danceability[df_item['term']] = np.average(track_analytics['danceability'], weights=track_analytics['popularity'])
    trend_energy[df_item['term']] = np.average(track_analytics['energy'], weights=track_analytics['popularity'])
    trend_speechiness[df_item['term']] = np.average(track_analytics['speechiness'], weights=track_analytics['popularity'])
    trend_acousticness[df_item['term']] = np.average(track_analytics['acousticness'], weights=track_analytics['popularity'])
    trend_instrumentalness[df_item['term']] = np.average(track_analytics['instrumentalness'], weights=track_analytics['popularity'])
    trend_liveness[df_item['term']] = np.average(track_analytics['liveness'], weights=track_analytics['popularity'])
    trend_valence[df_item['term']] = np.average(track_analytics['valence'], weights=track_analytics['popularity'])

### Short-term Track List Features Table

In [242]:
track_analytics.head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,popularity
0,0.812,0.784,2,-4.208,1,0.248,0.136,0.0,0.109,0.38,145.995,audio_features,18czZN7uruOjftj71Kt8oj,spotify:track:18czZN7uruOjftj71Kt8oj,https://api.spotify.com/v1/tracks/18czZN7uruOj...,https://api.spotify.com/v1/audio-analysis/18cz...,211644,4,23
1,0.598,0.71,0,-6.839,0,0.0272,0.0476,1.77e-05,0.1,0.726,97.976,audio_features,3O8X1DE9btbzy4UH9cSX9a,spotify:track:3O8X1DE9btbzy4UH9cSX9a,https://api.spotify.com/v1/tracks/3O8X1DE9btbz...,https://api.spotify.com/v1/audio-analysis/3O8X...,274294,4,23
2,0.228,0.615,7,-5.626,1,0.0328,0.0487,0.00293,0.327,0.12,166.035,audio_features,3rQ0ZaLrkLv8HhAAKSbAC0,spotify:track:3rQ0ZaLrkLv8HhAAKSbAC0,https://api.spotify.com/v1/tracks/3rQ0ZaLrkLv8...,https://api.spotify.com/v1/audio-analysis/3rQ0...,366067,4,23
3,0.673,0.61,1,-4.73,0,0.0356,0.00662,0.0,0.0572,0.343,122.063,audio_features,6PFyGdWw0qnstR8KJAWrrr,spotify:track:6PFyGdWw0qnstR8KJAWrrr,https://api.spotify.com/v1/tracks/6PFyGdWw0qns...,https://api.spotify.com/v1/audio-analysis/6PFy...,209200,4,23
4,0.752,0.82,1,-2.976,1,0.0461,0.0134,0.000804,0.0692,0.822,95.987,audio_features,3o7TK4inMYzJC3ZMJCWsRC,spotify:track:3o7TK4inMYzJC3ZMJCWsRC,https://api.spotify.com/v1/tracks/3o7TK4inMYzJ...,https://api.spotify.com/v1/audio-analysis/3o7T...,211520,4,23


## Audio Features Analysis Results

Descriptions below are taken from Spotify's API description - [here](https://developer.spotify.com/documentation/web-api/reference/tracks/get-audio-features/)

__Loundness__ - The overall loudness in decibels (dB).

__Tempo__ - The overall estimated tempo of the section in beats per minute (BPM).

__Mode__ - Indicates the modality (major or minor) of a track, the type of scale from which its melodic content is derived. 0 for “minor”, a 1 for “major”

__Acousticness__ - A confidence measure from 0.0 to 1.0 of whether the track is acoustic. 1.0 represents high confidence the track is acoustic.

__Danceability__ - Danceability describes how suitable a track is for dancing based on a combination of musical elements including tempo, rhythm stability, beat strength, and overall regularity. A value of 0.0 is least danceable and 1.0 is most danceable.

__Energy__ - Energy is a measure from 0.0 to 1.0 and represents a perceptual measure of intensity and activity. Typically, energetic tracks feel fast, loud, and noisy. 

__Instrumentalness__ - 	Predicts whether a track contains no vocals. “Ooh” and “aah” sounds are treated as instrumental in this context. Rap or spoken word tracks are clearly “vocal”. The closer the instrumentalness value is to 1.0, the greater likelihood the track contains no vocal content. Values above 0.5 are intended to represent instrumental tracks, but confidence is higher as the value approaches 1.0

__Liveness__ - Detects the presence of an audience in the recording. Higher liveness values represent an increased probability that the track was performed live. A value above 0.8 provides strong likelihood that the track is live. 

__Speechiness__ - Speechiness detects the presence of spoken words in a track. The more exclusively speech-like the recording (e.g. talk show, audio book, poetry), the closer to 1.0 the attribute value. Values above 0.66 describe tracks that are probably made entirely of spoken words. Values between 0.33 and 0.66 describe tracks that may contain both music and speech, either in sections or layered, including such cases as rap music

__Valence__ - A measure from 0.0 to 1.0 describing the musical positiveness conveyed by a track. Tracks with high valence sound more positive, while tracks with low valence sound more negative.

## Results

In [243]:
results_analytics = pd.DataFrame(columns=['Long', 'Medium', 'Short'])
results_analytics = results_analytics.append(trend_loudness, ignore_index=True)
results_analytics = results_analytics.append(trend_tempo, ignore_index=True)
results_analytics = results_analytics.append(trend_mode, ignore_index=True)
results_analytics = results_analytics.append(trend_danceability, ignore_index=True)
results_analytics = results_analytics.append(trend_energy, ignore_index=True)
results_analytics = results_analytics.append(trend_speechiness, ignore_index=True)
results_analytics = results_analytics.append(trend_acousticness, ignore_index=True)
results_analytics = results_analytics.append(trend_instrumentalness, ignore_index=True)
results_analytics = results_analytics.append(trend_liveness, ignore_index=True)
results_analytics = results_analytics.append(trend_valence, ignore_index=True)
results_analytics.index = ['Loudness', 'Tempo', 'Modality', 'Danceability', 'Energy', 'Speechiness', 'Acousticness', 'Instrumentalness', 'Liveness', 'Valence']
results_analytics

Unnamed: 0,Long,Medium,Short
Loudness,-6.4836,-6.00914,-6.99
Tempo,125.86322,121.42226,115.96248
Modality,0.4,0.42,0.38
Danceability,0.6963,0.62812,0.66182
Energy,0.6412,0.6515,0.63808
Speechiness,0.126974,0.10553,0.141792
Acousticness,0.162564,0.226112,0.231854
Instrumentalness,0.015104,0.002905,0.003496
Liveness,0.181488,0.19299,0.175538
Valence,0.476496,0.463896,0.500322
