Following this tutorial:
    - https://towardsdatascience.com/extracting-song-data-from-the-spotify-api-using-python-b1e79388d50

To fetch some data from the Spotify API.
Open the website in Incognito mode to read it as many times as you want.

Here the steps:
- Part I: (This article)
- Part II: EDA and Clustering
- Part III: Building a Song Recommendation System with Spotify
- Part IV: Deploying a Spotify Recommendation Model with Flask

This file takes into account Part I.

In [1]:
# IMPORT PACKAGES
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
from utils import config  # it hosts the credentials for the Spotify API
import pandas as pd
import numpy as np

In [2]:
# Authentication - without user login
# All we need are the IDs, client and secret. Then, we can create our "Spotify" object.
client_credentials_manager = SpotifyClientCredentials(
    client_id=config.CLIENT_ID, client_secret=config.CLIENT_SECRET
)

In [3]:
# Spotipy object
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

In [4]:
# Extracting Tracks From a Playlist
playlist_link = (
    # "https://open.spotify.com/playlist/37i9dQZEVXbNG2KDcFcKOF?si=1333723a6eff4b7f"
    "https://open.spotify.com/playlist/4RGIAb8NJ8sp7j2fqQFYTN?si=717f665607754a8a"
)

In [5]:
# Split the link in multiple elements (https, //, open...)
playlist_URI = playlist_link.split("/")[-1].split("?")[0]

In [6]:
playlist_URI

'37i9dQZEVXbNG2KDcFcKOF'

Get the uri of each track element inside the list.
Why: A Uniform Resource Identifier (URI) is a unique sequence of 
characters that identifies a logical or physical resource used by web technologies.

In [7]:
# for each track inside the playlist get the URI
track_uris = [x["track"]["uri"] for x in sp.playlist_tracks(playlist_URI)["items"]]

In [8]:
track_uris

['spotify:track:3k79jB4aGmMDUQzEwa46Rz',
 'spotify:track:1BxfuPKGuaTgP7aM0Bbdwr',
 'spotify:track:3qQbCzHBycnDpGskqOWY0E',
 'spotify:track:7ro0hRteUMfnOioTFI5TG1',
 'spotify:track:4DHcnVTT87F0zZhRPYmZ3B',
 'spotify:track:2UW7JaomAMuX9pZrjVpHAU',
 'spotify:track:6pD0ufEQq0xdHSsRbg9LBK',
 'spotify:track:1odExI7RdWc4BT515LTAwj',
 'spotify:track:4Dvkj6JhhA12EX05fT7y2e',
 'spotify:track:1Qrg8KqiBpW07V7PNxwwwL',
 'spotify:track:7FbrGaHYVDmfr7KoLIZnQ7',
 'spotify:track:4eMKD8MRroxCqugpsxCCNb',
 'spotify:track:7mXuWTczZNxG5EDcjFEuJR',
 'spotify:track:2FDTHlrBguDzQkp7PVj16Q',
 'spotify:track:7ABLbnD53cQK00mhcaOUVG',
 'spotify:track:6XSqqQIy7Lm7SnwxS4NrGx',
 'spotify:track:1UMm1Qs3u59Wvk53zBUE8r',
 'spotify:track:5AqiaZwhmC6dIbgWrD5SzV',
 'spotify:track:3Ua0m0YmEjrMi9XErKcNiR',
 'spotify:track:5XeFesFbtLpXzIVDNQP22n',
 'spotify:track:1s7oOCT8vauUh01PbJD6ps',
 'spotify:track:368eeEO3Y2uZUQ6S5oIjcu',
 'spotify:track:4rXLjWdF2ZZpXCVTfWcshS',
 'spotify:track:0DWdj2oZMBFSzRsi2Cvfzf',
 'spotify:track:

While we're here, we can also extract the name of each track, 
the name of the album that it belongs to, and the popularity 
of the track (which we expect to be high in this case — 
we're looking at the most popular songs globally). 
From the artist, we can find a genre (though not airtight — 
artists can make songs in multiple genres), and an artist popularity score.

In [9]:
# tracks = sp.playlist_tracks(playlist_URI)["items"]
data = [
    {
        "track_uri": track["track"]["uri"],
        "track_name": track["track"]["name"],
        "artist_uri": track["track"]["artists"][0]["uri"],
        "artist_info": sp.artist(track["track"]["artists"][0]["uri"]),
        "artist_name": track["track"]["artists"][0]["name"],
        "artist_pop": sp.artist(track["track"]["artists"][0]["uri"])["popularity"],
        "artist_genres": sp.artist(track["track"]["artists"][0]["uri"])["genres"],
        "album": track["track"]["album"]["name"],
        "track_pop": track["track"]["popularity"]
    }
    for track in sp.playlist_tracks(playlist_URI)["items"]
]

In [10]:
data

[{'track_uri': 'spotify:track:3k79jB4aGmMDUQzEwa46Rz',
  'track_name': 'vampire',
  'artist_uri': 'spotify:artist:1McMsnEElThX1knmY4oliG',
  'artist_info': {'external_urls': {'spotify': 'https://open.spotify.com/artist/1McMsnEElThX1knmY4oliG'},
   'followers': {'href': None, 'total': 26234930},
   'genres': ['pop'],
   'href': 'https://api.spotify.com/v1/artists/1McMsnEElThX1knmY4oliG',
   'id': '1McMsnEElThX1knmY4oliG',
   'images': [{'height': 640,
     'url': 'https://i.scdn.co/image/ab6761610000e5eb977ea0d43b234fefb825f480',
     'width': 640},
    {'height': 320,
     'url': 'https://i.scdn.co/image/ab67616100005174977ea0d43b234fefb825f480',
     'width': 320},
    {'height': 160,
     'url': 'https://i.scdn.co/image/ab6761610000f178977ea0d43b234fefb825f480',
     'width': 160}],
   'name': 'Olivia Rodrigo',
   'popularity': 84,
   'type': 'artist',
   'uri': 'spotify:artist:1McMsnEElThX1knmY4oliG'},
  'artist_name': 'Olivia Rodrigo',
  'artist_pop': 84,
  'artist_genres': ['pop']

In [11]:
# test an example
print(sp.audio_features(data[0]['track_uri']))

[{'danceability': 0.511, 'energy': 0.532, 'key': 5, 'loudness': -5.745, 'mode': 1, 'speechiness': 0.056, 'acousticness': 0.169, 'instrumentalness': 0, 'liveness': 0.311, 'valence': 0.322, 'tempo': 137.827, 'type': 'audio_features', 'id': '3k79jB4aGmMDUQzEwa46Rz', 'uri': 'spotify:track:3k79jB4aGmMDUQzEwa46Rz', 'track_href': 'https://api.spotify.com/v1/tracks/3k79jB4aGmMDUQzEwa46Rz', 'analysis_url': 'https://api.spotify.com/v1/audio-analysis/3k79jB4aGmMDUQzEwa46Rz', 'duration_ms': 219724, 'time_signature': 4}]


In [12]:
# Collect the audio features: the first for loop get the index, the second for loop goes through each dict
# and retrieves the feature values based on the 'track_uri' of the song
audio_features = [feature for idx in range(len(data)) for feature in sp.audio_features(data[idx]['track_uri'])]

In [13]:
# convert dict data to dataframe
df_artist = pd.DataFrame(data)
# df_artist = df_artist.rename(columns={"artist_uri": "uri"})
# set the index to artist_uri in order to join with the audio_features dataset
# df_artist = df_artist.set_index('uri')
df_artist.head(5)

Unnamed: 0,track_uri,track_name,artist_uri,artist_info,artist_name,artist_pop,artist_genres,album,track_pop
0,spotify:track:3k79jB4aGmMDUQzEwa46Rz,vampire,spotify:artist:1McMsnEElThX1knmY4oliG,{'external_urls': {'spotify': 'https://open.sp...,Olivia Rodrigo,84,[pop],vampire,97
1,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,Cruel Summer,spotify:artist:06HL4z0CvFAxyc27GXpf02,{'external_urls': {'spotify': 'https://open.sp...,Taylor Swift,100,[pop],Lover,99
2,spotify:track:3qQbCzHBycnDpGskqOWY0E,Ella Baila Sola,spotify:artist:0XeEobZplHxzM9QzFQWLiR,{'external_urls': {'spotify': 'https://open.sp...,Eslabon Armado,83,"[corrido, corridos tumbados, sad sierreno, sie...",DESVELADO,93
3,spotify:track:7ro0hRteUMfnOioTFI5TG1,WHERE SHE GOES,spotify:artist:4q3ewBCX7sLwd24euuV69X,{'external_urls': {'spotify': 'https://open.sp...,Bad Bunny,95,"[reggaeton, trap latino, urbano latino]",WHERE SHE GOES,100
4,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,Flowers,spotify:artist:5YGY8feqx7naU7z4HrwZM6,{'external_urls': {'spotify': 'https://open.sp...,Miley Cyrus,85,[pop],Endless Summer Vacation,91


In [14]:
# convert audio_features to dataframe
df_audio_features = pd.DataFrame(audio_features)
# set the index to artist_uri in order to join with the audio_features dataset
# df_audio_features = df_audio_features.set_index('uri')
df_audio_features.head(5)

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.511,0.532,5,-5.745,1,0.056,0.169,0.0,0.311,0.322,137.827,audio_features,3k79jB4aGmMDUQzEwa46Rz,spotify:track:3k79jB4aGmMDUQzEwa46Rz,https://api.spotify.com/v1/tracks/3k79jB4aGmMD...,https://api.spotify.com/v1/audio-analysis/3k79...,219724,4
1,0.552,0.702,9,-5.707,1,0.157,0.117,2.1e-05,0.105,0.564,169.994,audio_features,1BxfuPKGuaTgP7aM0Bbdwr,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,https://api.spotify.com/v1/tracks/1BxfuPKGuaTg...,https://api.spotify.com/v1/audio-analysis/1Bxf...,178427,4
2,0.668,0.758,5,-5.176,0,0.0332,0.483,1.9e-05,0.0837,0.834,147.989,audio_features,3qQbCzHBycnDpGskqOWY0E,spotify:track:3qQbCzHBycnDpGskqOWY0E,https://api.spotify.com/v1/tracks/3qQbCzHBycnD...,https://api.spotify.com/v1/audio-analysis/3qQb...,165671,3
3,0.652,0.8,9,-4.019,0,0.0614,0.143,0.629,0.112,0.234,143.978,audio_features,7ro0hRteUMfnOioTFI5TG1,spotify:track:7ro0hRteUMfnOioTFI5TG1,https://api.spotify.com/v1/tracks/7ro0hRteUMfn...,https://api.spotify.com/v1/audio-analysis/7ro0...,231704,4
4,0.707,0.681,0,-4.325,1,0.0668,0.0632,5e-06,0.0322,0.646,117.999,audio_features,4DHcnVTT87F0zZhRPYmZ3B,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,https://api.spotify.com/v1/tracks/4DHcnVTT87F0...,https://api.spotify.com/v1/audio-analysis/4DHc...,200455,4


In [19]:
# join the two dataframes so to have all the information in one on the index values
df_artist_track_merged = pd.merge(df_artist, df_audio_features, left_on='track_uri', right_on='uri')
df_artist_track_merged.head()

Unnamed: 0,track_uri,track_name,artist_uri,artist_info,artist_name,artist_pop,artist_genres,album,track_pop,danceability,...,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,spotify:track:3k79jB4aGmMDUQzEwa46Rz,vampire,spotify:artist:1McMsnEElThX1knmY4oliG,{'external_urls': {'spotify': 'https://open.sp...,Olivia Rodrigo,84,[pop],vampire,97,0.511,...,0.311,0.322,137.827,audio_features,3k79jB4aGmMDUQzEwa46Rz,spotify:track:3k79jB4aGmMDUQzEwa46Rz,https://api.spotify.com/v1/tracks/3k79jB4aGmMD...,https://api.spotify.com/v1/audio-analysis/3k79...,219724,4
1,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,Cruel Summer,spotify:artist:06HL4z0CvFAxyc27GXpf02,{'external_urls': {'spotify': 'https://open.sp...,Taylor Swift,100,[pop],Lover,99,0.552,...,0.105,0.564,169.994,audio_features,1BxfuPKGuaTgP7aM0Bbdwr,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,https://api.spotify.com/v1/tracks/1BxfuPKGuaTg...,https://api.spotify.com/v1/audio-analysis/1Bxf...,178427,4
2,spotify:track:3qQbCzHBycnDpGskqOWY0E,Ella Baila Sola,spotify:artist:0XeEobZplHxzM9QzFQWLiR,{'external_urls': {'spotify': 'https://open.sp...,Eslabon Armado,83,"[corrido, corridos tumbados, sad sierreno, sie...",DESVELADO,93,0.668,...,0.0837,0.834,147.989,audio_features,3qQbCzHBycnDpGskqOWY0E,spotify:track:3qQbCzHBycnDpGskqOWY0E,https://api.spotify.com/v1/tracks/3qQbCzHBycnD...,https://api.spotify.com/v1/audio-analysis/3qQb...,165671,3
3,spotify:track:7ro0hRteUMfnOioTFI5TG1,WHERE SHE GOES,spotify:artist:4q3ewBCX7sLwd24euuV69X,{'external_urls': {'spotify': 'https://open.sp...,Bad Bunny,95,"[reggaeton, trap latino, urbano latino]",WHERE SHE GOES,100,0.652,...,0.112,0.234,143.978,audio_features,7ro0hRteUMfnOioTFI5TG1,spotify:track:7ro0hRteUMfnOioTFI5TG1,https://api.spotify.com/v1/tracks/7ro0hRteUMfn...,https://api.spotify.com/v1/audio-analysis/7ro0...,231704,4
4,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,Flowers,spotify:artist:5YGY8feqx7naU7z4HrwZM6,{'external_urls': {'spotify': 'https://open.sp...,Miley Cyrus,85,[pop],Endless Summer Vacation,91,0.707,...,0.0322,0.646,117.999,audio_features,4DHcnVTT87F0zZhRPYmZ3B,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,https://api.spotify.com/v1/tracks/4DHcnVTT87F0...,https://api.spotify.com/v1/audio-analysis/4DHc...,200455,4


In [20]:
df_artist_track_merged.columns

Index(['track_uri', 'track_name', 'artist_uri', 'artist_info', 'artist_name',
       'artist_pop', 'artist_genres', 'album', 'track_pop', 'danceability',
       'energy', 'key', 'loudness', 'mode', 'speechiness', 'acousticness',
       'instrumentalness', 'liveness', 'valence', 'tempo', 'type', 'id', 'uri',
       'track_href', 'analysis_url', 'duration_ms', 'time_signature'],
      dtype='object')

In [23]:
# keep only the useful columns
df_artist_track_merged[['track_uri', 'track_name', 'artist_uri', 'artist_info', 'artist_name',
       'artist_genres', 'album', 'danceability', 'energy', 'loudness', 'speechiness', 'acousticness',
       'instrumentalness', 'liveness', 'valence', 'tempo', 'type', 'id', 'uri',
       'track_href']]

Unnamed: 0,track_uri,track_name,artist_uri,artist_info,artist_name,artist_genres,album,danceability,energy,loudness,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href
0,spotify:track:3k79jB4aGmMDUQzEwa46Rz,vampire,spotify:artist:1McMsnEElThX1knmY4oliG,{'external_urls': {'spotify': 'https://open.sp...,Olivia Rodrigo,[pop],vampire,0.511,0.532,-5.745,0.056,0.169,0.0,0.311,0.322,137.827,audio_features,3k79jB4aGmMDUQzEwa46Rz,spotify:track:3k79jB4aGmMDUQzEwa46Rz,https://api.spotify.com/v1/tracks/3k79jB4aGmMD...
1,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,Cruel Summer,spotify:artist:06HL4z0CvFAxyc27GXpf02,{'external_urls': {'spotify': 'https://open.sp...,Taylor Swift,[pop],Lover,0.552,0.702,-5.707,0.157,0.117,2.1e-05,0.105,0.564,169.994,audio_features,1BxfuPKGuaTgP7aM0Bbdwr,spotify:track:1BxfuPKGuaTgP7aM0Bbdwr,https://api.spotify.com/v1/tracks/1BxfuPKGuaTg...
2,spotify:track:3qQbCzHBycnDpGskqOWY0E,Ella Baila Sola,spotify:artist:0XeEobZplHxzM9QzFQWLiR,{'external_urls': {'spotify': 'https://open.sp...,Eslabon Armado,"[corrido, corridos tumbados, sad sierreno, sie...",DESVELADO,0.668,0.758,-5.176,0.0332,0.483,1.9e-05,0.0837,0.834,147.989,audio_features,3qQbCzHBycnDpGskqOWY0E,spotify:track:3qQbCzHBycnDpGskqOWY0E,https://api.spotify.com/v1/tracks/3qQbCzHBycnD...
3,spotify:track:7ro0hRteUMfnOioTFI5TG1,WHERE SHE GOES,spotify:artist:4q3ewBCX7sLwd24euuV69X,{'external_urls': {'spotify': 'https://open.sp...,Bad Bunny,"[reggaeton, trap latino, urbano latino]",WHERE SHE GOES,0.652,0.8,-4.019,0.0614,0.143,0.629,0.112,0.234,143.978,audio_features,7ro0hRteUMfnOioTFI5TG1,spotify:track:7ro0hRteUMfnOioTFI5TG1,https://api.spotify.com/v1/tracks/7ro0hRteUMfn...
4,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,Flowers,spotify:artist:5YGY8feqx7naU7z4HrwZM6,{'external_urls': {'spotify': 'https://open.sp...,Miley Cyrus,[pop],Endless Summer Vacation,0.707,0.681,-4.325,0.0668,0.0632,5e-06,0.0322,0.646,117.999,audio_features,4DHcnVTT87F0zZhRPYmZ3B,spotify:track:4DHcnVTT87F0zZhRPYmZ3B,https://api.spotify.com/v1/tracks/4DHcnVTT87F0...
5,spotify:track:2UW7JaomAMuX9pZrjVpHAU,La Bebe - Remix,spotify:artist:1NNRWkhwmcXRimFYSBpB1y,{'external_urls': {'spotify': 'https://open.sp...,Yng Lvcas,[reggaeton],La Bebe (Remix),0.812,0.479,-5.678,0.333,0.213,1e-06,0.0756,0.559,169.922,audio_features,2UW7JaomAMuX9pZrjVpHAU,spotify:track:2UW7JaomAMuX9pZrjVpHAU,https://api.spotify.com/v1/tracks/2UW7JaomAMuX...
6,spotify:track:6pD0ufEQq0xdHSsRbg9LBK,un x100to,spotify:artist:6XkjpgcEsYab502Vr1bBeW,{'external_urls': {'spotify': 'https://open.sp...,Grupo Frontera,[musica chihuahuense],un x100to,0.569,0.724,-4.076,0.0474,0.228,0.0,0.27,0.562,83.118,audio_features,6pD0ufEQq0xdHSsRbg9LBK,spotify:track:6pD0ufEQq0xdHSsRbg9LBK,https://api.spotify.com/v1/tracks/6pD0ufEQq0xd...
7,spotify:track:1odExI7RdWc4BT515LTAwj,Daylight,spotify:artist:33NVpKoXjItPwUJTMZIOiY,{'external_urls': {'spotify': 'https://open.sp...,David Kushner,"[gen z singer-songwriter, singer-songwriter pop]",Daylight,0.508,0.43,-9.475,0.0335,0.83,0.000441,0.093,0.324,130.09,audio_features,1odExI7RdWc4BT515LTAwj,spotify:track:1odExI7RdWc4BT515LTAwj,https://api.spotify.com/v1/tracks/1odExI7RdWc4...
8,spotify:track:4Dvkj6JhhA12EX05fT7y2e,As It Was,spotify:artist:6KImCVD70vtIoJWnq6nGn3,{'external_urls': {'spotify': 'https://open.sp...,Harry Styles,[pop],Harry's House,0.52,0.731,-5.338,0.0557,0.342,0.00101,0.311,0.662,173.93,audio_features,4Dvkj6JhhA12EX05fT7y2e,spotify:track:4Dvkj6JhhA12EX05fT7y2e,https://api.spotify.com/v1/tracks/4Dvkj6JhhA12...
9,spotify:track:1Qrg8KqiBpW07V7PNxwwwL,Kill Bill,spotify:artist:7tYKF4w9nC0nq9CsPZTHyP,{'external_urls': {'spotify': 'https://open.sp...,SZA,"[pop, r&b, rap]",SOS,0.644,0.735,-5.747,0.0391,0.0521,0.144,0.161,0.418,88.98,audio_features,1Qrg8KqiBpW07V7PNxwwwL,spotify:track:1Qrg8KqiBpW07V7PNxwwwL,https://api.spotify.com/v1/tracks/1Qrg8KqiBpW0...
