***Info for: pull_spotify_data.ipynb file***

This file uses the spotipy API to interact with the Spotify Web API and generate the features_df.csv and track_df.csv files that we use to complete our data processing and analysis. Use it alongside a config file with your Client ID and Client Secret ID API keys (generated through the Spotify dev portal) to import data, or use the data created in this repository.

In [1]:

#Importing dependencies (pandas to make data frames that we then use for the csvs.)
import pandas as pd

In [2]:
#import Client ID and Client Secret ID from config.py
from config import cid, secret


In [3]:
#Spotipy APU for Spotify (to pull data)
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager
=
client_credentials_manager)

In [4]:
# Define lists to add to csv
artist_name = []
track_name = []
popularity = []
track_id = []
audio_features = []

# Loop through the 1000 most popular songs associated with the year 2023
for i in range(0,1000,50):
    track_results = sp.search(q='year:2023', type='track', limit=50,offset=i)
    for i, t in enumerate(track_results['tracks']['items']):
        artist_name.append(t['artists'][0]['name'])
        track_name.append(t['name'])
        track_id.append(t['id'])
        popularity.append(t['popularity'])
        audio_features.append(sp.audio_features(t['id']))

In [5]:
# Make a dataframe with the track name, artist name, track id, and popularity of the track
track_df = pd.DataFrame({'artist_name' : artist_name, 'track_name' : track_name, 'track_id' : track_id, 'popularity' : popularity})
print(track_df.shape)
track_df.head()

# Make a dataframe with the trackID and the audio features of each track
features_df = pd.DataFrame({'AudioFeatures' : audio_features})
features_df = features_df['AudioFeatures'].str[0].apply(pd.Series)

# Save both the dataframes to csv files
track_df.to_csv('track_df.csv')
features_df.to_csv('features_df.csv')

features_df.head()

(1000, 4)


Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.847,0.622,1,-6.747,0,0.0903,0.119,0.0,0.285,0.22,130.001,audio_features,4rXLjWdF2ZZpXCVTfWcshS,spotify:track:4rXLjWdF2ZZpXCVTfWcshS,https://api.spotify.com/v1/tracks/4rXLjWdF2ZZp...,https://api.spotify.com/v1/audio-analysis/4rXL...,125040,4
1,0.511,0.532,5,-5.745,1,0.056,0.169,0.0,0.311,0.322,137.827,audio_features,3k79jB4aGmMDUQzEwa46Rz,spotify:track:3k79jB4aGmMDUQzEwa46Rz,https://api.spotify.com/v1/tracks/3k79jB4aGmMD...,https://api.spotify.com/v1/audio-analysis/3k79...,219724,4
2,0.557,0.774,7,-5.275,0,0.351,0.012,0.0,0.396,0.397,111.975,audio_features,67nepsnrcZkowTxMWigSbb,spotify:track:67nepsnrcZkowTxMWigSbb,https://api.spotify.com/v1/tracks/67nepsnrcZko...,https://api.spotify.com/v1/audio-analysis/67ne...,246134,4
3,0.712,0.603,8,-5.52,1,0.0262,0.186,0.0,0.115,0.67,97.994,audio_features,1Lo0QY9cvc8sUB2vnIOxDT,spotify:track:1Lo0QY9cvc8sUB2vnIOxDT,https://api.spotify.com/v1/tracks/1Lo0QY9cvc8s...,https://api.spotify.com/v1/audio-analysis/1Lo0...,265493,4
4,0.444,0.0911,0,-17.665,1,0.0307,0.959,1e-06,0.098,0.142,78.403,audio_features,6wf7Yu7cxBSPrRlWeSeK0Q,spotify:track:6wf7Yu7cxBSPrRlWeSeK0Q,https://api.spotify.com/v1/tracks/6wf7Yu7cxBSP...,https://api.spotify.com/v1/audio-analysis/6wf7...,222370,4
