# Getting data 3 - audio features of sample dr artist album tracks

Now that I got the sample of dr tracks, I need to get the audio features of said tracks.

I got a csv with the following data:
- track id
- track artist
- track title
- track explicity

<br/>

Next steps are:
- getting audio features from dr tracks

In [1]:
# importing needed libraries

import pandas as pd
import numpy as np
import getpass
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [8]:
# importing data csv

dr_sample_tracks = pd.read_csv('../data/dr_tracks_id_title_expl_artist_album.csv')

In [9]:
# checking the column

dr_sample_tracks.columns

Index(['Unnamed: 0', 'id', 'track_title', 'track_explicit', 'track_artist'], dtype='object')

In [10]:
# removing the not needed index column

dr_sample_tracks = dr_sample_tracks.drop(['Unnamed: 0'], axis = 1)

In [12]:
# checking length 

len(dr_sample_tracks)

28077

# Getting audio features

To get the audio features I will use audio_features(tracks=[]) function.

In [15]:
# setting passwords

client_id = getpass.getpass('client_id?')
client_secret = getpass.getpass('client_secret?')

client_id?········
client_secret?········


In [16]:
# connection to spotify API

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id=client_id ,
                                                           client_secret=client_secret))

In [30]:
# creating list of all track ids from df to iterate over to get audio features

track_ids = list(dr_sample_tracks['id'].unique())

In [19]:
# building a function to put the song ids in track_ids into chunks
# the audio feature function can only process 50 tracks at the time

def chunks(lst, n):
    """Yield successive n-sized chunks from lst."""
    for i in range(0, len(lst), n):
        yield lst[i:i + n]

In [31]:
# getting the audio features from the top100 track ids
# iterating over the top100 track list with the chunk function

dr_sample_tracks_audiofeatures = []

for chunk in list(chunks(track_ids, 50)):
    dr_sample_tracks_audiofeatures.append(sp.audio_features(chunk))

In [32]:
# turning the audio feature list into a df

dr_sample_tracks_audiofeatures_df = pd.DataFrame(dr_sample_tracks_audiofeatures)

In [33]:
# we need to unpack those dicts

dr_sample_tracks_audiofeatures_df = pd.DataFrame()

for i in range(len(dr_sample_tracks_audiofeatures)):
    try:
        dr_sample_tracks_audiofeatures_df = dr_sample_tracks_audiofeatures_df.append(pd.DataFrame(dr_sample_tracks_audiofeatures[i]))
    except:
        continue
        
dr_sample_tracks_audiofeatures_df.head()

Unnamed: 0,0,acousticness,analysis_url,danceability,duration_ms,energy,id,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,type,uri,valence
0,,0.568,https://api.spotify.com/v1/audio-analysis/5mrN...,0.726,173003.0,0.684,5mrNkGIEJ0pSdXIR0eMwZQ,2e-06,6.0,0.131,-6.125,0.0,0.387,145.955,4.0,https://api.spotify.com/v1/tracks/5mrNkGIEJ0pS...,audio_features,spotify:track:5mrNkGIEJ0pSdXIR0eMwZQ,0.318
1,,0.636,https://api.spotify.com/v1/audio-analysis/1EFb...,0.765,143867.0,0.646,1EFbLCDFWIIiC6IMgthYsN,5e-06,1.0,0.118,-5.292,1.0,0.308,173.915,4.0,https://api.spotify.com/v1/tracks/1EFbLCDFWIIi...,audio_features,spotify:track:1EFbLCDFWIIiC6IMgthYsN,0.541
2,,0.138,https://api.spotify.com/v1/audio-analysis/3czN...,0.551,196682.0,0.638,3czNqQU3YvEcxaxIz16nb7,0.000339,9.0,0.147,-5.8,1.0,0.367,94.261,4.0,https://api.spotify.com/v1/tracks/3czNqQU3YvEc...,audio_features,spotify:track:3czNqQU3YvEcxaxIz16nb7,0.158
3,,0.143,https://api.spotify.com/v1/audio-analysis/4oqj...,0.708,227509.0,0.76,4oqjF0RgQmtGGogG3paqOT,4e-05,7.0,0.111,-4.53,1.0,0.442,170.071,4.0,https://api.spotify.com/v1/tracks/4oqjF0RgQmtG...,audio_features,spotify:track:4oqjF0RgQmtGGogG3paqOT,0.359
4,,0.521,https://api.spotify.com/v1/audio-analysis/5E3d...,0.724,130719.0,0.708,5E3dnmFiGepqED3KpwJalV,0.0,8.0,0.165,-6.23,1.0,0.287,127.914,4.0,https://api.spotify.com/v1/tracks/5E3dnmFiGepq...,audio_features,spotify:track:5E3dnmFiGepqED3KpwJalV,0.554


In [34]:
len(dr_sample_tracks_audiofeatures_df), len(dr_sample_tracks)

(26735, 28077)

In [35]:
# merging both dfs

dr_sample_tracks_audiofeatures_df_merged = pd.merge(dr_sample_tracks, dr_sample_tracks_audiofeatures_df, how='left', on=['id'])
dr_sample_tracks_audiofeatures_df_merged.tail()

Unnamed: 0,id,track_title,track_explicit,track_artist,0,acousticness,analysis_url,danceability,duration_ms,energy,...,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,type,uri,valence
28072,6T3HETqZz2jWruWfQdMnnM,Gold,True,Jalle,,0.249,https://api.spotify.com/v1/audio-analysis/6T3H...,0.883,236190.0,0.565,...,0.114,-9.079,1.0,0.0527,126.048,4.0,https://api.spotify.com/v1/tracks/6T3HETqZz2jW...,audio_features,spotify:track:6T3HETqZz2jWruWfQdMnnM,0.17
28073,4x2ZGBnrYWKOHnd24PZUqQ,Gang Gang,True,Jalle,,0.505,https://api.spotify.com/v1/audio-analysis/4x2Z...,0.847,162462.0,0.548,...,0.108,-9.007,1.0,0.0694,129.992,4.0,https://api.spotify.com/v1/tracks/4x2ZGBnrYWKO...,audio_features,spotify:track:4x2ZGBnrYWKOHnd24PZUqQ,0.0802
28074,5akBfZirObWAdp2ZlY5zmo,Was du laberst,True,Jalle,,0.00801,https://api.spotify.com/v1/audio-analysis/5akB...,0.875,173015.0,0.666,...,0.0986,-10.176,0.0,0.226,149.942,4.0,https://api.spotify.com/v1/tracks/5akBfZirObWA...,audio_features,spotify:track:5akBfZirObWAdp2ZlY5zmo,0.573
28075,7vHJDGjGiJQrGORYZuXpVa,Fuck12,True,Jalle,,0.00219,https://api.spotify.com/v1/audio-analysis/7vHJ...,0.79,173514.0,0.72,...,0.11,-9.051,0.0,0.113,148.031,4.0,https://api.spotify.com/v1/tracks/7vHJDGjGiJQr...,audio_features,spotify:track:7vHJDGjGiJQrGORYZuXpVa,0.365
28076,3EYwKISvUTT2rFbK0eIBVE,Alles brennt,True,Jalle,,0.0818,https://api.spotify.com/v1/audio-analysis/3EYw...,0.662,192414.0,0.665,...,0.141,-6.311,0.0,0.0534,144.847,4.0,https://api.spotify.com/v1/tracks/3EYwKISvUTT2...,audio_features,spotify:track:3EYwKISvUTT2rFbK0eIBVE,0.153


In [36]:
# checking length to see how it worked

len(dr_sample_tracks_audiofeatures_df_merged)

28077

In [37]:
# saving df as csv

dr_sample_tracks_audiofeatures_df_merged.to_csv('../data/dr_sample_audio_features.csv')