In [97]:
import pandas as pd

# Part 1: Spotify API

The notebook steps through the process for making a call to the Spotify API to obtain audio features for each song in a playlist.  It is the first notebook of three in my supervised learning capstone.

Data was obtained from Spotify using *Spotipy*, a lightweight Python library for the Spotify Web API.  

http://spotipy.readthedocs.io/en/latest/#

### API Authorization 

In [98]:
# Import spotipy
import spotipy
import spotipy.util as util

# Client credentials
cid ="5b14ab139b5b428abb0631807fbe753d" 
secret = "50cc15b5c98d412e8bb5d6f81e44696c"  
username='sdvzhr40fdjbtr1dp7mbnlfk4'

# Specify redirect path to local host
redirect = "http://localhost:8888/callback"

# Specify scope of inquiry
scope = 'playlist-read-private'

# Generate access token
token = util.prompt_for_user_token(username, scope=scope, client_id=cid, client_secret=secret, redirect_uri=redirect)
sp = spotipy.Spotify(auth=token)

### Playlist IDs

In [99]:
# Uncomment specific playlist to get audio features.  
# Repeat for both playlist.

# Like
#playlist_id = '1qKLFl2MHnP0RP8xiD26aB'

# Dislike
#playlist_id = '6z3MjcPbgCiAIJYfrPevin'

### Query Track Information From Playlist

Code for this section inspired by:

https://stackoverflow.com/questions/39086287/spotipy-how-to-read-more-than-100-tracks-from-a-playlist

Thanks ackleyrc!

In [100]:
# Specify API endpoint to access tracks in playlist
playlist = sp.user_playlist_tracks(username,playlist_id)

# Store playlist items in tracks variable
tracks = playlist['items']

# API limits to queries to 100 returned results
# Use 'next' method to loop through entire playlist
# Use extend method to append elements from iteration 
while playlist['next']:
    playlist = sp.next(playlist)
    tracks.extend(playlist['items'])

In [101]:
# Verify all tracks were extracted  
len(tracks)

517

### Extract Song ID from Track Information

In [102]:
# Informatin stored as nested dictionary
# Information I want is tracks(dict) --> track(dict) --> id(str)

# Create empty ids list
ids = []

# Loop through tracks
for i in range(len(tracks)):
   
    # Get song id from track dictionary
    song_id = tracks[i]['track']['id']
    
    # Append to master ids list
    ids.append(song_id)

In [112]:
# Confirm all ids extracted.  Should match length of tracks.
len(ids)

517

### Get Audio Features Using Song ID

In [113]:
# Create empty list to store list of dictionaries
features = []

# Loop through ids 
for i in range(len(ids)):
    
    # Get song features for each item
    song_feature = sp.audio_features(ids[i])
    
    # Append to master features list
    features.append(song_feature)

In [114]:
# Confirm length of list matches len(ids)
print(len(features))

517


### Extract Audio Features to Data Frame

In [106]:
# Information I want is in list of list of dictionaries.
# Audio features for each song are stored in a dictionary.
# Each dictionary is being stored as a list, and then as 
# a list item within the features list.

#[[{}], [{}], ..., [{}]]

# Goal: format data into list of dictionaries to populate df

# Examine data structure 
# features[0:1]

In [115]:
# Create empty list to store extracted key:value pairs
features_list = []

# Loop through features list to pull out dictionary items 
for i in range(len(features)):
    
    # Get keys from each song's audio features  
    keys = features[i][0].keys()
    
    # Get values from each song's audio features 
    values = features[i][0].values()
    
    # Recombine into single dictionary for each song
    dict_item = dict(zip(keys, values))
    
    # Store dictionary in list
    features_list.append(dict_item)

In [116]:
# Confirm length 
len(features_list)

517

In [117]:
# Create dataframe
features = pd.DataFrame(features_list)

# Check it out for good measure
features.head()

Unnamed: 0,acousticness,analysis_url,danceability,duration_ms,energy,id,instrumentalness,key,liveness,loudness,mode,speechiness,tempo,time_signature,track_href,type,uri,valence
0,0.422,https://api.spotify.com/v1/audio-analysis/5X39...,0.582,200373,0.452,5X39KNrxjqJCJHcG1pqWRZ,0.00256,1,0.296,-12.315,1,0.0439,96.277,4,https://api.spotify.com/v1/tracks/5X39KNrxjqJC...,audio_features,spotify:track:5X39KNrxjqJCJHcG1pqWRZ,0.406
1,0.0155,https://api.spotify.com/v1/audio-analysis/4YaN...,0.566,283538,0.587,4YaNLEPw3MrIgkGOkBrAh2,2.1e-05,6,0.111,-5.96,1,0.0308,82.046,4,https://api.spotify.com/v1/tracks/4YaNLEPw3MrI...,audio_features,spotify:track:4YaNLEPw3MrIgkGOkBrAh2,0.246
2,0.625,https://api.spotify.com/v1/audio-analysis/0Ktz...,0.446,236795,0.303,0KtzMx1GbkkPyA2TQceHoM,0.00136,0,0.0596,-7.031,1,0.0306,120.212,4,https://api.spotify.com/v1/tracks/0KtzMx1GbkkP...,audio_features,spotify:track:0KtzMx1GbkkPyA2TQceHoM,0.057
3,0.0381,https://api.spotify.com/v1/audio-analysis/3VXr...,0.579,208018,0.687,3VXrpkM94UBgL4voR20tZq,0.0107,9,0.155,-6.859,0,0.0596,86.97,4,https://api.spotify.com/v1/tracks/3VXrpkM94UBg...,audio_features,spotify:track:3VXrpkM94UBgL4voR20tZq,0.545
4,0.00534,https://api.spotify.com/v1/audio-analysis/4x6a...,0.663,271136,0.84,4x6aawsLyPkF3aGPXGvbat,0.342,9,0.169,-5.618,0,0.0453,108.022,4,https://api.spotify.com/v1/tracks/4x6aawsLyPkF...,audio_features,spotify:track:4x6aawsLyPkF3aGPXGvbat,0.2


### Write to CSV

In [118]:
# Specify columns to keep.  Drop columns not related to audio features.
columns= ['acousticness', 'danceability', 'duration_ms', 'energy', 'instrumentalness', 'key', 'liveness', 
           'loudness', 'mode', 'speechiness', 'tempo', 'time_signature', 'valence']

# Export to csv.  Change path based on specific playlist.
features.to_csv('disliked_playlist.csv', columns=columns)

# Remember to repeat for both playlists!