# How to get data from Spotify


*Author: Marine JACQUEMIN-LORRIAUX*

Here is a code to extract data from Spotify, using Spotify API.

To do so, here are the requirements:
- Create a Spotify account and get a user name
- Get your Credentials
- Get your secret key

Got all this ? Let's get started ! 

In [1]:
import spotipy
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials


# authenticate and connect to the API
client_credentials_manager = SpotifyClientCredentials(client_id='*****',   #insert your client ID
                                                      client_secret='*****') #insert your secret key
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)


## Extract data from a given playlist

To extract data from a given public playlist, you will need:
 - the name of whoever creates the playlist
 - the playlist ID
 
In this example, I'll get the complete Spotify playlist of the most played songs in 2018. 

As the user_playlist function is by default limited to 100 songs, let's create first the **get_playlist_tracks function** that gets the tracks of the whole playlist.
Then, we loop over the tracks to return a list of the tracks' ids.

In [7]:
# get track ids from playlist

def get_playlist_tracks(username,playlist_id):
    results = sp.user_playlist_tracks(username,playlist_id)
    tracks = results['items']
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    return tracks

tracks = get_playlist_tracks('Spotify', '37i9dQZF1DX1HUbZS4LEyL') # insert here author name and playlist id

ids = []
for item in tracks:
    track = item['track']
    ids.append(track['id'])

We can now get the features we need for each track

In [8]:
# get song info and audio analysis from song ids
def getTrackFeatures(id):
      meta = sp.track(id)
      features = sp.audio_features(id)

#Features can be removed/added according to the needs.
      # Meta
      name = meta['name']
      album = meta['album']['name']
      artist = meta['album']['artists'][0]['name']
      release_date = meta['album']['release_date']
      duration_ms = meta['duration_ms']
      popularity = meta['popularity']
      explicit = meta['explicit']
      available_markets = meta["available_markets"]
      #image_url = meta['album']['images'][1]['url'] #get the 300x300 format album image

      # Features
      acousticness = features[0]['acousticness']
      danceability = features[0]['danceability']
      energy = features[0]['energy']
      instrumentalness = features[0]['instrumentalness']
      liveness = features[0]['liveness']
      loudness = features[0]['loudness']
      speechiness = features[0]['speechiness']
      tempo = features[0]['tempo']
      valence = features[0]['valence']
      time_signature = features[0]['time_signature']

      track = [name, album, artist, release_date, duration_ms, popularity,explicit,available_markets, danceability, acousticness, danceability, energy, instrumentalness, liveness, loudness, speechiness, tempo, time_signature]
      return track

In [9]:
# loop over track ids to create dataset
tracks = []
for i in range(0, len(ids)):
        track = getTrackFeatures(ids[i])
        tracks.append(track)

df = pd.DataFrame(tracks, columns = ['name', 'album', 'artist', 'release_date', 'duration_ms', 'popularity','explicit','available_markets', 'danceability', 'acousticness', 'danceability', 'energy', 'instrumentalness', 'liveness', 'loudness', 'speechiness', 'tempo', 'time_signature'])


We get a dataset with chosen features ready to be downloaded.

In [10]:
df.head()

Unnamed: 0,name,album,artist,release_date,duration_ms,popularity,explicit,available_markets,danceability,acousticness,danceability.1,energy,instrumentalness,liveness,loudness,speechiness,tempo,time_signature
0,God's Plan,Scorpion,Drake,2018-06-29,198973,90,True,"[AD, AE, AR, AT, AU, BE, BG, BH, BO, BR, CA, C...",0.754,0.0332,0.754,0.449,8.3e-05,0.552,-9.211,0.109,77.169,4
1,SAD!,?,XXXTENTACION,2018-03-16,166605,92,True,"[AD, AE, AR, AT, AU, BE, BG, BH, BO, BR, CA, C...",0.74,0.258,0.74,0.613,0.00372,0.123,-4.88,0.145,75.023,4
2,rockstar (feat. 21 Savage),beerbongs & bentleys,Post Malone,2018-04-27,218146,92,True,"[AD, AE, AR, AT, AU, BE, BG, BH, BO, BR, CA, C...",0.587,0.117,0.587,0.535,6.6e-05,0.131,-6.09,0.0898,159.847,4
3,Psycho (feat. Ty Dolla $ign),beerbongs & bentleys,Post Malone,2018-04-27,221440,89,True,"[AD, AE, AR, AT, AU, BE, BG, BH, BO, BR, CA, C...",0.739,0.58,0.739,0.559,0.0,0.112,-8.011,0.117,140.124,4
4,In My Feelings,Scorpion,Drake,2018-06-29,217925,89,True,"[AD, AE, AR, AT, AU, BE, BG, BH, BO, BR, CA, C...",0.835,0.0589,0.835,0.626,6e-05,0.396,-5.833,0.125,91.03,4


In [157]:
#Save final dataset
df.to_csv("/Users/*INSERT_PATH*/most_played_2018", sep = ',')