# Working with spotify web api in python with spotipy

In this notebook the spotify web api will be used with the help of spotipy, a special spotify api package.

The spotify web api will provide all necessary data that will build up to the latest music datasets in 2020

### Following steps are included in the notebook:

 - web api calls with spotypy
 - extracting data from the api as json
 - building pandas DataFrame from the web api data
 - storing the data into csv files for later analysis

.
### Goal

On later analysis will be focused on the most popular tracks in 2020. \
Means, for the dataset we need spotify playlist for most streamed track and artist, globally and for germany.

Track information and artist information are necessary. Also the popularity value, a spotify index, is key for later analysis.\
#### Although the spotify audio features, values from their AI analysis, which will index specific features of how the music sound (e.g acoustic, danceability), will be gathered and stored for each track id. 





## Preperation:

In [1]:
# import 
import pandas as pd
import requests
import spotipy 
import json
from spotipy.oauth2 import SpotifyClientCredentials

In [2]:
# showing dataframes wihtout limitation
from IPython.display import display
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
pd.set_option('display.width', None)
pd.set_option('display.max_colwidth', None)

For working with the api the cliend id and secret token have to be stored for later uses. \
The variable "sp" will store the api acces.

In [3]:
# getting client 

client_id = '####

secret = '###

manager = SpotifyClientCredentials(client_id=client_id,client_secret=secret)
response = requests.get('https://api.spotify.com/v1')
response
# set up spotify callable
sp = spotipy.Spotify(client_credentials_manager=manager)

In [4]:
# df checker, for data assesing
# selfmade

def checkdf(df):
    print("shape:",df.shape)
    display(df.head(5))
    display(df.tail(5))
    display(df.describe())
    display(df.info())
    print("number of unique values:")
    display(df.nunique())


Below i wrote a function that will automaticly gather track, artist and audio_feature data from different api endpoints.
The spotify data structure was a bit difficult to read especially the artist elements were on different spots.
This function provide data gathering also if different artist worked one track. \

All data gets stored in one list and this list will be stored as pandas DataFrame at the end of the function.

#### The paylist_tracks function only needs the playlist id as string, to work.

 

In [5]:
#function for get playlist tracks with audiofeatures

def playlist_tracks(id):
    # api call
    plist = sp.playlist(id)
    
    #emtpy list
    tracks_list= []
    #count 
    cts = 0
    #audiofeatures call
    
    def audiof(id):
        feat=sp.audio_features(id)
        return feat

    def tostr(data, sep):
       # Join all the strings in list
        string = sep.join(data)
        return string
    

    
    

    for i in range(0,50):
    
        
        # playlist information
        playlistname = plist['name']
        descr = plist['description']
        #id, track title, popularity, artist, explicit
        ids = plist['tracks']['items'][cts]['track']['id']
        title = plist['tracks']['items'][cts]['track']['name']
        album = plist['tracks']['items'][cts]['track']['album']['name']
        pop = plist['tracks']['items'][cts]['track']['popularity']
        artist = plist['tracks']['items'][cts]['track']['artists'][0]['name']
        expl = plist['tracks']['items'][cts]['track']['explicit']
        
        # if more than 1 artist
        lens = len(plist['tracks']['items'][cts]['track']['artists'])
        if lens >= 2:
            ct1 = 0
            artist1 =[]
            for x in range(lens):
                artist = plist['tracks']['items'][cts]['track']['artists'][ct1]['name']
                artist1.append(artist)
                ct1= ct1+1
        else:
            artist1=[]
            artist2 = plist['tracks']['items'][cts]['track']['artists'][0]['name']
            artist1.append(artist2)
        
        #audio features
        feats = audiof(str(ids))[0]
        dur = feats['duration_ms']
        dance =feats['danceability']
        energy =feats['energy']
        key = feats['key']
        loud= feats['loudness']
        mode = feats['mode']
        speech = feats['speechiness']
        acoustic = feats['acousticness']
        inst = feats['instrumentalness']
        live = feats['liveness']
        val = feats['valence']
        tempo = feats['tempo']
        sig = feats['time_signature']
        #build the later dataframe
        tracks_list.append({
                    'playlist':playlistname,
                    'description':descr,
                    'id':ids,
                     'title':title,
                    'album':album,
                    'artist/s':tostr(artist1,", "),
                    'popularity':pop,
                       
                    'explicit':expl,
                    'duration':dur,
                    'danceability':dance,
                    'energy':energy,
                    'key':key,
                    'loudness':loud,
                    'mode':mode,
                    'speechiness':speech,
                    'acousticness':acoustic,
                    'instrumentalness':inst,
                    'liveness':live,
                    'valence':val,
                    'tempo':tempo,
                    'time_signature':sig
                                        
                   })
        
        #increase count
        cts = cts+1
    
    df = pd.DataFrame(tracks_list)
    return df

In [6]:
#spotify:playlist:37i9dQZF1DX4HROODZmf5u
# most streamed tracks2020 germany
most_tracks_ger = playlist_tracks('37i9dQZF1DX4HROODZmf5u')
most_tracks_ger.to_csv('most_streamed_tracks2020_germany.csv',index=False)

In [7]:
most_tracks_ger

Unnamed: 0,playlist,description,id,title,album,artist/s,popularity,explicit,duration,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
0,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,0VjIjW4GlUZAMYd2vXMi3b,Blinding Lights,After Hours,The Weeknd,96,False,200040,0.514,0.73,1,-5.934,1,0.0598,0.00146,9.5e-05,0.0897,0.334,171.005,4
1,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,6hw1Sy9wZ8UCxYGdpKrU6M,Roller,Platte,Apache 207,72,True,157093,0.941,0.758,10,-6.47,0,0.17,0.0256,0.00258,0.193,0.683,128.017,4
2,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,24Yi9hE78yPEbZ4kxyoXAI,Roses - Imanbek Remix,Roses (Imanbek Remix),"SAINt JHN, Imanbek",90,True,176840,0.77,0.724,8,-5.484,1,0.0495,0.0167,0.0107,0.353,0.898,121.975,4
3,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,3H7ihDc1dqLriiWXwsc2po,Breaking Me,Breaking Me,"Topic, A7S",91,False,166794,0.789,0.72,8,-5.652,0,0.218,0.223,0.0,0.129,0.664,122.031,4
4,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,7ytR5pFWmSjzHJIeQkgog4,ROCKSTAR (feat. Roddy Ricch),BLAME IT ON BABY,"DaBaby, Roddy Ricch",93,True,181733,0.746,0.69,11,-7.956,1,0.164,0.247,0.0,0.101,0.497,89.977,4
5,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,5ZULALImTm80tzUbYQYM9d,Dance Monkey,The Kids Are Coming,Tones And I,80,False,209438,0.824,0.588,6,-6.4,0,0.0924,0.692,0.000104,0.149,0.513,98.027,4
6,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,5CqkgDH8QZjSqqI3HmYxDD,Airwaves,Airwaves,Pashanim,74,False,178933,0.885,0.578,5,-7.416,0,0.0783,0.0739,0.0903,0.102,0.388,131.999,4
7,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,3goH7O78TLkV9RhCAyM4AT,Fame,Fame,Apache 207,69,False,173761,0.811,0.645,1,-8.008,0,0.104,0.184,0.0,0.103,0.474,129.967,4
8,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,57tck8MCxkY6tcQC6VhahR,Bläulich,Bläulich,Apache 207,71,False,196230,0.796,0.687,1,-7.855,1,0.357,0.0633,0.000443,0.118,0.222,153.959,4
9,Top Tracks 2020 Deutschland,Die meistgestreamten Tracks 2020 in Deutschland. Cover: The Weeknd,1xQ6trAsedVPCdbtDAmk0c,Savage Love (Laxed - Siren Beat),Savage Love (Laxed - Siren Beat),"Jawsh 685, Jason Derulo",90,True,171375,0.767,0.481,0,-8.52,0,0.0803,0.234,0.0,0.269,0.761,150.076,4


In [8]:

# most streamed artists in germany, male and female
#https://open.spotify.com/playlist/37i9dQZF1DWTdV9tXbHOAv?si=A9OeQAOuSmWhOki7-kdiqg
most_artist_ger = playlist_tracks('37i9dQZF1DWTdV9tXbHOAv')
most_artist_ger.to_csv('most_streamed_artist2020_germany.csv',index=False)

In [9]:
most_artist_ger.head(60)

Unnamed: 0,playlist,description,id,title,album,artist/s,popularity,explicit,duration,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
0,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,1Nn6miPlilYF40aQ7u4KF8,Der Bratan bleibt der gleiche,CB7,Capital Bra,54,True,192107,0.64,0.689,0,-4.097,0,0.363,0.129,0.0,0.562,0.713,95.04,4
1,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,2WRTnY0slmFgWcrmEr8dPj,Bläulich,Treppenhaus,Apache 207,70,False,196213,0.79,0.704,10,-7.935,0,0.417,0.069,0.000658,0.113,0.212,154.007,4
2,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,1VEMUkGFio5uURvN8x1DZJ,Tilidin,Tilidin,"Capital Bra, Samra",65,True,172036,0.631,0.673,6,-4.892,0,0.384,0.407,0.0,0.154,0.721,83.504,4
3,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,0atjNG0rNW2m6zPaiK4Co7,Roadrunner,Hollywood,Bonez MC,71,False,149028,0.804,0.606,7,-9.045,1,0.242,0.137,0.245,0.111,0.247,99.024,4
4,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,3yZCjDTxzZxx1kll1sRjGc,Emotions 2.0,Emotions 2.0,"Ufo361, CÉLINE",70,False,146028,0.788,0.491,8,-9.884,1,0.109,0.386,0.00088,0.107,0.271,98.543,4
5,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,52BoEcA1JE2Q4cosNsb5gy,Mios mit Bars,EXOT,Luciano,56,False,232667,0.689,0.63,9,-7.405,0,0.0472,0.104,0.0998,0.229,0.396,144.967,4
6,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,4UOXo3xWnxYdTf3sFyRaUG,500 PS,Palmen aus Plastik 2,"Bonez MC, RAF Camora",69,True,181971,0.805,0.924,7,-5.883,0,0.0439,0.391,0.00448,0.0902,0.836,139.976,4
7,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,2vnzjMNMDNL4CxxfNl1rxW,Puste sie weg,Vollmond,Kontra K,64,False,191103,0.82,0.693,2,-6.808,1,0.0996,0.0708,0.039,0.0906,0.575,92.453,4
8,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,1V7JaMp11LKGwKiVmSetf0,Baby,Baby,"Joker Bra, VIZE",70,False,148125,0.771,0.921,1,-4.228,0,0.109,0.166,0.000211,0.119,0.56,128.021,4
9,Top Künstler*innen 2020,Entdecke hier die meistgestreamten Künstler*innen 2020 in Deutschland. Cover: Capital Bra,3hbHVdyA6cyVUBBLQ0RbUU,Dicka Was,Dicka Was,"Kool Savas, Sido, Nessi",69,True,155483,0.791,0.659,5,-5.6,0,0.226,0.257,0.0,0.101,0.473,93.956,4


In [10]:
# top global track 2020
#https://open.spotify.com/playlist/37i9dQZF1DX7Jl5KP2eZaS?si=T0vjmXV7TbqJ6yakXhikWw

most_tracks_glob = playlist_tracks('37i9dQZF1DX7Jl5KP2eZaS')
most_tracks_glob.to_csv('most_streamed_tracks2020_global.csv',index=False)

In [11]:
most_tracks_glob.head(60)

Unnamed: 0,playlist,description,id,title,album,artist/s,popularity,explicit,duration,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
0,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,0VjIjW4GlUZAMYd2vXMi3b,Blinding Lights,After Hours,The Weeknd,96,False,200040,0.514,0.73,1,-5.934,1,0.0598,0.00146,9.5e-05,0.0897,0.334,171.005,4
1,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,1rgnBhdG2JDFTbYkYRZAku,Dance Monkey,Dance Monkey,Tones And I,70,False,209755,0.825,0.593,6,-6.401,0,0.0988,0.688,0.000161,0.17,0.54,98.078,4
2,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,0nbXyq5TXYPCO7pr3N8S4I,The Box,Please Excuse Me For Being Antisocial,Roddy Ricch,89,True,196653,0.896,0.586,10,-6.687,0,0.0559,0.104,0.0,0.79,0.642,116.971,4
3,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,2Wo6QQD1KMDWeFkkjLqwx5,Roses - Imanbek Remix,Roses (Imanbek Remix),"SAINt JHN, Imanbek",77,True,176219,0.785,0.721,8,-5.457,1,0.0506,0.0149,0.00432,0.285,0.894,121.962,4
4,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,3PfIrDoz19wz7qK7tYeu62,Don't Start Now,Future Nostalgia,Dua Lipa,84,False,183290,0.793,0.793,11,-4.521,0,0.083,0.0123,0.0,0.0951,0.679,123.95,4
5,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,7ytR5pFWmSjzHJIeQkgog4,ROCKSTAR (feat. Roddy Ricch),BLAME IT ON BABY,"DaBaby, Roddy Ricch",93,True,181733,0.746,0.69,11,-7.956,1,0.164,0.247,0.0,0.101,0.497,89.977,4
6,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,6UelLqGlWMcVH1E5c4H7lY,Watermelon Sugar,Fine Line,Harry Styles,93,False,174000,0.548,0.816,0,-4.209,1,0.0465,0.122,0.0,0.335,0.557,95.39,4
7,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,7eJMfftS33KTjuF7lTsMCx,death bed (coffee for your head),death bed (coffee for your head),"Powfu, beabadoobee",90,False,173333,0.726,0.431,8,-8.765,0,0.135,0.731,0.0,0.696,0.348,144.026,4
8,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,2rRJrJEo19S2J82BDsQ3F7,Falling,Nicotine,Trevor Daniel,78,False,159382,0.784,0.43,10,-8.756,0,0.0364,0.123,0.0,0.0887,0.236,127.087,4
9,Top Tracks of 2020,The Global Top Tracks of 2020. Cover: The Weeknd,7qEHsqek33rTcFNT9PFqLf,Someone You Loved,Divinely Uninspired To A Hellish Extent,Lewis Capaldi,90,False,182161,0.501,0.405,1,-5.679,1,0.0319,0.751,0.0,0.105,0.446,109.891,4


In [12]:
"""# spotify self playlists
cts=0
spotys = sp.user_playlists('spotify',limit=50)

for i in range(100):
    lists=spotys['items'][cts]['name']
    followers=spotys['items']
    print(lists)
    cts=cts+1
"""

"# spotify self playlists\ncts=0\nspotys = sp.user_playlists('spotify',limit=50)\n\nfor i in range(100):\n    lists=spotys['items'][cts]['name']\n    followers=spotys['items']\n    print(lists)\n    cts=cts+1\n"

In [13]:
#https://open.spotify.com/playlist/37i9dQZF1DWXgY89J4Sjdb?si=wA30zNdCSOqmMvi7fgmHKQ
#top artists 2020 globally

most_artist_glob =playlist_tracks('37i9dQZF1DWXgY89J4Sjdb')
most_artist_glob.to_csv('most_streamed_artist2020_global.csv',index=False)

In [15]:
most_artist_glob.head(60)

Unnamed: 0,playlist,description,id,title,album,artist/s,popularity,explicit,duration,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,time_signature
0,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,47EiUVwUp4C9fGccaPuUCS,Dakiti,Dakiti,"Bad Bunny, Jhay Cortez",100,True,205090,0.731,0.573,4,-10.059,0,0.0544,0.401,5.2e-05,0.113,0.145,109.928,4
1,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,466cKvZn1j45IpxDdYZqdA,Toosie Slide,Dark Lane Demo Tapes,Drake,79,True,247059,0.83,0.49,1,-8.82,0,0.209,0.289,3e-06,0.113,0.845,81.604,4
2,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,2lCkncy6bIB0LTMT7kvrD1,Azul,Colores,J Balvin,87,False,205933,0.843,0.836,11,-2.474,0,0.0695,0.0816,0.00138,0.0532,0.65,94.018,4
3,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,2Y0wPrPQBrGhoLn14xRYCG,Come & Go (with Marshmello),Legends Never Die,"Juice WRLD, Marshmello",89,True,205485,0.625,0.814,0,-5.181,1,0.0657,0.0172,0.0,0.158,0.535,144.991,4
4,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,0VjIjW4GlUZAMYd2vXMi3b,Blinding Lights,After Hours,The Weeknd,96,False,200040,0.514,0.73,1,-5.934,1,0.0598,0.00146,9.5e-05,0.0897,0.334,171.005,4
5,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,2QyuXBcV1LJ2rq01KhreMF,ON,MAP OF THE SOUL : 7,BTS,82,False,246381,0.583,0.817,9,-5.146,0,0.0987,0.118,0.0,0.338,0.438,105.936,4
6,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,54bFM56PmE4YLRnqpW6Tha,Therefore I Am,Therefore I Am,Billie Eilish,95,False,174321,0.889,0.34,11,-7.773,0,0.0697,0.218,0.13,0.055,0.716,94.009,4
7,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,4R2kfaDFhslZEMJqAFNpdd,cardigan,folklore,Taylor Swift,83,False,239560,0.613,0.581,0,-8.588,0,0.0424,0.537,0.000345,0.25,0.551,130.033,4
8,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,7xQAfvXzm3AkraOtGPWIZg,Wow.,Hollywood's Bleeding,Post Malone,83,True,149547,0.829,0.539,11,-7.359,0,0.208,0.136,2e-06,0.103,0.388,99.96,4
9,Top Artists of 2020,The Global Top Artists of 2020. Cover: Bad Bunny,3eekarcy7kvN4yt5ZFzltW,HIGHEST IN THE ROOM,HIGHEST IN THE ROOM,Travis Scott,88,True,175721,0.598,0.427,7,-8.764,0,0.0317,0.0546,6e-06,0.21,0.0605,76.469,4


#### To be continued