# ![logo_ironhack_blue 7](https://user-images.githubusercontent.com/23629340/40541063-a07a0a8a-601a-11e8-91b5-2f13e4e6b441.png)

# Lab | API wrappers - Create your collection of songs & audio features


#### Instructions 


To move forward with the project, you need to create a collection of songs with their audio features - as large as possible! 

These are the songs that we will cluster. And, later, when the user inputs a song, we will find the cluster to which the song belongs and recommend a song from the same cluster.
The more songs you have, the more accurate and diverse recommendations you'll be able to give. Although... you might want to make sure the collected songs are "curated" in a certain way. Try to find playlists of songs that are diverse, but also that meet certain standards.

The process of sending hundreds or thousands of requests can take some time - it's normal if you have to wait a few minutes (or, if you're ambitious, even hours) to get all the data you need.

An idea for collecting as many songs as possible is to start with all the songs of a big, diverse playlist and then go to every artist present in the playlist and grab every song of every album of that artist. The amount of songs you'll be collecting per playlist will grow exponentially!

In [127]:
import spotipy
import pandas as pd
from spotipy.oauth2 import SpotifyClientCredentials

sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="84acfa90a4b040b288565c9043c891fc",
                                                           client_secret="b2096cde0fb74a10a980ebcd67843f8a"))

In [128]:
playlist = sp.user_playlist_tracks("turgon1993", "44sPcoQ10on4NGGNtvsjI4")
playlist.keys()
#playlist['total']

dict_keys(['href', 'items', 'limit', 'next', 'offset', 'previous', 'total'])

We created the dictionary "Playlist". It is a dictionary which contains all the information from the playlist I requested, in a JSON format.

In [5]:
len(playlist['items']) 

91

We are interested in the items, which are the tracks in the playlist.

In [17]:
results = sp.user_playlist_tracks("turgon1993", "44sPcoQ10on4NGGNtvsjI4")
tracks = results['items']

while results['next']:
    results = sp.next(results)
    tracks.extend(results['items'])

#This while loop iterates over the playlist for every track 
#and appends the information of the tracks to the tracks list

#The tracks list is a list which contains different dictionaries

#One can access to the song titles like this:    
tracks[0]['track']['name']

'Under Pressure - Remastered 2011'

In [18]:
len(tracks)

91

In [24]:
#We get the artists participating in a song this way:
tracks[0]['track']['artists']

[{'external_urls': {'spotify': 'https://open.spotify.com/artist/1dfeR4HaWDbWqFHLkxsg1d'},
  'href': 'https://api.spotify.com/v1/artists/1dfeR4HaWDbWqFHLkxsg1d',
  'id': '1dfeR4HaWDbWqFHLkxsg1d',
  'name': 'Queen',
  'type': 'artist',
  'uri': 'spotify:artist:1dfeR4HaWDbWqFHLkxsg1d'},
 {'external_urls': {'spotify': 'https://open.spotify.com/artist/0oSGxfWSnnOXhD2fKuz2Gy'},
  'href': 'https://api.spotify.com/v1/artists/0oSGxfWSnnOXhD2fKuz2Gy',
  'id': '0oSGxfWSnnOXhD2fKuz2Gy',
  'name': 'David Bowie',
  'type': 'artist',
  'uri': 'spotify:artist:0oSGxfWSnnOXhD2fKuz2Gy'}]

In [37]:
#Manually get all the songs from the playlist
song_names=[]
i=0

while i < len(tracks):
    song_names.append(tracks[i]['track']['name'])
    i+=1
    
#song_names

### Get info using functions

In [28]:
username="turgon1993"
playlist_id="44sPcoQ10on4NGGNtvsjI4"

In [29]:
def get_playlist_tracks(username, playlist_id):
    
    results = sp.user_playlist_tracks(username, playlist_id)
    tracks = results['items']
    
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    
    return tracks

In [38]:
#get_playlist_tracks(username, playlist_id)

In [39]:
def get_playlist_tracklist(username, playlist_id):
    
    results = sp.user_playlist_tracks(username, playlist_id)
    tracks = results['items']
    
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
        
    song_names=[]
    i=0

    while i < len(tracks):
        song_names.append(tracks[i]['track']['name'])
        i+=1
    
    return song_names

In [41]:
#get_playlist_tracklist(username, playlist_id)

In [31]:
def get_artists_from_playlist(username,playlist_id):
    
    tracks_from_playlist = get_playlist_tracks(username, playlist_id)
    
    artists = []
    
    for track in tracks_from_playlist:
        artists_info = track['track']['artists']
        
        for artist_info in artists_info:
            artists.append(artist_info['name'])
    
    return list(set(artists))

In [36]:
artists_playlist=get_artists_from_playlist(username,playlist_id)
len(artists_playlist)

68

In [33]:
def get_artists_ids_from_playlist(username,playlist_id):
    
    tracks_from_playlist = get_playlist_tracks(username, playlist_id)
    
    artists_ids = []
    
    for track in tracks_from_playlist:
        artists_info = track['track']['artists']
        
        for artist_info in artists_info:
            artists_ids.append(artist_info['id'])
            
    return list(set(artists_ids))

In [42]:
#get_artists_ids_from_playlist(username,playlist_id)

Now we have the list of artists in a playlist. Next step is to find all songs from those artists.

In [43]:
id_from_playlist=get_artists_ids_from_playlist(username,playlist_id)

In [44]:
id_from_playlist[0]

'07XSN3sPlIlB2L2XNcTwJw'

In [48]:
uri_="spotify:artist:"+id_from_playlist[0]

In [71]:
artistAlbums = sp.artist_albums(uri_)
artistAlbums['items'][1]['name']

albums=[]
album_ids=[]
i=0
while i < len(artistAlbums['items']):
    albums.append(artistAlbums['items'][i]['name'])
    album_ids.append(artistAlbums['items'][i]['id'])
    i+=1

Now we can see all the albums from a single artist:

In [63]:
albums

['Destroyer (45th Anniversary Super Deluxe)',
 'KISS Off The Soundboard: Tokyo 2001 (Live At Tokyo Dome, Japan 3/13/2001)',
 'KISS Off The Soundboard: Tokyo 2001 (Live)',
 'Monster',
 'Monster',
 'Monster',
 'Alive! 1975-2000',
 'Alive: The Millennium Concert',
 'Alive: The Millennium Concert',
 'Symphony: Alive IV',
 'Symphony: Alive IV',
 'Symphony: Alive IV',
 'KISS Box Set',
 'Psycho Circus',
 'Psycho Circus',
 'Carnival Of Souls: The Final Sessions',
 'Carnival Of Souls: The Final Sessions',
 'MTV Unplugged',
 'MTV Unplugged',
 'Alive III']

In [72]:
album_ids[1]

'6x60FPv5I0t19eqKx76bv9'

In [77]:
album_1=sp.album_tracks(album_ids[1])['items']
album_tracks=[]
album_track_ids=[]
i=0
while i < len(album_1):
    album_tracks.append(album_1[i]['name'])
    album_track_ids.append(album_1[i]['id'])
    i+=1
    
album_track_ids

['3KCj9jctvenpacWp9eWsJI',
 '0leY2a4PFdVmZmTLszWY9B',
 '2zK0pXoWSUE7qlqWfh3hGN',
 '4FexXd7TDJ0mavesN9FIfR',
 '2QuI8OhH9nlFdUVb1reiSO',
 '3TrGudXYaVjzecoskfeBLg',
 '7l5EzQGZcKBRWqkzuusMIQ',
 '1TC5R51aA9YYWbcs20s4Li',
 '1qxLw4umgCtbXnodB1I5Xs',
 '1ooGiGW6z9BpTtJqmIqBns',
 '1WvSJEcIt2DhmL6PhBL1qL',
 '2B5e7nVUYXJfP1FdmvqQ1K',
 '7gkY7sHevG9XK4cRCsAVg2',
 '0cb5cP6LZ9uZYAIpC4zqjL',
 '29Xq4rBtAT1Vew1fr9xqE5',
 '5BMPjPOLwNceFkBeYQMJXm',
 '37Khm6ADp0zgiMKPVxhtBm',
 '5hRYVKUuQTSC21PMMumNPX',
 '3vwD37cDKBGv2va6lczTRW',
 '6ARXXXmlJxy346utWmmYxq',
 '7IxmvQKuXm5Rk6EcTzFXaT']

In [111]:
sp.audio_features(album_track_ids)

[{'danceability': 0.181,
  'energy': 0.98,
  'key': 0,
  'loudness': -5.571,
  'mode': 0,
  'speechiness': 0.154,
  'acousticness': 0.00367,
  'instrumentalness': 0.362,
  'liveness': 0.691,
  'valence': 0.219,
  'tempo': 176.289,
  'type': 'audio_features',
  'id': '3KCj9jctvenpacWp9eWsJI',
  'uri': 'spotify:track:3KCj9jctvenpacWp9eWsJI',
  'track_href': 'https://api.spotify.com/v1/tracks/3KCj9jctvenpacWp9eWsJI',
  'analysis_url': 'https://api.spotify.com/v1/audio-analysis/3KCj9jctvenpacWp9eWsJI',
  'duration_ms': 298467,
  'time_signature': 4},
 {'danceability': 0.289,
  'energy': 0.958,
  'key': 6,
  'loudness': -6.704,
  'mode': 0,
  'speechiness': 0.095,
  'acousticness': 0.00485,
  'instrumentalness': 0.0244,
  'liveness': 0.872,
  'valence': 0.3,
  'tempo': 131.89,
  'type': 'audio_features',
  'id': '0leY2a4PFdVmZmTLszWY9B',
  'uri': 'spotify:track:0leY2a4PFdVmZmTLszWY9B',
  'track_href': 'https://api.spotify.com/v1/tracks/0leY2a4PFdVmZmTLszWY9B',
  'analysis_url': 'https://api

In [82]:
pd.DataFrame(sp.audio_features(album_track_ids)).head()

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,0.181,0.98,0,-5.571,0,0.154,0.00367,0.362,0.691,0.219,176.289,audio_features,3KCj9jctvenpacWp9eWsJI,spotify:track:3KCj9jctvenpacWp9eWsJI,https://api.spotify.com/v1/tracks/3KCj9jctvenp...,https://api.spotify.com/v1/audio-analysis/3KCj...,298467,4
1,0.289,0.958,6,-6.704,0,0.095,0.00485,0.0244,0.872,0.3,131.89,audio_features,0leY2a4PFdVmZmTLszWY9B,spotify:track:0leY2a4PFdVmZmTLszWY9B,https://api.spotify.com/v1/tracks/0leY2a4PFdVm...,https://api.spotify.com/v1/audio-analysis/0leY...,326173,4
2,0.362,0.914,3,-6.604,1,0.0749,0.0713,1.5e-05,0.681,0.433,135.286,audio_features,2zK0pXoWSUE7qlqWfh3hGN,spotify:track:2zK0pXoWSUE7qlqWfh3hGN,https://api.spotify.com/v1/tracks/2zK0pXoWSUE7...,https://api.spotify.com/v1/audio-analysis/2zK0...,236507,4
3,0.32,0.911,11,-6.613,1,0.0846,0.0238,0.00102,0.766,0.358,130.005,audio_features,4FexXd7TDJ0mavesN9FIfR,spotify:track:4FexXd7TDJ0mavesN9FIfR,https://api.spotify.com/v1/tracks/4FexXd7TDJ0m...,https://api.spotify.com/v1/audio-analysis/4Fex...,265600,4
4,0.437,0.95,3,-6.29,1,0.0538,0.0538,1.5e-05,0.504,0.326,86.444,audio_features,2QuI8OhH9nlFdUVb1reiSO,spotify:track:2QuI8OhH9nlFdUVb1reiSO,https://api.spotify.com/v1/tracks/2QuI8OhH9nlF...,https://api.spotify.com/v1/audio-analysis/2QuI...,243000,4


# New Approach
The dataset we'll work with can come from the songs that a certain user has in all of his/her playlists.

In [129]:
username="turgon1993"
playlist_id=sp.user_playlists(username)['items'][4]['id']
playlist_name=sp.user_playlists(username)['items'][4]['name']
print(playlist_name)
track_name=sp.playlist_tracks(playlist_id)['items'][0]['track']['name']
track_id=sp.playlist_tracks(playlist_id)['items'][0]['track']['id']
print(track_name)

Boletus
Resonance


In [130]:
track_features=sp.audio_features(track_id)
track_features[0]['name']=track_name
track_features[0]['playlist_id']=playlist_id
track_features[0]['playlist_name']=playlist_name
track_features
track_features_df=pd.DataFrame(track_features)
track_features_df

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,...,type,id,uri,track_href,analysis_url,duration_ms,time_signature,name,playlist_id,playlist_name
0,0.567,0.743,8,-8.614,1,0.0626,0.0356,0.847,0.221,0.423,...,audio_features,1TuopWDIuDi1553081zvuU,spotify:track:1TuopWDIuDi1553081zvuU,https://api.spotify.com/v1/tracks/1TuopWDIuDi1...,https://api.spotify.com/v1/audio-analysis/1Tuo...,212881,5,Resonance,4KunhoChStbIcMI8bprq0E,Boletus


In [147]:
username="turgon1993"

In [148]:
#playlist_ids=[]
#i=0
#while i < len(sp.user_playlists(username)['items']):
 #   playlist_id=sp.user_playlists(username)['items'][i]['id']
 #   playlist_name=sp.user_playlists(username)['items'][i]['name']
 #   playlist_ids.append(playlist_id)
 #   i+=1

In [149]:
#playlist_ids
def get_playlists(username):
    playlist_ids=[]
    i=0
    while i < len(sp.user_playlists(username)['items']):
        playlist_id=sp.user_playlists(username)['items'][i]['id']
        playlist_ids.append(playlist_id)
        i+=1
    return playlist_ids

In [155]:
def get_features_from_playlist(playlist_id):
    list_ft_dfs=[]
    j=0
    while j < len(sp.playlist_tracks(playlist_id)['items']):
        track_name=sp.playlist_tracks(playlist_id)['items'][j]['track']['name']
        track_id=sp.playlist_tracks(playlist_id)['items'][j]['track']['id']
    #print(track_name)
        track_features=sp.audio_features(track_id)
        track_features[0]['name']=track_name
        track_features[0]['playlist_id']=playlist_id
        track_features_df=pd.DataFrame(track_features)
        list_ft_dfs.append(track_features_df)
        j+=1
    dataset_playlist=pd.concat(list_ft_dfs)
    return dataset_playlist

In [167]:
def user_songs_dataset(username):
    playlist_ids=get_playlists(username)
    list_of_feat_dataframes=[]
    counter=1
    for ids in playlist_ids:
        print("Getting tracks from playlist",counter,"of",len(playlist_ids))
        list_of_feat_dataframes.append(get_features_from_playlist(ids))
        counter+=1
    dataset=pd.concat(list_of_feat_dataframes)
    return dataset

In [157]:
#get_playlists(username)

In [158]:
#user_songs_dataset(username)

### CAUTION: The following code takes a lot of time to run!

In [169]:
user_songs_ds=user_songs_dataset(username)

Getting tracks from playlist 1 of 50
Getting tracks from playlist 2 of 50
Getting tracks from playlist 3 of 50
Getting tracks from playlist 4 of 50
Getting tracks from playlist 5 of 50
Getting tracks from playlist 6 of 50
Getting tracks from playlist 7 of 50
Getting tracks from playlist 8 of 50
Getting tracks from playlist 9 of 50
Getting tracks from playlist 10 of 50
Getting tracks from playlist 11 of 50
Getting tracks from playlist 12 of 50
Getting tracks from playlist 13 of 50
Getting tracks from playlist 14 of 50
Getting tracks from playlist 15 of 50
Getting tracks from playlist 16 of 50
Getting tracks from playlist 17 of 50
Getting tracks from playlist 18 of 50
Getting tracks from playlist 19 of 50
Getting tracks from playlist 20 of 50
Getting tracks from playlist 21 of 50
Getting tracks from playlist 22 of 50
Getting tracks from playlist 23 of 50
Getting tracks from playlist 24 of 50
Getting tracks from playlist 25 of 50
Getting tracks from playlist 26 of 50
Getting tracks from p

TypeError: 'NoneType' object is not iterable

TypeError                                 Traceback (most recent call last)
<ipython-input-169-dc1f639c253c> in <module>
----> 1 user_songs_ds=user_songs_dataset(username)

<ipython-input-167-96fba97e1406> in user_songs_dataset(username)
      5     for ids in playlist_ids:
      6         print("Getting tracks from playlist",counter,"of",len(playlist_ids))
----> 7         list_of_feat_dataframes.append(get_features_from_playlist(ids))
      8         counter+=1
      9     dataset=pd.concat(list_of_feat_dataframes)

<ipython-input-155-5032bb732d04> in get_features_from_playlist(playlist_id)
      6         track_id=sp.playlist_tracks(playlist_id)['items'][j]['track']['id']
      7     #print(track_name)
----> 8         track_features=sp.audio_features(track_id)
      9         track_features[0]['name']=track_name
     10         track_features[0]['playlist_id']=playlist_id

/opt/anaconda3/lib/python3.8/site-packages/spotipy/client.py in audio_features(self, tracks)
   1680             results = self._get("audio-features/?ids=" + trackid)
   1681         else:
-> 1682             tlist = [self._get_id("track", t) for t in tracks]
   1683             results = self._get("audio-features/?ids=" + ",".join(tlist))
   1684         # the response has changed, look for the new style first, and if

TypeError: 'NoneType' object is not iterable

In [168]:
user_songs_ds['playlist_id'].value_counts()

3eg1sv7mBR5m2IwyxnZE1y    3250
Name: playlist_id, dtype: int64