# Lab | API wrappers - Create your collection of songs & audio features

Instructions

To move forward with the project, you need to create a collection of songs with their audio features - as large as possible!

These are the songs that we will cluster. And, later, when the user inputs a song, we will find the cluster to which the song belongs and recommend a song from the same cluster. The more songs you have, the more accurate and diverse recommendations you'll be able to give. Although... you might want to make sure the collected songs are "curated" in a certain way. Try to find playlists of songs that are diverse, but also that meet certain standards.

The process of sending hundreds or thousands of requests can take some time - it's normal if you have to wait a few minutes (or, if you're ambitious, even hours) to get all the data you need.

An idea for collecting as many songs as possible is to start with all the songs of a big, diverse playlist and then go to every artist present in the playlist and grab every song of every album of that artist. The amount of songs you'll be collecting per playlist will grow exponentially!

In [1]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [2]:
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(
        client_id="935ee69e9649484682c5c7ffecd1fafa",
        client_secret="8b6f66af67fa44b8b79d1e006150ed2d"))

In [3]:
playlist = sp.user_playlist_tracks('spotify', '5S8SJdl1BDc0ugpkEvFsIL')
tracks = playlist['items']
while sp.next(playlist):
    playlist = sp.next(playlist)
    tracks.extend(playlist['items'])

In [23]:
playlist['next']

In [4]:
playlist.keys()

dict_keys(['href', 'items', 'limit', 'next', 'offset', 'previous', 'total'])

In [5]:
playlist['total']

10000

In [6]:
playlist['items']

[{'added_at': '2021-08-23T06:11:25Z',
  'added_by': {'external_urls': {'spotify': 'https://open.spotify.com/user/twgeb7mzdcv4u8h191dxrvlpc'},
   'href': 'https://api.spotify.com/v1/users/twgeb7mzdcv4u8h191dxrvlpc',
   'id': 'twgeb7mzdcv4u8h191dxrvlpc',
   'type': 'user',
   'uri': 'spotify:user:twgeb7mzdcv4u8h191dxrvlpc'},
  'is_local': False,
  'primary_color': None,
  'track': {'album': {'album_type': 'compilation',
    'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/3MKCzCnpzw3TjUYs2v7vDA'},
      'href': 'https://api.spotify.com/v1/artists/3MKCzCnpzw3TjUYs2v7vDA',
      'id': '3MKCzCnpzw3TjUYs2v7vDA',
      'name': 'Pyotr Ilyich Tchaikovsky',
      'type': 'artist',
      'uri': 'spotify:artist:3MKCzCnpzw3TjUYs2v7vDA'}],
    'available_markets': ['AD',
     'AE',
     'AG',
     'AL',
     'AM',
     'AO',
     'AR',
     'AT',
     'AU',
     'AZ',
     'BA',
     'BB',
     'BD',
     'BE',
     'BF',
     'BG',
     'BH',
     'BI',
     'BJ',
     'BN

In [7]:
tracks = playlist['items']

In [8]:
songs = {
    track["track"]["name"]
    for track in playlist['items']
}

In [17]:
len(songs)

96

In [9]:
songs

{'1/1 - Remastered 2004',
 '14:31',
 '3 Hours Of Relaxing Rain For Deep Sleep, Baby Sleep, Meditation, Studying',
 '??? (Bonus Track)',
 'A Cappella History (Bonus Cut)',
 'A Magic Thunderstorm In The Night',
 "Alice's Restaurant Massacree",
 'Backyard Wind Chimes on a Breezy Day for Deep Meditation',
 'Big Impressive Ocean Waves Hitting The Pebble Beach In Le Havre (Normandie, France)',
 'Birds Chirping in a Woodland Meadow',
 'Birds Singing In The Woods In Spring (Vögelstimmen In Frühjahr)',
 'Birdsong in a Natural Setting for Yoga Concentration',
 'Blue Room',
 'Brahms: Piano Quintet in F Minor, Op. 34a: I. Allegro non troppo',
 'Brahms: Piano Quintet in F Minor, Op. 34a: IV. Finale. Poco sostenuto',
 'Brazilian Jungle at Dusk for Healing Strength',
 'Calm Seaside Surf Deep Relaxation',
 'Chacone in G Major, HWV 435',
 'Children of the Son',
 'Clarinet Concerto in A, K.622 : 1. Allegro',
 'Constant Waterfall Power for Spiritual Well Being',
 'Crocodile in the Bath - Story',
 'Dense 

In [10]:
artists = {
    track['track']['artists'][0]['name']
    for track in playlist['items']
}

In [11]:
artists

{'Antonín Dvořák',
 'Arlo Guthrie',
 'Banco De Gaia',
 'Bluecoats Drum and Bugle Corp',
 'Brian Eno',
 'Cecil Taylor',
 'Charles Mingus',
 'Charles Vald',
 'Chris Butler',
 'Dream Theater',
 'Echoes Of Nature',
 'Elton John',
 'George Frideric Handel',
 'Glad',
 'Global Communication',
 'Jethro Tull',
 'Johann Sebastian Bach',
 'Johannes Brahms',
 'Kings of Nature',
 'Life Sounds Nature',
 'Markus Guentner',
 'Meditation Zen Master',
 'Nektar',
 'On The Rocks',
 'Ornette Coleman Trio',
 'P C III',
 'Paul McCartney',
 'Pro Sound Effects Library',
 'Pyotr Ilyich Tchaikovsky',
 'Ralph Vaughan Williams',
 'Supertramp',
 'The Charlie Daniels Band',
 'The Killers',
 'The Orb',
 'Ween',
 'Wilderness Studios Australia',
 'Wolfgang Amadeus Mozart',
 'Yes',
 '寺島尚彦'}

In [12]:
albums = []
for artist in artists:
    results = sp.search(q=f"artist:{artist}", type='album')
    albums_data = results['albums']['items']
    for album in albums_data:
        album_name = album.get('name')
        if album_name:
            albums.append(album_name)

In [13]:
albums

['Aqualung (Special Edition)',
 'Benefit',
 'Stand Up',
 'RökFlöte',
 'War Child (2002 Remaster)',
 'Stand Up',
 'The Very Best of Jethro Tull',
 'Living in the Past',
 'Songs from the Wood',
 'M.U. - The Best of Jethro Tull',
 'Journey To The Centre Of The Eye',
 '...Sounds Like This',
 'A Tab In The Ocean',
 'Recycled',
 'Down To Earth',
 'Remember The Future',
 'Ena Tragoudi Gia Sena',
 'The Other Side',
 'Magic Is A Child',
 'Esmeralda',
 'Breakfast In America (Deluxe Edition)',
 'Even In The Quietest Moments',
 'Crime Of The Century (Remastered)',
 'Breakfast In America',
 'Crisis? What Crisis?',
 'Famous Last Words (Remastered)',
 'The Very Best Of Supertramp',
 'Crime Of The Century (Deluxe)',
 'Breakfast In America (Remastered)',
 'Retrospectacle - The Supertramp Anthology',
 'Apollo',
 'Another Green World',
 'Ambient 1: Music For Airports (Remastered 2004)',
 'Small Craft On A Milk Sea',
 'Here Come The Warm Jets',
 'Wrong Way Up [Expanded Edition]',
 'Before And After Scienc

In [14]:
tracks = []
for artist in artists:
    results = sp.search(q=f"artist:{artist}", type='album')
    albums_data = results['albums']['items']
    for album in albums_data:
        album_id = album.get('id')
        if album_id:
            album_tracks = sp.album_tracks(album_id)['items']
            for track in album_tracks:
                track_name = track.get('name')
                if track_name:
                    tracks.append(track_name)

In [15]:
tracks

['Aqualung',
 'Cross-Eyed Mary',
 'Cheap Day Return',
 'Mother Goose',
 "Wond'ring Aloud",
 'Up to Me',
 'My God',
 'Hymn 43',
 'Slipstream',
 'Locomotive Breath',
 'Wind-Up',
 'Lick Your Fingers Clean',
 'Wind-Up',
 'Ian Anderson Interview',
 'With You There to Help Me - 2001 Remaster',
 'Nothing to Say - 2001 Remaster',
 'Alive and Well and Living In - 2001 Remaster',
 'Son - 2001 Remaster',
 'For Michael Collins, Jeffrey and Me - 2001 Remaster',
 'To Cry You a Song - 2001 Remaster',
 'A Time for Everything - 2001 Remaster',
 'Inside - 2001 Remaster',
 'Play in Time - 2001 Remaster',
 "Sossity; You're a Woman - 2001 Remaster",
 'Singing All Day - 2001 Remaster',
 "The Witch's Promise - 2001 Remaster",
 'Just Trying to Be - 2001 Remaster',
 'Teacher - Single Mix; 2001 Remaster',
 'A New Day Yesterday',
 'Jeffrey Goes to Leicester Square',
 'Bourée',
 'Back to the Family',
 'Look into the Sun',
 'Nothing Is Easy',
 'Fat Man',
 'We Used to Know',
 'Reasons for Waiting',
 'For a Thousand

In [16]:
len(tracks)

5463

In [24]:
# As a function

def get_songs_from_playlist(playlist_id, num_songs):

    tracks = []
    playlist = sp.user_playlist_tracks("spotify", playlist_id)
    tracks.extend(playlist['items'])
    while sp.next(playlist):
        playlist = sp.next(playlist)
        tracks.extend(playlist['items'])
    songs = []
    for track in tracks:
        artist_id = track['track']['artists'][0]['id']
        artist_name = track['track']['artists'][0]['name']
        results = sp.artist_top_tracks(artist_id)
        for track in results['tracks'][:num_songs]:
            songs.append((track['id'], track['name'], artist_name))

    return np.unique(songs, axis=0)

In [None]:
playlist_ids = ["3ggHWuEOItFO23NDng5pFt", "444NypIIQHk4YdoosaVpIy", "7d91tCzneuBicY1L9dZTaA", "3gxZ8SVUUiDqxsHBogDkQS",
               "37i9dQZF1DX8C9xQcOrE6T", "37i9dQZF1DWTALrdBtcXjw", "6DKA5wog6VHfm4bgtN3oJ8", "1xnPxImqdydUvAkHODCkZb"]
num_songs = 30
len(get_songs_from_playlists(playlist_ids, num_songs))

In [25]:
import csv

with open('songs_list.csv', 'w') as file:
    writer = csv.writer(file)