# Lab 8.05

## Instructions

To move forward with the project, you need to create a collection of songs with their audio features - as large as possible!

These are the songs that we will cluster. And, later, when the user inputs a song, we will find the cluster to which the song belongs and recommend a song from the same cluster. The more songs you have, the more accurate and diverse recommendations you'll be able to give. Although... you might want to make sure the collected songs are "curated" in a certain way. Try to find playlists of songs that are diverse, but also that meet certain standards.

The process of sending hundreds or thousands of requests can take some time - it's normal if you have to wait a few minutes (or, if you're ambitious, even hours) to get all the data you need.

An idea for collecting as many songs as possible is to start with all the songs of a big, diverse playlist and then go to every artist present in the playlist and grab every song of every album of that artist. The amount of songs you'll be collecting per playlist will grow exponentially!

In [26]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import getpass
client_secret = getpass.getpass()
import pandas as pd

········


In [27]:
sp = spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id='9d5b81b176e145c48ed913c9881044ad',
                                                           client_secret=client_secret))

In [28]:
rock_playlist_id = '37i9dQZF1DWZJhOVGWqUKF'
classical_playlist_id = '37i9dQZF1DWWEJlAGA9gs0'
pop_playlist_id = '37i9dQZF1DXbKGrOUA30KN'
kids_playlist_id = '37i9dQZF1DXecgCDnEPSBf'
jazz_playlist_id = '37i9dQZF1DX7YCknf2jT6s'

In [29]:
def get_playlist_tracks(username, playlist_id):
    results = sp.user_playlist_tracks(username,playlist_id)
    tracks = results['items']
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    return tracks

In [30]:
rock_playlist = get_playlist_tracks("spotify", rock_playlist_id)
classical_playlist = get_playlist_tracks("spotify", classical_playlist_id)
pop_playlist = get_playlist_tracks("spotify", pop_playlist_id)
kids_playlist = get_playlist_tracks("spotify", kids_playlist_id)
jazz_playlist = get_playlist_tracks("spotify", jazz_playlist_id)

In [33]:
rock_playlist[0]['track']

{'album': {'album_type': 'single',
  'artists': [{'external_urls': {'spotify': 'https://open.spotify.com/artist/1aOt6LvXOV6I8dv1A5Diia'},
    'href': 'https://api.spotify.com/v1/artists/1aOt6LvXOV6I8dv1A5Diia',
    'id': '1aOt6LvXOV6I8dv1A5Diia',
    'name': 'DON BROCO',
    'type': 'artist',
    'uri': 'spotify:artist:1aOt6LvXOV6I8dv1A5Diia'}],
  'available_markets': ['AD',
   'AE',
   'AG',
   'AL',
   'AM',
   'AO',
   'AR',
   'AT',
   'AU',
   'AZ',
   'BA',
   'BB',
   'BD',
   'BE',
   'BF',
   'BG',
   'BH',
   'BI',
   'BJ',
   'BN',
   'BO',
   'BR',
   'BS',
   'BT',
   'BW',
   'BY',
   'BZ',
   'CA',
   'CD',
   'CG',
   'CH',
   'CI',
   'CL',
   'CM',
   'CO',
   'CR',
   'CV',
   'CW',
   'CY',
   'CZ',
   'DE',
   'DJ',
   'DK',
   'DM',
   'DO',
   'DZ',
   'EC',
   'EE',
   'EG',
   'ES',
   'FI',
   'FJ',
   'FM',
   'FR',
   'GA',
   'GB',
   'GD',
   'GE',
   'GH',
   'GM',
   'GN',
   'GQ',
   'GR',
   'GT',
   'GW',
   'GY',
   'HK',
   'HN',
   'HR',
   'HT',
 

In [16]:
def get_artists_from_playlist(playlist_id, tracks):
    
    artists = []
    
    for track in tracks:
        artist_info = track['track']['artists'][0]
        artists.append(artist_info['name'])
    
    return artists

In [17]:
rock_artists = get_artists_from_playlist(rock_playlist_id, rock_playlist)
classical_artists = get_artists_from_playlist(classical_playlist_id, classical_playlist)
pop_artists = get_artists_from_playlist(pop_playlist_id, pop_playlist)
kids_artists = get_artists_from_playlist(kids_playlist_id, kids_playlist)
jazz_artists = get_artists_from_playlist(jazz_playlist_id, jazz_playlist)

In [18]:
def get_song_names(tracks):
    song_names = [track['track']['name'] for track in tracks]
    track_ids = [track['track']['id'] for track in tracks]
    
    return song_names, track_ids

In [20]:
rock_song_name, rock_track_id = get_song_names(rock_playlist)
classical_song_name, classical_track_id = get_song_names(classical_playlist)
pop_song_name, pop_track_id = get_song_names(pop_playlist)
kids_song_name, kids_track_id = get_song_names(kids_playlist)
jazz_song_name, jazz_track_id = get_song_names(jazz_playlist)

In [32]:
def main_features_df(track_ids):
    list_ = []
    for i in track_ids:
        list_.append(sp.audio_features(i)) 
    converted = [i for elem in list_ for i in elem]    
    df = pd.json_normalize(converted)
    
    return df

In [31]:
df_rock = main_features_df(rock_track_id)
df_classical = main_features_df(classical_track_id)
df_pop = main_features_df(pop_track_id)
df_kids = main_features_df(kids_track_id)
df_jazz = main_features_df(jazz_track_id)

In [36]:
def add_features_to_df(song_names, artists, features_df):
    df_song_names = pd.DataFrame(song_names, columns = ['song_name'])
    df_artists = pd.DataFrame(artists, columns = ['artist'])
    
    final_df = pd.concat([features_df, df_song_names, df_artists], axis=1)
    
    return final_df

In [38]:
final_rock_df = add_features_to_df(rock_song_name, rock_artists, df_rock)
final_classical_df = add_features_to_df(classical_song_name, classical_artists, df_classical)
final_pop_df = add_features_to_df(pop_song_name, pop_artists, df_pop)
final_kids_df = add_features_to_df(kids_song_name, kids_artists, df_kids)
final_jazz_df = add_features_to_df(jazz_song_name, jazz_artists, df_jazz)

In [46]:
songs_df = pd.concat([final_rock_df, final_classical_df, final_pop_df, final_kids_df, final_jazz_df], axis=0)
songs_df.reset_index(drop=True, inplace=True)

In [48]:
songs_df

Unnamed: 0,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,song_name,artist
0,0.613,0.946,1,-4.522,0,0.1130,0.003910,0.033800,0.1100,0.374,141.994,audio_features,5spwALkCxee9CstWeKG7gF,spotify:track:5spwALkCxee9CstWeKG7gF,https://api.spotify.com/v1/tracks/5spwALkCxee9...,https://api.spotify.com/v1/audio-analysis/5spw...,245244,4,Fingernails,DON BROCO
1,0.639,0.860,11,-4.270,1,0.0690,0.000306,0.005490,0.0507,0.328,112.025,audio_features,0UuNO0yYvsMPlyLF7RfQlg,spotify:track:0UuNO0yYvsMPlyLF7RfQlg,https://api.spotify.com/v1/tracks/0UuNO0yYvsMP...,https://api.spotify.com/v1/audio-analysis/0UuN...,224801,4,Angst,Rammstein
2,0.473,0.989,6,-4.212,0,0.0597,0.000645,0.000006,0.3420,0.327,93.988,audio_features,51lcM37Li2HOhk8F8kPwUv,spotify:track:51lcM37Li2HOhk8F8kPwUv,https://api.spotify.com/v1/tracks/51lcM37Li2HO...,https://api.spotify.com/v1/audio-analysis/51lc...,243561,4,AfterLife,Five Finger Death Punch
3,0.580,0.841,5,-5.274,0,0.0320,0.000318,0.007700,0.3440,0.850,97.000,audio_features,5VDiBRQ6k1RW7H6HGcyme8,spotify:track:5VDiBRQ6k1RW7H6HGcyme8,https://api.spotify.com/v1/tracks/5VDiBRQ6k1RW...,https://api.spotify.com/v1/audio-analysis/5VDi...,212621,4,Like A Drug,BRKN LOVE
4,0.554,0.846,1,-3.453,0,0.0427,0.001270,0.000000,0.3260,0.520,103.998,audio_features,3kpW19uTVTQF9EYJ9jhIOG,spotify:track:3kpW19uTVTQF9EYJ9jhIOG,https://api.spotify.com/v1/tracks/3kpW19uTVTQF...,https://api.spotify.com/v1/audio-analysis/3kpW...,218553,4,Foxglove,Boston Manor
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
471,0.457,0.645,1,-11.235,1,0.0946,0.904000,0.898000,0.1160,0.675,177.678,audio_features,1LBemY1Vmr78TYltVOiX2N,spotify:track:1LBemY1Vmr78TYltVOiX2N,https://api.spotify.com/v1/tracks/1LBemY1Vmr78...,https://api.spotify.com/v1/audio-analysis/1LBe...,214433,3,Don't Break,Immanuel Wilkins
472,0.276,0.525,2,-14.073,0,0.0449,0.305000,0.850000,0.1080,0.385,168.833,audio_features,5n2ungarkfmrpDS5VhKja6,spotify:track:5n2ungarkfmrpDS5VhKja6,https://api.spotify.com/v1/tracks/5n2ungarkfmr...,https://api.spotify.com/v1/audio-analysis/5n2u...,248562,4,Unrest II,Brandee Younger
473,0.368,0.297,0,-12.584,1,0.0342,0.971000,0.421000,0.1120,0.304,172.846,audio_features,4miACj3Pi7Pox3ylFNFZ9s,spotify:track:4miACj3Pi7Pox3ylFNFZ9s,https://api.spotify.com/v1/tracks/4miACj3Pi7Po...,https://api.spotify.com/v1/audio-analysis/4miA...,195778,4,Reste un oiseau,Anne Paceo
474,0.324,0.284,0,-19.674,0,0.0327,0.907000,0.876000,0.1200,0.170,102.037,audio_features,55sg7XJ6nyfOv1VM1OHdHD,spotify:track:55sg7XJ6nyfOv1VM1OHdHD,https://api.spotify.com/v1/tracks/55sg7XJ6nyfO...,https://api.spotify.com/v1/audio-analysis/55sg...,277000,4,What Happens Next?,Little North
