# Spotify Afrobeats Recommendation System
By Afolabi Cardoso

## Data Gathering

This notebook features the data gathering process. Using the Spotify API and Spotipy library, I was able to create functions to collect the useful metadata and audio features from the tracks in the playlist. After collecting the features as a dataframe, I exported them as a csv

#### Import Libraries

In [67]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

from sklearn.datasets import make_blobs
from sklearn.cluster import KMeans
from sklearn.metrics import silhouette_score
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline

import time

#### Authentication

To gain access to the Spotify API, I need to create a client id and client secret.

In [43]:

client_credentials_manager = SpotifyClientCredentials(client_id = '2101cd224f5948e19c4c782d76744ed3',
                                                      client_secret = '879abdfca432449facc9d8566fb40ab6')

#### Spotipy Object

Using the client id and client secret created above, I create a spotipy Object using SpotifyClientCredentials

In [68]:
sp = spotipy.Spotify(client_credentials_manager = client_credentials_manager)

#### Get track metadata and features

This function takes in a playlist URI and returns a dataframe containing metadata such as artist name, genre, album name, track_uri, danceability, energy, loudness, instrumentalness etc

In [48]:
def get_track_info(playlist):
    
    #split the playlist_uri is at the end of the playlists url. I'll use the .split method to extract it
    uri = playlist.split("/")[-1].split("?")[0]
    
    #from the spotipy library, use the playlist_tracks() method to extract each track from the playlist uri
    #It comes in a nested dictionary format
    
    results = sp.playlist_tracks(uri)
    tracks = results['items']
    
    while results['next']:
        results = sp.next(results)
        tracks.extend(results['items'])
    
    
    #create an empty dictionary with the info we want to extract as columns
    info = {
    'track_uri':[],
    'track_name':[],
    'artist_name':[],
    'artist_info':[],
    'artist_uri':[],
    'artist_popularity':[],
    'artist_genre':[],
    'album':[],
    'track_pop':[],
    'year_released':[]
    }
    
    features = {'danceability': [],
     'energy': [],
     'key': [],
     'loudness': [],
     'mode': [],
     'speechiness': [],
     'acousticness': [],
     'instrumentalness': [],
     'liveness': [],
     'valence': [],
     'tempo': [],
     'type': [],
     'id': [],
     'uri': [],
     'track_href': [],
     'analysis_url': [],
     'duration_ms': [],
     'time_signature': []
               }
    
    #using a for loop, get the the info for each song and put it into the empty dictionary
    for track in tracks:
        #URI
        info['track_uri'].append((track["track"]["uri"]).split(':')[2])

        #Track name
        info['track_name'].append(track["track"]["name"])

        #Main Artist
        info['artist_uri'].append((track["track"]["artists"][0]["uri"]).split(':')[2])
        info['artist_info'].append(sp.artist(track["track"]["artists"][0]["uri"]))

        #Name, popularity, genre
        info['artist_name'].append(track["track"]["artists"][0]["name"])
        info['artist_popularity'].append(sp.artist(track["track"]["artists"][0]["uri"])["popularity"])
        info['artist_genre'].append(sp.artist(track["track"]["artists"][0]["uri"])["genres"])

        #Album
        info['album'].append(track["track"]["album"]["name"])

        #Popularity of the track, year released
        info['track_pop'].append(track["track"]["popularity"])
        info['year_released'].append(track["track"]['album']['release_date'])
        
        #Transform the info dictionary into a dataframe
        info_df = pd.DataFrame(info)
        
        #loop through the tracks to their features and assign it to the empty dictionary
        track_uri = track["track"]["uri"].split(':')[2] 
        
        try:
            for key,value in (sp.audio_features(track_uri)[0]).items():
                features[key].append(value)
            
        except:
            print(f'failed on track {track["track"]["name"]}')
            continue
        #time.sleep(5)
        
    #Transform the features dictionary into a dataframe
    features_df = pd.DataFrame(features)
    
        
    
    return info_df.join(features_df)
        

#### Add label to the playlist dataframe


This function calls the get_track_info function above and creates a column with the playlist genre or given username

In [49]:
def raw_data(user_playlist_url, genre):
    user_playlist_info = get_track_info(user_playlist_url)
    #add user genre
    user_playlist_info.loc[:,'genre'] = genre
    return user_playlist_info

#### Afrobeats playlist

I created an afrobeats playlist with spotify that contains 1500+ tracks. Using the functions above, I will fetch the tracks metadata and features

In [50]:
afrobeats_playlist_url= "https://open.spotify.com/playlist/5ZCzd0nCLqiIX1jwQWfazW"

In [51]:
afrobeats_df = raw_data(afrobeats_playlist_url, 'afrobeats')

failed on track Lonely


In [52]:
afrobeats_df.head(2)

Unnamed: 0,track_uri,track_name,artist_name,artist_info,artist_uri,artist_popularity,artist_genre,album,track_pop,year_released,...,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre
0,0mDRuZmReEm6DquPLJlcEm,Oleku (feat. Brymo),Ice Prince,{'external_urls': {'spotify': 'https://open.sp...,1sSt1DqqqFLkPwfrqafVyn,54,"[afro dancehall, afropop, azontobeats, nigeria...",Oleku (feat. Brymo),49,2010-09-21,...,0.751,162.112,audio_features,0mDRuZmReEm6DquPLJlcEm,spotify:track:0mDRuZmReEm6DquPLJlcEm,https://api.spotify.com/v1/tracks/0mDRuZmReEm6...,https://api.spotify.com/v1/audio-analysis/0mDR...,291364.0,5.0,afrobeats
1,1tvi8tv0eykhNcV1WtaIqO,Move Back,5five,{'external_urls': {'spotify': 'https://open.sp...,37zb1JQnDV9dRLatrASEj1,29,"[afro dancehall, azonto, hiplife]",Move Back,43,2011-04-08,...,0.793,125.032,audio_features,1tvi8tv0eykhNcV1WtaIqO,spotify:track:1tvi8tv0eykhNcV1WtaIqO,https://api.spotify.com/v1/tracks/1tvi8tv0eykh...,https://api.spotify.com/v1/audio-analysis/1tvi...,237107.0,4.0,afrobeats


In [69]:
len(afrobeats_df)

1874

In [70]:
afrobeats_df.isna().sum().sum()

0

In [71]:
afrobeats_df[afrobeats_df['energy'].isnull()]

Unnamed: 0,track_uri,track_name,artist_name,artist_info,artist_uri,artist_popularity,artist_genre,album,track_pop,year_released,...,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre


In [72]:
afrobeats_df.dropna(inplace=True)

In [73]:
afrobeats_df.isnull().sum().sum()

0

#### Exporta afrobeats as a csv

In [58]:
afrobeats_df.to_csv('../data/afrobeats.csv', index = False)

#### Jacks playlist

My classmate Jack volunteered his spotify playlist which contains mostly classical music. 

In [81]:
jacks_playlist_url = 'https://open.spotify.com/playlist/6UlskZAcTPzcGMnQaMnIVm?si=5cbb031d1fdd4064'

In [82]:
jacks_playlist_df = raw_data(jacks_playlist_url, 'jack')

In [83]:
jacks_playlist_df.head()

Unnamed: 0,track_uri,track_name,artist_name,artist_info,artist_uri,artist_popularity,artist_genre,album,track_pop,year_released,...,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre
0,1YzcrcgR3T2RwAZg5tSvYP,Die Walküre / Erster Aufzug: Orchestervorspiel,Richard Wagner,{'external_urls': {'spotify': 'https://open.sp...,1C1x4MVkql8AiABuTw6DgE,58,"[classical, german opera, german romanticism, ...",Solti - Wagner - The Operas,15,2012-01-01,...,0.164,113.033,audio_features,1YzcrcgR3T2RwAZg5tSvYP,spotify:track:1YzcrcgR3T2RwAZg5tSvYP,https://api.spotify.com/v1/tracks/1YzcrcgR3T2R...,https://api.spotify.com/v1/audio-analysis/1Yzc...,196000,3,jack
1,6JmduA0I9QYtD1RiHQgWjj,"Götterdämmerung, WWV 86D, Act III: Siegfrieds ...",Richard Wagner,{'external_urls': {'spotify': 'https://open.sp...,1C1x4MVkql8AiABuTw6DgE,58,"[classical, german opera, german romanticism, ...","Wagner: Götterdämmerung, WWV 86D",3,2018-11-09,...,0.0396,66.858,audio_features,6JmduA0I9QYtD1RiHQgWjj,spotify:track:6JmduA0I9QYtD1RiHQgWjj,https://api.spotify.com/v1/tracks/6JmduA0I9QYt...,https://api.spotify.com/v1/audio-analysis/6Jmd...,409787,3,jack
2,1U1i1HBJ5H8DY5J4fO8ySg,Tannhäuser: Overture,Richard Wagner,{'external_urls': {'spotify': 'https://open.sp...,1C1x4MVkql8AiABuTw6DgE,58,"[classical, german opera, german romanticism, ...",Wagner: Orchestral Favourites,44,1994-01-01,...,0.0579,81.802,audio_features,1U1i1HBJ5H8DY5J4fO8ySg,spotify:track:1U1i1HBJ5H8DY5J4fO8ySg,https://api.spotify.com/v1/tracks/1U1i1HBJ5H8D...,https://api.spotify.com/v1/audio-analysis/1U1i...,853827,4,jack
3,5loYnrcJ1XTIbs0MXKntlr,"Götterdämmerung, WWV 86D, Prologue: Siegfrieds...",Richard Wagner,{'external_urls': {'spotify': 'https://open.sp...,1C1x4MVkql8AiABuTw6DgE,58,"[classical, german opera, german romanticism, ...","Wagner: Götterdämmerung, WWV 86D",0,2018-11-09,...,0.0622,128.729,audio_features,5loYnrcJ1XTIbs0MXKntlr,spotify:track:5loYnrcJ1XTIbs0MXKntlr,https://api.spotify.com/v1/tracks/5loYnrcJ1XTI...,https://api.spotify.com/v1/audio-analysis/5loY...,309053,3,jack
4,3ci6KfIfr3UiIRgJ3WVYhd,"Piano Concerto No. 1 in D Minor, Op. 15: I. Ma...",Johannes Brahms,{'external_urls': {'spotify': 'https://open.sp...,5wTAi7QkpP6kp8a54lmTOq,68,"[classical, german romanticism, late romantic ...",Brahms: Piano Concerto No.1,31,2005-01-01,...,0.0395,87.459,audio_features,3ci6KfIfr3UiIRgJ3WVYhd,spotify:track:3ci6KfIfr3UiIRgJ3WVYhd,https://api.spotify.com/v1/tracks/3ci6KfIfr3Ui...,https://api.spotify.com/v1/audio-analysis/3ci6...,1407000,4,jack


#### Exporta Jack's playlist as a csv

In [84]:
jacks_playlist_df.to_csv('../data/jack.csv', index = False)

#### Ankita playlist

Ankita, also a classmate, gave me her spotify url. Her playlist is a mixture of pop and Indian music

In [63]:
ankita_playlist_url = 'https://open.spotify.com/playlist/6qzbkhrmxdFj5TtaXR0sfI?si=ElY40mMjQ7apOsHTGlQI7A'

In [64]:
ankita_playlist_df = raw_data(ankita_playlist_url, 'ankita')

In [65]:
ankita_playlist_df.head()

Unnamed: 0,track_uri,track_name,artist_name,artist_info,artist_uri,artist_popularity,artist_genre,album,track_pop,year_released,...,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre
0,068HSvCf5MbQfhV4qqaelg,"Haan Main Galat (From ""Love Aaj Kal"")",Pritam,{'external_urls': {'spotify': 'https://open.sp...,1wRPtKGflJrBx9BmLsSwlU,84,"[desi pop, filmi, indian instrumental, modern ...","Haan Main Galat (From ""Love Aaj Kal"")",62,2020-01-29,...,0.893,117.979,audio_features,068HSvCf5MbQfhV4qqaelg,spotify:track:068HSvCf5MbQfhV4qqaelg,https://api.spotify.com/v1/tracks/068HSvCf5MbQ...,https://api.spotify.com/v1/audio-analysis/068H...,218644,4,ankita
1,2aHz87L2Z4a0ZEQ7vMlH8z,Badal Pe Paon Hai,Salim–Sulaiman,{'external_urls': {'spotify': 'https://open.sp...,6ohaQzKaXrobAL8paLSaxq,64,"[desi pop, filmi, indian folk, modern bollywood]",Chak De India,55,2007-08-01,...,0.869,104.999,audio_features,2aHz87L2Z4a0ZEQ7vMlH8z,spotify:track:2aHz87L2Z4a0ZEQ7vMlH8z,https://api.spotify.com/v1/tracks/2aHz87L2Z4a0...,https://api.spotify.com/v1/audio-analysis/2aHz...,243931,4,ankita
2,0mACGvgvCcwjfJCPFLeabh,Ziddi Dil,Vishal Dadlani,{'external_urls': {'spotify': 'https://open.sp...,6CXEwIaXYfVJ84biCxqc9k,72,"[desi hip hop, desi pop, filmi, modern bollywood]",Mary Kom,55,2018-05-27,...,0.541,99.007,audio_features,0mACGvgvCcwjfJCPFLeabh,spotify:track:0mACGvgvCcwjfJCPFLeabh,https://api.spotify.com/v1/tracks/0mACGvgvCcwj...,https://api.spotify.com/v1/audio-analysis/0mAC...,286093,4,ankita
3,0dT6g7oqjEqphjXMk8MKHz,Dangal,Pritam,{'external_urls': {'spotify': 'https://open.sp...,1wRPtKGflJrBx9BmLsSwlU,84,"[desi pop, filmi, indian instrumental, modern ...",Dangal (Original Motion Picture Soundtrack),57,2016-12-14,...,0.665,178.017,audio_features,0dT6g7oqjEqphjXMk8MKHz,spotify:track:0dT6g7oqjEqphjXMk8MKHz,https://api.spotify.com/v1/tracks/0dT6g7oqjEqp...,https://api.spotify.com/v1/audio-analysis/0dT6...,299326,4,ankita
4,77UjLW8j5UAGAGVGhR5oUK,Pray For Me (with Kendrick Lamar),The Weeknd,{'external_urls': {'spotify': 'https://open.sp...,1Xyo4u8uXC1ZmMpatF05PJ,97,"[canadian contemporary r&b, canadian pop, pop]",Black Panther The Album Music From And Inspire...,73,2018-02-09,...,0.188,100.584,audio_features,77UjLW8j5UAGAGVGhR5oUK,spotify:track:77UjLW8j5UAGAGVGhR5oUK,https://api.spotify.com/v1/tracks/77UjLW8j5UAG...,https://api.spotify.com/v1/audio-analysis/77Uj...,211440,4,ankita


#### Export Ankita's playlist as a csv

In [66]:
ankita_playlist_df.to_csv('../data/ankita.csv', index = False)

#### Playlist featuring top songs from Fela

Fela pioneered the Afrobeats sound. He called his sound Afrobeat (without the s). I will compare the difference in sounds in the EDA section

In [75]:
fela_playlist_url = 'https://open.spotify.com/playlist/1bSsiqaBobgbnZTTVEo4Qh?si=3093a84902434561'

In [76]:
fela_playlist_df = raw_data(fela_playlist_url, 'fela')

In [77]:
fela_playlist_df.head()

Unnamed: 0,track_uri,track_name,artist_name,artist_info,artist_uri,artist_popularity,artist_genre,album,track_pop,year_released,...,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature,genre
0,11GDQVqIEKAB4QKOcIVOvG,Zombie,Fela Kuti,{'external_urls': {'spotify': 'https://open.sp...,5CG9X521RDFWCuAhlo6QoR,53,"[afrobeat, afropop, funk, world]",Zombie,43,1977-01-01,...,0.585,131.812,audio_features,11GDQVqIEKAB4QKOcIVOvG,spotify:track:11GDQVqIEKAB4QKOcIVOvG,https://api.spotify.com/v1/tracks/11GDQVqIEKAB...,https://api.spotify.com/v1/audio-analysis/11GD...,745653,4,fela
1,6sNNtFKdCz0bnjx7IEXyl2,Expensive Shit,Fela Kuti,{'external_urls': {'spotify': 'https://open.sp...,5CG9X521RDFWCuAhlo6QoR,53,"[afrobeat, afropop, funk, world]",Expensive Shit,37,1975-01-01,...,0.682,122.635,audio_features,6sNNtFKdCz0bnjx7IEXyl2,spotify:track:6sNNtFKdCz0bnjx7IEXyl2,https://api.spotify.com/v1/tracks/6sNNtFKdCz0b...,https://api.spotify.com/v1/audio-analysis/6sNN...,793200,4,fela
2,7nM9ZnOD6nL04E5dOjjZiw,Let's Start (feat. Ginger Baker) - Live,Fela Kuti,{'external_urls': {'spotify': 'https://open.sp...,5CG9X521RDFWCuAhlo6QoR,53,"[afrobeat, afropop, funk, world]",Fela With Ginger Baker Live!,38,1971-01-01,...,0.743,102.327,audio_features,7nM9ZnOD6nL04E5dOjjZiw,spotify:track:7nM9ZnOD6nL04E5dOjjZiw,https://api.spotify.com/v1/tracks/7nM9ZnOD6nL0...,https://api.spotify.com/v1/audio-analysis/7nM9...,465280,4,fela
3,1HC6o3lTQvNmtH3ejYG4hs,Gentleman,Fela Kuti,{'external_urls': {'spotify': 'https://open.sp...,5CG9X521RDFWCuAhlo6QoR,53,"[afrobeat, afropop, funk, world]",Gentleman,39,1973-01-01,...,0.767,86.621,audio_features,1HC6o3lTQvNmtH3ejYG4hs,spotify:track:1HC6o3lTQvNmtH3ejYG4hs,https://api.spotify.com/v1/tracks/1HC6o3lTQvNm...,https://api.spotify.com/v1/audio-analysis/1HC6...,881240,4,fela
4,4fSGItb7Y1uOGfSoZDadhn,Trouble Sleep Yanga Wake Am,Fela Kuti,{'external_urls': {'spotify': 'https://open.sp...,5CG9X521RDFWCuAhlo6QoR,53,"[afrobeat, afropop, funk, world]",Roforofo Fight,33,1972-01-01,...,0.599,142.04,audio_features,4fSGItb7Y1uOGfSoZDadhn,spotify:track:4fSGItb7Y1uOGfSoZDadhn,https://api.spotify.com/v1/tracks/4fSGItb7Y1uO...,https://api.spotify.com/v1/audio-analysis/4fSG...,726027,4,fela


#### Exporta fela playlist as a csv

In [80]:
fela_playlist_df.to_csv('../data/fela.csv', index = False)

The next notebook features me performing exploratory data analysis on the csv exported. 