# This notebook will try to streamline/automate getting my data

I'm going to try to have one function that takes in a genre and finds the playlist IDs with more than 20,000 followers. I'm also going to combine all of my previous functions into one function that will take in a playlist ID and spit out a complete DF. Then I will use a for loop to go through each 'valid' playlist ID for one genre. I ended up with around 40 songs for each genre--country, alternative, pop, rap, club, and hits. The last step is to concat all genre dataframes together to make the final DF

In [1]:
import requests
import pandas as pd
import numpy as np
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

In [2]:
#Credentials
cid ='XX'
secret = "XX"

client_credentials_manager = SpotifyClientCredentials(client_id=cid, client_secret=secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

## Getting playlist ID's

In [3]:
#sp.search(q='country', type='playlist', market = 'US', limit=50)

This sp.search will give me 50 playlists with 'country' in the title. I'm going to create a function that you'd be able to input a genre (ie. country) and it would give you a list of playlist ID's that have that genre in its title and have more than 20,000 followers.

Drilling down the search api: 

In [4]:
temp = sp.search(q='country', type='playlist', market = 'US', limit=50)

In [5]:
ids=[]
for i in temp['playlists']['items']:
    ids.append(i['id'])

In [6]:
ids

['37i9dQZF1DX8WMG8VPSOJC',
 '37i9dQZF1DWYnwbYQ5HnZU',
 '37i9dQZF1DXaiEFNvQPZrM',
 '37i9dQZF1DXdfhOsjRMISB',
 '37i9dQZF1DWYiR2Uqcon0X',
 '6nU0t33tQA2i0qTI5HiyRV',
 '37i9dQZF1DWZBCPUIUs2iR',
 '37i9dQZF1DX6RCydf9ytsj',
 '37i9dQZF1DWYP5PUsVbso9',
 '1f6kfopjy0uVYzKN9MUVMr',
 '37i9dQZF1DX1lVhptIYRda',
 '37i9dQZF1DX7CGYgLhqwu5',
 '37i9dQZF1DWWH0izG4erma',
 '4SssSas9VvpPXlMCWXnu91',
 '37i9dQZF1DWWnpcjfCqaW0',
 '37i9dQZF1DX4bf0P6HTTom',
 '37i9dQZF1DXdmMcgFhLQ8u',
 '37i9dQZF1DWXi7h4mmmkzD',
 '2NVuOHHEOImpkirdrcufV3',
 '37i9dQZF1DWU2LcZVHsTdv',
 '37i9dQZF1DWUgBy0IJPlHq',
 '37i9dQZF1DXaJXCbmtHVHV',
 '37i9dQZF1DWVpjAJGB70vU',
 '37i9dQZF1DWXepGEFFmQXJ',
 '37i9dQZF1DX13ZzXoot6Jc',
 '37i9dQZF1DWSK8os4XIQBk',
 '37i9dQZF1DWW7RgkOJG32Y',
 '37i9dQZF1DWZQIRdXaBqdE',
 '37i9dQZF1DX5mB2C8gBeUM',
 '37i9dQZF1DWTkxQvqMy4WW',
 '37i9dQZF1DX49poIUZYXp7',
 '37i9dQZF1DX9hWdQ46pHPo',
 '37i9dQZF1DWYubIwLN4Hq2',
 '63cMB6ED7OxsjRrVLqOebZ',
 '6gXgPeZLfob30T6RSoy9nb',
 '37i9dQZF1DX3RCeShx2suK',
 '7Cne8ivJEQGRKPHmxDu66D',
 

Finding the amount of followers:

In [7]:
test = sp.playlist('37i9dQZF1DXcBWIGoYBM5M')

In [8]:
test['followers']['total']

25831409

Creating the function:

In [9]:
# Function that you input a genre, and it gives you playlist IDs with over 20,000 followers with that genre in the title

def get_playlist_ids(genre):
    #first store all the playlist ids that have the genre in its title
    ids=[]
    genre_playlists = sp.search(q=genre, type='playlist', market = 'US', limit=50)
    for i in genre_playlists['playlists']['items']:
         ids.append(i['id'])
    
    #now create a 'sublist' with only playlist ids with more than 20,000 followers
    follower_ids = []
    for i in ids:
        playlist = sp.playlist(i)
        if playlist['followers']['total'] > 20_000:
            follower_ids.append(i)
    print(f'There are {len(follower_ids)} {genre} playlists with over 20,000 followers')
    return follower_ids

Had to tweak the number of followers so it would output a good number of playlists. 200,000 only gave me two playlists, 20000 gave me mid 40's, 50000 followers gave me around 30 playlists

In [10]:
#Checking that the function works
get_playlist_ids('country')

There are 47 country playlists with over 20,000 followers


['37i9dQZF1DX8WMG8VPSOJC',
 '37i9dQZF1DWYnwbYQ5HnZU',
 '37i9dQZF1DXaiEFNvQPZrM',
 '37i9dQZF1DXdfhOsjRMISB',
 '37i9dQZF1DWYiR2Uqcon0X',
 '6nU0t33tQA2i0qTI5HiyRV',
 '37i9dQZF1DWZBCPUIUs2iR',
 '37i9dQZF1DX6RCydf9ytsj',
 '37i9dQZF1DWYP5PUsVbso9',
 '1f6kfopjy0uVYzKN9MUVMr',
 '37i9dQZF1DX1lVhptIYRda',
 '37i9dQZF1DX7CGYgLhqwu5',
 '37i9dQZF1DWWH0izG4erma',
 '4SssSas9VvpPXlMCWXnu91',
 '37i9dQZF1DWWnpcjfCqaW0',
 '37i9dQZF1DX4bf0P6HTTom',
 '37i9dQZF1DXdmMcgFhLQ8u',
 '37i9dQZF1DWXi7h4mmmkzD',
 '2NVuOHHEOImpkirdrcufV3',
 '37i9dQZF1DWU2LcZVHsTdv',
 '37i9dQZF1DWUgBy0IJPlHq',
 '37i9dQZF1DXaJXCbmtHVHV',
 '37i9dQZF1DWVpjAJGB70vU',
 '37i9dQZF1DWXepGEFFmQXJ',
 '37i9dQZF1DX13ZzXoot6Jc',
 '37i9dQZF1DWSK8os4XIQBk',
 '37i9dQZF1DWW7RgkOJG32Y',
 '37i9dQZF1DWZQIRdXaBqdE',
 '37i9dQZF1DX5mB2C8gBeUM',
 '37i9dQZF1DWTkxQvqMy4WW',
 '37i9dQZF1DX49poIUZYXp7',
 '37i9dQZF1DX9hWdQ46pHPo',
 '37i9dQZF1DWYubIwLN4Hq2',
 '6gXgPeZLfob30T6RSoy9nb',
 '37i9dQZF1DX3RCeShx2suK',
 '7Cne8ivJEQGRKPHmxDu66D',
 '0dZ6X4lEBOHBgQwRtg1cwP',
 

In [11]:
country_ids = get_playlist_ids('country')

There are 47 country playlists with over 20,000 followers


In [12]:
country_playlist_names = []
for i in country_ids:
    country_playlist_names.append(sp.playlist(i)['name'])

In [13]:
# shows the playlist name and it's corresponding ID
pd.DataFrame(country_playlist_names, country_ids)

Unnamed: 0,0
37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love
37i9dQZF1DWYnwbYQ5HnZU,Country Gold
37i9dQZF1DXaiEFNvQPZrM,Country's Greatest Hits: The '90s
37i9dQZF1DXdfhOsjRMISB,Country Drive
37i9dQZF1DWYiR2Uqcon0X,Country Coffeehouse
6nU0t33tQA2i0qTI5HiyRV,Country Hits
37i9dQZF1DWZBCPUIUs2iR,Country Music 101: Country's Greatest Hits
37i9dQZF1DX6RCydf9ytsj,Country's Greatest Hits: The '80s
37i9dQZF1DWYP5PUsVbso9,Country's Greatest Hits: The '70's
1f6kfopjy0uVYzKN9MUVMr,Country Hits 2010s


## Combining 3 previous functions (track attributes, artist attributes, and audio) into 1 function

In [14]:
# FINAL FUNCTION THAT TAKES A PLAYLIST ID AND OUTPUTS A COMPLETE DF WITH ALL THE DESIRED ATTRIBUTES

def playlist_id(num):
    #playlist info
    playlist = sp.playlist_tracks(num)
    playlist = playlist['items']
    playlist = pd.DataFrame(playlist)
   
    #getting track ID 
    track_ids =[]
    track_pop =[]
    track_name = []
    for track in playlist['track']: 
        try:                                    #fixing the issue of having music videos as songs-- code will just skip these
            track_ids.append(track['id'])
        except:
            continue
        track_pop.append(track['popularity'])
        track_name.append(track['name'])
    
    #Getting artist and artist ID
    artist = []
    artist_ids =[]
    for i in playlist['track']: 
        try:                       #fixing the issue of having music videos as songs-- code will just skip these
            artist.append(i['artists'][0]['name']) #this points to the first artist listed on the track which is the main one
        except:
            continue
        artist_ids.append(i['artists'][0]['id'])
    
    #Creating featured artist column
    featured_artist = []

    for i in playlist['track']: 
        try:                       #fixing the issue of having music videos as songs-- code will just skip these
            if len(i['artists']) > 1:
                featured_artist.append(1)
            else:
                featured_artist.append(0)
        except Exception as e:
            print(e)
            continue
    
    playlist_df = [track_ids, track_pop, track_name, artist, artist_ids, featured_artist]
    playlist_df = pd.DataFrame(data = playlist_df).T
    playlist_df.columns = ['track_ids', 'track_pop', 'track_name', 'artist', 'artist_ids', 'featured_artist']
    
    #Adding the audio features to the dataframe
    a = pd.DataFrame(sp.audio_features(tracks = list(playlist_df['track_ids'])))
    a = a[['danceability', 'energy', 'key', 'loudness', 'mode', 'speechiness',
       'acousticness', 'instrumentalness', 'liveness', 'valence', 'tempo',
       'type','duration_ms', 'time_signature']]
    playlist_df = playlist_df.join(a, how = 'right')

    #Adding the artist genre and their popularity score to the dataframe
    artist_pop = []
    genre = []

    for i in playlist_df['artist_ids']:
        temp = sp.artist(i)
        artist_pop.append(temp['popularity'])
        if temp['genres'] == [] : #fixing the empty list issue
            genre.append('NA')
        else:
            genre.append(temp['genres'][0])
    
    playlist_df['artist_pop'] = artist_pop
    playlist_df['genre'] = genre
    
    return playlist_df

I checked to see that the large function was producing the same dataset as my previous three working functions. I don't show it here, but they were identical so we know the final function works!

In [15]:
#Example of the dataset that my function produces
RapCaviar = playlist_id('37i9dQZF1DX0XUsuxWHRQd')
RapCaviar

'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable


Unnamed: 0,track_ids,track_pop,track_name,artist,artist_ids,featured_artist,danceability,energy,key,loudness,...,acousticness,instrumentalness,liveness,valence,tempo,type,duration_ms,time_signature,artist_pop,genre
0,3Q6F8RByyhRTJpRtZLY3cg,86,WHATS POPPIN,Jack Harlow,2LIk90788K0zvyj2JJVwkJ,0,0.923,0.604,11,-6.671,...,0.017,0.0,0.272,0.826,145.062,audio_features,139741,4,79,deep underground hip hop
1,4iiWcajF1fEUpwcUewc464,78,"Life Is Good (feat. Drake, DaBaby & Lil Baby) ...",Future,1RyvyyTE3xzB2ZywiAwp0i,1,0.81,0.566,5,-8.33,...,0.107,0.0,0.122,0.582,142.069,audio_features,315346,4,94,atl hip hop
2,733c1CWmIGymoQXdp7Us88,83,"Numbers (feat. Roddy Ricch, Gunna and London O...",A Boogie Wit da Hoodie,31W5EY0aAly4Qieq6OFu6I,1,0.819,0.654,11,-6.665,...,0.517,0.0,0.0996,0.455,133.503,audio_features,188563,4,94,melodic rap
3,3qHgGyJY4GpXNOK4WL4NSo,72,Red Eye,YoungBoy Never Broke Again,7wlFDEWiM5OoIAt8RSli8b,0,0.485,0.814,9,-3.907,...,0.217,0.0,0.112,0.327,159.894,audio_features,156077,4,91,baton rouge rap
4,0nbXyq5TXYPCO7pr3N8S4I,100,The Box,Roddy Ricch,757aE44tKEUQEqRuT6GnEB,0,0.896,0.586,10,-6.687,...,0.104,0.0,0.79,0.642,116.971,audio_features,196653,4,96,melodic rap
5,6wJYhPfqk3KGhHRG76WzOh,88,Blueberry Faygo,Lil Mosey,5zctI4wO9XSKS8XwcnqEHk,0,0.774,0.554,0,-7.909,...,0.207,0.0,0.132,0.349,99.034,audio_features,162547,4,86,melodic rap
6,07KXEDMj78x68D884wgVEm,90,High Fashion (feat. Mustard),Roddy Ricch,757aE44tKEUQEqRuT6GnEB,1,0.831,0.499,11,-8.442,...,0.269,0.0,0.3,0.511,97.956,audio_features,220487,4,96,melodic rap
7,6gi6y1xwmVszDWkUqab1qw,87,OUT WEST (feat. Young Thug),JACKBOYS,7A8S43ryYdbWpJKeHRZRcq,1,0.802,0.591,8,-4.895,...,0.0104,0.0,0.196,0.309,139.864,audio_features,157712,4,83,rap
8,2FvD20Z8aoWIePi7PoN8sG,87,TOES (feat. Lil Baby & Moneybagg Yo),DaBaby,4r63FhuTkUYltbVAg5TQnk,1,0.816,0.582,8,-4.141,...,0.0794,6e-06,0.0916,0.542,160.004,audio_features,136366,4,95,nc hip hop
9,0JjM9bKm4wrwohMslcm892,82,Still Be Friends (feat. Tory Lanez & Tyga),G-Eazy,02kJSzxNuaWGqwubyUba0Z,1,0.803,0.759,7,-4.692,...,0.00509,0.0,0.0921,0.284,104.0,audio_features,213308,4,87,indie pop rap


In [16]:
# Checking to make sure it works with playlists that contain over 100 songs...it does
hot_country = playlist_id('37i9dQZF1DX1lVhptIYRda')
hot_country

Unnamed: 0,track_ids,track_pop,track_name,artist,artist_ids,featured_artist,danceability,energy,key,loudness,...,acousticness,instrumentalness,liveness,valence,tempo,type,duration_ms,time_signature,artist_pop,genre
0,0NSwXfEWMG7HIRvXioGu03,64,Here And Now,Kenny Chesney,3grHWM9bx2E9vwJCdlRv9O,0,0.52,0.841,9,-3.048,...,0.0477,0.0,0.25,0.705,147.963,audio_features,171493,4,77,contemporary country
1,7Ce11Oh1kConF6UlkBWnaZ,0,God Whispered Your Name,Keith Urban,0u2FHSq3ln94y5Q57xazwf,0,0.594,0.6,2,-6.712,...,0.306,0.0,0.149,0.5,145.967,audio_features,232165,4,76,australian country
2,53F1MVa1BWUkTBbVqgVAfN,76,Kinfolks,Sam Hunt,2kucQ9jQwuD8jWdtR9Ef38,0,0.565,0.805,0,-4.457,...,0.0808,0.0,0.173,0.869,76.04,audio_features,181933,4,78,contemporary country
3,7dnDBbHKyJNFXoeVwO8KBY,74,Blessings,Florida Georgia Line,3b8QkneNDz4JHKKKlLgYZg,0,0.466,0.672,4,-5.926,...,0.737,0.0,0.0858,0.334,89.081,audio_features,198520,4,84,contemporary country
4,6U4S5Fcy4sNgYnKPh6LLdh,70,These Days,MacKenzie Porter,6nXco5Q3cJJ0ZutnBOsSpq,0,0.463,0.778,9,-3.013,...,0.413,0.0,0.16,0.424,175.888,audio_features,172755,4,61,alberta country
5,7oSq5fbsFkTS9zeJQwMihf,74,This Bar,Morgan Wallen,4oUHIQIBe0LHzYfvXNW4QM,0,0.545,0.884,2,-5.208,...,0.115,0.000424,0.0496,0.272,110.015,audio_features,185733,4,82,contemporary country
6,6D2Up6lpoFazlmFn0Zpkty,32,Homesick,Kane Brown,3oSJ7TBVCWMDMiYjXNiCKE,0,0.701,0.5,8,-8.82,...,0.46,0.0,0.234,0.409,97.028,audio_features,205400,4,82,contemporary country
7,53cGGBxGPaRjP0HVUESyW5,68,Catch,Brett Young,0fiWOxhsBsQQvFDtxUQWo0,0,0.566,0.765,2,-6.337,...,0.0248,0.0,0.0783,0.604,156.036,audio_features,196440,4,76,contemporary country
8,7idmHTAQQPUFqdjXkoooXD,72,Beer Can’t Fix,Thomas Rhett,6x2LnllRG5uGarZMsD4iO8,1,0.711,0.774,7,-4.068,...,0.0317,0.0,0.124,0.939,111.016,audio_features,209733,4,80,contemporary country
9,7yNJCsUH3tXlpQiHSsAc5l,73,"Big, Big Plans",Chris Lane,68abRTdO4meYReMWHvBYb0,0,0.574,0.58,7,-6.091,...,0.076,0.0,0.12,0.4,149.97,audio_features,187307,4,71,contemporary country


## Showing example of Playlist that had Music Videos

In [None]:
#RapCaviar playlist which initially broke my code because the music videos
# the first track is 'None'
sp.playlist('37i9dQZF1DX0XUsuxWHRQd')

In [18]:
#the function skips the first two tracks on RapCaviar bc they were music videos
playlist_id('37i9dQZF1DX0XUsuxWHRQd')

'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable


Unnamed: 0,track_ids,track_pop,track_name,artist,artist_ids,featured_artist,danceability,energy,key,loudness,...,acousticness,instrumentalness,liveness,valence,tempo,type,duration_ms,time_signature,artist_pop,genre
0,3Q6F8RByyhRTJpRtZLY3cg,86,WHATS POPPIN,Jack Harlow,2LIk90788K0zvyj2JJVwkJ,0,0.923,0.604,11,-6.671,...,0.017,0.0,0.272,0.826,145.062,audio_features,139741,4,79,deep underground hip hop
1,4iiWcajF1fEUpwcUewc464,78,"Life Is Good (feat. Drake, DaBaby & Lil Baby) ...",Future,1RyvyyTE3xzB2ZywiAwp0i,1,0.81,0.566,5,-8.33,...,0.107,0.0,0.122,0.582,142.069,audio_features,315346,4,94,atl hip hop
2,733c1CWmIGymoQXdp7Us88,83,"Numbers (feat. Roddy Ricch, Gunna and London O...",A Boogie Wit da Hoodie,31W5EY0aAly4Qieq6OFu6I,1,0.819,0.654,11,-6.665,...,0.517,0.0,0.0996,0.455,133.503,audio_features,188563,4,94,melodic rap
3,3qHgGyJY4GpXNOK4WL4NSo,72,Red Eye,YoungBoy Never Broke Again,7wlFDEWiM5OoIAt8RSli8b,0,0.485,0.814,9,-3.907,...,0.217,0.0,0.112,0.327,159.894,audio_features,156077,4,91,baton rouge rap
4,0nbXyq5TXYPCO7pr3N8S4I,100,The Box,Roddy Ricch,757aE44tKEUQEqRuT6GnEB,0,0.896,0.586,10,-6.687,...,0.104,0.0,0.79,0.642,116.971,audio_features,196653,4,96,melodic rap
5,6wJYhPfqk3KGhHRG76WzOh,88,Blueberry Faygo,Lil Mosey,5zctI4wO9XSKS8XwcnqEHk,0,0.774,0.554,0,-7.909,...,0.207,0.0,0.132,0.349,99.034,audio_features,162547,4,86,melodic rap
6,07KXEDMj78x68D884wgVEm,90,High Fashion (feat. Mustard),Roddy Ricch,757aE44tKEUQEqRuT6GnEB,1,0.831,0.499,11,-8.442,...,0.269,0.0,0.3,0.511,97.956,audio_features,220487,4,96,melodic rap
7,6gi6y1xwmVszDWkUqab1qw,87,OUT WEST (feat. Young Thug),JACKBOYS,7A8S43ryYdbWpJKeHRZRcq,1,0.802,0.591,8,-4.895,...,0.0104,0.0,0.196,0.309,139.864,audio_features,157712,4,83,rap
8,2FvD20Z8aoWIePi7PoN8sG,87,TOES (feat. Lil Baby & Moneybagg Yo),DaBaby,4r63FhuTkUYltbVAg5TQnk,1,0.816,0.582,8,-4.141,...,0.0794,6e-06,0.0916,0.542,160.004,audio_features,136366,4,95,nc hip hop
9,0JjM9bKm4wrwohMslcm892,82,Still Be Friends (feat. Tory Lanez & Tyga),G-Eazy,02kJSzxNuaWGqwubyUba0Z,1,0.803,0.759,7,-4.692,...,0.00509,0.0,0.0921,0.284,104.0,audio_features,213308,4,87,indie pop rap


## Looping through all the playlist IDs to create one big dataframe for one genre

### Country Dataframe - 3,533 songs

In [19]:
country_ids = get_playlist_ids('country')

There are 47 country playlists with over 20,000 followers


In [20]:
country_dataframe = pd.DataFrame()
for i in country_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i 
        result['playlist_name'] = sp.playlist(i)['name']
        country_dataframe = country_dataframe.append(result)
    except:
        print(i)

retrying ...3secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
'NoneType' object is not subscriptable
retrying ...3secs
retrying ...1secs


In [21]:
country_dataframe['category'] = 'country'

In [24]:
country_dataframe.head()

Unnamed: 0,track_ids,track_pop,track_name,artist,artist_ids,featured_artist,danceability,energy,key,loudness,...,valence,tempo,type,duration_ms,time_signature,artist_pop,genre,playlist_id,playlist_name,category
0,3GJ4hzg4lrGwU51Y3VARbF,78,Speechless,Dan + Shay,7z5WFjZAIYejWy0NI5lv4T,0,0.616,0.438,1,-5.968,...,0.386,135.929,audio_features,213387,4,85,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
1,2rxQMGVafnNaRaXlRMWPde,79,Beautiful Crazy,Luke Combs,718COspgdWOnwOFpJHRZHS,0,0.552,0.402,11,-7.431,...,0.382,103.313,audio_features,193200,4,86,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
2,7Lr4XaxGpkAwa37IVgg22k,69,Back To Life,Rascal Flatts,0a1gHP0HAqALbEyxaD5Ngn,0,0.467,0.701,4,-4.997,...,0.28,132.041,audio_features,214219,3,76,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
3,5kp3JbZL1ROMxc32pcpn29,58,Slow Dance In A Parking Lot,Jordan Davis,77kULmXAQ6vWer7IIHdGzI,0,0.519,0.833,7,-4.47,...,0.514,81.992,audio_features,193929,4,72,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
4,1wPiUPw9IqSchKwinw7dCf,10,Made For You,Jake Owen,1n2pb9Tsfe4SwAjmUac6YT,0,0.541,0.452,1,-6.129,...,0.325,82.052,audio_features,238040,4,72,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country


In [23]:
country_dataframe.shape

(3533, 25)

### Alternative Dataframe- 2,412 songs

In [25]:
alternative_ids = get_playlist_ids('alternative')

There are 33 alternative playlists with over 20,000 followers


In [26]:
alternative_dataframe = pd.DataFrame()
for i in alternative_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i 
        result['playlist_name'] = sp.playlist(i)['name']
        alternative_dataframe = alternative_dataframe.append(result)
    except:
        print(i) #prints out the playlist that breaks the code

retrying ...3secs
retrying ...4secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...3secs
retrying ...3secs
2VCUG2HWlEeq1zvUvRkN80
retrying ...1secs
retrying ...3secs
retrying ...1secs
retrying ...1secs


In [27]:
alternative_dataframe['category'] = 'alternative'

In [28]:
alternative_dataframe.shape

(2412, 25)

### Pop Dataframe- 3,321 songs

In [29]:
pop_ids = get_playlist_ids('pop')

There are 48 pop playlists with over 20,000 followers


In [30]:
pop_dataframe = pd.DataFrame()
for i in pop_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i
        result['playlist_name'] = sp.playlist(i)['name']
        pop_dataframe = pop_dataframe.append(result)
    except:
        print(i)

retrying ...1secs
retrying ...2secs
retrying ...1secs
retrying ...1secs
retrying ...2secs
retrying ...2secs
retrying ...1secs
37i9dQZF1DX9ASuQophyb3
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
retrying ...1secs
'NoneType' object is not subscriptable
retrying ...2secs
'NoneType' object is not subscriptable
retrying ...2secs
retrying ...3secs
retrying ...1secs
retrying ...1secs
retrying ...3secs
retrying ...3secs
retrying ...1secs
'NoneType' object is not subscriptable
5oxZIYU1L9N1CczN0C4JkM
retrying ...3secs
'NoneType' object is not subscriptable
retrying ...3secs


In [31]:
pop_dataframe['category'] = 'pop'

In [32]:
pop_dataframe.shape

(3321, 25)

### Rap Dataframe- 2,620 songs

In [33]:
rap_ids = get_playlist_ids('rap')

There are 47 rap playlists with over 20,000 followers


In [34]:
rap_dataframe = pd.DataFrame()
for i in rap_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i
        result['playlist_name'] = sp.playlist(i)['name']
        rap_dataframe = rap_dataframe.append(result)
    except:
        print(i)

'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
'NoneType' object is not subscriptable
retrying ...2secs
retrying ...3secs
retrying ...1secs
retrying ...4secs
retrying ...2secs
retrying ...3secs
53fB7YLpgL2nIr6Yke4133
retrying ...2secs
37i9dQZF1DX6PKX5dyBKeq
retrying ...3secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...1secs


In [35]:
rap_dataframe['category'] = 'rap'

In [36]:
rap_dataframe.shape

(2620, 25)

### Club DataFrame- 1,974 songs

In [37]:
club_ids = get_playlist_ids('club')

There are 32 club playlists with over 20,000 followers


In [38]:
club_dataframe = pd.DataFrame()
for i in club_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i
        result['playlist_name'] = sp.playlist(i)['name']
        club_dataframe = club_dataframe.append(result)
    except:
        print(i)

retrying ...2secs
'NoneType' object is not subscriptable
retrying ...1secs
retrying ...1secs
retrying ...2secs
retrying ...2secs
retrying ...3secs
37i9dQZF1DWYWyJFR69WAN
retrying ...1secs
retrying ...3secs
retrying ...1secs
retrying ...4secs
retrying ...3secs
retrying ...1secs
retrying ...1secs
0uxHEGXSlleLGA0NUvzMhe
retrying ...1secs


In [39]:
club_dataframe['category'] = 'club'

In [40]:
club_dataframe.shape

(1974, 25)

### Hits Dataframe- 3,005 songs

In [41]:
hits_ids = get_playlist_ids('hits')

There are 48 hits playlists with over 20,000 followers


In [42]:
hits_dataframe = pd.DataFrame()
for i in hits_ids:
    try:
        result = playlist_id(i)
        result['playlist_id'] = i
        result['playlist_name'] = sp.playlist(i)['name']
        hits_dataframe = hits_dataframe.append(result)
    except:
        print(i)

retrying ...2secs
retrying ...3secs
retrying ...3secs
retrying ...2secs
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...2secs
'NoneType' object is not subscriptable
retrying ...1secs
retrying ...1secs
retrying ...1secs
retrying ...3secs
retrying ...1secs
retrying ...3secs
retrying ...1secs
3Di88mvYplBtkDBIzGLiiM
retrying ...3secs


In [43]:
hits_dataframe['category'] = 'hits'

In [44]:
hits_dataframe.shape

(3005, 25)

## Concat all genre dataframes to one

In [45]:
#16,865 songs
result = pd.concat([country_dataframe, alternative_dataframe, pop_dataframe, rap_dataframe, club_dataframe, hits_dataframe])

In [46]:
result.head()

Unnamed: 0,track_ids,track_pop,track_name,artist,artist_ids,featured_artist,danceability,energy,key,loudness,...,valence,tempo,type,duration_ms,time_signature,artist_pop,genre,playlist_id,playlist_name,category
0,3GJ4hzg4lrGwU51Y3VARbF,78,Speechless,Dan + Shay,7z5WFjZAIYejWy0NI5lv4T,0,0.616,0.438,1,-5.968,...,0.386,135.929,audio_features,213387,4,85,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
1,2rxQMGVafnNaRaXlRMWPde,79,Beautiful Crazy,Luke Combs,718COspgdWOnwOFpJHRZHS,0,0.552,0.402,11,-7.431,...,0.382,103.313,audio_features,193200,4,86,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
2,7Lr4XaxGpkAwa37IVgg22k,69,Back To Life,Rascal Flatts,0a1gHP0HAqALbEyxaD5Ngn,0,0.467,0.701,4,-4.997,...,0.28,132.041,audio_features,214219,3,76,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
3,5kp3JbZL1ROMxc32pcpn29,58,Slow Dance In A Parking Lot,Jordan Davis,77kULmXAQ6vWer7IIHdGzI,0,0.519,0.833,7,-4.47,...,0.514,81.992,audio_features,193929,4,72,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country
4,1wPiUPw9IqSchKwinw7dCf,10,Made For You,Jake Owen,1n2pb9Tsfe4SwAjmUac6YT,0,0.541,0.452,1,-6.129,...,0.325,82.052,audio_features,238040,4,72,contemporary country,37i9dQZF1DX8WMG8VPSOJC,Country Kind of Love,country


In [47]:
result.shape

(16865, 25)

In [48]:
result.to_csv('../../Data/final_with_duplicates.csv')