## Chapter 5) Methodology
Code for analysis and ML workflow to build a recommender system and find music item similarities based on a user's listening history and liked/saved songs.


##### Steps (Brief Overview):

**Data Collection & Processing (EDA Part)**
1. load all data available and store in Pandas data frames
2. connect to Spotify API using developer console to extract song features
3. create separate DFs for songs/playlists collected by friends and MBTI playlists downloaded from Kaggle

**Feature Extraction & Selection**
4. clean the data and select feature columns for the model

**Content Based Filtering on Base Dataset**
5. applying the different ML Models on Baseline Dataset Using Content Based Filtering and Evaluating Initial Results

**Incorperating MBTI Perosnality Types in the Recommendation Process**
6. adding additional feature column for MBTI personality type and create MBTI based DFs from Kaggle Datasets
7. applying use-item Matrix factorization and evaluate results 

**Compare Results**
8. compare the results of baseline model with MBTI implemented model

In [8]:
# importing libraries
import pandas as pd
import seaborn as sns

# importing spotify_songs dataset downloaded from Kaggle (https://www.kaggle.com/datasets/joebeachcapital/30000-spotify-songs/data?select=spotify_songs.csv)
df_spotify_songs = pd.read_csv('/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Kaggle Datasets/30000 Spotify Songs Dataset/spotify_songs.csv')

df_spotify_songs.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32833 entries, 0 to 32832
Data columns (total 23 columns):
 #   Column                    Non-Null Count  Dtype  
---  ------                    --------------  -----  
 0   track_id                  32833 non-null  object 
 1   track_name                32828 non-null  object 
 2   track_artist              32828 non-null  object 
 3   track_popularity          32833 non-null  int64  
 4   track_album_id            32833 non-null  object 
 5   track_album_name          32828 non-null  object 
 6   track_album_release_date  32833 non-null  object 
 7   playlist_name             32833 non-null  object 
 8   playlist_id               32833 non-null  object 
 9   playlist_genre            32833 non-null  object 
 10  playlist_subgenre         32833 non-null  object 
 11  danceability              32833 non-null  float64
 12  energy                    32833 non-null  float64
 13  key                       32833 non-null  int64  
 14  loudne

In [10]:
# summarizing shape of the dataset's distribution
df_spotify_songs.describe()

Unnamed: 0,track_popularity,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms
count,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0,32833.0
mean,42.477081,0.65485,0.698619,5.374471,-6.719499,0.565711,0.107068,0.175334,0.084747,0.190176,0.510561,120.881132,225799.811622
std,24.984074,0.145085,0.18091,3.611657,2.988436,0.495671,0.101314,0.219633,0.22423,0.154317,0.233146,26.903624,59834.006182
min,0.0,0.0,0.000175,0.0,-46.448,0.0,0.0,0.0,0.0,0.0,0.0,0.0,4000.0
25%,24.0,0.563,0.581,2.0,-8.171,0.0,0.041,0.0151,0.0,0.0927,0.331,99.96,187819.0
50%,45.0,0.672,0.721,6.0,-6.166,1.0,0.0625,0.0804,1.6e-05,0.127,0.512,121.984,216000.0
75%,62.0,0.761,0.84,9.0,-4.645,1.0,0.132,0.255,0.00483,0.248,0.693,133.918,253585.0
max,100.0,0.983,1.0,11.0,1.275,1.0,0.918,0.994,0.994,0.996,0.991,239.44,517810.0


In [13]:
df_spotify_songs.head(10)

Unnamed: 0,track_id,track_name,track_artist,track_popularity,track_album_id,track_album_name,track_album_release_date,playlist_name,playlist_id,playlist_genre,...,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo,duration_ms
0,6f807x0ima9a1j3VPbc7VN,I Don't Care (with Justin Bieber) - Loud Luxur...,Ed Sheeran,66,2oCs0DGTsRO98Gh5ZSl2Cx,I Don't Care (with Justin Bieber) [Loud Luxury...,2019-06-14,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,6,-2.634,1,0.0583,0.102,0.0,0.0653,0.518,122.036,194754
1,0r7CVbZTWZgbTCYdfa2P31,Memories - Dillon Francis Remix,Maroon 5,67,63rPSO264uRjW1X5E6cWv6,Memories (Dillon Francis Remix),2019-12-13,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,11,-4.969,1,0.0373,0.0724,0.00421,0.357,0.693,99.972,162600
2,1z1Hg7Vb0AhHDiEmnDE79l,All the Time - Don Diablo Remix,Zara Larsson,70,1HoSmj2eLcsrR0vE9gThr4,All the Time (Don Diablo Remix),2019-07-05,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,1,-3.432,0,0.0742,0.0794,2.3e-05,0.11,0.613,124.008,176616
3,75FpbthrwQmzHlBJLuGdC7,Call You Mine - Keanu Silva Remix,The Chainsmokers,60,1nqYsOef1yKKuGOVchbsk6,Call You Mine - The Remixes,2019-07-19,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,7,-3.778,1,0.102,0.0287,9e-06,0.204,0.277,121.956,169093
4,1e8PAfcKUYoKkxPhrHqw4x,Someone You Loved - Future Humans Remix,Lewis Capaldi,69,7m7vv9wlQ4i0LFuJiE2zsQ,Someone You Loved (Future Humans Remix),2019-03-05,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,1,-4.672,1,0.0359,0.0803,0.0,0.0833,0.725,123.976,189052
5,7fvUMiyapMsRRxr07cU8Ef,Beautiful People (feat. Khalid) - Jack Wins Remix,Ed Sheeran,67,2yiy9cd2QktrNvWC2EUi0k,Beautiful People (feat. Khalid) [Jack Wins Remix],2019-07-11,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,8,-5.385,1,0.127,0.0799,0.0,0.143,0.585,124.982,163049
6,2OAylPUDDfwRGfe0lYqlCQ,Never Really Over - R3HAB Remix,Katy Perry,62,7INHYSeusaFlyrHSNxm8qH,Never Really Over (R3HAB Remix),2019-07-26,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,5,-4.788,0,0.0623,0.187,0.0,0.176,0.152,112.648,187675
7,6b1RNvAcJjQH73eZO4BLAB,Post Malone (feat. RANI) - GATTÜSO Remix,Sam Feldt,69,6703SRPsLkS4bPtMFFJes1,Post Malone (feat. RANI) [GATTÜSO Remix],2019-08-29,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,4,-2.419,0,0.0434,0.0335,5e-06,0.111,0.367,127.936,207619
8,7bF6tCO3gFb8INrEDcjNT5,Tough Love - Tiësto Remix / Radio Edit,Avicii,68,7CvAfGvq4RlIwEbT9o8Iav,Tough Love (Tiësto Remix),2019-06-14,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,8,-3.562,1,0.0565,0.0249,4e-06,0.637,0.366,127.015,193187
9,1IXGILkPm0tOCNeq00kCPa,If I Can't Have You - Gryffin Remix,Shawn Mendes,67,4QxzbfSsVryEQwvPFEV5Iu,If I Can't Have You (Gryffin Remix),2019-06-20,Pop Remix,37i9dQZF1DXcZDD7cfEKhW,pop,...,2,-4.552,1,0.032,0.0567,0.0,0.0919,0.59,124.957,253040


In [11]:
# identifying columns with null_values
df_spotify_songs.isnull().sum()

track_id                    0
track_name                  5
track_artist                5
track_popularity            0
track_album_id              0
track_album_name            5
track_album_release_date    0
playlist_name               0
playlist_id                 0
playlist_genre              0
playlist_subgenre           0
danceability                0
energy                      0
key                         0
loudness                    0
mode                        0
speechiness                 0
acousticness                0
instrumentalness            0
liveness                    0
valence                     0
tempo                       0
duration_ms                 0
dtype: int64

In [12]:
# coutning duplicate entries in dataframe
df_spotify_songs.duplicated().sum()

0

# Extracting Disliked Songs of a User using streaming history

In [132]:
# extracting the streaming history as JSON and save relevant columns as Pandas DF

import json
import pandas as pd

# Read the JSON file
with open('/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/00_Data_Wadthy_ESFJ/Spotify Account Data/StreamingHistory_music_0.json', 'r') as file:
    data = json.load(file)

# Extract the required fields
extracted_data = [
    {
        'artistName': item['artistName'],
        'trackName': item['trackName'],
        'msPlayed': item['msPlayed']
    }
    for item in data
]

# Create a pandas DataFrame
df = pd.DataFrame(extracted_data)

# Display the first few rows of the DataFrame
print(df.head())

# Save the DataFrame to a CSV file (optional)
# df.to_csv('streaming_history.csv', index=False)

     artistName     trackName  msPlayed
0  Taylor Swift     Enchanted     86091
1    Bruno Mars      Treasure    178560
2          ZICO      Any song    186465
3            IU          Coin    193080
4            IU  Hold my hand    195213


In [133]:
df.head(10)

Unnamed: 0,artistName,trackName,msPlayed
0,Taylor Swift,Enchanted,86091
1,Bruno Mars,Treasure,178560
2,ZICO,Any song,186465
3,IU,Coin,193080
4,IU,Hold my hand,195213
5,Eric Nam,Miss You,172634
6,IU,LILAC,214253
7,IU,Meaning of you,195631
8,G-DRAGON,WHO YOU?,201428
9,LAS,"Love, This",168663


In [134]:
# creating a "like_dislike" column out of dataframe based on msPlayed (if <30000 then 0 / dislike else 1 / like)
df['like_dislike'] = (df['msPlayed'] > 30000).astype(int)

In [135]:
df.head(10)

Unnamed: 0,artistName,trackName,msPlayed,like_dislike
0,Taylor Swift,Enchanted,86091,1
1,Bruno Mars,Treasure,178560,1
2,ZICO,Any song,186465,1
3,IU,Coin,193080,1
4,IU,Hold my hand,195213,1
5,Eric Nam,Miss You,172634,1
6,IU,LILAC,214253,1
7,IU,Meaning of you,195631,1
8,G-DRAGON,WHO YOU?,201428,1
9,LAS,"Love, This",168663,1


In [136]:
df_dislike = df[df['like_dislike'] == 0]

In [137]:
df_dislike.head(30)

Unnamed: 0,artistName,trackName,msPlayed,like_dislike
14,Taylor Swift,Lavender Haze - Felix Jaehn Remix,4410,0
27,Whethan,Can't Hide,20145,0
34,Anne-Marie,2002,21382,0
45,ASL,When Loving You,1440,0
46,Anderson .Paak,Fire In The Sky,1650,0
49,MYSM,Indie Feel,6832,0
51,Bruno Mars,Just the Way You Are,4769,0
54,Jack Johnson,Washing Dishes,2430,0
57,Jay Chou,青花瓷,17950,0
58,The Script,The Man Who Can't Be Moved,1760,0


In [139]:
# Create the 'like_dislike' column
df['like_dislike'] = (df['msPlayed'] > 30000).astype(int)

# Filter for disliked songs (like_dislike = 0)
df_dislike = df[df['like_dislike'] == 0]

# Filter for liked songs (like_dislike = 1)
df_like = df[df['like_dislike'] == 1]

# Remove entries where trackName is "Unknown Track"
df_dislike = df_dislike[df_dislike['trackName'] != "Unknown Track"]

# Drop duplicate values
df_dislike = df_dislike.drop_duplicates()

# getting a random sample of 200 entries out of the 
df_dislike_sampled = df_dislike.sample(n=200, random_state=42)

In [140]:
df_dislike_sampled.head(30)

Unnamed: 0,artistName,trackName,msPlayed,like_dislike
1576,The xx,Islands,29200,0
4631,Quinn XCII,Too Late (with AJR),4202,0
1220,Parachute,Without You,0,0
4726,Jung Kook,Seven (feat. Latto),880,0
5353,Nicky Youre,Sunroof,0,0
5962,The Band CAMINO,2 / 14,590,0
3775,Marek Hemmann,Gemini,1630,0
328,French Montana,Unforgettable,24685,0
3403,Troye Sivan,YOUTH,1250,0
3410,Taylor Swift,Enchanted,110,0


In [141]:
df_dislike_sampled.shape

(200, 4)

In [142]:
df_dislike_sampled.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 200 entries, 1576 to 3316
Data columns (total 4 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   artistName    200 non-null    object
 1   trackName     200 non-null    object
 2   msPlayed      200 non-null    int64 
 3   like_dislike  200 non-null    int64 
dtypes: int64(2), object(2)
memory usage: 7.8+ KB


In [143]:
df_dislike_sampled.isnull().sum()

artistName      0
trackName       0
msPlayed        0
like_dislike    0
dtype: int64

In [144]:
import pandas as pd
import requests
import base64

# Spotify API credentials
client_id = '582341de1c87493291783ae774754039'
client_secret = 'a21c97ce2062459a8f257d476ed5fe97'

# Function to get access token
def get_access_token(client_id, client_secret):
    auth_url = 'https://accounts.spotify.com/api/token'
    auth_header = base64.b64encode(f"{client_id}:{client_secret}".encode()).decode()
    auth_data = {'grant_type': 'client_credentials'}
    auth_response = requests.post(auth_url, headers={'Authorization': f'Basic {auth_header}'}, data=auth_data)
    return auth_response.json()['access_token']

# Function to search for a track and get its URI
def get_track_uri(artist_name, track_name, access_token):
    search_url = 'https://api.spotify.com/v1/search'
    query = f"track:{track_name} artist:{artist_name}"
    search_params = {
        'q': query,
        'type': 'track',
        'limit': 1
    }
    search_response = requests.get(search_url, headers={'Authorization': f'Bearer {access_token}'}, params=search_params)

    if search_response.status_code == 200:
        results = search_response.json()
        if results['tracks']['items']:
            return results['tracks']['items'][0]['uri']
    return None

# Get access token
access_token = get_access_token(client_id, client_secret)

# Function to apply to each row of the DataFrame
def get_uri_for_row(row):
    return get_track_uri(row['artistName'], row['trackName'], access_token)

# Apply the function to each row and create a new 'track_uri' column
df_dislike_sampled['track_uri'] = df_dislike_sampled.apply(get_uri_for_row, axis=1)

# Display the first few rows of the updated DataFrame
print(df_dislike_sampled[['artistName', 'trackName', 'track_uri']].head())

# Optionally, save the updated DataFrame to a CSV file
# df_dislike.to_csv('disliked_songs_with_uris.csv', index=False)

       artistName            trackName                             track_uri
1576       The xx              Islands  spotify:track:6i5tYaGlOFDLILEB6HfJAa
4631   Quinn XCII  Too Late (with AJR)  spotify:track:3KE6KppohrzZMbo2ao7CZ2
1220    Parachute          Without You  spotify:track:6R6ux6KaKrhAg2EIB2krdU
4726    Jung Kook  Seven (feat. Latto)  spotify:track:7x9aauaA9cu6tyfpHnqDLo
5353  Nicky Youre              Sunroof  spotify:track:5YqEzk3C5c3UZ1D5fJUlXA


In [145]:
df_dislike_sampled.head(20)

Unnamed: 0,artistName,trackName,msPlayed,like_dislike,track_uri
1576,The xx,Islands,29200,0,spotify:track:6i5tYaGlOFDLILEB6HfJAa
4631,Quinn XCII,Too Late (with AJR),4202,0,spotify:track:3KE6KppohrzZMbo2ao7CZ2
1220,Parachute,Without You,0,0,spotify:track:6R6ux6KaKrhAg2EIB2krdU
4726,Jung Kook,Seven (feat. Latto),880,0,spotify:track:7x9aauaA9cu6tyfpHnqDLo
5353,Nicky Youre,Sunroof,0,0,spotify:track:5YqEzk3C5c3UZ1D5fJUlXA
5962,The Band CAMINO,2 / 14,590,0,spotify:track:2QwpEi3eNToZCCMMRcOj0u
3775,Marek Hemmann,Gemini,1630,0,spotify:track:5dwjQsS2ezI4NDnokvK7IM
328,French Montana,Unforgettable,24685,0,spotify:track:3B54sVLJ402zGa6Xm4YGNe
3403,Troye Sivan,YOUTH,1250,0,spotify:track:1cOyWWUr3oXJIxY0AjJEx9
3410,Taylor Swift,Enchanted,110,0,spotify:track:3sW3oSbzsfecv9XoUdGs7h


In [146]:
df_dislike_sampled.isnull().sum()

artistName      0
trackName       0
msPlayed        0
like_dislike    0
track_uri       7
dtype: int64

In [147]:
df_dislike_sampled_final = df_dislike_sampled.drop(['msPlayed' ,'like_dislike'], axis=1)

In [131]:
df_dislike_sampled_final.head(20)

1576    spotify:track:6i5tYaGlOFDLILEB6HfJAa
4631    spotify:track:3KE6KppohrzZMbo2ao7CZ2
1220    spotify:track:6R6ux6KaKrhAg2EIB2krdU
4726    spotify:track:7x9aauaA9cu6tyfpHnqDLo
5353    spotify:track:5YqEzk3C5c3UZ1D5fJUlXA
5962    spotify:track:2QwpEi3eNToZCCMMRcOj0u
3775    spotify:track:5dwjQsS2ezI4NDnokvK7IM
328     spotify:track:3B54sVLJ402zGa6Xm4YGNe
3403    spotify:track:1cOyWWUr3oXJIxY0AjJEx9
3410    spotify:track:3sW3oSbzsfecv9XoUdGs7h
5831    spotify:track:3JbVKGhp0WXqO3DK8Xeg0n
948     spotify:track:5eO04wLeM487N9qhPHPPoB
3719    spotify:track:0dbQ4h3cs8QE5fOPMYdDrX
4450    spotify:track:5acSb48zFAcXTdL5Wsk8xx
4650    spotify:track:3QkU3yJDpu7XITJz4uWSSg
2211    spotify:track:6r3duEAfFTH83DuoywkG20
3082    spotify:track:6UDxsNdXBPvlLOydyncRDa
8744    spotify:track:4KoecuyOpZaNFZ0UqVsllc
9856    spotify:track:4BShF07Q4mZh0L9Juoes0Z
5426    spotify:track:2jdAk8ATWIL3dwT47XpRfu
Name: track_uri, dtype: object

In [148]:
df_dislike_sampled_final.shape

(200, 3)

In [149]:
df_dislike_sampled_final.head(20)

Unnamed: 0,artistName,trackName,track_uri
1576,The xx,Islands,spotify:track:6i5tYaGlOFDLILEB6HfJAa
4631,Quinn XCII,Too Late (with AJR),spotify:track:3KE6KppohrzZMbo2ao7CZ2
1220,Parachute,Without You,spotify:track:6R6ux6KaKrhAg2EIB2krdU
4726,Jung Kook,Seven (feat. Latto),spotify:track:7x9aauaA9cu6tyfpHnqDLo
5353,Nicky Youre,Sunroof,spotify:track:5YqEzk3C5c3UZ1D5fJUlXA
5962,The Band CAMINO,2 / 14,spotify:track:2QwpEi3eNToZCCMMRcOj0u
3775,Marek Hemmann,Gemini,spotify:track:5dwjQsS2ezI4NDnokvK7IM
328,French Montana,Unforgettable,spotify:track:3B54sVLJ402zGa6Xm4YGNe
3403,Troye Sivan,YOUTH,spotify:track:1cOyWWUr3oXJIxY0AjJEx9
3410,Taylor Swift,Enchanted,spotify:track:3sW3oSbzsfecv9XoUdGs7h


In [150]:
# dropping null values in the track_uri column
df_dislike_sampled_final = df_dislike_sampled_final.dropna()

In [151]:
df_dislike_sampled_final.shape

(193, 3)

In [152]:
# splitting track URI column to get track_id separately
df_dislike_sampled_final['track_id'] = df_dislike_sampled_final.iloc[:, 2].str.split(':').str.get(-1)

In [153]:
df_dislike_sampled_final.head(10)


Unnamed: 0,artistName,trackName,track_uri,track_id
1576,The xx,Islands,spotify:track:6i5tYaGlOFDLILEB6HfJAa,6i5tYaGlOFDLILEB6HfJAa
4631,Quinn XCII,Too Late (with AJR),spotify:track:3KE6KppohrzZMbo2ao7CZ2,3KE6KppohrzZMbo2ao7CZ2
1220,Parachute,Without You,spotify:track:6R6ux6KaKrhAg2EIB2krdU,6R6ux6KaKrhAg2EIB2krdU
4726,Jung Kook,Seven (feat. Latto),spotify:track:7x9aauaA9cu6tyfpHnqDLo,7x9aauaA9cu6tyfpHnqDLo
5353,Nicky Youre,Sunroof,spotify:track:5YqEzk3C5c3UZ1D5fJUlXA,5YqEzk3C5c3UZ1D5fJUlXA
5962,The Band CAMINO,2 / 14,spotify:track:2QwpEi3eNToZCCMMRcOj0u,2QwpEi3eNToZCCMMRcOj0u
3775,Marek Hemmann,Gemini,spotify:track:5dwjQsS2ezI4NDnokvK7IM,5dwjQsS2ezI4NDnokvK7IM
328,French Montana,Unforgettable,spotify:track:3B54sVLJ402zGa6Xm4YGNe,3B54sVLJ402zGa6Xm4YGNe
3403,Troye Sivan,YOUTH,spotify:track:1cOyWWUr3oXJIxY0AjJEx9,1cOyWWUr3oXJIxY0AjJEx9
3410,Taylor Swift,Enchanted,spotify:track:3sW3oSbzsfecv9XoUdGs7h,3sW3oSbzsfecv9XoUdGs7h


In [154]:
df_dislike_sampled_final.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 193 entries, 1576 to 3316
Data columns (total 4 columns):
 #   Column      Non-Null Count  Dtype 
---  ------      --------------  ----- 
 0   artistName  193 non-null    object
 1   trackName   193 non-null    object
 2   track_uri   193 non-null    object
 3   track_id    193 non-null    object
dtypes: object(4)
memory usage: 7.5+ KB


In [155]:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

client_credentials_manager = SpotifyClientCredentials(client_id='582341de1c87493291783ae774754039', client_secret='a21c97ce2062459a8f257d476ed5fe97')
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

In [156]:
import time

def get_audio_features_batch(track_ids, batch_size=100):
    audio_features = []
    for i in range(0, len(track_ids), batch_size):
        batch = track_ids[i:i+batch_size]
        features = sp.audio_features(batch)
        audio_features.extend(features)
        time.sleep(1)  # Add a 1-second delay between batches to respect rate limits
    return audio_features

In [157]:
all_track_ids = df_dislike_sampled_final['track_id'].tolist()
all_audio_features = get_audio_features_batch(all_track_ids)

In [158]:
audio_features_df = pd.DataFrame(all_audio_features)
result_df = pd.merge(df_dislike_sampled_final, audio_features_df, left_on='track_id', right_on='id', how='left')

In [159]:
result_df.head(20)

Unnamed: 0,artistName,trackName,track_uri,track_id,danceability,energy,key,loudness,mode,speechiness,...,liveness,valence,tempo,type,id,uri,track_href,analysis_url,duration_ms,time_signature
0,The xx,Islands,spotify:track:6i5tYaGlOFDLILEB6HfJAa,6i5tYaGlOFDLILEB6HfJAa,0.871,0.472,4,-11.077,1,0.0756,...,0.145,0.707,124.04,audio_features,6i5tYaGlOFDLILEB6HfJAa,spotify:track:6i5tYaGlOFDLILEB6HfJAa,https://api.spotify.com/v1/tracks/6i5tYaGlOFDL...,https://api.spotify.com/v1/audio-analysis/6i5t...,160720,4
1,Quinn XCII,Too Late (with AJR),spotify:track:3KE6KppohrzZMbo2ao7CZ2,3KE6KppohrzZMbo2ao7CZ2,0.731,0.63,4,-6.943,1,0.0637,...,0.262,0.634,89.025,audio_features,3KE6KppohrzZMbo2ao7CZ2,spotify:track:3KE6KppohrzZMbo2ao7CZ2,https://api.spotify.com/v1/tracks/3KE6KppohrzZ...,https://api.spotify.com/v1/audio-analysis/3KE6...,175760,4
2,Parachute,Without You,spotify:track:6R6ux6KaKrhAg2EIB2krdU,6R6ux6KaKrhAg2EIB2krdU,0.564,0.864,4,-5.121,1,0.0341,...,0.182,0.534,95.984,audio_features,6R6ux6KaKrhAg2EIB2krdU,spotify:track:6R6ux6KaKrhAg2EIB2krdU,https://api.spotify.com/v1/tracks/6R6ux6KaKrhA...,https://api.spotify.com/v1/audio-analysis/6R6u...,228933,4
3,Jung Kook,Seven (feat. Latto),spotify:track:7x9aauaA9cu6tyfpHnqDLo,7x9aauaA9cu6tyfpHnqDLo,0.788,0.841,11,-3.955,1,0.0432,...,0.0772,0.904,124.986,audio_features,7x9aauaA9cu6tyfpHnqDLo,spotify:track:7x9aauaA9cu6tyfpHnqDLo,https://api.spotify.com/v1/tracks/7x9aauaA9cu6...,https://api.spotify.com/v1/audio-analysis/7x9a...,184400,4
4,Nicky Youre,Sunroof,spotify:track:5YqEzk3C5c3UZ1D5fJUlXA,5YqEzk3C5c3UZ1D5fJUlXA,0.768,0.714,10,-5.11,1,0.0401,...,0.15,0.842,131.443,audio_features,5YqEzk3C5c3UZ1D5fJUlXA,spotify:track:5YqEzk3C5c3UZ1D5fJUlXA,https://api.spotify.com/v1/tracks/5YqEzk3C5c3U...,https://api.spotify.com/v1/audio-analysis/5YqE...,163026,4
5,The Band CAMINO,2 / 14,spotify:track:2QwpEi3eNToZCCMMRcOj0u,2QwpEi3eNToZCCMMRcOj0u,0.607,0.709,9,-7.428,1,0.0538,...,0.294,0.551,97.006,audio_features,2QwpEi3eNToZCCMMRcOj0u,spotify:track:2QwpEi3eNToZCCMMRcOj0u,https://api.spotify.com/v1/tracks/2QwpEi3eNToZ...,https://api.spotify.com/v1/audio-analysis/2Qwp...,163690,4
6,Marek Hemmann,Gemini,spotify:track:5dwjQsS2ezI4NDnokvK7IM,5dwjQsS2ezI4NDnokvK7IM,0.793,0.637,8,-11.994,1,0.093,...,0.143,0.32,128.001,audio_features,5dwjQsS2ezI4NDnokvK7IM,spotify:track:5dwjQsS2ezI4NDnokvK7IM,https://api.spotify.com/v1/tracks/5dwjQsS2ezI4...,https://api.spotify.com/v1/audio-analysis/5dwj...,547560,4
7,French Montana,Unforgettable,spotify:track:3B54sVLJ402zGa6Xm4YGNe,3B54sVLJ402zGa6Xm4YGNe,0.726,0.769,6,-5.043,1,0.123,...,0.104,0.733,97.985,audio_features,3B54sVLJ402zGa6Xm4YGNe,spotify:track:3B54sVLJ402zGa6Xm4YGNe,https://api.spotify.com/v1/tracks/3B54sVLJ402z...,https://api.spotify.com/v1/audio-analysis/3B54...,233902,4
8,Troye Sivan,YOUTH,spotify:track:1cOyWWUr3oXJIxY0AjJEx9,1cOyWWUr3oXJIxY0AjJEx9,0.628,0.737,7,-4.437,1,0.041,...,0.0777,0.591,91.505,audio_features,1cOyWWUr3oXJIxY0AjJEx9,spotify:track:1cOyWWUr3oXJIxY0AjJEx9,https://api.spotify.com/v1/tracks/1cOyWWUr3oXJ...,https://api.spotify.com/v1/audio-analysis/1cOy...,185194,4
9,Taylor Swift,Enchanted,spotify:track:3sW3oSbzsfecv9XoUdGs7h,3sW3oSbzsfecv9XoUdGs7h,0.52,0.553,8,-3.546,1,0.0269,...,0.165,0.227,81.949,audio_features,3sW3oSbzsfecv9XoUdGs7h,spotify:track:3sW3oSbzsfecv9XoUdGs7h,https://api.spotify.com/v1/tracks/3sW3oSbzsfec...,https://api.spotify.com/v1/audio-analysis/3sW3...,353253,4


In [160]:
result_df['MBTI'] = 'ESFJ' # --> change as per MBTI type

In [161]:
# Save the extended DataFrame
result_df.to_csv('extended_songs_wadthy_dislike.csv', index=False)  # --> change as per Name