## Chapter 5) Methodology
Code for analysis and ML workflow to build a recommender system and find music item similarities based on a user's listening history and liked/saved songs.


##### Steps (Brief Overview):

**Data Collection & Processing (EDA Part)**
1. load all data available and store in Pandas data frames
2. connect to Spotify API using developer console to extract song features
3. create separate DFs for songs/playlists collected by friends and MBTI playlists downloaded from Kaggle

**Feature Extraction & Selection**
4. clean the data and select feature columns for the model

**Content Based Filtering on Base Dataset**
5. applying the different ML Models on Baseline Dataset Using Content Based Filtering and Evaluating Initial Results

**Incorperating MBTI Perosnality Types in the Recommendation Process**
6. adding additional feature column for MBTI personality type and create MBTI based DFs from Kaggle Datasets
7. applying use-item Matrix factorization and evaluate results 

**Compare Results**
8. compare the results of baseline model with MBTI implemented model

#### Data Collection & Processing (EDA Part)

In [2]:
import pandas as pd
import seaborn as sns

# saving obtained playlists from friends and Kaggle datasets as Pandas DFs (your library = liked songs)


In [22]:
# example that worked for 1 DF

import json
import pandas as pd

# JSON file path of user
json_file_path_wadthy = '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/00_YourLibrary_wadthy.json'

# Load the JSON file
with open(json_file_path_wadthy, 'r') as file:
    data = json.load(file)

# Extract song information
songs = data['tracks']

# Create a list to hold song data
song_list = []

# Iterate over each song and extract relevant details
for song in songs:
    song_info = {
        'Title': song.get('track', 'N/A'),
        'Artist': song.get('artist', 'N/A'),
        'Album': song.get('album', 'N/A'),
        'URI': song.get('uri', 'N/A')
    }
    song_list.append(song_info)

# Create a DataFrame from the song list
df = pd.DataFrame(song_list)

# Display the DataFrame
df.head(10)

Unnamed: 0,Title,Artist,Album,URI
0,Smells Like Teen Spirit,Nirvana,Nirvana,spotify:track:4hy4fb5D1KL50b3sng9cjw
1,Sure Thing,Miguel,All I Want Is You,spotify:track:0JXXNGljqupsJaZsgSbMZV
2,Fancy Shoes,The Walters,Songs for Dads,spotify:track:1YVVAiBD5WhX2ZdHtlSOhz
3,Tokyo Inn,HYUKOH,23,spotify:track:4myeBw35GUMw5FyDGZcOON
4,Konoha Peace,Kato,Naruto Vibes,spotify:track:0wIfYaveiZku0eL44UXtHk
5,Wake Up Call - Mark Ronson Remix,Maroon 5,It Won't Be Soon Before Long,spotify:track:4Q36omJXoeMD8LnnoOXOu7
6,Stolen Dance - Acoustic Version,Milky Chance,Sadnecessary,spotify:track:3DUhxfgyzNi0cCmZWDWfnS
7,Williamsburg - Urban Contact Remix,Purple Souls,Williamsburg,spotify:track:3FRQ6ZBnsbiUEgr3fwyzHV
8,DUELE EL CORAZON (feat. Wisin),Enrique Iglesias,DUELE EL CORAZON (feat. Wisin),spotify:track:6YZdkObH88npeKrrkb8Ggf
9,Blank Space,Taylor Swift,1989,spotify:track:1u8c2t2Cy7UBoG4ArRcF5g


In [23]:
import json
import pandas as pd
import os

# List of JSON file paths
file_paths = [
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/00_YourLibrary_wadthy.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/01_YourLibrary_withy.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/02_YourLibrary_yoojin.json'
]

# Dictionary to store DataFrames
dataframes = {}

# Loop through each file, load the data, and create a DataFrame
for file_path in file_paths:
    with open(file_path, 'r') as file:
        data = json.load(file)

    # Extract song information
    songs = data['tracks']

    # Create a list to hold song data
    song_list = []

    # Iterate over each song and extract relevant details
    for song in songs:
        song_info = {
            'Title': song.get('track', 'N/A'),
            'Artist': song.get('artist', 'N/A'),
            'Album': song.get('album', 'N/A'),
            'URI': song.get('uri', 'N/A')
        }
        song_list.append(song_info)

    # Create a DataFrame from the song list
    df = pd.DataFrame(song_list)

    # Use the file name (without extension) as the key for the DataFrame in the dictionary
    file_name = os.path.splitext(os.path.basename(file_path))[0]
    dataframes[file_name] = df

# Now dataframes dictionary holds DataFrames for each JSON file
for name, df in dataframes.items():
    print(f"DataFrame for {name}:")
    print(df)

DataFrame for 00_YourLibrary_wadthy:
                        Title                 Artist                Album  \
0     Smells Like Teen Spirit                Nirvana              Nirvana   
1                  Sure Thing                 Miguel    All I Want Is You   
2                 Fancy Shoes            The Walters       Songs for Dads   
3                   Tokyo Inn                 HYUKOH                   23   
4                Konoha Peace                   Kato         Naruto Vibes   
...                       ...                    ...                  ...   
3152            Snow (Hey Oh)  Red Hot Chili Peppers     Stadium Arcadium   
3153      Tell me you love me                Tophyun  Tell me you love me   
3154                 Sunshine              Matisyahu         Spark Seeker   
3155                     Sing             Ed Sheeran                    x   
3156             Blue Spirits                 DWLLRS         Blue Spirits   

                                      

In [25]:
import json
import pandas as pd
import os

# List of JSON file paths --> local file path
file_paths = [
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/00_YourLibrary_wadthy.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/01_YourLibrary_withy.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/02_YourLibrary_yoojin.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/03_YourLibrary_moni.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/04_YourLibrary_nga.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/05_YourLibrary_makra.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/06_YourLibrary_sören.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/07_YourLibrary_simon.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/09_YourLibrary_yeonju.json',
    '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/10_YourLibrary_han.json'
]

# Dictionary to store DataFrames
dataframes = {}

# Loop through each file, load the data, and create a DataFrame
for file_path in file_paths:
    with open(file_path, 'r') as file:
        data = json.load(file)

    # Extract song information
    songs = data['tracks']

    # Create a list to hold song data
    song_list = []

    # Iterate over each song and extract relevant details
    for song in songs:
        song_info = {
            'Title': song.get('track', 'N/A'),
            'Artist': song.get('artist', 'N/A'),
            'Album': song.get('album', 'N/A'),
            'URI': song.get('uri', 'N/A')
        }
        song_list.append(song_info)

    # Create a DataFrame from the song list
    df = pd.DataFrame(song_list)

    # Use the file name (without extension) as the key for the DataFrame in the dictionary
    file_name = os.path.splitext(os.path.basename(file_path))[0]
    dataframes[file_name] = df

# Accessing and analyzing individual DataFrames
for name, df in dataframes.items():
    print(f"DataFrame for {name}:")
    print(df.head())  # Display the first few rows of the DataFrame

    # Example analysis: Print the number of songs in each DataFrame
    print(f"Number of songs in {name}: {len(df)}")

    # Optionally, save each DataFrame to a separate CSV file
    df.to_csv(f'{name}_songs.csv', index=False)

# Example of specific DataFrame access for further analysis


# Access a specific DataFrame by its key, e.g., "05_YourLibrary_makra" --> getting DFs of all collected Users
df_wadthy = dataframes.get("00_YourLibrary_wadthy")
df_withy = dataframes.get("01_YourLibrary_withy")
df_yoojin = dataframes.get("02_YourLibrary_yoojin")
df_moni = dataframes.get("03_YourLibrary_moni")
df_nga = dataframes.get("04_YourLibrary_nga")
df_makra = dataframes.get("05_YourLibrary_makra")
df_sören = dataframes.get("06_YourLibrary_sören")
df_simon = dataframes.get("07_YourLibrary_simon")
# van's daten
df_sören = dataframes.get("09_YourLibrary_sören")
df_han = dataframes.get("10_YourLibrary_han")
# trang's daten


if df_specific is not None:
    # Perform analysis on the specific DataFrame
    print(df_specific.describe())
else:
    print("The specified DataFrame does not exist.")

DataFrame for 00_YourLibrary_wadthy:
                     Title       Artist              Album  \
0  Smells Like Teen Spirit      Nirvana            Nirvana   
1               Sure Thing       Miguel  All I Want Is You   
2              Fancy Shoes  The Walters     Songs for Dads   
3                Tokyo Inn       HYUKOH                 23   
4             Konoha Peace         Kato       Naruto Vibes   

                                    URI  
0  spotify:track:4hy4fb5D1KL50b3sng9cjw  
1  spotify:track:0JXXNGljqupsJaZsgSbMZV  
2  spotify:track:1YVVAiBD5WhX2ZdHtlSOhz  
3  spotify:track:4myeBw35GUMw5FyDGZcOON  
4  spotify:track:0wIfYaveiZku0eL44UXtHk  
Number of songs in 00_YourLibrary_wadthy: 3157
DataFrame for 01_YourLibrary_withy:
                                       Title           Artist          Album  \
0  Ordinaryish People (feat. Blue Man Group)              AJR   OK ORCHESTRA   
1                                   Good Day       Jake Scott       Lavender   
2              

In [33]:
# Access a specific DataFrame by its key, e.g., "05_YourLibrary_makra"
df_wadthy = dataframes.get("00_YourLibrary_wadthy")
df_wadthy.head(20)


Unnamed: 0,Title,Artist,Album,URI
0,Smells Like Teen Spirit,Nirvana,Nirvana,spotify:track:4hy4fb5D1KL50b3sng9cjw
1,Sure Thing,Miguel,All I Want Is You,spotify:track:0JXXNGljqupsJaZsgSbMZV
2,Fancy Shoes,The Walters,Songs for Dads,spotify:track:1YVVAiBD5WhX2ZdHtlSOhz
3,Tokyo Inn,HYUKOH,23,spotify:track:4myeBw35GUMw5FyDGZcOON
4,Konoha Peace,Kato,Naruto Vibes,spotify:track:0wIfYaveiZku0eL44UXtHk
5,Wake Up Call - Mark Ronson Remix,Maroon 5,It Won't Be Soon Before Long,spotify:track:4Q36omJXoeMD8LnnoOXOu7
6,Stolen Dance - Acoustic Version,Milky Chance,Sadnecessary,spotify:track:3DUhxfgyzNi0cCmZWDWfnS
7,Williamsburg - Urban Contact Remix,Purple Souls,Williamsburg,spotify:track:3FRQ6ZBnsbiUEgr3fwyzHV
8,DUELE EL CORAZON (feat. Wisin),Enrique Iglesias,DUELE EL CORAZON (feat. Wisin),spotify:track:6YZdkObH88npeKrrkb8Ggf
9,Blank Space,Taylor Swift,1989,spotify:track:1u8c2t2Cy7UBoG4ArRcF5g


In [34]:
df_withy = dataframes.get("01_YourLibrary_withy")
df_withy.head(20)

Unnamed: 0,Title,Artist,Album,URI
0,Ordinaryish People (feat. Blue Man Group),AJR,OK ORCHESTRA,spotify:track:3sBdf3nxnEC9e2GcdP9d3j
1,Good Day,Jake Scott,Lavender,spotify:track:5zsTxXxpVzWC9MqeNJ3Pes
2,Oh My Love,The Score,ATLAS,spotify:track:4Q99wmrzQFtK3oweyLkWij
3,Conversations,Marney,Conversations,spotify:track:6XfjfDpIUWI0sE0WBBnQfD
4,Bones,Imagine Dragons,Bones,spotify:track:0HqZX76SFLDz2aW8aiqi7G
5,Uma Thurman,Fall Out Boy,American Beauty/American Psycho,spotify:track:5PUawWFG1oIS2NwEcyHaCr
6,Ghosts,BANNERS,Ghosts,spotify:track:2YGqpzBhtJxt9M5g1czYg4
7,all the good girls go to hell,Billie Eilish,"WHEN WE ALL FALL ASLEEP, WHERE DO WE GO?",spotify:track:6IRdLKIyS4p7XNiP8r6rsx
8,Control,Broken Bells,After the Disco,spotify:track:1QZJiYulh7ak7GpZ8OAdwI
9,I Like Me Better,Lauv,I met you when I was 18.,spotify:track:0EcQcdcbQeVJn9fknj44Be


In [14]:
import json
import pandas as pd

# JSON file path of user
json_file_path_wadthy = '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/00_YourLibrary_wadthy.json'

# Load the JSON file
with open(json_file_path_wadthy, 'r') as file:
    data = json.load(file)

# Inspect the structure of the JSON
print(json.dumps(data, indent=2))

# Extract song information (Assuming the JSON structure contains a key 'tracks' with song details)
songs = data['tracks']

# Create a list to hold song data
song_list = []

# Iterate over each song and extract relevant details
for song in songs:
    song_info = {
        'Title': song['track']['track_name'],
        'Artist': song['track']['artist_name'],
        'Album': song['track']['album_name'],
        'Added Date': song['added_at']
    }
    song_list.append(song_info)

# Create a DataFrame from the song list
df = pd.DataFrame(song_list)

# Display the DataFrame
print(df)


In [16]:
df.head(20)

Unnamed: 0,name,artist,album,added_at,uri
0,,,,,spotify:track:4hy4fb5D1KL50b3sng9cjw
1,,,,,spotify:track:0JXXNGljqupsJaZsgSbMZV
2,,,,,spotify:track:1YVVAiBD5WhX2ZdHtlSOhz
3,,,,,spotify:track:4myeBw35GUMw5FyDGZcOON
4,,,,,spotify:track:0wIfYaveiZku0eL44UXtHk
5,,,,,spotify:track:4Q36omJXoeMD8LnnoOXOu7
6,,,,,spotify:track:3DUhxfgyzNi0cCmZWDWfnS
7,,,,,spotify:track:3FRQ6ZBnsbiUEgr3fwyzHV
8,,,,,spotify:track:6YZdkObH88npeKrrkb8Ggf
9,,,,,spotify:track:1u8c2t2Cy7UBoG4ArRcF5g


In [8]:
import pandas as pd
import json
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials

# JSON file path of user
json_file_path_wadthy = '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Collected Spotify Data from Friends/All Extracted Library/00_YourLibrary_wadthy.json'

# load the JSON file
with open(json_file_path_wadthy, 'r', encoding='utf-8') as file:
    data = json.load(file)


# Check if 'tracks' is a list or a string
if isinstance(data, str):
    # If data is a string, try to parse it as JSON
    data = json.loads(data)

# Ensure 'tracks' is a list
tracks = data if isinstance(data, list) else data.get('tracks', [])

# Create a list to store the extracted information
tracks_data = []

# Extract relevant information from each track
for track in tracks:
    if isinstance(track, dict):
        tracks_data.append({
            'name': track.get('trackName', ''),
            'artist': track.get('artistName', ''),
            'album': track.get('albumName', ''),
            'added_at': track.get('addedDate', ''),
            'uri': track.get('uri', '')
        })
    elif isinstance(track, str):
        # If track is a string, it might be a JSON string
        try:
            track_dict = json.loads(track)
            tracks_data.append({
                'name': track_dict.get('trackName', ''),
                'artist': track_dict.get('artistName', ''),
                'album': track_dict.get('albumName', ''),
                'added_at': track_dict.get('addedDate', ''),
                'uri': track_dict.get('uri', '')
            })
        except json.JSONDecodeError:
            print(f"Skipping invalid track data: {track}")

# Create a pandas DataFrame
df = pd.DataFrame(tracks_data)

# Display the first few rows of the DataFrame
print(df.head(20))




   name artist album added_at                                   uri
0                              spotify:track:4hy4fb5D1KL50b3sng9cjw
1                              spotify:track:0JXXNGljqupsJaZsgSbMZV
2                              spotify:track:1YVVAiBD5WhX2ZdHtlSOhz
3                              spotify:track:4myeBw35GUMw5FyDGZcOON
4                              spotify:track:0wIfYaveiZku0eL44UXtHk
5                              spotify:track:4Q36omJXoeMD8LnnoOXOu7
6                              spotify:track:3DUhxfgyzNi0cCmZWDWfnS
7                              spotify:track:3FRQ6ZBnsbiUEgr3fwyzHV
8                              spotify:track:6YZdkObH88npeKrrkb8Ggf
9                              spotify:track:1u8c2t2Cy7UBoG4ArRcF5g
10                             spotify:track:3WU2blK7YBcugEEU9CATvF
11                             spotify:track:27tNWlhdAryQY04Gb2ZhUI
12                             spotify:track:1zPfk2vaGcqXMn22PD9DuT
13                             spotify:track:3oD

In [9]:
print(json.dumps(data[:2], indent=2))  # Print the first two items of the data

TypeError: unhashable type: 'slice'

In [7]:
# Save the DataFrame to a CSV file (optional)
df.to_csv('liked_songs.csv', index=False)

In [None]:
# You would normalize the songs in each playlist like this:
playlists = pd.json_normalize(data, record_path=['playlists', 'items'],
                              meta=[['playlists', 'items']])

# Step 3: Create a DataFrame
# If 'data' is already in a format that can be directly converted to a DataFrame, you could skip step 2
df = pd.DataFrame(playlists)

# Rename columns if necessary
df.rename(columns={'playlists.name': 'Playlist Name'}, inplace=True)

# Display the DataFrame
print(df)



# Extract liked songs from the 'tracks' key
liked_songs = data.get('tracks', [])

# Create a list to store the extracted information
tracks_data = []

# Extract relevant information from each track
for track in liked_songs:
    tracks_data.append({
        'name': track.get('track', {}).get('trackName', ''),
        'artist': track.get('track', {}).get('artistName', ''),
        'album': track.get('track', {}).get('albumName', ''),
        'added_at': track.get('addedDate', ''),
        'uri': track.get('uri', '')
    })

# Display the first few rows of the DataFrame
print(df.head())

In [None]:
# Spotify credentials
client_id = '582341de1c87493291783ae774754039'
client_secret = 'a21c97ce2062459a8f257d476ed5fe97'

# Authenticate with Spotify
client_credentials_manager = SpotifyClientCredentials(client_id=client_id, client_secret=client_secret)
sp = spotipy.Spotify(client_credentials_manager=client_credentials_manager)

# List of track IDs (replace these with the actual Spotify track IDs you're interested in)
track_ids = ['1KU5EHSz04JhGg3rReGJ0N']

# Initialize empty list to hold track details and audio features
tracks_data = []

# Retrieve track details and audio features for each track
for track_id in track_ids:
    track_details = sp.track(track_id)
    audio_features = sp.audio_features(track_id)[0]

    track_info = {
        'title': track_details['name'],
        'artist_name': track_details['artists'][0]['name'],
        'release_date': track_details['album']['release_date'],
        'genre': '',  # Spotify API does not provide genre at the track level, usually available at the artist level
        'duration_ms': track_details['duration_ms'],
        'danceability': audio_features['danceability'],
        'energy': audio_features['energy'],
        'key': audio_features['key'],
        'loudness': audio_features['loudness'],
        'mode': audio_features['mode'],
        'speechiness': audio_features['speechiness'],
        'acousticness': audio_features['acousticness'],
        'instrumentalness': audio_features['instrumentalness'],
        'liveness': audio_features['liveness'],
        'valence': audio_features['valence'],
        'tempo': audio_features['tempo'],
    }

    tracks_data.append(track_info)

# Convert list of dicts to pandas DataFrame
df_tracks = pd.DataFrame(tracks_data)

# Display the DataFrame
print(df_tracks)


In [2]:
df_tracks.head()


Unnamed: 0,title,artist_name,release_date,genre,duration_ms,danceability,energy,key,loudness,mode,speechiness,acousticness,instrumentalness,liveness,valence,tempo
0,Black Velvet,Alannah Myles,1989-03-14,,287440,0.754,0.366,8,-10.07,1,0.0312,0.273,9e-05,0.106,0.469,91.147


### Retrieving the Data from Spotify as JSON and Normalize to Pandas Data Frame

In [4]:
import json

In [7]:
# Path to your JSON file
json_file_path = '/Users/khieuvon/Documents/10_Personal Stuff/01_Masterarbeit/Data for ML Model/Spotify Data/Spotify-Data_Sören/Playlist1.json'

# Step 1: Read the JSON file
with open(json_file_path, 'r') as file:
    data = json.load(file)

# Assuming 'data' contains a list of playlists, and each playlist has a list of songs
# You might need to adjust the path to 'records' based on the structure of your JSON

# Step 2: Normalize the data (if necessary)
# For example, if your data structure is:
# {
#    "playlists": [
#        {
#            "name": "Playlist 1",
#            "lastModifiedDate": "2022-09-17",
#            "items": [
#                {"track": "Song 1", "artist": "Artist 1"},
#                ...
#            ]
#        },
#                {
#    "track": {
#        "trackName": "Gold",
#        "artistName": "Chet Faker",
#        "albumName": "Built on Glass",
#        "trackUri": "spotify:track:1Ll09EiN5ffeFl1xNZB2Uk"
#    },
#    "episode": null,
#    "localTrack": null,
#    "addedDate": "2022-08-22"
#  },
#    ]
# }
# You would normalize the songs in each playlist like this:
playlists = pd.json_normalize(data, record_path=['playlists', 'items'],
                              meta=[['playlists', 'items']])

# Step 3: Create a DataFrame
# If 'data' is already in a format that can be directly converted to a DataFrame, you could skip step 2
df = pd.DataFrame(playlists)

# Rename columns if necessary
df.rename(columns={'playlists.name': 'Playlist Name'}, inplace=True)

# Display the DataFrame
print(df)


     episode localTrack   addedDate             track.trackName  \
0       None       None  2022-08-22                        Gold   
1       None       None  2022-08-22                   Sunflower   
2       None       None  2022-08-22      Trampoline (with ZAYN)   
3       None       None  2022-08-22                         You   
4       None       None  2022-08-22                 Imagination   
...      ...        ...         ...                         ...   
1200    None       None  2022-10-06                        Lies   
1201    None       None  2022-10-18   CHANT (feat. Tones And I)   
1202    None       None  2022-10-18              3 Tage am Meer   
1203    None       None  2022-10-20                   telepatía   
1204    None       None  2022-10-20  Mind Over Matter (Reprise)   

       track.artistName                          track.albumName  \
0            Chet Faker                           Built on Glass   
1     Rex Orange County                                Sunf

In [8]:
df.head(20)

Unnamed: 0,episode,localTrack,addedDate,track.trackName,track.artistName,track.albumName,track.trackUri,playlists.items
0,,,2022-08-22,Gold,Chet Faker,Built on Glass,spotify:track:1Ll09EiN5ffeFl1xNZB2Uk,"[{'track': {'trackName': 'Gold', 'artistName':..."
1,,,2022-08-22,Sunflower,Rex Orange County,Sunflower,spotify:track:4EpZ4eYuZOwPSSwyqpdHnJ,"[{'track': {'trackName': 'Gold', 'artistName':..."
2,,,2022-08-22,Trampoline (with ZAYN),SHAED,Melt (Deluxe),spotify:track:2ez6qvOTHKeI3ss80NGqnI,"[{'track': {'trackName': 'Gold', 'artistName':..."
3,,,2022-08-22,You,Regard,You,spotify:track:2cc8Sw1OnCuA5bV8nqWqpE,"[{'track': {'trackName': 'Gold', 'artistName':..."
4,,,2022-08-22,Imagination,Gorgon City,Sirens,spotify:track:3ZrWmt3DGH75hItHp6uWLz,"[{'track': {'trackName': 'Gold', 'artistName':..."
5,,,2022-08-22,I'm Into You,Chet Faker,Thinking In Textures,spotify:track:2mbu1ssfb7h1RNO5jBv4cW,"[{'track': {'trackName': 'Gold', 'artistName':..."
6,,,2022-08-27,Black Velvet,Alannah Myles,Alannah Myles,spotify:track:1KU5EHSz04JhGg3rReGJ0N,"[{'track': {'trackName': 'Gold', 'artistName':..."
7,,,2022-09-12,Powerslide,Ryan Beatty,Boy in Jeans,spotify:track:5x1Ctt9JCZNU5UhJWbHoQp,"[{'track': {'trackName': 'Gold', 'artistName':..."
8,,,2022-09-17,Fade Away,Susanne Sundfør,Ten Love Songs,spotify:track:0IWBaEf7GOwKPKHyC32E1z,"[{'track': {'trackName': 'Gold', 'artistName':..."
9,,,2022-07-24,Shout (Tears for Fears) - Remix,Alex Graham,Shout (Tears for Fears),spotify:track:5FFiJ0zzCJmzz0HDrPCdh7,[{'track': {'trackName': 'Shout (Tears for Fea...


In [15]:
# step 2 is to extract the trackUri in the column for each song and get only the string ID
# df_adjusted_trackUri = df

df['trackUri_ID'] = df["track.trackUri"].str[14:]

In [16]:
df.head(20)

Unnamed: 0,episode,localTrack,addedDate,track.trackName,track.artistName,track.albumName,track.trackUri,playlists.items,trackUri_ID
0,,,2022-08-22,Gold,Chet Faker,Built on Glass,spotify:track:1Ll09EiN5ffeFl1xNZB2Uk,"[{'track': {'trackName': 'Gold', 'artistName':...",1Ll09EiN5ffeFl1xNZB2Uk
1,,,2022-08-22,Sunflower,Rex Orange County,Sunflower,spotify:track:4EpZ4eYuZOwPSSwyqpdHnJ,"[{'track': {'trackName': 'Gold', 'artistName':...",4EpZ4eYuZOwPSSwyqpdHnJ
2,,,2022-08-22,Trampoline (with ZAYN),SHAED,Melt (Deluxe),spotify:track:2ez6qvOTHKeI3ss80NGqnI,"[{'track': {'trackName': 'Gold', 'artistName':...",2ez6qvOTHKeI3ss80NGqnI
3,,,2022-08-22,You,Regard,You,spotify:track:2cc8Sw1OnCuA5bV8nqWqpE,"[{'track': {'trackName': 'Gold', 'artistName':...",2cc8Sw1OnCuA5bV8nqWqpE
4,,,2022-08-22,Imagination,Gorgon City,Sirens,spotify:track:3ZrWmt3DGH75hItHp6uWLz,"[{'track': {'trackName': 'Gold', 'artistName':...",3ZrWmt3DGH75hItHp6uWLz
5,,,2022-08-22,I'm Into You,Chet Faker,Thinking In Textures,spotify:track:2mbu1ssfb7h1RNO5jBv4cW,"[{'track': {'trackName': 'Gold', 'artistName':...",2mbu1ssfb7h1RNO5jBv4cW
6,,,2022-08-27,Black Velvet,Alannah Myles,Alannah Myles,spotify:track:1KU5EHSz04JhGg3rReGJ0N,"[{'track': {'trackName': 'Gold', 'artistName':...",1KU5EHSz04JhGg3rReGJ0N
7,,,2022-09-12,Powerslide,Ryan Beatty,Boy in Jeans,spotify:track:5x1Ctt9JCZNU5UhJWbHoQp,"[{'track': {'trackName': 'Gold', 'artistName':...",5x1Ctt9JCZNU5UhJWbHoQp
8,,,2022-09-17,Fade Away,Susanne Sundfør,Ten Love Songs,spotify:track:0IWBaEf7GOwKPKHyC32E1z,"[{'track': {'trackName': 'Gold', 'artistName':...",0IWBaEf7GOwKPKHyC32E1z
9,,,2022-07-24,Shout (Tears for Fears) - Remix,Alex Graham,Shout (Tears for Fears),spotify:track:5FFiJ0zzCJmzz0HDrPCdh7,[{'track': {'trackName': 'Shout (Tears for Fea...,5FFiJ0zzCJmzz0HDrPCdh7
