<div style="text-align: center; background-color: #750E21; font-family: 'Trebuchet MS', Arial, sans-serif; color: white; padding: 20px; font-size: 40px; font-weight: bold; border-radius: 0 0 0 0; box-shadow: 0px 6px 8px rgba(0, 0, 0, 0.2);">
  FINAL PROJECT: RESEARCHING ON MUSIC TASTE WORDWIDELY 📌
</div>

<div style="text-align: center; background-color: #0766AD; font-family: 'Trebuchet MS', Arial, sans-serif; color: white; padding: 20px; font-size: 40px; font-weight: bold; border-radius: 0 0 0 0; box-shadow: 0px 6px 8px rgba(0, 0, 0, 0.2);">
  Stage 01 - Data collecting 📌
</div>

## **IMPORT LIBRARY** 🎄

In [4]:
import requests 
import pandas as pd
import numpy as np
import re
from bs4 import BeautifulSoup
from googleapiclient.discovery import build
import isodate
from datetime import datetime
import threading
import time
from concurrent.futures import ThreadPoolExecutor, as_completed

import spotipy
from spotipy.oauth2 import SpotifyOAuth


<div style="text-align: left; font-family: 'Trebuchet MS', Arial, sans-serif; color: #FF90BC; padding: 20px; font-size: 30px; font-weight: bold; border-radius: 0 0 0 0">
  STEP 1: Get data of toplist music video on Youtube from Kworb.net statistic 🔥
</div>

In [2]:
soup = BeautifulSoup(requests.get("https://kworb.net/youtube/topvideos.html").content, "html.parser")

music_data = []
for rank,tr in enumerate(soup.find_all("tr")[1:]):
    tds = tr.find_all("td")
    
    music_data.append({
        'Ranking': rank + 1,
        'Video Url': tds[0].a['href'],
        'Title': tds[0].text,
        'Views': tds[1].text,
        'Yesterday Views': tds[2].text,
    })

music_data = pd.DataFrame(music_data).set_index('Ranking')
music_data

Unnamed: 0_level_0,Video Url,Title,Views,Yesterday Views
Ranking,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
1,video/kJQP7kiw5Fk.html,Luis Fonsi - Despacito ft. Daddy Yankee,8326595309,688561
2,video/JGwWNGJdvx8.html,Ed Sheeran - Shape of You (Official Music Video),6148215569,693362
3,video/RgKAFK5djSk.html,Wiz Khalifa - See You Again ft. Charlie Puth [...,6107590085,954835
4,video/OPf0YbXqDm0.html,Mark Ronson - Uptown Funk (Official Video) ft....,5097482695,723078
5,video/9bZkp7q19f0.html,PSY - GANGNAM STYLE(강남스타일) M/V,4975484655,987234
...,...,...,...,...
2496,video/HC172grgTwU.html,Same Time Same Jagah (Chaar Din) ● Sandeep Bra...,325054072,89076
2497,video/cAMHx-m9oh8.html,Kya Loge Tum | Akshay Kumar | Amyra Dastur | B...,324747138,324683
2498,video/Fd7lYEtevxQ.html,Xúc Xắc Xúc Xẻ - Bé Bảo An ft Phi Long,324464042,155994
2499,video/Fp8msa5uYsc.html,Justin Bieber - Ghost,324372318,256700


In [3]:
music_video_id = []
for url in music_data['Video Url']:
    music_video_id.append(re.findall(r'video/(.*).html', url)[0])

def generate_video_url(video_id):
    url_arr = []
    for video in video_id:
        url_arr.append(f'https://www.youtube.com/watch?v={video}')
    return url_arr

def save_to_txt(url_arr, file_name):
    with open('../data/' + file_name, 'w') as f:
        for url in url_arr:
            f.write(url + '\n')
    print('Save to txt file successfully!')

youtube_video_url = generate_video_url(music_video_id)
save_to_txt(youtube_video_url, 'youtube_video_url.txt')

#save a column of a dataframe to an array
kworb_video_url = music_data['Video Url'].to_numpy()
kworb_video_url = ['https://kworb.net/youtube/' + url for url in kworb_video_url]

save_to_txt(kworb_video_url, 'kworb_video_url.txt')

Save to txt file successfully!
Save to txt file successfully!


<div style="text-align: left; font-family: 'Trebuchet MS', Arial, sans-serif; color: #FF90BC; padding: 20px; font-size: 30px; font-weight: bold; border-radius: 0 0 0 0">
  STEP 2: Crawling data from youtube using api key
</div>

- With crawling data from youtube using api ket, first we need to create an api key on Google Cloud Console. We have already done this.

In [16]:
api_key = 'AIzaSyCW7PPpfgKbrRxt3Oa3yHhDALB5Ro3djHo'

- Since we need to have number of subscribers of each channel, we create a function using api key and channel id to crawl this information.

In [5]:
def get_channel_info_youtube(api_key, channel_id):
    youtube = build('youtube', 'v3', developerKey=api_key)

    try:
        response = youtube.channels().list(
            part='snippet, contentDetails, statistics',
            id=channel_id
        ).execute()

        channel_info = response['items'][0]

        # Extract relevant information
        channel_name = channel_info['snippet']['title']
        subscriber_count = channel_info['statistics']['subscriberCount']
        country = channel_info['snippet'].get("country", "")

        return {
            'channel_name': channel_name,
            'subscriber_count': subscriber_count,
            'country': country
        }

    except Exception as e:
        print(f'An error occurred: {e}')
        return None

- Next, we are going to crawl some other informations on from youtube including: `view`, `like`, `duartion`, `channel name`, `subscriber`, `publish time`, `hashtag`. 
- Since there are some videos that have been removed from youtube, we will check if the reponse `items` is empty or not, if it is empty we will assign all values to `NaN`.
- Besides that, some videos don't allow to take `like` so if we don't get it, we will also assin it to `NaN`.

In [13]:
def get_video_info_youtube(api_key, video_id, view_list, like_list, duration_list, channel_name_list, subscriber_list, 
                   publish_time_list, hashtag_list, video_id_list, country_list):
    youtube = build('youtube', 'v3', developerKey=api_key)
    
    response = youtube.videos().list(
        part='snippet, contentDetails, statistics',
        id=video_id
    ).execute()

    if (not response['items']):
        video_id_list.append(video_id)
        view_list.append(np.nan)
        like_list.append(np.nan)
        duration_list.append(np.nan)
        channel_name_list.append(np.nan)
        subscriber_list.append(np.nan)
        country_list.append(np.nan)
        publish_time_list.append(np.nan)
        hashtag_list.append(np.nan)
    else:
        video_info = response['items'][0]

        # Extract relevant information
        views = video_info['statistics']['viewCount']
        
        try: 
            likes = video_info['statistics']['likeCount']
        except: 
            likes = np.nan
            
        duration_iso = video_info['contentDetails']['duration']
        channel_id = video_info['snippet']['channelId']

        # Get number of hashtags
        description = video_info['snippet']['description']
        hashtag_count = description.count('#')

        # Get published time
        published_at = video_info['snippet']['publishedAt']
        publish_time = datetime.strptime(published_at, '%Y-%m-%dT%H:%M:%SZ')

        # Convert ISO duration to human-readable format
        duration_human = isodate.parse_duration(duration_iso)

        # Extract channel name and subscribers
        channel_data = get_channel_info_youtube(api_key, channel_id)
        channel_name = channel_data['channel_name']
        subscribers = channel_data['subscriber_count']
        country = channel_data['country']

        view_list.append(views)
        like_list.append(likes)
        duration_list.append(str(duration_human))
        channel_name_list.append(channel_name)
        subscriber_list.append(subscribers)
        country_list.append(country)
        publish_time_list.append(publish_time)
        hashtag_list.append(hashtag_count)
        video_id_list.append(video_id)

In [14]:
def collect_data_youtube(music_video_id, api_key):
    # Init empty list to store the values of each attribute.
    view_list = []
    like_list = []
    duration_list = []
    channel_name_list = []
    subscriber_list = []
    country_list = []
    publish_time_list = []
    hashtag_list = []
    video_id_list = []
    
    threads = []
    for video_id in music_video_id:
        # Checking whether video_id is blank or not
        if (video_id == ''): 
            continue
        
        # Create thread
        while (threading.active_count() > 20):
            time.sleep(0.1)
        
        thread = threading.Thread(target=get_video_info_youtube, args=(api_key, video_id, view_list, like_list, duration_list, 
                                                               channel_name_list, subscriber_list, publish_time_list, 
                                                               hashtag_list, video_id_list, country_list))
        threads.append(thread)
        thread.start()
        
    for thread in threads:
        thread.join()
        
    data = pd.DataFrame({'Id': video_id_list,
                         'View': view_list,
                         'Like': like_list,
                         'Duration': duration_list,
                         'Channel_name': channel_name_list,
                         'Subscriber': subscriber_list,
                         'Country': country_list,
                         'Publish_time': publish_time_list,
                         'Hashtag': hashtag_list})
    
    return data

In [7]:
def get_video_info_web(url, Title, Most_view_in4_1, Most_view_in4_2, Rank_in4_1, Rank_in4_2, Rank_in4_3, Video_id):
    html_content = requests.get(url)#.text
    soup = BeautifulSoup(html_content.content, "lxml")

    video_title = soup.title.text.split(" – ")[0].split("YouTube Stats of ")[1]
    Title.append(video_title)

    most_view_in4 = soup.text.split("Most views in a day: ")[1].split("\n")[0].split(" ")
    most_view_in4[0] = int(most_view_in4[0].replace(',', ''))
    most_view_in4[1] = most_view_in4[1][1:-1]
    Most_view_in4_1.append(most_view_in4[0])
    Most_view_in4_2.append(most_view_in4[1])

    rank_in4 = soup.text.split("Peaked at #")
    if(len(rank_in4) == 1): 
        rank_in4 = [np.nan, np.nan, np.nan]
    elif(len(rank_in4[1].split("\n")[0].split(" ")) <= 6):
        rank_in4 = [rank_in4[1].split("\n")[0].split(" ")[0], 'nan', rank_in4[1].split("\n")[0].split(" ")[4]]
    else: 
        rank_in4 = [rank_in4[1].split("\n")[0].split(" ")[0], rank_in4[1].split("\n")[0].split(" ")[2], rank_in4[1].split("\n")[0].split(" ")[7]]

    rank_in4[0] = float(rank_in4[0])
    rank_in4[1] = float(rank_in4[1])
    rank_in4[2] = float(rank_in4[2])

    Rank_in4_1.append(rank_in4[0])
    Rank_in4_2.append(rank_in4[1])
    Rank_in4_3.append(rank_in4[2])

    video_id = url[32:-5]
    Video_id.append(video_id)

In [8]:
def collect_data_web(course_urls_file):
    #load paths from file
    url_file = open(course_urls_file)
    urls = url_file.readlines()
    urls_filtered = [item[:-1] for item in urls]
    
    #init empty list to store the values of each attribute.
    Title = []
    Most_view_in4_1 = []
    Most_view_in4_2 = []
    Rank_in4_1 = []
    Rank_in4_2 = []
    Rank_in4_3 = []
    Video_id = []

    num_threads = 4

    with ThreadPoolExecutor(max_workers=num_threads) as executor:
        # Submit tasks to the thread pool
        futures = [executor.submit(get_video_info_web, url, Title, Most_view_in4_1, 
                                   Most_view_in4_2, Rank_in4_1, Rank_in4_2, Rank_in4_3, 
                                   Video_id) for url in urls_filtered]

        # Wait for all tasks to complete
        for future in futures:
            future.result()
    
    data = pd.DataFrame({"Title": Title,
                         "Most view per day": Most_view_in4_1,
                         "Most-view-date": Most_view_in4_2, 
                         "Highest rank": Rank_in4_1,
                         "Time to Highest rank": Rank_in4_2,
                         "Charted-duration": Rank_in4_3, 
                         "Id": Video_id})

    return data

In [9]:
kworb_df = collect_data_web('../data/kworb_video_url.txt')
kworb_df

Unnamed: 0,Title,Most view per day,Most-view-date,Highest rank,Time to Highest rank,Charted-duration,Id
0,Mark Ronson - Uptown Funk (Official Video) ft....,6365428,2015/03/21,1.0,7.0,462.0,OPf0YbXqDm0
1,Luis Fonsi - Despacito ft. Daddy Yankee,25794523,2017/08/05,1.0,35.0,359.0,kJQP7kiw5Fk
2,Wiz Khalifa - See You Again ft. Charlie Puth [...,8818084,2015/05/23,1.0,17.0,449.0,RgKAFK5djSk
3,Ed Sheeran - Shape of You (Official Music Video),14390704,2017/05/13,1.0,4.0,356.0,JGwWNGJdvx8
4,Maroon 5 - Sugar (Official Music Video),10684581,2015/01/16,1.0,1.0,304.0,09R8_2nJtjg
...,...,...,...,...,...,...,...
2495,Same Time Same Jagah (Chaar Din) ● Sandeep Bra...,216961,2021/02/15,,,,HC172grgTwU
2496,Justin Bieber - Ghost,3261913,2021/10/08,14.0,,10.0,Fp8msa5uYsc
2497,Xúc Xắc Xúc Xẻ - Bé Bảo An ft Phi Long,842423,2020/05/01,,,,Fd7lYEtevxQ
2498,Kya Loge Tum | Akshay Kumar | Amyra Dastur | B...,15118533,2023/05/19,1.0,1.0,17.0,cAMHx-m9oh8


In [17]:
youtube_df = collect_data_youtube(music_video_id, api_key)
youtube_df

Unnamed: 0,Id,View,Like,Duration,Channel_name,Subscriber,Country,Publish_time,Hasgtag
0,9bZkp7q19f0,4977100201,27830052,0:04:13,officialpsy,18400000,,2012-07-15 07:46:32,4.0
1,hT_nvWreIhg,3927343823,17491157,0:04:44,OneRepublicVEVO,5470000,,2013-05-31 07:00:36,2.0
2,JGwWNGJdvx8,6149237372,32323841,0:04:24,Ed Sheeran,53900000,,2017-01-30 10:57:50,3.0
3,lp-EO5I60KA,3700047693,14912926,0:04:57,Ed Sheeran,53900000,,2014-10-07 13:57:37,3.0
4,CevxZvSJLk8,3918654705,16596026,0:04:30,KatyPerryVEVO,24600000,US,2013-09-05 20:00:22,0.0
...,...,...,...,...,...,...,...,...,...
2495,HC172grgTwU,325090545,1693226,0:05:17,Lokdhun Punjabi,13300000,IN,2016-01-18 03:30:00,8.0
2496,Fd7lYEtevxQ,324741605,867653,0:03:07,Ruby Bảo An,1640000,,2011-01-31 13:52:25,8.0
2497,Fp8msa5uYsc,324440961,3355081,0:03:33,JustinBieberVEVO,31500000,US,2021-10-08 04:00:10,3.0
2498,6EGg0_l-edc,324572283,1617971,0:03:11,Henrique e Juliano,15800000,BR,2021-09-03 15:00:14,0.0


In [20]:
kworb_df.to_csv('../data/raw_kworb_data.csv', index=False)

In [21]:
youtube_df.to_csv('../data/raw_youtube_data.csv', index=False)

In [22]:
youtube_kworb_df = pd.merge(youtube_df, kworb_df, on='Id')
youtube_kworb_df

Unnamed: 0,Id,View,Like,Duration,Channel_name,Subscriber,Country,Publish_time,Hasgtag,Title,Most view per day,Most-view-date,Highest rank,Time to Highest rank,Charted-duration
0,9bZkp7q19f0,4977100201,27830052,0:04:13,officialpsy,18400000,,2012-07-15 07:46:32,4.0,PSY - GANGNAM STYLE(강남스타일) M/V,14924298,2012/12/21,1.0,36.0,482.0
1,hT_nvWreIhg,3927343823,17491157,0:04:44,OneRepublicVEVO,5470000,,2013-05-31 07:00:36,2.0,OneRepublic - Counting Stars,3288973,2018/11/10,4.0,,482.0
2,JGwWNGJdvx8,6149237372,32323841,0:04:24,Ed Sheeran,53900000,,2017-01-30 10:57:50,3.0,Ed Sheeran - Shape of You (Official Music Video),14390704,2017/05/13,1.0,4.0,356.0
3,lp-EO5I60KA,3700047693,14912926,0:04:57,Ed Sheeran,53900000,,2014-10-07 13:57:37,3.0,Ed Sheeran - Thinking Out Loud (Official Music...,3771622,2015/02/14,3.0,1.0,304.0
4,CevxZvSJLk8,3918654705,16596026,0:04:30,KatyPerryVEVO,24600000,US,2013-09-05 20:00:22,0.0,Katy Perry - Roar,11294380,2013/09/07,2.0,5.0,445.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2495,HC172grgTwU,325090545,1693226,0:05:17,Lokdhun Punjabi,13300000,IN,2016-01-18 03:30:00,8.0,Same Time Same Jagah (Chaar Din) ● Sandeep Bra...,216961,2021/02/15,,,
2496,Fd7lYEtevxQ,324741605,867653,0:03:07,Ruby Bảo An,1640000,,2011-01-31 13:52:25,8.0,Xúc Xắc Xúc Xẻ - Bé Bảo An ft Phi Long,842423,2020/05/01,,,
2497,Fp8msa5uYsc,324440961,3355081,0:03:33,JustinBieberVEVO,31500000,US,2021-10-08 04:00:10,3.0,Justin Bieber - Ghost,3261913,2021/10/08,14.0,,10.0
2498,6EGg0_l-edc,324572283,1617971,0:03:11,Henrique e Juliano,15800000,BR,2021-09-03 15:00:14,0.0,Henrique e Juliano - A MAIOR SAUDADE - DVD Ma...,2562984,2022/07/11,56.0,,29.0


In [23]:
youtube_kworb_df.to_csv('../data/raw_youtube_kworb_data.csv', index=False)

<div style="text-align: left; font-family: 'Trebuchet MS', Arial, sans-serif; color: #FF90BC; padding: 20px; font-size: 30px; font-weight: bold; border-radius: 0 0 0 0">
  STEP 4: CRAWL DATA WITH SPOTIFY API
</div>

In [2]:
SPOTIPY_CLIENT_ID = '14fd44fe1224492a9068b235b16b2929'
SPOTIPY_CLIENT_SECRET = 'dd18dbc5aaeb4edbb59b07beb5129100'
SPOTIPY_REDIRECT_URI = 'http://localhost:8888/callback'

In [5]:
sp = spotipy.Spotify(auth_manager=SpotifyOAuth(client_id=SPOTIPY_CLIENT_ID,
                                                  client_secret=SPOTIPY_CLIENT_SECRET,
                                                  redirect_uri=SPOTIPY_REDIRECT_URI,
                                                  scope="user-library-read"))


In [6]:
data_kworb = pd.read_csv('../data/raw_youtube_kworb_data.csv')
data_kworb

Unnamed: 0,Id,View,Like,Duration,Channel_name,Subscriber,Country,Publish_time,Hasgtag,Title,Most view per day,Most-view-date,Highest rank,Time to Highest rank,Charted-duration
0,9bZkp7q19f0,4.977100e+09,27830052.0,0:04:13,officialpsy,18400000.0,,2012-07-15 07:46:32,4.0,PSY - GANGNAM STYLE(강남스타일) M/V,14924298,2012/12/21,1.0,36.0,482.0
1,hT_nvWreIhg,3.927344e+09,17491157.0,0:04:44,OneRepublicVEVO,5470000.0,,2013-05-31 07:00:36,2.0,OneRepublic - Counting Stars,3288973,2018/11/10,4.0,,482.0
2,JGwWNGJdvx8,6.149237e+09,32323841.0,0:04:24,Ed Sheeran,53900000.0,,2017-01-30 10:57:50,3.0,Ed Sheeran - Shape of You (Official Music Video),14390704,2017/05/13,1.0,4.0,356.0
3,lp-EO5I60KA,3.700048e+09,14912926.0,0:04:57,Ed Sheeran,53900000.0,,2014-10-07 13:57:37,3.0,Ed Sheeran - Thinking Out Loud (Official Music...,3771622,2015/02/14,3.0,1.0,304.0
4,CevxZvSJLk8,3.918655e+09,16596026.0,0:04:30,KatyPerryVEVO,24600000.0,US,2013-09-05 20:00:22,0.0,Katy Perry - Roar,11294380,2013/09/07,2.0,5.0,445.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2495,HC172grgTwU,3.250905e+08,1693226.0,0:05:17,Lokdhun Punjabi,13300000.0,IN,2016-01-18 03:30:00,8.0,Same Time Same Jagah (Chaar Din) ● Sandeep Bra...,216961,2021/02/15,,,
2496,Fd7lYEtevxQ,3.247416e+08,867653.0,0:03:07,Ruby Bảo An,1640000.0,,2011-01-31 13:52:25,8.0,Xúc Xắc Xúc Xẻ - Bé Bảo An ft Phi Long,842423,2020/05/01,,,
2497,Fp8msa5uYsc,3.244410e+08,3355081.0,0:03:33,JustinBieberVEVO,31500000.0,US,2021-10-08 04:00:10,3.0,Justin Bieber - Ghost,3261913,2021/10/08,14.0,,10.0
2498,6EGg0_l-edc,3.245723e+08,1617971.0,0:03:11,Henrique e Juliano,15800000.0,BR,2021-09-03 15:00:14,0.0,Henrique e Juliano - A MAIOR SAUDADE - DVD Ma...,2562984,2022/07/11,56.0,,29.0


In [7]:
#print full of title
pd.set_option('display.max_colwidth', None)
data_kworb['Title']

0                                                              PSY - GANGNAM STYLE(강남스타일) M/V
1                                                                OneRepublic - Counting Stars
2                                            Ed Sheeran - Shape of You (Official Music Video)
3                                       Ed Sheeran - Thinking Out Loud (Official Music Video)
4                                                                           Katy Perry - Roar
                                                ...                                          
2495    Same Time Same Jagah (Chaar Din) ● Sandeep Brar ● Kulwinder Billa ● New Punjabi Songs
2496                                                   Xúc Xắc Xúc Xẻ - Bé Bảo An ft Phi Long
2497                                                                    Justin Bieber - Ghost
2498                            Henrique e Juliano -  A MAIOR SAUDADE - DVD Manifesto Musical
2499                                                        

In [8]:
def get_track(song_title):
    query = song_title
    result = sp.search(q=query, limit=10)
    try:
        track = result['tracks']['items'][0]
        return track
    except:
        return None

In [9]:
song_titles = data_kworb['Title'].to_numpy()
with ThreadPoolExecutor() as executor:
    tracks = list(executor.map(lambda i, title: (i, get_track(title)), range(len(song_titles)), song_titles))


KeyboardInterrupt: 

In [15]:
#rearange data
tracks = [track for track in tracks if track[1] is not None]
#get some track information like popularity, song name, artist name, release date, genre
track_info = [(track[1]['name'], track[1]['artists'][0]['name'], track[1]['popularity'], track[1]['album']['release_date']) for track in tracks]
track_info = pd.DataFrame(track_info, columns=['Song Name', 'Artist', 'Popularity Score (Spotify)', 'Release date (Spotify)'])
track_info

Unnamed: 0,Song Name,Artist,Popularity Score (Spotify),Release date (Spotify)
0,Uptown Funk (feat. Bruno Mars),Mark Ronson,85,2015-01-12
1,Despacito,Luis Fonsi,81,2019-02-01
2,See You Again (feat. Charlie Puth),Wiz Khalifa,61,2016-01-29
3,Shape of You - Stormzy Remix,Ed Sheeran,55,2017-02-24
4,Sugar - Remix,Maroon 5,31,2015-05-18
...,...,...,...,...
2495,Char Din Char Juni,Ankita Pun,20,2022-06-08
2496,Ghost,Justin Bieber,89,2021-03-19
2497,Xúc Xắc Xúc Xẻ,Bé Bảo An,11,2020-08-08
2498,Kya Loge Tum,B Praak,73,2023-05-15


In [23]:
genres = []
for track in tracks:
    genres.append(sp.artist(track[1]['artists'][0]['id'])['genres'])

track_info['Genre'] = genres

In [29]:
track_info

Unnamed: 0,Song Name,Artist,Popularity Score (Spotify),Release date (Spotify),Genre
0,Uptown Funk (feat. Bruno Mars),Mark Ronson,85,2015-01-12,[pop soul]
1,Despacito,Luis Fonsi,81,2019-02-01,"[latin pop, puerto rican pop]"
2,See You Again (feat. Charlie Puth),Wiz Khalifa,61,2016-01-29,"[hip hop, pittsburgh rap, pop rap, rap, southern hip hop, trap]"
3,Shape of You - Stormzy Remix,Ed Sheeran,55,2017-02-24,"[pop, singer-songwriter pop, uk pop]"
4,Sugar - Remix,Maroon 5,31,2015-05-18,[pop]
...,...,...,...,...,...
2495,Char Din Char Juni,Ankita Pun,20,2022-06-08,[nepali pop]
2496,Ghost,Justin Bieber,89,2021-03-19,"[canadian pop, pop]"
2497,Xúc Xắc Xúc Xẻ,Bé Bảo An,11,2020-08-08,[nhac thieu nhi]
2498,Kya Loge Tum,B Praak,73,2023-05-15,"[desi pop, filmi, punjabi pop]"


In [None]:
track_info.save_csv('../data/raw_combined_data.csv')