## Kworb Webscraping

### Table of Contents
1. [Imports and Request](#1)
2. [Data Extraction](#2)
3. [Data Save](#3)

This notebook outlines my process to developing the webscraping for kworb.net, the website that wikipedia uses to get its spotify data.

### 1. Imports and Request <a title = '1'></a>
This section is simply for importing the necessary libraries and requesting the HTML from kworb 

In [1]:
# import the needed libraries
import requests
from bs4 import BeautifulSoup
import pandas as pd
from datetime import datetime, timedelta

In [2]:
# request raw HTML Code from kworb with requests and turn it into soup (a cleaner format)

# link to website
url = 'https://kworb.net/spotify/country/global_weekly_totals.html'
# request the raw HTML from the website
r = requests.get(url)
# turn the raw html into soup
soup = BeautifulSoup(r.content)

### 2. Data Extraction <a title = '2'></a>
In this section, data will be taken from the HTML codes and put into a table

#### Column Extraction

In [3]:
#get the head of the table
colNames = soup.find("thead").find_all('th')
#new list to store column names
cols = []
# for column in list, get string (value of attribute)
for col in colNames:
    #add the name of the column to the list
    cols.append(col.string)
print(cols)

['Pos', 'Artist and Title', 'Wks', 'T10', 'Pk', '(x?)', 'PkStreams', 'Total']


In [4]:
# I want to seperate title and artist
artTitle = cols[1].split()
# ['Artist', 'and', 'Title'] 
# we want artTitle[0] and artTitle[2]
columns = [cols[0], artTitle[0], artTitle[2]] + cols[2:9]
print(columns)

['Pos', 'Artist', 'Title', 'Wks', 'T10', 'Pk', '(x?)', 'PkStreams', 'Total']


In [5]:
# The columns are what I want them to be
# Create DF to store the data in
df = pd.DataFrame(columns = columns)

#### Full data (every row) extraction

In [6]:
# get the rows of the table
rows = soup.find("tbody").find_all('tr')

In [7]:
## Main code for scraping all of the data
print("INIT: SearchAllRows")
# for each row in the group of rows
for row in rows:
    # Get data from the nth row
    print(f"INIT: SearchRow{row.find_all('td')[0].string}")
    # get the value in every cell and store in entries
    entries = row.find_all('td')
    entry = []
    # for each entry
    for i in range(len(entries)):
        # The way the data is stored in the table, the artist and song are in 1 column (index 1)
        # This if statement seperates them into their own columns
        if i == 1:
            artistSong = entries[i].find_all('a')
            # add artist to list
            entry.append(artistSong[0].string)
            # add song to list
            entry.append(artistSong[1].string)
        else:
            # if not column 1 (aritst and song) just add the column
            entry.append(entries[i].string)
    # write the data scraped into a new row
    df.loc[len(df.index)] = entry
print("END: SearchAllRows")

INIT: SearchAllRows
INIT: SearchRow1
INIT: SearchRow2
INIT: SearchRow3
INIT: SearchRow4
INIT: SearchRow5
INIT: SearchRow6
INIT: SearchRow7
INIT: SearchRow8
INIT: SearchRow9
INIT: SearchRow10
INIT: SearchRow11
INIT: SearchRow12
INIT: SearchRow13
INIT: SearchRow14
INIT: SearchRow15
INIT: SearchRow16
INIT: SearchRow17
INIT: SearchRow18
INIT: SearchRow19
INIT: SearchRow20
INIT: SearchRow21
INIT: SearchRow22
INIT: SearchRow23
INIT: SearchRow24
INIT: SearchRow25
INIT: SearchRow26
INIT: SearchRow27
INIT: SearchRow28
INIT: SearchRow29
INIT: SearchRow30
INIT: SearchRow31
INIT: SearchRow32
INIT: SearchRow33
INIT: SearchRow34
INIT: SearchRow35
INIT: SearchRow36
INIT: SearchRow37
INIT: SearchRow38
INIT: SearchRow39
INIT: SearchRow40
INIT: SearchRow41
INIT: SearchRow42
INIT: SearchRow43
INIT: SearchRow44
INIT: SearchRow45
INIT: SearchRow46
INIT: SearchRow47
INIT: SearchRow48
INIT: SearchRow49
INIT: SearchRow50
INIT: SearchRow51
INIT: SearchRow52
INIT: SearchRow53
INIT: SearchRow54
INIT: SearchRow55

INIT: SearchRow487
INIT: SearchRow488
INIT: SearchRow489
INIT: SearchRow490
INIT: SearchRow491
INIT: SearchRow492
INIT: SearchRow493
INIT: SearchRow494
INIT: SearchRow495
INIT: SearchRow496
INIT: SearchRow497
INIT: SearchRow498
INIT: SearchRow499
INIT: SearchRow500
INIT: SearchRow501
INIT: SearchRow502
INIT: SearchRow503
INIT: SearchRow504
INIT: SearchRow505
INIT: SearchRow506
INIT: SearchRow507
INIT: SearchRow508
INIT: SearchRow509
INIT: SearchRow510
INIT: SearchRow511
INIT: SearchRow512
INIT: SearchRow513
INIT: SearchRow514
INIT: SearchRow515
INIT: SearchRow516
INIT: SearchRow517
INIT: SearchRow518
INIT: SearchRow519
INIT: SearchRow520
INIT: SearchRow521
INIT: SearchRow522
INIT: SearchRow523
INIT: SearchRow524
INIT: SearchRow525
INIT: SearchRow526
INIT: SearchRow527
INIT: SearchRow528
INIT: SearchRow529
INIT: SearchRow530
INIT: SearchRow531
INIT: SearchRow532
INIT: SearchRow533
INIT: SearchRow534
INIT: SearchRow535
INIT: SearchRow536
INIT: SearchRow537
INIT: SearchRow538
INIT: Search

INIT: SearchRow981
INIT: SearchRow982
INIT: SearchRow983
INIT: SearchRow984
INIT: SearchRow985
INIT: SearchRow986
INIT: SearchRow987
INIT: SearchRow988
INIT: SearchRow989
INIT: SearchRow990
INIT: SearchRow991
INIT: SearchRow992
INIT: SearchRow993
INIT: SearchRow994
INIT: SearchRow995
INIT: SearchRow996
INIT: SearchRow997
INIT: SearchRow998
INIT: SearchRow999
INIT: SearchRow1000
INIT: SearchRow1001
INIT: SearchRow1002
INIT: SearchRow1003
INIT: SearchRow1004
INIT: SearchRow1005
INIT: SearchRow1006
INIT: SearchRow1007
INIT: SearchRow1008
INIT: SearchRow1009
INIT: SearchRow1010
INIT: SearchRow1011
INIT: SearchRow1012
INIT: SearchRow1013
INIT: SearchRow1014
INIT: SearchRow1015
INIT: SearchRow1016
INIT: SearchRow1017
INIT: SearchRow1018
INIT: SearchRow1019
INIT: SearchRow1020
INIT: SearchRow1021
INIT: SearchRow1022
INIT: SearchRow1023
INIT: SearchRow1024
INIT: SearchRow1025
INIT: SearchRow1026
INIT: SearchRow1027
INIT: SearchRow1028
INIT: SearchRow1029
INIT: SearchRow1030
INIT: SearchRow1031

INIT: SearchRow1456
INIT: SearchRow1457
INIT: SearchRow1458
INIT: SearchRow1459
INIT: SearchRow1460
INIT: SearchRow1461
INIT: SearchRow1462
INIT: SearchRow1463
INIT: SearchRow1464
INIT: SearchRow1465
INIT: SearchRow1466
INIT: SearchRow1467
INIT: SearchRow1468
INIT: SearchRow1469
INIT: SearchRow1470
INIT: SearchRow1471
INIT: SearchRow1472
INIT: SearchRow1473
INIT: SearchRow1474
INIT: SearchRow1475
INIT: SearchRow1476
INIT: SearchRow1477
INIT: SearchRow1478
INIT: SearchRow1479
INIT: SearchRow1480
INIT: SearchRow1481
INIT: SearchRow1482
INIT: SearchRow1483
INIT: SearchRow1484
INIT: SearchRow1485
INIT: SearchRow1486
INIT: SearchRow1487
INIT: SearchRow1488
INIT: SearchRow1489
INIT: SearchRow1490
INIT: SearchRow1491
INIT: SearchRow1492
INIT: SearchRow1493
INIT: SearchRow1494
INIT: SearchRow1495
INIT: SearchRow1496
INIT: SearchRow1497
INIT: SearchRow1498
INIT: SearchRow1499
INIT: SearchRow1500
INIT: SearchRow1501
INIT: SearchRow1502
INIT: SearchRow1503
INIT: SearchRow1504
INIT: SearchRow1505


INIT: SearchRow1910
INIT: SearchRow1911
INIT: SearchRow1912
INIT: SearchRow1913
INIT: SearchRow1914
INIT: SearchRow1915
INIT: SearchRow1916
INIT: SearchRow1917
INIT: SearchRow1918
INIT: SearchRow1919
INIT: SearchRow1920
INIT: SearchRow1921
INIT: SearchRow1922
INIT: SearchRow1923
INIT: SearchRow1924
INIT: SearchRow1925
INIT: SearchRow1926
INIT: SearchRow1927
INIT: SearchRow1928
INIT: SearchRow1929
INIT: SearchRow1930
INIT: SearchRow1931
INIT: SearchRow1932
INIT: SearchRow1933
INIT: SearchRow1934
INIT: SearchRow1935
INIT: SearchRow1936
INIT: SearchRow1937
INIT: SearchRow1938
INIT: SearchRow1939
INIT: SearchRow1940
INIT: SearchRow1941
INIT: SearchRow1942
INIT: SearchRow1943
INIT: SearchRow1944
INIT: SearchRow1945
INIT: SearchRow1946
INIT: SearchRow1947
INIT: SearchRow1948
INIT: SearchRow1949
INIT: SearchRow1950
INIT: SearchRow1951
INIT: SearchRow1952
INIT: SearchRow1953
INIT: SearchRow1954
INIT: SearchRow1955
INIT: SearchRow1956
INIT: SearchRow1957
INIT: SearchRow1958
INIT: SearchRow1959


INIT: SearchRow2369
INIT: SearchRow2370
INIT: SearchRow2371
INIT: SearchRow2372
INIT: SearchRow2373
INIT: SearchRow2374
INIT: SearchRow2375
INIT: SearchRow2376
INIT: SearchRow2377
INIT: SearchRow2378
INIT: SearchRow2379
INIT: SearchRow2380
INIT: SearchRow2381
INIT: SearchRow2382
INIT: SearchRow2383
INIT: SearchRow2384
INIT: SearchRow2385
INIT: SearchRow2386
INIT: SearchRow2387
INIT: SearchRow2388
INIT: SearchRow2389
INIT: SearchRow2390
INIT: SearchRow2391
INIT: SearchRow2392
INIT: SearchRow2393
INIT: SearchRow2394
INIT: SearchRow2395
INIT: SearchRow2396
INIT: SearchRow2397
INIT: SearchRow2398
INIT: SearchRow2399
INIT: SearchRow2400
INIT: SearchRow2401
INIT: SearchRow2402
INIT: SearchRow2403
INIT: SearchRow2404
INIT: SearchRow2405
INIT: SearchRow2406
INIT: SearchRow2407
INIT: SearchRow2408
INIT: SearchRow2409
INIT: SearchRow2410
INIT: SearchRow2411
INIT: SearchRow2412
INIT: SearchRow2413
INIT: SearchRow2414
INIT: SearchRow2415
INIT: SearchRow2416
INIT: SearchRow2417
INIT: SearchRow2418


INIT: SearchRow2814
INIT: SearchRow2815
INIT: SearchRow2816
INIT: SearchRow2817
INIT: SearchRow2818
INIT: SearchRow2819
INIT: SearchRow2820
INIT: SearchRow2821
INIT: SearchRow2822
INIT: SearchRow2823
INIT: SearchRow2824
INIT: SearchRow2825
INIT: SearchRow2826
INIT: SearchRow2827
INIT: SearchRow2828
INIT: SearchRow2829
INIT: SearchRow2830
INIT: SearchRow2831
INIT: SearchRow2832
INIT: SearchRow2833
INIT: SearchRow2834
INIT: SearchRow2835
INIT: SearchRow2836
INIT: SearchRow2837
INIT: SearchRow2838
INIT: SearchRow2839
INIT: SearchRow2840
INIT: SearchRow2841
INIT: SearchRow2842
INIT: SearchRow2843
INIT: SearchRow2844
INIT: SearchRow2845
INIT: SearchRow2846
INIT: SearchRow2847
INIT: SearchRow2848
INIT: SearchRow2849
INIT: SearchRow2850
INIT: SearchRow2851
INIT: SearchRow2852
INIT: SearchRow2853
INIT: SearchRow2854
INIT: SearchRow2855
INIT: SearchRow2856
INIT: SearchRow2857
INIT: SearchRow2858
INIT: SearchRow2859
INIT: SearchRow2860
INIT: SearchRow2861
INIT: SearchRow2862
INIT: SearchRow2863


INIT: SearchRow3230
INIT: SearchRow3231
INIT: SearchRow3232
INIT: SearchRow3233
INIT: SearchRow3234
INIT: SearchRow3235
INIT: SearchRow3236
INIT: SearchRow3237
INIT: SearchRow3238
INIT: SearchRow3239
INIT: SearchRow3240
INIT: SearchRow3241
INIT: SearchRow3242
INIT: SearchRow3243
INIT: SearchRow3244
INIT: SearchRow3245
INIT: SearchRow3246
INIT: SearchRow3247
INIT: SearchRow3248
INIT: SearchRow3249
INIT: SearchRow3250
INIT: SearchRow3251
INIT: SearchRow3252
INIT: SearchRow3253
INIT: SearchRow3254
INIT: SearchRow3255
INIT: SearchRow3256
INIT: SearchRow3257
INIT: SearchRow3258
INIT: SearchRow3259
INIT: SearchRow3260
INIT: SearchRow3261
INIT: SearchRow3262
INIT: SearchRow3263
INIT: SearchRow3264
INIT: SearchRow3265
INIT: SearchRow3266
INIT: SearchRow3267
INIT: SearchRow3268
INIT: SearchRow3269
INIT: SearchRow3270
INIT: SearchRow3271
INIT: SearchRow3272
INIT: SearchRow3273
INIT: SearchRow3274
INIT: SearchRow3275
INIT: SearchRow3276
INIT: SearchRow3277
INIT: SearchRow3278
INIT: SearchRow3279


INIT: SearchRow3697
INIT: SearchRow3698
INIT: SearchRow3699
INIT: SearchRow3700
INIT: SearchRow3701
INIT: SearchRow3702
INIT: SearchRow3703
INIT: SearchRow3704
INIT: SearchRow3705
INIT: SearchRow3706
INIT: SearchRow3707
INIT: SearchRow3708
INIT: SearchRow3709
INIT: SearchRow3710
INIT: SearchRow3711
INIT: SearchRow3712
INIT: SearchRow3713
INIT: SearchRow3714
INIT: SearchRow3715
INIT: SearchRow3716
INIT: SearchRow3717
INIT: SearchRow3718
INIT: SearchRow3719
INIT: SearchRow3720
INIT: SearchRow3721
INIT: SearchRow3722
INIT: SearchRow3723
INIT: SearchRow3724
INIT: SearchRow3725
INIT: SearchRow3726
INIT: SearchRow3727
INIT: SearchRow3728
INIT: SearchRow3729
INIT: SearchRow3730
INIT: SearchRow3731
INIT: SearchRow3732
INIT: SearchRow3733
INIT: SearchRow3734
INIT: SearchRow3735
INIT: SearchRow3736
INIT: SearchRow3737
INIT: SearchRow3738
INIT: SearchRow3739
INIT: SearchRow3740
INIT: SearchRow3741
INIT: SearchRow3742
INIT: SearchRow3743
INIT: SearchRow3744
INIT: SearchRow3745
INIT: SearchRow3746


INIT: SearchRow4154
INIT: SearchRow4155
INIT: SearchRow4156
INIT: SearchRow4157
INIT: SearchRow4158
INIT: SearchRow4159
INIT: SearchRow4160
INIT: SearchRow4161
INIT: SearchRow4162
INIT: SearchRow4163
INIT: SearchRow4164
INIT: SearchRow4165
INIT: SearchRow4166
INIT: SearchRow4167
INIT: SearchRow4168
INIT: SearchRow4169
INIT: SearchRow4170
INIT: SearchRow4171
INIT: SearchRow4172
INIT: SearchRow4173
INIT: SearchRow4174
INIT: SearchRow4175
INIT: SearchRow4176
INIT: SearchRow4177
INIT: SearchRow4178
INIT: SearchRow4179
INIT: SearchRow4180
INIT: SearchRow4181
INIT: SearchRow4182
INIT: SearchRow4183
INIT: SearchRow4184
INIT: SearchRow4185
INIT: SearchRow4186
INIT: SearchRow4187
INIT: SearchRow4188
INIT: SearchRow4189
INIT: SearchRow4190
INIT: SearchRow4191
INIT: SearchRow4192
INIT: SearchRow4193
INIT: SearchRow4194
INIT: SearchRow4195
INIT: SearchRow4196
INIT: SearchRow4197
INIT: SearchRow4198
INIT: SearchRow4199
INIT: SearchRow4200
INIT: SearchRow4201
INIT: SearchRow4202
INIT: SearchRow4203


INIT: SearchRow4613
INIT: SearchRow4614
INIT: SearchRow4615
INIT: SearchRow4616
INIT: SearchRow4617
INIT: SearchRow4618
INIT: SearchRow4619
INIT: SearchRow4620
INIT: SearchRow4621
INIT: SearchRow4622
INIT: SearchRow4623
INIT: SearchRow4624
INIT: SearchRow4625
INIT: SearchRow4626
INIT: SearchRow4627
INIT: SearchRow4628
INIT: SearchRow4629
INIT: SearchRow4630
INIT: SearchRow4631
INIT: SearchRow4632
INIT: SearchRow4633
INIT: SearchRow4634
INIT: SearchRow4635
INIT: SearchRow4636
INIT: SearchRow4637
INIT: SearchRow4638
INIT: SearchRow4639
INIT: SearchRow4640
INIT: SearchRow4641
INIT: SearchRow4642
INIT: SearchRow4643
INIT: SearchRow4644
INIT: SearchRow4645
INIT: SearchRow4646
INIT: SearchRow4647
INIT: SearchRow4648
INIT: SearchRow4649
INIT: SearchRow4650
INIT: SearchRow4651
INIT: SearchRow4652
INIT: SearchRow4653
INIT: SearchRow4654
INIT: SearchRow4655
INIT: SearchRow4656
INIT: SearchRow4657
INIT: SearchRow4658
INIT: SearchRow4659
INIT: SearchRow4660
INIT: SearchRow4661
INIT: SearchRow4662


INIT: SearchRow5085
INIT: SearchRow5086
INIT: SearchRow5087
INIT: SearchRow5088
INIT: SearchRow5089
INIT: SearchRow5090
INIT: SearchRow5091
INIT: SearchRow5092
INIT: SearchRow5093
INIT: SearchRow5094
INIT: SearchRow5095
INIT: SearchRow5096
INIT: SearchRow5097
INIT: SearchRow5098
INIT: SearchRow5099
INIT: SearchRow5100
INIT: SearchRow5101
INIT: SearchRow5102
INIT: SearchRow5103
INIT: SearchRow5104
INIT: SearchRow5105
INIT: SearchRow5106
INIT: SearchRow5107
INIT: SearchRow5108
INIT: SearchRow5109
INIT: SearchRow5110
INIT: SearchRow5111
INIT: SearchRow5112
INIT: SearchRow5113
INIT: SearchRow5114
INIT: SearchRow5115
INIT: SearchRow5116
INIT: SearchRow5117
INIT: SearchRow5118
INIT: SearchRow5119
INIT: SearchRow5120
INIT: SearchRow5121
INIT: SearchRow5122
INIT: SearchRow5123
INIT: SearchRow5124
INIT: SearchRow5125
INIT: SearchRow5126
INIT: SearchRow5127
INIT: SearchRow5128
INIT: SearchRow5129
INIT: SearchRow5130
INIT: SearchRow5131
INIT: SearchRow5132
INIT: SearchRow5133
INIT: SearchRow5134


INIT: SearchRow5524
INIT: SearchRow5525
INIT: SearchRow5526
INIT: SearchRow5527
INIT: SearchRow5528
INIT: SearchRow5529
INIT: SearchRow5530
INIT: SearchRow5531
INIT: SearchRow5532
INIT: SearchRow5533
INIT: SearchRow5534
INIT: SearchRow5535
INIT: SearchRow5536
INIT: SearchRow5537
INIT: SearchRow5538
INIT: SearchRow5539
INIT: SearchRow5540
INIT: SearchRow5541
INIT: SearchRow5542
INIT: SearchRow5543
INIT: SearchRow5544
INIT: SearchRow5545
INIT: SearchRow5546
INIT: SearchRow5547
INIT: SearchRow5548
INIT: SearchRow5549
INIT: SearchRow5550
INIT: SearchRow5551
INIT: SearchRow5552
INIT: SearchRow5553
INIT: SearchRow5554
INIT: SearchRow5555
INIT: SearchRow5556
INIT: SearchRow5557
INIT: SearchRow5558
INIT: SearchRow5559
INIT: SearchRow5560
INIT: SearchRow5561
INIT: SearchRow5562
INIT: SearchRow5563
INIT: SearchRow5564
INIT: SearchRow5565
INIT: SearchRow5566
INIT: SearchRow5567
INIT: SearchRow5568
INIT: SearchRow5569
INIT: SearchRow5570
INIT: SearchRow5571
INIT: SearchRow5572
INIT: SearchRow5573


INIT: SearchRow5963
INIT: SearchRow5964
INIT: SearchRow5965
INIT: SearchRow5966
INIT: SearchRow5967
INIT: SearchRow5968
INIT: SearchRow5969
INIT: SearchRow5970
INIT: SearchRow5971
INIT: SearchRow5972
INIT: SearchRow5973
INIT: SearchRow5974
INIT: SearchRow5975
INIT: SearchRow5976
INIT: SearchRow5977
INIT: SearchRow5978
INIT: SearchRow5979
INIT: SearchRow5980
INIT: SearchRow5981
INIT: SearchRow5982
INIT: SearchRow5983
INIT: SearchRow5984
INIT: SearchRow5985
INIT: SearchRow5986
INIT: SearchRow5987
INIT: SearchRow5988
INIT: SearchRow5989
INIT: SearchRow5990
INIT: SearchRow5991
INIT: SearchRow5992
INIT: SearchRow5993
INIT: SearchRow5994
INIT: SearchRow5995
INIT: SearchRow5996
INIT: SearchRow5997
INIT: SearchRow5998
INIT: SearchRow5999
INIT: SearchRow6000
INIT: SearchRow6001
INIT: SearchRow6002
INIT: SearchRow6003
INIT: SearchRow6004
INIT: SearchRow6005
INIT: SearchRow6006
INIT: SearchRow6007
INIT: SearchRow6008
INIT: SearchRow6009
INIT: SearchRow6010
INIT: SearchRow6011
INIT: SearchRow6012


In [8]:
# view the new data
df.head()

Unnamed: 0,Pos,Artist,Title,Wks,T10,Pk,(x?),PkStreams,Total
0,1,The Weeknd,Blinding Lights,172,64,1,(x13),52375259,3406980588
1,2,Ed Sheeran,Shape of You,308,34,1,(x14),64217796,3302350300
2,3,Lewis Capaldi,Someone You Loved,218,15,4,,24962682,2623186016
3,4,Tones And I,Dance Monkey,151,41,1,(x17),52055226,2538070349
4,5,Post Malone,Sunflower - Spider-Man: Into the Spider-Verse,225,31,1,(x2),34579416,2498983472


In [8]:
# the commas in PkStreams and Total mean that the entries are strings. We want them to be numbers
df['PkStreams'] = df['PkStreams'].str.replace(',','')
df['Total'] = df['Total'].str.replace(',','')


In [9]:
# Print DF to see changes
df

Unnamed: 0,Pos,Artist,Title,Wks,T10,Pk,(x?),PkStreams,Total
0,1,The Weeknd,Blinding Lights,174,64,1,(x13),52375259,3433500819
1,2,Ed Sheeran,Shape of You,310,34,1,(x14),64217796,3318579901
2,3,Lewis Capaldi,Someone You Loved,220,15,4,,24962682,2642951162
3,4,Tones And I,Dance Monkey,151,41,1,(x17),52055226,2538070349
4,5,Post Malone,Sunflower - Spider-Man: Into the Spider-Verse,226,31,1,(x2),34579416,2506532527
...,...,...,...,...,...,...,...,...,...
6198,6199,Mariah Carey,Infinity,1,,220,,177495,177495
6199,6200,Twenty One Pilots,Fairly Local,1,,234,,176713,176713
6200,6201,Ludacris,Good Lovin,1,,223,,174256,174256
6201,6202,iSHi,Push It,1,,236,,173886,173886


### Data Save <a title = '3'></a>
Save off the data to use in the main notebook

In [10]:
#The changes look good! we can export to a TSV (rather than CSV because of commas in song names and artists namse)
df.to_csv('Data/spotifyTopSongs.tsv')

In [11]:
# Check if import works
pd.read_csv('Data/spotifyTopSongs.tsv')

Unnamed: 0.1,Unnamed: 0,Pos,Artist,Title,Wks,T10,Pk,(x?),PkStreams,Total
0,0,1,The Weeknd,Blinding Lights,174,64.0,1,(x13),52375259,3433500819
1,1,2,Ed Sheeran,Shape of You,310,34.0,1,(x14),64217796,3318579901
2,2,3,Lewis Capaldi,Someone You Loved,220,15.0,4,,24962682,2642951162
3,3,4,Tones And I,Dance Monkey,151,41.0,1,(x17),52055226,2538070349
4,4,5,Post Malone,Sunflower - Spider-Man: Into the Spider-Verse,226,31.0,1,(x2),34579416,2506532527
...,...,...,...,...,...,...,...,...,...,...
6198,6198,6199,Mariah Carey,Infinity,1,,220,,177495,177495
6199,6199,6200,Twenty One Pilots,Fairly Local,1,,234,,176713,176713
6200,6200,6201,Ludacris,Good Lovin,1,,223,,174256,174256
6201,6201,6202,iSHi,Push It,1,,236,,173886,173886
