# Changing Music Preferences
### Do Spotify users listen to different music during the COVID-19 pandemic? 
### Does this relate to the amount of confirmed cases?

To Do: 

1. Scraping Spotify Top 200 data
2. Get song metadata through Spotify API
3. .....


## 1.
Scraping the Top 200 data for 2019 (baseline) and 2020 (pandemic).

https://spotifycharts.com contains the official Spotify Charts. 

Luckily, each day/week is available as a .csv file!
<br> <br> <br>
**Goal: get all the weekly .csv files from 2019 and 2020 and combine them to one.**

In [55]:
import pandas as pd
import numpy as np
from tqdm import tqdm
import requests
from bs4 import BeautifulSoup
import io

In [5]:
url = 'https://spotifycharts.com/regional/nl/weekly/latest'

r = requests.get(url)

soup = BeautifulSoup(r.content, 'html.parser')

In [21]:
dates = [item["data-value"] for item in soup.find_all(attrs={"data-value": True}) if item["data-value"].startswith('2')]
dates[:5]

['2020-10-16--2020-10-23',
 '2020-10-09--2020-10-16',
 '2020-10-02--2020-10-09',
 '2020-09-25--2020-10-02',
 '2020-09-18--2020-09-25']

In [34]:
urls = ["https://spotifycharts.com/regional/nl/weekly/" + date + "/download" for date in dates]
urls[:5]

['https://spotifycharts.com/regional/nl/weekly/2020-10-16--2020-10-23/download',
 'https://spotifycharts.com/regional/nl/weekly/2020-10-09--2020-10-16/download',
 'https://spotifycharts.com/regional/nl/weekly/2020-10-02--2020-10-09/download',
 'https://spotifycharts.com/regional/nl/weekly/2020-09-25--2020-10-02/download',
 'https://spotifycharts.com/regional/nl/weekly/2020-09-18--2020-09-25/download']

### Warning: running the block below will result in scraping 200 .csv's!

In [60]:
data = []

for date in tqdm(dates): 
    url = "https://spotifycharts.com/regional/nl/weekly/" + date + "/download"
    response = requests.get(url)
    file_object = io.StringIO(response.content.decode('utf-8'))
    df = pd.read_csv(file_object, header=1)
    df["Date"] = date
    data.append(df)

df = pd.concat(data)
df.reset_index(drop=True, inplace=True)
df.to_csv("..\\data\\raw\\top200_2017_2020.csv", index=False)
data = []

100%|██████████| 200/200 [05:47<00:00,  1.74s/it]


In [112]:
df = pd.read_csv("..\\data\\raw\\top200_2017_2020.csv")
df

Unnamed: 0.1,Unnamed: 0,Position,Track Name,Artist,Streams,URL,Date
0,0,1,Mood (feat. iann dior),24kGoldn,1637451,https://open.spotify.com/track/3tjFYV6RSFtuktY...,2020-10-16--2020-10-23
1,1,2,"Lemonade (feat. Gunna, Don Toliver & NAV)",Internet Money,1421080,https://open.spotify.com/track/7hxHWCCAIIxFLCz...,2020-10-16--2020-10-23
2,2,3,Head & Heart (feat. MNEK),Joel Corry,1285202,https://open.spotify.com/track/6cx06DFPPHchuUA...,2020-10-16--2020-10-23
3,3,4,Holy (feat. Chance The Rapper),Justin Bieber,1244608,https://open.spotify.com/track/5u1n1kITHCxxp8t...,2020-10-16--2020-10-23
4,4,5,Jerusalema (feat. Nomcebo Zikode),Master KG,1173119,https://open.spotify.com/track/2MlOUXmcofMackX...,2020-10-16--2020-10-23
...,...,...,...,...,...,...,...
39995,39995,196,Sex,Cheat Codes,114030,https://open.spotify.com/track/5DA77EqppDmCTWG...,2016-12-23--2016-12-30
39996,39996,197,Ain't My Fault,Zara Larsson,113974,https://open.spotify.com/track/0ADG9OgdVTL7fgR...,2016-12-23--2016-12-30
39997,39997,198,Please Come Home for Christmas,Luther Vandross,113779,https://open.spotify.com/track/2mOtx6P21hecOcP...,2016-12-23--2016-12-30
39998,39998,199,Jodge Me Niet - Titelsong Van De Film “SOOF 2”,Jayh,113763,https://open.spotify.com/track/2VxAfqI3vIOaPSl...,2016-12-23--2016-12-30


In [83]:
df.drop(["Unnamed: 0"], 1, inplace=True)

In [91]:
df[['Start Week', 'End Week']] = df['Date'].str.split('--', 1, expand=True)
df.to_csv("..\\data\\processed\\top200_2017_2020.csv", index=False)

In [115]:
df = pd.read_csv("..\\data\\processed\\top200_2017_2020.csv")
df[['Start Week', 'End Week']] = df[['Start Week', 'End Week']].apply(pd.to_datetime, format="%Y-%m-%d")
df.head()

Unnamed: 0,Position,Track Name,Artist,Streams,URL,Date,Start Week,End Week
0,1,Mood (feat. iann dior),24kGoldn,1637451,https://open.spotify.com/track/3tjFYV6RSFtuktY...,2020-10-16--2020-10-23,2020-10-16,2020-10-23
1,2,"Lemonade (feat. Gunna, Don Toliver & NAV)",Internet Money,1421080,https://open.spotify.com/track/7hxHWCCAIIxFLCz...,2020-10-16--2020-10-23,2020-10-16,2020-10-23
2,3,Head & Heart (feat. MNEK),Joel Corry,1285202,https://open.spotify.com/track/6cx06DFPPHchuUA...,2020-10-16--2020-10-23,2020-10-16,2020-10-23
3,4,Holy (feat. Chance The Rapper),Justin Bieber,1244608,https://open.spotify.com/track/5u1n1kITHCxxp8t...,2020-10-16--2020-10-23,2020-10-16,2020-10-23
4,5,Jerusalema (feat. Nomcebo Zikode),Master KG,1173119,https://open.spotify.com/track/2MlOUXmcofMackX...,2020-10-16--2020-10-23,2020-10-16,2020-10-23
