## Scraping song lyrics using Genius API

Import necessary packages

Note: Do not run this more than once. Restart the kernel before running this code chunk.

In [1]:
import json
import pandas as pd
from pprint import pprint

import os
os.chdir(os.path.expanduser("../"))

from dees_package.genius_functions import *

Check current working directory. Note: Do not run the above code for changing directory more than once

In [2]:
print("Current working directory:", os.getcwd())

Current working directory: /Users/hanbinfeng/Desktop/LSE_Data_Science/ds105a-project-dees-nuts


Open JSON file containing credentials

In [3]:
credentials_file_path = './credentials.json'

with open(credentials_file_path, 'r') as f:
    credentials = json.load(f)

Initialise a new session

In [4]:
my_session = requests.Session()

### **Scrape lyrics of songs in CSV**

At this point of data collection, we will have a pandas dataframe of already selected and filtered songs from using the YouTube API. Critically, the dataframe will have information on the name and artist of each song.

We now want to add the lyrics of each song into the dataframe.

In [5]:
test_df = pd.read_csv('./data/test_10_songs.csv')
test_df.head()

Unnamed: 0,Artist,Song
0,The Weeknd,Blinding Lights
1,Glass Animals,Heat Waves
2,Harry Styles,As It Was
3,The Kid LAROI & Justin Bieber,Stay
4,The Weeknd & Ariana Grande,Save Your Tears


In [6]:
# add Genius URL of each song to dataframe
test_df['Genius_URL'] = test_df.apply(lambda row: generate_song_url(row['Artist'], row['Song']), axis=1)

In [7]:
test_df.head()

Unnamed: 0,Artist,Song,Genius_URL
0,The Weeknd,Blinding Lights,https://genius.com/the-weeknd-blinding-lights-...
1,Glass Animals,Heat Waves,https://genius.com/glass-animals-heat-waves-ly...
2,Harry Styles,As It Was,https://genius.com/harry-styles-as-it-was-lyrics
3,The Kid LAROI & Justin Bieber,Stay,https://genius.com/the-kid-laroi-and-justin-bi...
4,The Weeknd & Ariana Grande,Save Your Tears,https://genius.com/the-weeknd-and-ariana-grand...


In [8]:
# add Genius lyrics of each song to dataframe
test_df['Genius_lyrics'] = test_df.apply(lambda row: scrape_lyrics(my_session, row['Genius_URL']), axis=1)

In [9]:
test_df.head()

Unnamed: 0,Artist,Song,Genius_URL,Genius_lyrics
0,The Weeknd,Blinding Lights,https://genius.com/the-weeknd-blinding-lights-...,Yeah I've been tryna call I've been on my own ...
1,Glass Animals,Heat Waves,https://genius.com/glass-animals-heat-waves-ly...,"(Last night, all I think about is you) ( Don't..."
2,Harry Styles,As It Was,https://genius.com/harry-styles-as-it-was-lyrics,"Come on, Harry, we wanna say goodnight to you ..."
3,The Kid LAROI & Justin Bieber,Stay,https://genius.com/the-kid-laroi-and-justin-bi...,I do the same thing I told you that I never wo...
4,The Weeknd & Ariana Grande,Save Your Tears,https://genius.com/the-weeknd-and-ariana-grand...,


In [10]:
# save to CSV for use in data visualisation
test_df.to_json('./data/10_songs_with_lyrics.json')