# Lyrics download workflow

You can get an access token here: https://genius.com/api-clients
Just log in or create an account, create a new API Client and then click on "Generate Access Token".

In [2]:
import lyricsgenius
import re
import pandas as pd

If you save the token in afile called 'geniustoken.txt' the following cell will automatically open the file, copy the token into the variable "token" and close the file.

In [3]:
with open('geniustoken.txt', 'r') as file:
    token = file.read().strip()
    file.close()

This will create the object that we need to actually use the API. Every call to the API will be made with genius.whatever_method_we_need

In [4]:
genius = lyricsgenius.Genius(token)

Given the fact that we will have single songs to get lyrics from the only method we need is actually genius.search_song() which has only 3 parameters:

Title and Artist, as a string, and get_full_info which, when True, gets a lot of other stuff we don't need and when False only returns the lyrics. We're only going to use it with False.

The dataset we're using has quite a few songs with multiple artists. When it's only two of them they are divided by an '&' while if there's 3 or more only the last one will be separated by '&' while the other by a comma.

Examples:
Kanye West, Big Sean, Pusha T & 2 Chainz OR Jay-Z & Kanye West

There is a regex below that only takes the first artist name which is enough to find the related song. It should only fail on Artists that have a comma in the middle of their name which I'm hoping it's none.


Assuming that the artists and related songs will be saved in a list of lists (e.g. [['Sharon Van Etten', 'Seventeen'], ['Crash of Rhinos', 'Big Sea'], ['Jay-Z & Kanye West', 'Niggas in Paris']] we can just use a function to get all the related lyrics. The function will return a list of lists with three elements: (artist, song_title, lyrics).



In [5]:
def get_first_artist(artist):
    pattern = '[^,|&]*'
    return re.match(pattern, artist).group()

def get_lyrics(artist_song_list):    
    return [[artist, song, genius.search_song(song, get_first_artist(artist), get_full_info=False).lyrics] for artist, song in artist_song_list]
        
    

Small test including songs with 2 and 3 or more artist to check regex.

In [6]:
artist_list = [['Sharon Van Etten', 'Seventeen'], ['P Diddy, Usher & Loon', 'I Need A Girl'], ['Jay-Z & Kanye West', 'Niggas in Paris']]

In [7]:
lyrics = get_lyrics(artist_list)

Searching for "Seventeen" by Sharon Van Etten...
Done.
Searching for "I Need A Girl" by P Diddy...
Done.
Searching for "Niggas in Paris" by Jay-Z ...
Done.


In [8]:
lyrics

[['Sharon Van Etten',
  'Seventeen',
  "[Verse 1]\nI know what you wanna say\nI think that you're all the same\nConstantly being led astray\nYou think you know something you don't\n\n[Chorus 1]\nDowntown hotspot halfway up the street\nI used to be free, I used to be seventeen\nFollow my shadow around your corner\nI used to be seventeen, now you're just like me\n\n[Verse 2]\nDown beneath the ashes and the stone\nSure of what I've lived and have known\nI see you so uncomfortably alone\nI wish I could show you how much you've grown\n\n[Chorus 2]\nDowntown hotspot used to be on this street\nI used to be seventeen, I used to be seventeen\nNow you're a hotshot hanging on my block\nSun coming up, who's my shadow?\n\n[Post-Chorus]\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\n\n[Bridge]\nI know what you're gonna be\nI know that you're gonna be\nYou'll crumble it up just to see\nAfraid that you'll be just like me\n\n[Chorus 3]\nDowntown hotspot halfway

IMPORTANT: Genius lyrics include information inside square brackets to indicate section of the songs or which artist is singing which part. They can be removed with a regex. We can first gather all the lyrics into a list of tuples like the one returned by the functions above and then run the regex through it.

In [44]:
def remove_brackets(lyrics_list):
    pattern = '\[.*\]'
    for song in lyrics:
        song[2] = re.sub(pattern, '', song[2])

In [45]:
remove_brackets(lyrics)

In [47]:
lyrics[0]

['Sharon Van Etten',
 'Seventeen',
 "\nI know what you wanna say\nI think that you're all the same\nConstantly being led astray\nYou think you know something you don't\n\n\nDowntown hotspot halfway up the street\nI used to be free, I used to be seventeen\nFollow my shadow around your corner\nI used to be seventeen, now you're just like me\n\n\nDown beneath the ashes and the stone\nSure of what I've lived and have known\nI see you so uncomfortably alone\nI wish I could show you how much you've grown\n\n\nDowntown hotspot used to be on this street\nI used to be seventeen, I used to be seventeen\nNow you're a hotshot hanging on my block\nSun coming up, who's my shadow?\n\n\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\nLa-la-la-la-la-la-la\n\n\nI know what you're gonna be\nI know that you're gonna be\nYou'll crumble it up just to see\nAfraid that you'll be just like me\n\n\nDowntown hotspot halfway through this life\nI used to feel free, or was it just a dream?\nNow yo