## Scraping Lyrics Using MetroLyrics and LastFM
###### Updated on 10/7/2019
----------------

#### To create conda environment from file:

1. `cd` into `pyGhostWriter` repo
2. `conda env create --file environment.yml`
3. (Optional) `source ~/.bashrc`

In [None]:
!pip install tswift

In [19]:
import random

import numpy as np
import pandas as pd
import requests as r
from tswift import Artist, Song

### MetroLyrics with `tswift` Usage Example

In [7]:
# Verify tswift is operational
the_cure = Artist('The Cure')
the_cure.songs[:10]

[Song(title='I Dont Know Whats Going On', artist='The Cure'),
 Song(title='1015 Saturday Night', artist='The Cure'),
 Song(title='13Th', artist='The Cure'),
 Song(title='2 Late', artist='The Cure'),
 Song(title='39', artist='The Cure'),
 Song(title='A Boy I Never Knew', artist='The Cure'),
 Song(title='A Chain Of Flowers', artist='The Cure'),
 Song(title='A Few Hours After This', artist='The Cure'),
 Song(title='A Foolish Arrangement', artist='The Cure'),
 Song(title='A Forest', artist='The Cure')]

In [12]:
# <|endoftext|> is the GPT-2 training data delimiter
song = random.choice(the_cure.songs)
print(song.lyrics+'\n<|endoftext|>')

I ride into your town on a big black trojan horse
I'm looking to have some fun
Some kind of trigger-happy intercourse
'Club America salutes you,' says the girl on the door
'We accept all major lies
We love any kind of fraud
So go on in and enjoy...
Go on in and enjoy!'

I'm buying for my bright new friends
Blue Suzannes all round
And my mood is heavily pregnant...
Yeah you're right
I couldn't help but notice your icy blue eyes
They've been burning two holes in the sides of my head
Since the second I arrived

And it's not too hard to guess from your stick-on stars
And your canary feather dress
Your hair in such a carefully careless mess
That you're really trying very hard to impress

You're such a wonderful person living a fabulous life
Sensational dazzling perfectly sized
Such a wonderful person living a fabulous life
Sharing it with me in Club America tonight...

So we talk for a while about some band you saw on TV
But I don't listen to you and you don't listen to me
Yeah it's an old 

In [13]:
# Create dataframe of all songs and lyrics for The Cure
lyrics_dict = {s.title: s.lyrics for s in the_cure.songs}
lyrics_zipped = list(zip(lyrics_dict.keys(), lyrics_dict.values()))

In [17]:
lyrics_df = pd.DataFrame(lyrics_zipped, columns=['title', 'lyrics'])

print('Observations/Songs: ', len(lyrics_df))
lyrics_df.head()

Observations/Songs:  301


Unnamed: 0,title,lyrics
0,I Dont Know Whats Going On,I don't know what's going on\nI am so up close...
1,1015 Saturday Night,10.15 on a Saturday night\nAnd the tap\nDrips ...
2,13Th,"'Everyone feels good in the room,' she swings\..."
3,2 Late,So I'll wait for you\nWhere I always wait\nBeh...
4,39,So the fire is almost out and there's nothing ...


### LastFM Query Examples

In [22]:
# Query similar artists of The Cure
# To run LastFM queries input your API key
response = r.get('http://ws.audioscrobbler.com/2.0/?method=artist.getsimilar&artist=thecure&api_key=API_KEY&format=json')
response

<Response [200]>

In [36]:
resp_json = response.json()['similarartists']['artist']

print([band['name'] for band in resp_json])

['The Glove', 'Siouxsie and the Banshees', 'New Order', 'Joy Division', 'Bauhaus', 'The Smiths', 'Echo & the Bunnymen', 'Depeche Mode', 'The Sisters of Mercy', 'The Chameleons', 'Peter Murphy', 'The Jesus and Mary Chain', 'Nick Cave & The Bad Seeds', 'The Church', 'The Psychedelic Furs', 'Cocteau Twins', 'Love And Rockets', 'The Sound', 'Talking Heads', 'Morrissey', 'Killing Joke', 'Japan', 'Christian Death', 'Public Image Ltd.', 'Sonic Youth', 'The The', 'Gang of Four', 'Xmal Deutschland', 'Simple Minds', 'Sad Lovers and Giants', 'The Mission', 'New Model Army', 'Modern English', 'Tones on Tail', 'The Creatures', 'Duran Duran', 'XTC', 'Wire', 'The Stranglers', 'Devo', 'Adam and the Ants', 'Pixies', 'Interpol', 'The Damned', 'Placebo', 'Magazine', 'Gene Loves Jezebel', 'Alien Sex Fiend', 'She Wants Revenge', 'Tears for Fears', "The B-52's", 'R.E.M.', 'Red Lorry Yellow Lorry', 'Gary Numan', 'Television', 'Fad Gadget', 'U2', 'The Cult', 'Soft Cell', 'Pet Shop Boys', 'Talk Talk', 'Oingo B

In [37]:
# Query top tags for The Cure
response = r.get('http://ws.audioscrobbler.com/2.0/?method=artist.getTopTags&artist=thecure&api_key=API_KEY&format=json')
response

<Response [200]>

In [42]:
resp_json = response.json()['toptags']['tag']

print([tag['name'] for tag in resp_json])

['post-punk', 'new wave', 'alternative', '80s', 'rock', 'seen live', 'alternative rock', 'goth', 'british', 'indie', 'Gothic Rock', 'Gothic', 'The Cure', 'pop', 'Post punk', 'darkwave', 'punk', 'goth rock', 'indie rock', 'classic rock', 'dark', 'UK', '90s', 'britpop', 'cold wave', 'electronic', 'melancholic', 'dark wave', 'male vocalists', 'Cure', '70s', 'favorites', 'english', "80's", 'synth pop', 'Love', 'emo', 'england', 'robert smith', 'punk rock', 'synthpop', 'psychedelic', 'indie pop', 'melancholy']


In [43]:
# Query top artists in the post-punk top tag
response = r.get('http://ws.audioscrobbler.com/2.0/?method=tag.gettopartists&tag=post-punk&api_key=API_KEY&format=json')
response

<Response [200]>

In [46]:
resp_json = response.json()['topartists']['artist']

print([band['name'] for band in resp_json])

['The Cure', 'Joy Division', 'Nick Cave & The Bad Seeds', 'Swans', 'Siouxsie and the Banshees', 'Echo & the Bunnymen', 'Bauhaus', 'She Wants Revenge', 'Wire', 'Parquet Courts', 'The Fall', 'Killing Joke', 'Gang of Four', 'Motorama', 'Iceage', 'Television', 'Idles', 'The Chameleons', 'Public Image Ltd.', 'Savages', 'These New Puritans', 'Les Savy Fav', 'The Soft Moon', 'Protomartyr', 'New Model Army', 'The The', 'Mission of Burma', 'Suicide', 'Felt', 'Lebanon Hanover', 'The Durutti Column', 'The Birthday Party', 'Crystal Stilts', 'Peter Murphy', 'The Gun Club', 'The Sound', 'The Feelies', 'Orange Juice', 'Pere Ubu', 'Magazine', 'Young Marble Giants', 'Sleaford Mods', 'Ought', 'Love And Rockets', 'Preoccupations', 'DRAB MAJESTY', 'Tuxedomoon', 'The Clean', "I Love You But I've Chosen Darkness", 'Television Personalities']


### Combining LastFM and MetroLyrics