# Got Raps?

## Background

I am someone who *loves* to music. One of my favorite genres to listen to is hip-hop/rap because of, in some cases, the focus on lyricism. 

In recent years, there has been less of a focus in this regard, leading some fans of "true hip-hop" to look down upon the musicians currently in the spotlight. Part of the criticism of modern rap is how forgettable the lyrics and songs are due to their simplistic nature. 

While learning about *recurrent neural networks* (RNNs)I came across a [video](https://www.youtube.com/watch?v=ZMudJXhsUpY) by Laurence Moroney explaining how AI can be used to generate poetry after training upon a corpus of Irish poems. This sparked an idea to try and do the same with a modern song lyrics. I specifically decided to choose rap because of my familiarity with the genre and thinking that the songs may have more words since the artists are not singing (most of the time).

I will be scraping lyrics from [azlyrics](https://www.azlyrics.com/t/tyga.html) using BeautifulSoup and Keras to create my RNN.

## Imports

In [1]:
import requests
import regex
import pickle
import pandas as pd
import numpy as np
import functions as dlf
from bs4 import BeautifulSoup

## Getting Soupy

In [2]:
## Starting page + Q.C. of response
start_url = 'https://www.azlyrics.com/t/tyga.html'
start_resp = requests.get(start_url)
print(f'Starting Response: {start_resp}')

Starting Response: <Response [200]>


In [51]:
## Creating soup + Q.C.
start_soup = BeautifulSoup(start_resp.text, 'html.parser')
print(start_soup.prettify()[8422:8946])

      <div class="album" id="9188">
       album:
       <b>
        "No Introduction"
       </b>
       (2008)
      </div>
      <div class="listalbum-item">
       <a href="../lyrics/tyga/diamondlife.html" target="_blank">
        Diamond Life
       </a>
      </div>
      <div class="listalbum-item">
       <a href="../lyrics/tyga/coconutjuice.html" target="_blank">
        Coconut Juice
       </a>
      </div>
      <div class="listalbum-item">
       <a href="../lyrics/tyga/supersizeme.html" target="_blank">
 


### Linking Up

In [8]:
## Collecting all album/song titles + song links
albums_songs = start_soup.findAll('div', attrs={'class': ['album', 'listalbum-item']})

In [16]:
## Agg. each album into a dict of a dict {Album: {song:song_link,...}}
album_mid = dlf.album_aggregator(albums_songs)
## Extra function needed for sorting 'Other songs'
album_dict = dlf.legendary_album_splitter(album_mid)

1st Album!
Empty list!


#### Link Scraping

**STRATEGY FOR SCRAPING LYRICS**
* Copy `album_dict` into `res_dict`
* Iterate through each album:
    * Iterate through each song:
        * Scrape using link (**NEED TO CHECK FORMAT**)
        * Replace link w/scraped soup
* Return *copied* `res_dict`

In [44]:
song_start_url = 'https://www.azlyrics.com'
test_album = album_dict['Well Done 4']

for song in test_album:
    end_url = test_album[song]
    full_url = song_start_url + end_url
    resp = requests.get(full_url)
    print(f'Song: {song}')
    song_soup = BeautifulSoup(resp.text, 'html.parser')
    
    song_lyrics = song_soup.findAll('div', attrs={'class': None})
    song_res = []
    song_dump = []
    switch = False
    for tag in song_lyrics:
        try:
            _ = tag['id']
        except KeyError:
            t_tag = tag
            print('Lyrics!')

    tester = t_tag.findAll('i')

    for tag in tester:
        print(tag.text)
        
    print('--'*30)

Song: Word On The Street
Starting Response: <Response [200]>
Lyrics!
[Intro]
[Verse 1:]
[Verse 2:]
Song: Bang Out
Starting Response: <Response [200]>
Lyrics!
[Verse 1: Tyga]
[Hook]
[Verse 2: Tyga]
[Hook]
[Verse 3: Eazy-E]
[Outro: Ice Cube]
Song: Back 2 Basics
Starting Response: <Response [200]>
Lyrics!
Song: Good Day
Starting Response: <Response [200]>
Lyrics!
[Intro: Lil Wayne]
[Bridge: Lil Wayne]
[Verse 1: Tyga]
[Bridge]
[Hook: Lil Wayne]
[Verse 2: Lil Wayne]
[Bridge]
[Hook]
[Verse 3: Meek Mill]
[Bridge]
[Hook]
Song: Young Kobe
Starting Response: <Response [200]>
Lyrics!
[Hook:]
[Verse 1:]
[Pre-Hook:]
[Hook x2]
[Verse 2:]
[Pre-Hook]
[Hook x2]
[Outro:]
Song: Day One
Starting Response: <Response [200]>
Lyrics!
[Verse 1]
[Hook 1 (x2)]
[Bridge]
[Hook 2]
[Verse]
[Hook 2 (x2)]
Song: Maniac
Starting Response: <Response [200]>
Lyrics!
[Hook: Tyga]
[Verse 1: Tyga]
[Hook]
[Verse 2: Fabolous]
[Hook]
Song: Pressed
Starting Response: <Response [200]>
Lyrics!
[Hook: Tyga]
[Verse 1: Tyga]
[Hook: Ho

In [35]:
test_start_url = 'https://www.azlyrics.com'
# print(test_start_url)
test_album = album_dict['Well Done 4']
test_end_url = test_album['Bang Out']
# print(test_end_url)
test_url = test_start_url + test_end_url
print(test_url)
# test_resp = requests.get(test_url)
print(f'Starting Response: {test_resp}')
test_soup = BeautifulSoup(test_resp.text, 'html.parser')
# test_soup

https://www.azlyrics.com/lyrics/tyga/bangout.html
Starting Response: <Response [200]>


In [34]:
test_lyrics = test_soup.findAll('div', attrs={'class': None})
test_res = []
test_dump = []
switch = False
for tag in test_lyrics:
    try:
        _ = tag['id']
    except KeyError:
        t_tag = tag
        print('Lyrics!')
        
tester = t_tag.findAll('i')

for tag in tester:
    print(tag.text)

Lyrics!
[Verse 1: Tyga]
[Hook]
[Verse 2: Tyga]
[Hook]
[Verse 3: Eazy-E]
[Outro: Ice Cube]


In [12]:
test_lyrics

[<div id="RTK_67Y3"></div>, <div>
 <!-- Usage of azlyrics.com content by any third-party lyrics provider is prohibited by our licensing agreement. Sorry about that. -->
 <i>[Verse 1: Tyga]</i><br/>
 Hold up, money talk so you know what?<br/>
 Ainât nothing to talk about, you ainât got enough cuz<br/>
 Rock star drugs break a bitch heart, no love<br/>
 Emma Watts, Charlie Sheen, fuckin with no scrub<br/>
 Niggas want connects, got no plugs<br/>
 Nigga say they high, got no buzz<br/>
 Popsicle niggas wanna talk shit then say you froze up<br/>
 Young niggas wanna pop pills, just po up<br/>
 Went on a bang, Went on a bang<br/>
 Bitches came for me and my nigga eazy<br/>
 Threw that bitch out, got that ho one way<br/>
 Said she tryna stay, told that bitch no way<br/>
 Thatâs a preme nigga, B ripper, grim reaper<br/>
 I donât get mad bitch, I just get even<br/>
 T-Raw magician, I donât gotta trick or treat it<br/>
 That Ferrari California make a bitch a believer<br/>
 <br/>
 <i>[Ho

#### Link Saving

In [2]:
#### WRITING ####
# with open('Pickles/album_dict.pickle', 'wb') as f:
#     pickle.dump(album_dict, f)
#     f.close()

#### READING ####
with open('Pickles/album_dict.pickle', 'rb') as f:
    album_dict = pickle.load(f)
    f.close()