# Visualisation - *grey poupon*

Using the scraper, data was collected for the lyric 'grey poupon'.

This notebook looks to explore this data.

In [268]:
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
from ast import literal_eval
from fuzzywuzzy import process
import re

---

In [296]:
def drop_breaks(lyrics):
    '''
    Strip our section marker from lyrics i.e. ['Verse 1'] and line breaks
    '''
    return re.sub(r'\[(.*?)\]', '', lyrics)

In [300]:
def clean_sections(lyrics):
    '''
    Strip breaks and clean empty elements from lyrics lists
    '''
    return list(filter(None, [drop_breaks(x) for x in lyrics]))

In [304]:
def import_lyrics(lyric):
    df = pd.read_csv('{}.csv'.format(lyric), index_col=0)
    df.year = pd.to_datetime(df.year)
    df['lyrics'] = df['lyrics'].apply(lambda x: literal_eval(x))
    df['lyrics'] = df['lyrics'].apply(lambda x: clean_sections(x))
    
    return df

In [307]:
def rhyming_pattern(string):
    '''
    Find phrases either side of desired phrase
    '''
    surrounding_lyrics = []
    for i,v in enumerate(string):
        if process.default_scorer(lyric.lower(), v.lower()) > 70:
            try:
                surrounding_lyrics.append([string[i-1].lower(), v.lower(), string[i+1].lower()])
            except:
                surrounding_lyrics.append(v.lower())
                
    return surrounding_lyrics

In [None]:
def create_report(lyric):
    '''
    Generate report on a certain lyric:
    - Show when first used and most recently used
    - 
    '''
    df = import_lyrics(lyric)
    print('''The lyric "{}" first appeared in {} in the song: {} - {}.\n\nMost recently it appeared in {} in the song: {} - {}.'''\
                                                 .format(lyric,
                                                         df['year'].min().year,
                                                         df['title'].iloc[gp['year'].idxmin()],
                                                         df['artist'].iloc[gp['year'].idxmin()],
                                                         df['year'].max().year,
                                                         df['title'].iloc[gp['year'].idxmax()],
                                                         df['artist'].iloc[gp['year'].idxmax()]
                                                        ))

---

In [324]:
lyric = 'grey poupon'

In [326]:
create_report(lyric)

The lyric "grey poupon" first appeared in 1992 in the song: East Coast - Das EFX.

Most recently it appeared in 2020 in the song: Royce Da 5'9" Freestyle | L.A. Leakers - Freestyle #100 - Royce da 5'9'’.


In [327]:
gp['lyrics'].apply(lambda x: rhyming_pattern(x))[0]

[["i stay modest 'bout it, ayy, she elaborate it, ayy",
  'this that grey poupon, that evian, that ted talk, ayy',
  'watch my soul speak, you let the meds talk, ayy']]