In [1]:
%%capture
%run 'lyrics_analysis.ipynb'

# What does it take to win SOTY? Lyrical analysis of Grammy Song of the Year winners and nominees

### Nour Elkassabany
### COMM313 SPRING2020

## INTRODUCTION

Despite perennial debates of the Grammys’ relevance and objections of #GrammysTooWhite, the Recording Academy’s awards remain one of the most legible ways of signaling achievement or recognition in the music industry.

Because so few of the (as of 2017) 84 awards handed out are not part of the televised program and so many inherently subjective factors contribute to song selection, like interpretation of “excellence” and “prominence” and the particularities of voting members’ tastes, the Grammy Awards are something of a black box. The aim of this project, then, is to see what insights we might find by taking a closer look into one of the major awards, Song of the Year (SOTY). 

The Song of the Year award “recognizes the songwriter(s) who wrote and composed the song.” It is distinct from Record of the Year, which “recognizes the artist’s performance as well as the overall contributions of the producer(s), recording engineer(s), and/or mixer(s), if other than the performing artist” (grammys.com). Along with the awards for Album of the Year and Best New Artist, these make up the “Big Four” at the Grammys each year. 

Using the body of songs that have been nominated for the Song of the Year Award since the first award year in 1959 to present day, I will explore if lyrical patterns in the winners imply a certain “formula” or strategy of songwriting to create a winning song. Thus, the primary dimension of analysis is award status (winner or nominee).


## COLLECTING THE CORPUS
Information about the recipients is contained in a single table on the Wikipedia page for the Song of the Year award (https://en.wikipedia.org/wiki/Grammy_Award_for_Song_of_the_Year). I utilized web scraping to obtain the award year, names of the winner(s) of the award, titles of songs, and performing artists. A similar process was used for the nominees, for which there was a column within in the recipients table containing nominees, title of song, and performing artist. This information was stored in two list of dictionaries structures (one for winners, one for nominees), where each dictionary in the list contains the relevant details for each song.

The Genius.com API was then used to retrieve the lyrics for the songs. I addressed cases of Genius returning no results or incorrect results manually. Because the corpora were small, the length of song lyrics was typically short, and song lyrics usually contain repetition, mistakes were easily detected visually. These cases were often borne out of discrepancies in how certain details were displayed on Wikipedia versus Genius, particularly for songs with featured artists, songs written for films or musicals, songs that were covers of earlier recordings, and instrumental tracks.

In an attempt to more easily fix lyrical errors, I wrote a function that would isolate a desired song dictionary and obtain lyrics based on the details within that dictionary. While this function worked, it relied on the compatibility of dictionary values with the Genius search. However, this incompatibility was often the cause for error and I often needed to look up individual tracks to uncover the nature of this disconnect.

Overall, the corpus contains 63 winning songs and 258 non-winning, or nominated songs. 


## FINDINGS

### "Wordiness" -- Text-type ratios and part of speech tagging

##### Are winning songs utilizing a larger vocabulary?

In the same loop used for tokenizing lyrics and creating word/bigram frequency lists, I added an additional step to calculate each song's text-type ratio and store it in a list for each corpus, `win_ttr` and `nom_ttr`. Before averaging the values in each list, I removed the text-token ratios associated with instrumental songs. Within the song dictionaries, the value matched to `['lyrics']` was either `'none'` or `'Instrumental'` for these tracks, resulting in a type-token ratio of `100.0`, where it should be undefined (dividing by zero). 

In [2]:
print('The average type-token ratio among winning songs is {}.'.format(sum(win_ttr) / len(win_ttr)))
print('The average type-token ratio among nominated songs is {}.'.format(sum(nom_ttr) / len(nom_ttr)))

The average type-token ratio among winning songs is 40.784626607625476.
The average type-token ratio among nominated songs is 39.211339585603504.


The proximity of these values suggests that on average, songs that won SOTY are not necessarily using more words in their lyrics. The next step I took was to see if different *kinds* of words were being utilized, using part of speech tagging for each corpus. I chose to focus on verbs and adjectives because they might give insights to more action-oriented songs and songs containing more flowery, descriptive language. All percentages below are calculated from nested list structures `win_tagged` and `nom_tagged`.

In [3]:
print('{} percent of the tokens in all winning songs were verbs'.format(win_verb_perc))
print('{} percent of the tokens in all winning songs were adjectives'.format(win_adj_perc))

20.969660916121356 percent of the tokens in all winning songs were verbs
8.869720404521118 percent of the tokens in all winning songs were adjectives


In [4]:
print('{} percent of the tokens in all nominated songs were verbs'.format(nom_verb_perc))
print('{} percent of the tokens in all nominated songs were adjectives'.format(nom_adj_perc))

21.864803606565545 percent of the tokens in all nominated songs were verbs
8.575556053699968 percent of the tokens in all nominated songs were adjectives


These similar proportions of verbs and adjectives across the two sets of text suggest that neither set of songs seems to express any more "doing" or "describing" in their lyrics than the other. In fact, the two sets of songs seem to employ many of the same verbs and adjectives, displayed in the matched verb and adjective frequency tables below.

In [5]:
### displaying 30 most frequent VERBS in winning and nominated songs, respectively
for idx,pair in enumerate(top_win_verb):
    print('{:<10}{:<10}{:<15}{:<10}'.format(pair[0], pair[1], top_nom_verb[idx][0], top_nom_verb[idx][1]))

be        153       be             703       
i         136       i              591       
are       109       is             557       
know      90        know           516       
make      88        do             372       
is        76        got            302       
go        67        was            277       
got       65        say            259       
dont      64        go             258       
do        55        see            258       
were      51        love           242       
have      49        are            235       
see       48        dont           227       
love      47        want           218       
get       44        let            201       
say       38        have           200       
im        38        make           200       
want      36        get            199       
take      33        tell           199       
tell      33        were           195       
had       33        been           195       
was       32        take          

In [6]:
### displaying 30 most common ADJECTIVES in winning and nominated songs, respectively

for idx,pair in enumerate(top_win_adj):
    print('{:<10}{:<10}{:<15}{:<10}'.format(pair[0], pair[1], top_nom_adj[idx][0], top_nom_adj[idx][1]))

i         142       i              819       
oh        65        oh             201       
good      37        im             147       
ive       34        ive            130       
black     33        little         104       
happy     31        good           99        
im        28        verse          92        
ill       24        new            87        
whole     23        bad            71        
beautiful 22        more           70        
own       21        old            69        
new       20        better         68        
better    20        ill            66        
single    20        dont           63        
young     18        hard           63        
bad       18        free           60        
little    17        best           60        
verse     17        true           59        
same      16        long           58        
true      16        big            57        
more      15        cant           53        
dont      15        alive         

### Searching for deeper entry points
##### Key terms
First looks into the data uncovered several symmetrical qualities of the data. The next step was to zoom in, utilizing keyness analysis to better choose words for closer examination. The below table displays key terms for words in the winning set.

In [7]:
calculate_keyness(win_word_freq, nom_word_freq, top = 20) 

WORD                     Corpus Freq.RC Freq.  Keyness
worry                    33        10        72.914
black                    33        16        59.931
every                    78        112       58.899
watching                 23        9         46.074
happy                    31        23        43.812
are                      109       235       42.034
world                    63        107       37.511
hmm                      17        6         35.509
rolling                  17        7         33.317
songs                    27        24        33.307
whole                    23        19        30.057
carry                    17        9         29.511
single                   20        14        29.407
let's                    26        26        28.997
make                     88        208       27.485
making                   13        5         26.231
write                    23        23        25.651
man                      48        88        25.228
rose     

Looking at the keyness tables, some of the most key terms in the corpus aren't distributed very widely across all the songs. Take the first word `worry` for example. It is associated with the 1989 winner, "Don't WORRY Be Happy" by Bobby McFerrin. Something similar happens for other key words like `rolling` (2012 winner "Rolling in the Deep" by Adele) and `single` (2010 winner "Single Ladies (Put a Ring on it) by Beyonce). I suspect these words appear as key because they are repeated throughout the song, implied by presence in the title, not because they are signaling a particular kind of songwriting.

##### Normalized frequencies

I first created a normalized frequency chart for the 50 most common words among the winning songs. It appeared that something similar to what happened with the keyness chart was happening again. The abbreviated display below isolates these points of interest.

In [8]:
for word, freq in win_sel: 
    win = freq
    nom = nom_word_freq.get(word,0)
    norm_win = win/win_size * 1000
    norm_nom = nom/nom_size * 1000

    LL = 0 if nom==0 else log_likelihood(win, win_size, nom, nom_size)
    print(row_template.format(word, win, norm_win, nom, norm_nom, LL))

we             161     9.58		480       5.98	 24.66
are            109     6.48		235       2.93	 42.03
world          63      3.75		107       1.33	 37.51
make           88      5.23		208       2.59	 27.48
every          78      4.64		112       1.39	 58.90


The words with LL values that suggest occurrence in one corpus over another are associated with the title of a song and likely appear in the chorus as well. These words are not being used in a particularly meaningful way, but are being used so often that they have this effect. This is especially plausible, considering that the corpus of winning songs is so small, with only 63 entries. Let's look closer:

`we`, `are`, `world`, and `make` come from the 1986 winner, "We Are The World" performed by USA for Africa. The lyrics of the chorus go: `We are the world, We are the children, We are the ones to make a brighter day...`

`every` is likely coming from the 1984 winner, "Every Breath You Take" performed by The Police. Most of the lines in this song contain the word `every` two times.

Below, I isolated the counts of these words in their associated song from the rest of the winners.

In [9]:
usa_africa_display = ['we', 'are', 'world', 'make']
for word in usa_africa_display:
    print('"{}" occurs {} times in the song'.format(word,usa_africa.get(word)))
    percentage = (usa_africa.get(word)/win_word_freq.get(word))*100
    print('this comprises {:.5} percent of occurrences across all winning songs\n'.format(percentage))

"we" occurs 40 times in the song
this comprises 24.845 percent of occurrences across all winning songs

"are" occurs 35 times in the song
this comprises 32.11 percent of occurrences across all winning songs

"world" occurs 12 times in the song
this comprises 19.048 percent of occurrences across all winning songs

"make" occurs 23 times in the song
this comprises 26.136 percent of occurrences across all winning songs



In [10]:
print('"{}" occurs {} times in the song'.format('every', breath.get('every')))
percentage = (breath.get('every') / win_word_freq.get('every'))*100
print('this comprises {:.5} percent of occurrences across all winning songs'.format(percentage))

"every" occurs 52 times in the song
this comprises 66.667 percent of occurrences across all winning songs


Because these percentages suggest that some "words of interest" are distributed unevenly across the corpus of winning songs, I question whether this impact accurately reflects how winning songs differ from nominees. 

I isolated another set of terms that I thought might be interesting to investigate, related to love and time, themes from which writers might draw inspiration. Because they are common themes, the similar normalized frequencies come as no surprise.

In [11]:
for word, freq in win_entries: #top acad is a counter, make word list a counter
    win = freq
    nom = nom_word_freq.get(word,0)
    norm_win = win/win_size * 1000
    norm_nom = nom/nom_size * 1000

    LL = 0 if nom==0 else log_likelihood(win, win_size, nom, nom_size)
    print(row_template.format(word, win, norm_win, nom, norm_nom, LL))

love           111     6.60		657       8.18	-4.59
heart          59      3.51		193       2.40	 6.04
always         26      1.55		110       1.37	 0.30
never          66      3.93		229       2.85	 4.93
forever        10      0.59		53        0.66	-0.09


#### Key words in context, sentiment scores
Because the results of the normalized frequency suggested that these words occurred at similar rates in the two sets of text, I searched through concordances for the above `love` and `time` words to again see if they are employed differently. The patterns described above still held true. Take something like Tina Turner's famous song and 1985 winner, "What's Love Got To Do With It" for example. Among the collocates three words before or after `love`, the words in the title (and chorus) of that song have a clear presence, alongside other words that make up different iterations of "I love you" expressed in lyrics. 

In [12]:
win_love_colls.most_common(20)

[('I', 31),
 ('you', 29),
 ('with', 20),
 ('love', 16),
 ('of', 15),
 ("What's", 15),
 ('in', 14),
 ('to', 14),
 ('that', 11),
 ('and', 10),
 ('the', 10),
 ('got', 10),
 ('do,', 10),
 ('we', 10),
 ('[Chorus]', 9),
 ('And', 8),
 ('second-hand', 8),
 ('way', 7),
 ('out', 7),
 ('your', 6)]

I considered sentiment scores as another way to get at this question of how topics were being discussed in the songs. Again, the two sets of songs returned similar averages. However, I am wary of relying on sentiment scores to make any conclusions about the songs because the presence of words does not reflect how they're used or how they relate to the "message" or valence of the song as a whole. 

In [13]:
print('The average compound sentiment score for winning songs was {:.4}'.format(win_sent_score))
print('The average compound sentiment score for nominated songs was {:.4}'.format(nom_sent_score))

The average compound sentiment score for winning songs was 0.5192
The average compound sentiment score for nominated songs was 0.4601


## DISCUSSION
This analysis is inherently limited in that it only captures patterns related to lyrical content. And what it *can* capture lyrical content does not encompass the particularities of language use, such as irony. The similarities between these sets of songs does not indicate a “failed” analysis. Rather, it might suggest that final award status may be influenced by other factors, such as the cultural context or commercial success at the time of release, which were both outside the scope of this investigation. And as someone with an affinity for often wordless ambient and electronic music, disregarding song lyrics in evaluations of quality is not a hard sell! 

The specificity of the research question required award status as the primary dimension of analysis. However, this emphasis on the winning category fails to acknowledge that any song included in the final group of nominees (independent of award status) was already selected over numerous other entries. The intermediate tiers of selection from the date of release leading up to the award ceremony may reduce variety as similar qualities in releases are evaluated by Recording Academy members.

Adjusting the research question to expand the corpus of texts might yield more meaningful differences. A combined in-group of SOTY winners and nominees could be compared to the songs at the top of the Billboard charts for corresponding years to see if there is a threshold for recognition, to assess the level of overlap, etc. 

Additional expansions for future research might look into the overlap between SOTY and ROTY categories in an attempt to see if songs are a “complete package” or if they can be distilled into their constituent parts (writing, production, engineering, and so on). Conducting another SOTY analysis, one could examine the idea of singular “artistic genius” through the winners of the awards, as opposed to the performing artist. This might spotlight more prolific songwriters who receive recognition for songs written by them but recorded by numerous artists. It might also highlight the degree to which Grammy-recognized artists play a role in crafting their songs. I considered this for my analysis, but failed to address naming practices expressed in credits. Credits often name individuals with their “actual” name, which does not easily align with artists who perform in a group or under a different name. 
