# Comparing expected proportion of rhymes in a song to the computed proportion

In this notebook, we will use the function rhymes() from 
the evaluation submodule of lyrics_analysis to compute 
the proportion of rhyming lines.

We will use the song "Ode to the Mets" by The Strokes
as an example lyrics. The lyrics can be found in the
file example1.json in the data directory. The song 
itself doesn't contain a large number of rhymes.

Here are the pairs of rhyming words at the end of lines
determined by me. I picked not only the rhymes that follow
some precise rules, but also those that 'feel' as if
they rhyme:

(do, you), (good, should), (back, attack),
(mind, fine), (alone, alone), (alone, alone), 
(heart, right), (forgotten, bottom).

The total proportion of rhyming lines is therefore
16 / 43 ~ 0.37.

Let's see how many rhymes the function finds:

In [1]:
# load the song
import json
from lyrics_analysis import Song

with open('../data/example1.json') as file:
    song_json = json.load(file)
song = Song(song_json["lyrics"], song_json["genre"], song_json["artist"])

                the kernel may be left running.  Please let us know
                about your system (bitness, Python, etc.) at
                ipython-dev@scipy.org
  ipython-dev@scipy.org""")


In [2]:
from lyrics_analysis.evaluation import rhymes

rhymes(song.lyrics, rhyme_level=2)

0.2558139534883721

We see that the function found about two thirds of
human-determined rhymes. Let's define a new function
that does the same thing as rhymes, except it also prints
rhyming lines, to see where it disagrees.

In [3]:
from lyrics_analysis import helpers
import nltk

ARPABET = nltk.corpus.cmudict.dict()
def rhymes_debug(lyrics, rhyme_level=2, max_distance=2, arpabet=ARPABET):
    last_words = [word.lower() for word in helpers._get_last_words(lyrics)]
    base_last_phonemes = helpers._get_last_n_phonemes(last_words, rhyme_level, arpabet)
    last_phonemes = []
    for base_pron in base_last_phonemes:
        for pron in helpers._get_alternative_pronunciations(base_pron):
            last_phonemes.append(pron)

    # store all rhyming lines in a dictionary
    rhyming_lines = {}
    for i, prons in enumerate(last_phonemes):
        for pron in prons:
            rhyming_lines[pron] = rhyming_lines.get(pron, [])
            rhyming_lines[pron].append(i)

    # add a point for each line that rhymes with a line at at most max_distance
    rhyme_count = 0
    for i, prons in enumerate(last_phonemes):
        considered_lines = [i - d for d in range(1, max_distance + 1)] + \
                           [i + d for d in range(1, max_distance + 1)]
        for pron in prons:
            if set(rhyming_lines[pron]).intersection(set(considered_lines)):
                # this is the added code
                # print the number of line where a rhymes was found
                print(i)
                rhyme_count += 1
                break

    # calculate the proportion of rhyming lines
    return rhyme_count / len(last_words)

Now, use the function on the song lyrics.

In [4]:
rhymes_debug(song.lyrics)

3
5
6
7
8
9
19
20
21
22
24


0.2558139534883721

The rhyming words found by the function were the following:
you, you, good, should, back, attack, alone, alone, alone,
alone, phone.

The rhymes that were not detected were: (do, you), 
(mind, fine), (heart, right) and (bottom, forgotten), 
which was expected, since they only really rhyme when
sung.

On the other hand, it detected the word phone (which rhymes
with alone). A human would probably not mark this as 
a rhyme, as the words are far from each other and,
additionally, there is a musical interlude between the
verses.

In conclusion, the function performs decently,
detecting most proper rhymes.
