# Bigram Markov Chain Model

Now you'll build a more complex Markov chain that uses the last _two_ words (or bigram) to predict the next word. Now your dict `chain` should map a _tuple_ of words to a list of words that appear after it. So for example, one entry of this dict might be

```
chain = {
    ("it", "is"): ["the", "the", "not", "a", "a", "not", "the"],
    ...
}
```

As before, you should also include tags that indicate the beginning and end of a song, as well as line breaks. That is, a tuple might contain tags like `"<START>"`, `"<END>"`, and `"<N>"`, in addition to regular words. So if the song starts with the line "Is this the real life?" and ends with the line "Nothing really matters to me.", you would have a dictionary that looks like
```
chain = {
    (None, "<START>"): ["Is", ...],
    ("<START>", "Is"): ["this", ...],
    ("Is", "this"): ["the", ...],
    ("this", "the"): ["real", ...],
    ("the", "real"): ["life?", ...],
    ("real", "life?"): ["<N>", ...],
    ("<N>", "Nothing"): ["really", ...],
    ("Nothing", "really"): ["matters", ...],
    ("really", "matters"): ["to", ...],
    ("matters", "to"): ["me.", ...],
    ("to", "me."): ["<END>", ...],
    ...
}
```

In [7]:
def train_markov_chain(lyrics):
    """
    Args:
      - lyrics: a list of strings, where each string represents
                the lyrics of one song by an artist.
    
    Returns:
      A dict that maps a tuple of 2 words ("bigram") to a list of
      words that follow that bigram, representing the Markov
      chain trained on the lyrics.
    """
    
    # Initialize the beginning of our chain.
    chain = {
        (None, "<START>"): []
    }
    
    for lyric in lyrics:
        # Replace newline characters with our tag.
        lyric_newlines = lyric.replace('\n', ' <N> ')
        # Create a tuple representing the most recent (current) bigram.
        last_2 = (None, "<START>")
        for word in lyric_newlines.split():
            # Add the word as one that follows the current bigram.
            chain[last_2].append(word)
            # Shift the current bigram to account for the newly added word.
            last_2 = (last_2[1], word)
            if last_2 not in chain:
                chain[last_2] = []
        chain[last_2].append("<END>")
    
    return chain

In [8]:
# Load the pickled lyrics object that we created earlier.
import pickle
lyrics = pickle.load(open("lyrics.pkl", "rb"))

# Train a Markov Chain over all of Logic's lyrics.
chain = train_markov_chain(lyrics)

# Examine the words that tend to start a song (i.e. words following the <START> tag)
print(chain[(None, "<START>")][:20])

["I've", "I've", '[Hook:', 'If', 'If', '<END>', '[Woman', "Let's", 'I', 'I', '<END>', 'Hold', 'Hey', 'Hey', 'Everything', 'Everything', '<END>', '<END>', 'Jesus,', 'Jesus,']


In [13]:
print(len(chain.keys()))

26775


Now, let's generate new lyrics using the Markov chain you constructed above. To do this, we'll begin at the `(None, "<START>")` state and randomly sample a word from the list of first words. Then, we'll randomly sample each next word from the list of words that appeared after the current word in the training data. We will continue this until we reach the `"<END>"` state. This will give us the complete lyrics of a randomly generated song!

In [9]:
import random

def generate_new_lyrics(chain):
    """
    Args:
      - chain: a dict representing the Markov chain,
               such as one generated by generate_new_lyrics()
    
    Returns:
      A string representing the randomly generated song.
    """
    
    # a list for storing the generated words
    words = []
    # generate the first word
    word = random.choice(chain[(None, "<START>")])
    words.append(word)
    
    # Begin with the first bigram in our chain.
    last_2 = (None, "<START>")
    while words[-1] != "<END>":
        # Generate the next word.
        word = random.choice(chain[last_2])
        words.append(word)
        # Shift the current bigram to account for the newly added word.
        last_2 = (last_2[1], words[-1])
    
    # Join the words together into a string with line breaks.
    lyrics = " ".join(words[:-1])
    return "\n".join(lyrics.split("<N>"))

In [12]:
print(generate_new_lyrics(chain))

I've I got 
 And I don't know where the groupies are? 
 Bitches on my mind 
 Through bumpin' my favorite rappers I came from nothing-- so to be freeThis is our ride 
 Don't run from Trump, run against him!Hey mothafucka I'm real as shit 
 I'ma show you mothafuckas see alright? 
 Everybody love, everybody knowI been knockin' doors down like a quarter I was strollin' down the highway 
 For this rap shit 
 You can believe you're superior, fine 
 'Cause in my kitchen, cutting, selling that 
 Street's disciple 
 My mind racing, I'm sick of pacing, I feel like the brain 
 Pretend that family's my family to avoid the pain 
 With poems written in blood 
 In the hood I'm a king, who died as a black father and a devil wife and a motherfucker still the shit that don't matter and makin' a buck 
 I'ma get up in your rib 
 Got Neil on the tuck 
 I rarely went to the summet. 
 So hot they all might 
 Hold up, wait a minute, know my name until it fade away (Fade away) fade away (Fade away) 
 They gon'

+ In the day she love to smoke, yes she fade away
+ So I'm puffing on this vision, the night is my division
+ I'ma show 'em how to act, I'ma get up and then on the back
+ Praise Black Jesus now they call the cops, do it for the life that I'm puttin' on for the props
+ I pretty much knew he was born with the heat, rock more solid than concrete
+ Baby girl can I find humanity?
+ Put my everything into the street, let alone the heat  
+ This motherfucker better know the Feds is buggin'
+ My life ain't mine, I need you to save me  
+ Everybody looking for the street, let alone the heat
+ I'ma keep rapping about all of you guys? Fuck all that shit I was gone for a reason
+ Oh my, my, my, feeling this villainous vibe
+ But I take the bus from my problems, Lord help me solve them
+ Now I'm praying for somebody to save me, no matter what you believe is right
+ You got everything to lose, like a goddamn king 
+ Pawns tend to carry on with no dial tone
+ Yeah, know what? I'll make dead fuckin' presidents to represent me
+ The life that I'm puttin' on, this is a facade
+ She don't wanna cry anymore, destitute and less informed
+ I see myself at the Louvre, and I know my mind playing tricks on me
+ I feel like I'm killin' my dreams, life fade in a way
+ Why nobody wanna say I can rap
+ Like abracadabra when that magician pull up the road
+ Homies in my studio, and I was strollin' down the highway
+ This rap shit another day, another book
+ I see good people who make it rain like no other man
+ I know where to begin to make a killin'
+ Yo, I'ma keep all of this new left over residue
+ Trust me girl I won't be mad, if you heard different someone lied
+ People thinking they on his level, they ain't ready for more bottles
+ Anybody that's riding with me trynna get it like that now

### Grader's Comments

- 
- 

[This question is worth 20 points.]

In [5]:
# This cell should only be modified only by a grader.
scores = [None]