# N-gram Language Modeling


Whether for transcribing spoken utterances as correct word sequences or generating coherent human-like text, language models are extremely useful.

In this assignment, you will be building your own language models powered by n-grams and RNNs.

In [2]:
!unzip data.zip

Archive:  data.zip
   creating: data/bbc/
  inflating: data/bbc/business.txt   
  inflating: data/bbc/entertainment.txt  
  inflating: data/bbc/politics.txt   
  inflating: data/bbc/sport.txt      
  inflating: data/bbc/tech.txt       
  inflating: data/bbc/tech-small.txt  
   creating: data/lyrics/
  inflating: data/lyrics/billie_eillish.txt  
  inflating: data/lyrics/ed_sheeran.txt  
  inflating: data/lyrics/green_day.txt  
  inflating: data/lyrics/taylor_swift.txt  
  inflating: data/lyrics/test_lyrics.txt  
  inflating: data/sample.txt         


## Part 1: Language Models

### Step 0: Preprocessing

In [1]:
# !pip install transformers
# !pip install requests
# !pip install torch
# !pip install tqdm

import math
import torch
import numpy as np
import torch.nn as nn
from collections import Counter
from torch.utils.data import DataLoader, Dataset

We provide you with a few functions in `utils.py` to read and preprocess your input data. Do not edit this file!

In [3]:
from utils import *

We have performed a round of preprocessing on the datasets.

- Each file contains one sentence per line.
- All punctuation marks have been removed.
- Each line is a sequences of tokens separated by whitespace.

#### Special Symbols ( Already defined in `utils.py` )
The start and end tokens will act as padding to the given sentences, to make sure they are correctly defined, print them here:

In [4]:
print("Sentence START symbol: {}".format(START))
print("Sentence END symbol: {}".format(EOS))
print("Unknown word symbol: {}".format(UNK))

Sentence START symbol: <s>
Sentence END symbol: </s>
Unknown word symbol: <UNK>


#### Reading and processing an example file

In [5]:
# Read the sample file
sample = read_file("data/sample.txt")
print(sample)

['We are never ever ever ever ever getting back together\n', 'We are the ones together we are back']


In [6]:
# Preprocess the content to add corresponding number of start and end tokens. Try out the method with n = 3 and n = 4 as well.
# Preprocessing example for bigrams (n=2)
sample = preprocess(sample, n=3)
for s in sample:
    print(s)

['<s>', '<s>', 'we', 'are', 'never', 'ever', 'ever', 'ever', 'ever', 'getting', 'back', 'together', '</s>']
['<s>', '<s>', 'we', 'are', 'the', 'ones', 'together', 'we', 'are', 'back', '</s>']


In [7]:
# Flattens a nested list into a 1D list.
flattened = flatten(sample)
print(flattened)

['<s>', '<s>', 'we', 'are', 'never', 'ever', 'ever', 'ever', 'ever', 'getting', 'back', 'together', '</s>', '<s>', '<s>', 'we', 'are', 'the', 'ones', 'together', 'we', 'are', 'back', '</s>']


### Step 1: N-Gram Language Model

#### TO DO: Defining `get_ngrams()`

In [9]:
#######################################
# TODO: get_ngrams()
#######################################
def get_ngrams(list_of_words, n):
    """
    Returns a list of n-grams for a list of words.
    Args
    ----
    list_of_words: List[str]
        List of already preprocessed and flattened (1D) list of tokens e.g. ["<s>", "hello", "</s>", "<s>", "bye", "</s>"]
    n: int
        n-gram order e.g. 1, 2, 3

    Returns:
        n_grams: List[Tuple]
            Returns a list containing n-gram tuples
    """

    return [
        tuple(list_of_words[i : i + n]) for i in range(len(list_of_words) - n + 1)
    ]

    # raise NotImplementedError

In [13]:
#######################################
# TEST: get_ngrams()
#######################################
sample = preprocess(read_file("data/sample.txt"), n=3)
flattened = flatten(sample)

print(get_ngrams(flattened, 3))
assert get_ngrams(flattened, 3) == [('<s>', '<s>', 'we'),
        ('<s>', 'we', 'are'),
        ('we', 'are', 'never'),
        ('are', 'never', 'ever'),
        ('never', 'ever', 'ever'),
        ('ever', 'ever', 'ever'),
        ('ever', 'ever', 'ever'),
        ('ever', 'ever', 'getting'),
        ('ever', 'getting', 'back'),
        ('getting', 'back', 'together'),
        ('back', 'together', '</s>'),
        ('together', '</s>', '<s>'),
        ('</s>', '<s>', '<s>'),
        ('<s>', '<s>', 'we'),
        ('<s>', 'we', 'are'),
        ('we', 'are', 'the'),
        ('are', 'the', 'ones'),
        ('the', 'ones', 'together'),
        ('ones', 'together', 'we'),
        ('together', 'we', 'are'),
        ('we', 'are', 'back'),
        ('are', 'back', '</s>')]

[('<s>', '<s>', 'we'), ('<s>', 'we', 'are'), ('we', 'are', 'never'), ('are', 'never', 'ever'), ('never', 'ever', 'ever'), ('ever', 'ever', 'ever'), ('ever', 'ever', 'ever'), ('ever', 'ever', 'getting'), ('ever', 'getting', 'back'), ('getting', 'back', 'together'), ('back', 'together', '</s>'), ('together', '</s>', '<s>'), ('</s>', '<s>', '<s>'), ('<s>', '<s>', 'we'), ('<s>', 'we', 'are'), ('we', 'are', 'the'), ('are', 'the', 'ones'), ('the', 'ones', 'together'), ('ones', 'together', 'we'), ('together', 'we', 'are'), ('we', 'are', 'back'), ('are', 'back', '</s>')]


#### **TO DO:** Class `NGramLanguageModel()`

*Now*, we will define our LanguageModel class.

**Some Useful Variables:**
- self.model: `dict` of n-grams and their corresponding probabilities, keys being the tuple containing the n-gram, and the value being the probability of the n-gram.
- self.vocab: `dict` of unigram vocabulary with counts, keys being the words themselves and the values being their frequency.
- self.n: `int` value for n-gram order (e.g. 1, 2, 3).
- self.train_data: `List[List]` containing preprocessed **unflattened** train sentences. You will have to flatten it to use in the language model
- self.smoothing: `float` flag signifying the smoothing parameter.

Note that we will not be using log probabilities in this section. Store the probabilities as they are, not in log space.

**Laplace Smoothing**

There are two ways to perform this:
- Either you calculate all possible n-grams at train time and calculate smooth probabilities for all of them, hence inflating the model (eager emoothing). You then use the probabilities as when required at test time. **OR**
- You calculate the probabilities for the **observed n-grams** at train time, using the smoothed likelihood formula, then if any unseen n-gram is observed at test time, you calculate the probability using the smoothed likelihood formula and store it in the model for future use (lazy smoothing).

You will be implementing lazy smoothing

**Perplexity**

Steps:
1. Flatten the test data.
2. Extract ngrams from the flattened data.
3. Calculate perplexity according to given formula. For unseen n-grams, calculate using smoothed likelihood and store the unseen n-gram probability in the labguage model `model` attribute:

$ppl(W_{test}) = ppl(W_1W_2 ... W_n)^{-1/n} $

Tips:
- Remember that product changes to summation under `log`. Take the log of probabilities, sum them up, and then exponentiate it to get back to the original scale.
- Make sure to `flatten()` your data before creating the n_grams using `get_ngrams()`.


In [30]:
#######################################
# TODO: NGramLanguageModel()
#######################################
class NGramLanguageModel():
    def __init__(self, n, train_data, alpha=1):
        """
        Language model class.

        Args
        ____
        n: int
            n-gram order
        train_data: List[List]
            already preprocessed unflattened list of sentences. e.g. [["<s>", "hello", "my", "</s>"], ["<s>", "hi", "there", "</s>"]]
        alpha: float
            Smoothing parameter

        Other attributes:
            self.tokens: list of individual tokens present in the training corpus
            self.vocab: vocabulary dict with counts
            self.model: n-gram language model, i.e., n-gram dict with probabilties
            self.n_grams_counts: dictionary for storing the frequency of ngrams in the training data, keys being the tuple of words(n-grams) and value being their frequency
            self.prefix_counts: dictionary for storing the frequency of the (n-1) grams in the data, similar to the self.n_grams_counts
            As an example:
            For a trigram model, the n-gram would be (w1,w2,w3), the corresponding [n-1] gram would be (w1,w2)
        """
        self.n = n
        self.alpha = alpha
        self.train_data = train_data
        self.tokens = flatten(train_data)
        self.vocab = Counter(self.tokens)
        self.model = {}
        self.n_grams_counts = {}
        self.prefix_counts = {}

        self.build()

        # raise NotImplementedError

    def build(self):
        """
        Returns a n-gram dict with their smoothed probabilities. Remember to consider the edge case of n=1 as well

        You are expected to update the self.n_grams_counts and self.prefix_counts, and use those calculate the probabilities.
        """
        # Calculate n-grams counts
        n_grams = get_ngrams(self.tokens, self.n)
        self.n_grams_counts = Counter(n_grams)
        
        # Calculate (n-1)-grams counts (prefix counts)
        if self.n > 1:
            n_minus_1_grams = get_ngrams(self.tokens, self.n - 1)
            self.prefix_counts = Counter(n_minus_1_grams)
        else:
            # Handle the edge case of unigrams
            self.prefix_counts = Counter({(): len(self.tokens)})
        
        N = len(self.vocab)
        self.model = {}
        for n_gram, count in self.n_grams_counts.items():
            if self.n == 1:
                # Handle the edge case of unigrams
                prefix = ()
            else:
                prefix = n_gram[:-1]
            prefix_count = self.prefix_counts[prefix]
            prob = (count + self.alpha) / (prefix_count + self.alpha * N)
            self.model[n_gram] = prob
        return self.model

        # raise NotImplementedError

    def get_smooth_probabilities(self, ngrams):
        """
        Returns the smoothed probability of the n-gram, using Laplace Smoothing.
        Remember to consider the edge case of  n = 1
        HINT: Use self.n_gram_counts, self.tokens and self.prefix_counts
        """
        N = len(self.vocab)
        probabilities = {}
        for ngram in ngrams:
            count = self.n_grams_counts.get(ngram, 0)
            if self.n == 1:
                # Handle the edge case of unigrams
                prefix_count = len(self.tokens)
            else:
                prefix = ngram[:-1]
                prefix_count = self.prefix_counts.get(prefix, 0)
            prob = (count + self.alpha) / (prefix_count + self.alpha * N)
            probabilities[ngram] = prob
        return probabilities
        # raise NotImplementedError

    def get_prob(self, ngram):
        """
        Returns the probability of the n-gram, using Laplace Smoothing.

        Args
        ____
        ngram: tuple
            n-gram tuple

        Returns
        _______
        float
            probability of the n-gram
        """
        # Build the model if not already built
        if self.model is None:
            self.build()

        # Hint: Check if this n-gram exists in self.model, if it does simply return it!
        # Otherwise, calculate the probabillity similar to get_smooth_probabilities()
        if ngram in self.model:
            return self.model[ngram]
        else:
            N = len(self.vocab)
            count = self.n_grams_counts.get(ngram, 0)
            if self.n == 1:
                # Handle the edge case of unigrams
                prefix_count = len(self.tokens)
            else:
                prefix = ngram[:-1]
                prefix_count = self.prefix_counts.get(prefix, 0)
            
            # Handle division by zero when alpha=0 and prefix_count=0
            denominator = prefix_count + self.alpha * N
            if denominator == 0:
                prob = 0.0
            else:
                prob = (count + self.alpha) / denominator
            
            # Store the calculated probability for future use (lazy smoothing)
            self.model[ngram] = prob
            return prob
        # raise NotImplementedError

    def perplexity(self, test_data):
        """
        Returns perplexity calculated on the test data.
        Args
        ----------
        test_data: List[List]
            Already preprocessed nested list of sentences

        Returns
        -------
        float
            Calculated perplexity value
        """
        # Flatten the test data
        test_tokens = flatten(test_data)
        
        # Extract n-grams from the flattened data
        test_ngrams = get_ngrams(test_tokens, self.n)
        
        # Calculate the log probability sum
        log_prob_sum = 0.0
        N = len(test_ngrams)
        
        for ngram in test_ngrams:
            prob = self.get_prob(ngram)
            log_prob_sum += math.log(prob)
        
        # Calculate perplexity: exp(-1/N * sum(log(p(w_i))))
        avg_log_prob = log_prob_sum / N
        perplexity = math.exp(-avg_log_prob)
        
        return perplexity


In [31]:
#######################################
# TEST: NGramLanguageModel()
#######################################
# For the sake of understanding we will pass alpha as 0 (no smoothing), so that you gain intuition about the probabilities
sample = preprocess(read_file("data/sample.txt"), n=2)
test_lm = NGramLanguageModel(n=2, train_data=sample, alpha=0)

expected_vocab = Counter({'<s>': 2,
        'we': 3,
        'are': 3,
        'never': 1,
        'ever': 4,
        'getting': 1,
        'back': 2,
        'together': 2,
        '</s>': 2,
        'the': 1,
        'ones': 1})

expected_model = {('<s>', 'we'): 1.0,
        ('we', 'are'): 1.0,
        ('are', 'never'): 0.3333333333333333,
        ('never', 'ever'): 1.0,
        ('ever', 'ever'): 0.75,
        ('ever', 'getting'): 0.25,
        ('getting', 'back'): 1.0,
        ('back', 'together'): 0.5,
        ('together', '</s>'): 0.5,
        # ('</s>', '<s>'): 1.0, # there are 2 sentences, the bigram should appear once and </s> appear twice (i.e. prob 0.5)
        ('</s>', '<s>'): 0.5,
        ('are', 'the'): 0.3333333333333333,
        ('the', 'ones'): 1.0,
        ('ones', 'together'): 1.0,
        ('together', 'we'): 0.5,
        ('are', 'back'): 0.3333333333333333,
        ('back', '</s>'): 0.5}

assert test_lm.vocab == expected_vocab, f"Vocabulary mismatch! Expected: {expected_vocab}, but got: {test_lm.vocab}"

assert test_lm.model == expected_model, (
    f"Model mismatch! \n"
    f"Expected keys but missing: {set(expected_model.keys()) - set(test_lm.model.keys())}\n"
    f"Unexpected keys in model: {set(test_lm.model.keys()) - set(expected_model.keys())}\n"
    f"Discrepancies in probabilities: "
    f"{ {k: (expected_model[k], test_lm.model[k]) for k in expected_model if k in test_lm.model and expected_model[k] != test_lm.model[k]} }"
)

In [35]:
#######################################
# TEST smoothing: NGramLanguageModel()
#######################################
sample = preprocess(read_file("data/sample.txt"), n=2)
test_lm = NGramLanguageModel(n=2, train_data=sample, alpha=1)

expected_vocab_smoothing = Counter({'<s>': 2,
        'we': 3,
        'are': 3,
        'never': 1,
        'ever': 4,
        'getting': 1,
        'back': 2,
        'together': 2,
        '</s>': 2,
        'the': 1,
        'ones': 1})

expected_model_smoothing ={('<s>', 'we'): 0.23076923076923078,
        ('we', 'are'): 0.2857142857142857,
        ('are', 'never'): 0.14285714285714285,
        ('never', 'ever'): 0.16666666666666666,
        ('ever', 'ever'): 0.26666666666666666,
        ('ever', 'getting'): 0.13333333333333333,
        ('getting', 'back'): 0.16666666666666666,
        ('back', 'together'): 0.15384615384615385,
        ('together', '</s>'): 0.15384615384615385,
        # ('</s>', '<s>'): 0.16666666666666666, # only one occurrence over 2 unigrams
        ('</s>', '<s>'): 0.15384615384615385,
        ('are', 'the'): 0.14285714285714285,
        ('the', 'ones'): 0.16666666666666666,
        ('ones', 'together'): 0.16666666666666666,
        ('together', 'we'): 0.15384615384615385,
        ('are', 'back'): 0.14285714285714285,
        ('back', '</s>'): 0.15384615384615385}


assert test_lm.vocab == expected_vocab_smoothing, f"Vocabulary mismatch! Expected: {expected_vocab}, but got: {test_lm.vocab}"

assert test_lm.model == expected_model_smoothing, (
    f"Model mismatch! \n"
    f"Expected keys but missing: {set(expected_model_smoothing.keys()) - set(test_lm.model.keys())}\n"
    f"Unexpected keys in model: {set(test_lm.model.keys()) - set(expected_model_smoothing.keys())}\n"
    f"Discrepancies in probabilities: "
    f"{ {k: (expected_model_smoothing[k], test_lm.model[k]) for k in expected_model_smoothing if k in test_lm.model and expected_model_smoothing[k] != test_lm.model[k]} }"
)

In [36]:
#######################################
# TEST unigram: NGramLanguageModel()
#######################################
sample = preprocess(read_file("data/sample.txt"), n=1)
test_lm = NGramLanguageModel(n=1, train_data=sample, alpha=1)

expected_vocab_unigram = Counter({'<s>': 2,
        'we': 3,
        'are': 3,
        'never': 1,
        'ever': 4,
        'getting': 1,
        'back': 2,
        'together': 2,
        '</s>': 2,
        'the': 1,
        'ones': 1})

expected_model_unigram = {('<s>',): 0.09090909090909091,
        ('we',): 0.12121212121212122,
        ('are',): 0.12121212121212122,
        ('never',): 0.06060606060606061,
        ('ever',): 0.15151515151515152,
        ('getting',): 0.06060606060606061,
        ('back',): 0.09090909090909091,
        ('together',): 0.09090909090909091,
        ('</s>',): 0.09090909090909091,
        ('the',): 0.06060606060606061,
        ('ones',): 0.06060606060606061}


assert test_lm.vocab == expected_vocab_unigram, f"Vocabulary mismatch! Expected: {expected_vocab}, but got: {test_lm.vocab}"

assert test_lm.model == expected_model_unigram, (
    f"Model mismatch! \n"
    f"Expected keys but missing: {set(expected_model_unigram.keys()) - set(test_lm.model.keys())}\n"
    f"Unexpected keys in model: {set(test_lm.model.keys()) - set(expected_model_unigram.keys())}\n"
    f"Discrepancies in probabilities: "
    f"{ {k: (expected_model_unigram[k], test_lm.model[k]) for k in expected_model_unigram if k in test_lm.model and expected_model_unigram[k] != test_lm.model[k]} }"
)

In [None]:
#######################################
# TEST: perplexity()
#######################################
test_lm = NGramLanguageModel(n=3, train_data=sample, alpha=0)
test_ppl = test_lm.perplexity(sample)
print(test_ppl)
assert test_ppl < 1.7
assert test_ppl > 0

test_lm = NGramLanguageModel(n=2, train_data=sample, alpha=1)
test_ppl = test_lm.perplexity(sample)
print(test_ppl)
# assert test_ppl < 5.0
assert test_ppl < 5.4
assert test_ppl > 0

1.2972789669802325
5.303299534750951


In [40]:
# Debug perplexity calculation
test_lm = NGramLanguageModel(n=2, train_data=sample, alpha=1)
test_tokens = flatten(sample)
test_ngrams = get_ngrams(test_tokens, 2)
print("Test tokens:", test_tokens)
print("Test ngrams:", test_ngrams)
print("Number of ngrams:", len(test_ngrams))

# Check a few probabilities
for ngram in test_ngrams[:5]:
    prob = test_lm.get_prob(ngram)
    print(f"P({ngram}) = {prob}")

Test tokens: ['<s>', 'we', 'are', 'never', 'ever', 'ever', 'ever', 'ever', 'getting', 'back', 'together', '</s>', '<s>', 'we', 'are', 'the', 'ones', 'together', 'we', 'are', 'back', '</s>']
Test ngrams: [('<s>', 'we'), ('we', 'are'), ('are', 'never'), ('never', 'ever'), ('ever', 'ever'), ('ever', 'ever'), ('ever', 'ever'), ('ever', 'getting'), ('getting', 'back'), ('back', 'together'), ('together', '</s>'), ('</s>', '<s>'), ('<s>', 'we'), ('we', 'are'), ('are', 'the'), ('the', 'ones'), ('ones', 'together'), ('together', 'we'), ('we', 'are'), ('are', 'back'), ('back', '</s>')]
Number of ngrams: 21
P(('<s>', 'we')) = 0.23076923076923078
P(('we', 'are')) = 0.2857142857142857
P(('are', 'never')) = 0.14285714285714285
P(('never', 'ever')) = 0.16666666666666666
P(('ever', 'ever')) = 0.26666666666666666


## Train the n-gram language model on the data/bbc/business.txt dataset for n = 2 and n = 3. Then do the same for data/bbc/sports.txt datset

In [41]:
#######################################
# TRAIN unigram: NGramLanguageModel() for business data
#######################################
business_prepro = preprocess(read_file("data/bbc/business.txt"), n=2)
train_bussi = NGramLanguageModel(n=2, train_data=business_prepro, alpha=0.5)
print(len(set(train_bussi.model.keys())))
print(len(train_bussi.n_grams_counts))
print('Vocab size: ', len(train_bussi.vocab))

83819
83819
Vocab size:  11916


In [42]:
#######################################
# TRAIN unigram: NGramLanguageModel() for business data
#######################################
business_prepro = preprocess(read_file("data/bbc/business.txt"), n=3)
train_bussi = NGramLanguageModel(n=3, train_data=business_prepro, alpha=0.5)
print(len(set(train_bussi.model.keys())))
print(len(train_bussi.n_grams_counts))
print('Vocab size: ', len(train_bussi.vocab))

141221
141221
Vocab size:  11916


In [43]:
#######################################
# TRAIN unigram: NGramLanguageModel() for sports data
#######################################
spo_prepro = preprocess(read_file("data/bbc/sport.txt"), n=2)
train_spo = NGramLanguageModel(n=2, train_data=spo_prepro, alpha=0.5)
print(len(set(train_spo.model.keys())))
print(len(train_spo.n_grams_counts))
print('Vocab size: ', len(train_spo.vocab))

77398
77398
Vocab size:  10607


In [44]:
#######################################
# TRAIN unigram: NGramLanguageModel() for sports data
#######################################
spo_prepro = preprocess(read_file("data/bbc/sport.txt"), n=3)
train_spo = NGramLanguageModel(n=3, train_data=spo_prepro, alpha=0.5)
print(len(set(train_spo.model.keys())))
print(len(train_spo.n_grams_counts))
print('Vocab size: ', len(train_spo.vocab))

135645
135645
Vocab size:  10607


How many possible 2- and 3- grams could there be, given the same vocabulary?


How do the empirical counts given above compare to the number of possible 2- and 3- grams?


## Train a tri-gram (n=3, smoothing= 0.1) language models on collections of song lyrics from three popular artists (‘data/lyrics/‘) and use the model to score a new unattributed song.

In [45]:
taylor_pre = preprocess(read_file("data/lyrics/taylor_swift.txt"), n=3)
train_tay = NGramLanguageModel(n=3, train_data=taylor_pre, alpha=0.1)

green_pre = preprocess(read_file("data/lyrics/green_day.txt"), n=3)
train_green = NGramLanguageModel(n=3, train_data=green_pre, alpha=0.1)

ed_pre = preprocess(read_file("data/lyrics/ed_sheeran.txt"), n=3)
train_ed = NGramLanguageModel(n=3, train_data=ed_pre, alpha=0.1)

What are the perplexity scores of the test lyrics against each of the language models?

In [46]:
test_prepro = preprocess(read_file("data/lyrics/test_lyrics.txt"), n=3)

tay_ppl = train_tay.perplexity(test_prepro)
print('Perplexity of taylor swift: ', tay_ppl)

green_ppl = train_green.perplexity(test_prepro)
print('Perplexity of green day: ', green_ppl)

ed_ppl = train_ed.perplexity(test_prepro)
print('Perplexity of ed sheeran: ', ed_ppl)

Perplexity of taylor swift:  138.00663307990817
Perplexity of green day:  522.5401188730924
Perplexity of ed sheeran:  521.2574891234094


## Train a bi-gram (n=2, smoothing= 0.1) language models on collections of song lyrics from three popular artists (‘data/lyrics/‘) and use the model to score a new unattributed song.

In [47]:
taylor_pre = preprocess(read_file("data/lyrics/taylor_swift.txt"), n=2)
train_tay = NGramLanguageModel(n=2, train_data=taylor_pre, alpha=0.1)

green_pre = preprocess(read_file("data/lyrics/green_day.txt"), n=2)
train_green = NGramLanguageModel(n=2, train_data=green_pre, alpha=0.1)

ed_pre = preprocess(read_file("data/lyrics/ed_sheeran.txt"), n=2)
train_ed = NGramLanguageModel(n=2, train_data=ed_pre, alpha=0.1)

In [48]:
test_prepro = preprocess(read_file("data/lyrics/test_lyrics.txt"), n=2)

tay_ppl = train_tay.perplexity(test_prepro)
print('Perplexity of taylor swift: ', tay_ppl)

green_ppl = train_green.perplexity(test_prepro)
print('Perplexity of green day: ', green_ppl)

ed_ppl = train_ed.perplexity(test_prepro)
print('Perplexity of ed sheeran: ', ed_ppl)

Perplexity of taylor swift:  90.36563845090222
Perplexity of green day:  286.3921178190898
Perplexity of ed sheeran:  298.35129478639016
