# Technical Implementation - Part 1
### For a list of names dataset, how will the vectorized version look like? The hyperparameter maxlen should be set to a fixed value. Why is this the case? Show your solution in a jupyter notebook.

## Creating mappings for characters & indices

In [1]:
# Read names to use as training data
filename = "names.txt"
f = open(filename, 'r')

raw_text = f.read()

In [2]:
# Get unique characters from the training data as our vocabulary
characters = sorted(list(set(raw_text)))
print(f"Vocabulary Size (Total Characters): {len(characters)}")

Vocabulary Size (Total Characters): 53


In [3]:
# Returns the index of a given character
character_indices = dict((c, i) for i,c in enumerate(characters))

# Returns the character given an index
indices_characters = dict((i, c) for i, c in enumerate(characters))

In [4]:
character_indices

{'\n': 0,
 'A': 1,
 'B': 2,
 'C': 3,
 'D': 4,
 'E': 5,
 'F': 6,
 'G': 7,
 'H': 8,
 'I': 9,
 'J': 10,
 'K': 11,
 'L': 12,
 'M': 13,
 'N': 14,
 'O': 15,
 'P': 16,
 'Q': 17,
 'R': 18,
 'S': 19,
 'T': 20,
 'U': 21,
 'V': 22,
 'W': 23,
 'X': 24,
 'Y': 25,
 'Z': 26,
 'a': 27,
 'b': 28,
 'c': 29,
 'd': 30,
 'e': 31,
 'f': 32,
 'g': 33,
 'h': 34,
 'i': 35,
 'j': 36,
 'k': 37,
 'l': 38,
 'm': 39,
 'n': 40,
 'o': 41,
 'p': 42,
 'q': 43,
 'r': 44,
 's': 45,
 't': 46,
 'u': 47,
 'v': 48,
 'w': 49,
 'x': 50,
 'y': 51,
 'z': 52}

## Convert to a Set of Symbols of Fixed Length

In [5]:
maxlen = 10
step = 2

data_points = []
next_characters = []

for i in range(0, len(raw_text) - maxlen, step):
    data_points.append(raw_text[i: i+maxlen])
    next_characters.append(raw_text[i + maxlen])

In [6]:
# Print the first few elements of data_points for inspection
data_points[:3]

['Michael\nCh', 'chael\nChri', 'ael\nChrist']

In [7]:
# Print the first few elements of next_characters for inspection
next_characters[:3]

['r', 's', 'o']

In [8]:
# Print the number of observations/data points generated from the dataset
print(f"Number of data points: {len(data_points)}")

Number of data points: 66774


## Vectorization

In [9]:
# Convert data_points into x and characters into y
import numpy as np

x = np.zeros((len(data_points), maxlen, len(characters)), dtype=int)
y = np.zeros((len(data_points), len(characters)), dtype=int) 
# dtype = int is used above for the purposes of demonstrating the vectors, 
# however in actual implementation it should be set to dtype = np.float64

In [10]:
# Create one hot encoding of data_points and characters
for i, data_point in enumerate(data_points):
    for t, character, in enumerate(data_point):
        x[i, t, character_indices[character]] = 1
    y[i, character_indices[next_characters[i]]] = 1

## Explanation

### For a list of names dataset, how will the vectorized version look like?
From the list of names dataset, we generate an x vector (observations/data points) and a y vector (predicted next characters). We look at the y vector first:

The y vector (predicted next characters) can be thought of as an array of arrays. Each array inside is the size of the vocabulary, and the array contains a '1' at the index corresponding to the character and '0' everywhere else. Thus, the shape of y will be `the number of data points` x `the vocabulary size`.

For example, our vocabulary size (number of unique characters from the dataset) is 53. y is an array of arrays, wherein each array inside has 53 elements. When we printed `next_characters` earlier, the first predicted next character is 'r'. Looking at `character_indices`, 'r' is at index 44. Thus, the first array in y has '1' at index 44 and '0' everywhere else. We can verify this by inspecting y below:

In [11]:
y[:3] # the first few arrays in y, showing y is an array of arrays

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [12]:
y[0] # the first array in y, which corresponds to 'r' (the first element in next_characters)
# 'r' is represented by '1' at index 44 and '0' everywhere else

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 0, 0, 0, 0, 0, 0, 0])

In [13]:
y[0][44] # show that '1' is at index 44

1

In [14]:
indices_characters[44] # show that index 44 corresponds to 'r'

'r'

In [15]:
y.shape
# The shape of y will be: the number of data points,
# the number of characters in our vocabulary (vocabulary size)

(66774, 53)

Next, we look at the x vector. Whereas y can be thought of as a 2D array, x is a 3D array. That is, while y itself is an array of arrays, each element in x is an array of arrays.

For example, when we printed `data_points` earlier, the first element is 'Michael\nCh' (an array of arrays). Each character in 'Michael\nCh' is represented as an array in the same way y represented one character. Looking at `character_indices`, 'M' is one array with '1' at index 13, 'i' is one array with '1' at index 35, and so on.

Looking at `character_indices`, 'r' is at index 44. Thus, the first array in y has '1' at index 44 and '0' everywhere else. We can verify this by inspecting x below:

In [16]:
x[:2] # the first few elements of x, showing x is a 3D array

array([[[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [1, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]],

       [[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]]])

In [17]:
x[0] # the first element of x, which is an array of arrays corresponding to 'Michael\nCh' (the first element in data_points)
# the first array is for 'M', represented by '1' at index 13 and '0' everywhere else
# the second array is for 'i', represented by '1' at index 35 and '0' everywhere else, and so on

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
      

In [18]:
x.shape
# The shape of x will be: the number of data points, 
# the number of characters per data point (maxlen),
# the number of characters in our vocabulary (vocabulary size)

(66774, 10, 53)

### The hyperparameter maxlen should be set to a fixed value. Why is this the case?
The names themselves have varying lengths (ex. Michael has length 7, Christian has length 9), however, MLP requires that inputs must all have the same size. To resolve this, the textual dataset is split using `maxlen` into data points that all have the same dimensionality (length).

For example, `maxlen` was set to 10 above. Thus, whereas the original text has names of varying lengths, the inputs become 'Michael\nCh', 'chael\nChri', 'ael\nChrist', and so on, each of length 10. The split inputs can then be passed through the MLP model.

# Technical Implementation - Part 2
### Convert the notebook Intuition for Text Generation notebook to a word token level (as opposed to character level token) and see what gets generated.

### Import the Necessary Libraries

In [19]:
import re
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import random
import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import stopwords
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

### Read a Text File and Get Raw Text

In [20]:
filename = "story.txt"
f = open(filename, 'r')

raw_text = f.read()

### Examine First 1000 Characters

In [21]:
raw_text[0:1000]

'Once upon a time, there was a boy named Mike\nHe was raised in the hood, life was never quite\nGrowing up, he faced many struggles and strife\nBut through it all, he had one love in his life\n\nRap music was his escape from reality\nThe beats and lyrics helped him to see\nA world beyond his troubled neighborhood\nHe knew that with hard work, he could change his mood\n\nHe started writing rhymes and practicing his flow\nIn the mirror, he would rap and watch himself grow\nHe knew that if he could just make it to the top\nHe could change his life, and make a better stop\n\nOne day, he had the chance to perform on stage\nIn front of a crowd, he killed it, he killed it with rage\nHe had finally made it, he had achieved his dream\nAnd now he is a successful rapper, or so it seems\n\nHe never forgot where he came from, and he never will\nHe always remember the struggles that he had to fulfill\nHe is now a role model to kids in the hood\nShowing them that with hard work, they too could\n\nRis

### Cleanup Text

In [22]:
processed_text = raw_text.lower()
processed_text = re.sub(r'[^\x00-\x7f]', r'', processed_text)

processed_text

'once upon a time, there was a boy named mike\nhe was raised in the hood, life was never quite\ngrowing up, he faced many struggles and strife\nbut through it all, he had one love in his life\n\nrap music was his escape from reality\nthe beats and lyrics helped him to see\na world beyond his troubled neighborhood\nhe knew that with hard work, he could change his mood\n\nhe started writing rhymes and practicing his flow\nin the mirror, he would rap and watch himself grow\nhe knew that if he could just make it to the top\nhe could change his life, and make a better stop\n\none day, he had the chance to perform on stage\nin front of a crowd, he killed it, he killed it with rage\nhe had finally made it, he had achieved his dream\nand now he is a successful rapper, or so it seems\n\nhe never forgot where he came from, and he never will\nhe always remember the struggles that he had to fulfill\nhe is now a role model to kids in the hood\nshowing them that with hard work, they too could\n\nris

In [23]:
word_tokens = word_tokenize(processed_text)

In [24]:
stop_words = set(stopwords.words('english'))
filtered_tokens = [token for token in word_tokens if token.lower() not in stop_words]

In [25]:
word_vocabulary = list(set(sorted(filtered_tokens)))

In [26]:
lemmatizer = WordNetLemmatizer()
lemmatized_tokens = [lemmatizer.lemmatize(token) for token in word_vocabulary]

lemmatized_tokens

['better',
 'hard',
 'chance',
 'top',
 'could',
 'front',
 ',',
 'remember',
 'achieved',
 'circumstance',
 'day',
 'upon',
 'growing',
 'model',
 'grow',
 'dream',
 'finally',
 'flow',
 'mike',
 'time',
 'mirror',
 'many',
 'change',
 'watch',
 'work',
 'successful',
 'rise',
 'named',
 'crowd',
 'perform',
 'forgot',
 'practicing',
 'quite',
 'would',
 'one',
 'seems',
 'always',
 'lyric',
 'killed',
 'made',
 'never',
 'stop',
 'rhyme',
 'helped',
 'love',
 'beat',
 'knew',
 'chase',
 'rapper',
 'boy',
 'world',
 'beyond',
 'mood',
 'neighborhood',
 'life',
 'strife',
 'fulfill',
 'hood',
 'end',
 'troubled',
 'make',
 'see',
 'faced',
 'struggle',
 'escape',
 'started',
 'music',
 'came',
 'kid',
 'reality',
 'raised',
 'writing',
 'showing',
 'stage',
 'role',
 'dream',
 'rap',
 'rage']

### Create Index Mappings

* `word_indices`: Return the index of a given word. Example: `word_indices['a']`
* `indices_word`: Returns the word given an index. Example: `indices_word[0]`

In this case, `word` serves as your **vocabulary**.

In [27]:
words = sorted(list(set(lemmatized_tokens)))
print("Total Words: {}".format(len(words)))

Total Words: 77


In [28]:
word_indices = dict((w, i) for i,w in enumerate(words))
indices_words = dict((i, w) for i,w in enumerate(words))

### Convert to a Set of Symbols of Fixed Length

* `maxlen`: Dimensionality of each data point
* `step`: Granularity of skips. The lower the number, the noisier. The higher the number, the more erratic.

Take note we also capture the predicted character for the given sentence. This will allow us to setup the data in such a way that a given set of sequences predicts the next character. In machine learning, we will denote this as our `y` value or ground truth. Each `y` however is represented as a one hot encoding where a position will receive a value of `1` depending on the character position in `word_indices`

In [29]:
maxlen = 10
step = 1

sentences = []
next_words = []

for i in range(0, len(lemmatized_tokens) - maxlen, step):
    sentences.append(lemmatized_tokens[i: i+maxlen])
    next_words.append(lemmatized_tokens[i + maxlen])

In [30]:
sentences

[['better',
  'hard',
  'chance',
  'top',
  'could',
  'front',
  ',',
  'remember',
  'achieved',
  'circumstance'],
 ['hard',
  'chance',
  'top',
  'could',
  'front',
  ',',
  'remember',
  'achieved',
  'circumstance',
  'day'],
 ['chance',
  'top',
  'could',
  'front',
  ',',
  'remember',
  'achieved',
  'circumstance',
  'day',
  'upon'],
 ['top',
  'could',
  'front',
  ',',
  'remember',
  'achieved',
  'circumstance',
  'day',
  'upon',
  'growing'],
 ['could',
  'front',
  ',',
  'remember',
  'achieved',
  'circumstance',
  'day',
  'upon',
  'growing',
  'model'],
 ['front',
  ',',
  'remember',
  'achieved',
  'circumstance',
  'day',
  'upon',
  'growing',
  'model',
  'grow'],
 [',',
  'remember',
  'achieved',
  'circumstance',
  'day',
  'upon',
  'growing',
  'model',
  'grow',
  'dream'],
 ['remember',
  'achieved',
  'circumstance',
  'day',
  'upon',
  'growing',
  'model',
  'grow',
  'dream',
  'finally'],
 ['achieved',
  'circumstance',
  'day',
  'upon',
  

In [31]:
next_words

['day',
 'upon',
 'growing',
 'model',
 'grow',
 'dream',
 'finally',
 'flow',
 'mike',
 'time',
 'mirror',
 'many',
 'change',
 'watch',
 'work',
 'successful',
 'rise',
 'named',
 'crowd',
 'perform',
 'forgot',
 'practicing',
 'quite',
 'would',
 'one',
 'seems',
 'always',
 'lyric',
 'killed',
 'made',
 'never',
 'stop',
 'rhyme',
 'helped',
 'love',
 'beat',
 'knew',
 'chase',
 'rapper',
 'boy',
 'world',
 'beyond',
 'mood',
 'neighborhood',
 'life',
 'strife',
 'fulfill',
 'hood',
 'end',
 'troubled',
 'make',
 'see',
 'faced',
 'struggle',
 'escape',
 'started',
 'music',
 'came',
 'kid',
 'reality',
 'raised',
 'writing',
 'showing',
 'stage',
 'role',
 'dream',
 'rap',
 'rage']

### Vectorization

This step simply converts the `sentences` and `next_words` to its `x` and `y` components respectively. Since we're using pytorch, we convert it to a tensor. Succeeding cells check the shape of `x` and `y`.

In [32]:
print("Vectorization")
device = 'cpu'
x = np.zeros((len(sentences), maxlen, len(words)), dtype=np.float64)
y = np.zeros((len(sentences), len(words)), dtype=np.float64)

for i, sentence in enumerate(sentences):
    for t, word, in enumerate(sentence):
        x[i, t, word_indices[word]] = 1
    y[i, word_indices[next_words[i]]] = 1
    
x = torch.tensor(x).float().to(device)
y = torch.tensor(y).float().to(device)

Vectorization


In [33]:
x.shape

torch.Size([68, 10, 77])

In [34]:
x = torch.flatten(x, start_dim=1)

x.shape

torch.Size([68, 770])

In [35]:
y.shape

torch.Size([68, 77])

### Utility Function for Generating Samples

In [36]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

### Callback Function

This generates sample text from a given seed and ran for every epoch of the model.

In [37]:
def callback(model):
    start = 0
    stop = len(lemmatized_tokens) - maxlen - 1

    #print("Start: {}".format(start))
    #print("Stop: {}".format(stop))

    start_index = random.randint(start, stop)

    #print("Start Index: {}".format(start_index))

    sentence = lemmatized_tokens[start_index: start_index + maxlen]

    #print("Sentence: {}".format(sentence))
    #print("Sentence Length: {}".format(len(sentence)))

    generated = ""

    for i in range(400):
        x_predictions = np.zeros((1, maxlen, len(words)))

        for t, w in enumerate(sentence):
            #print("Sentence t: {}".format(t))
            #print("Sentence w: {}".format(w))
            #print("x_predictions[0, t, word_indices[w]]: {}".format(x_predictions[0, t, word_indices[w]]))
            x_predictions[0, t, word_indices[w]] = 1

            #print(x_predictions)
        x_predictions = torch.tensor(x_predictions).float().to(device)
        x = torch.flatten(x_predictions, start_dim=1)

        preds = model.forward(x)[0].detach().cpu().numpy()

        next_index = sample(preds)
        #print("next_index: {}".format(next_index))
        next_word = indices_words[next_index]
        #print("next_word: {}".format(next_word))

        generated += next_word
        #print("generated with next_word: {}".format(generated))
        sentence = sentence[1:]
        sentence.append(next_word)
        #print("sentence with next_word: {}".format(sentence))

    return sentence

### MultiLayerPerceptron Model

This will be our current language model. Although originally used for classification, we can also treat is a regression model since our ground truth represents the next predicted character in a sequence. Take note that this is a rather simplistic model without any properties to remember previous input. This forces the model to treat each input as an independent observation without considering sequential behavior. TLDR, it won't generate good results.

In [38]:
class MultiLayerPerceptron(nn.Module):
    def __init__(self, input_dim, output_dim):
        super().__init__()

        self.hidden = nn.Linear(input_dim, 500)
        self.output = nn.Linear(500, output_dim)
        
        self.relu = nn.ReLU()
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        # f(x) = a(f(x))
        x = self.relu(self.hidden(x))
        y = self.sigmoid(self.output(x))

        return y

In [39]:
model = MultiLayerPerceptron(x.shape[1], y.shape[1]).to(device)

model

MultiLayerPerceptron(
  (hidden): Linear(in_features=770, out_features=500, bias=True)
  (output): Linear(in_features=500, out_features=77, bias=True)
  (relu): ReLU()
  (sigmoid): Sigmoid()
)

### Training Function

In [40]:
optimizer = optim.Adam(model.parameters(), lr=0.00001)
criterion = nn.CrossEntropyLoss()

def train_fn(model, optimizer, loss_fn, device):
    ave_loss = 0
    count = 0
    
    for i, data in enumerate(x):
        data = x[i]
        targets = y[i]
        
        # Forward
        predictions = model.forward(data)
        
        predictions = F.softmax(predictions, dim=-1)
        
        loss = loss_fn(predictions, targets)
        
        # Backward
        optimizer.zero_grad()
        
        loss.backward()
        
        optimizer.step()

        count += 1
        ave_loss += loss.item()
    
    ave_loss = ave_loss / count

    return ave_loss

epochs = 500

average_losses = []

for epoch in range(epochs):
    print("Epoch: {}".format(epoch))
    ave_loss = train_fn(model, optimizer, criterion, device)
    
    average_losses.append(ave_loss)
        
    print("Ave Loss: {}".format(ave_loss))
    
    generated_sentence = callback(model)
    
    print("Generated sentence:")
    print(generated_sentence)
    print("Length: {}".format(len(generated_sentence)))

Epoch: 0
Ave Loss: 4.343806855818805
Generated sentence:
['mike', 'practicing', 'struggle', 'day', 'flow', 'raised', 'troubled', 'hard', 'made', 'upon']
Length: 10
Epoch: 1
Ave Loss: 4.343782417914447
Generated sentence:
['escape', 'showing', 'achieved', 'knew', 'top', 'change', 'would', 'love', 'many', 'rapper']
Length: 10
Epoch: 2
Ave Loss: 4.343757832751555
Generated sentence:
['seems', 'boy', 'struggle', 'crowd', 'started', 'lyric', ',', 'always', 'change', 'made']
Length: 10
Epoch: 3
Ave Loss: 4.3437325744067925
Generated sentence:
['grow', 'raised', 'top', 'started', 'growing', 'faced', 'life', 'perform', 'escape', 'rage']
Length: 10
Epoch: 4
Ave Loss: 4.343707119717317
Generated sentence:
['stop', 'faced', 'named', 'troubled', 'knew', 'killed', 'boy', 'world', 'end', 'rage']
Length: 10
Epoch: 5
Ave Loss: 4.343681370510774
Generated sentence:
['quite', 'kid', 'circumstance', 'named', 'forgot', 'crowd', 'always', 'boy', 'always', 'work']
Length: 10
Epoch: 6
Ave Loss: 4.34365546002

Ave Loss: 4.342184592695797
Generated sentence:
['perform', 'never', 'quite', 'reality', 'showing', 'troubled', 'many', 'day', 'flow', 'end']
Length: 10
Epoch: 52
Ave Loss: 4.342141782536226
Generated sentence:
['strife', 'better', 'rap', 'killed', 'top', 'quite', 'rapper', 'reality', 'grow', 'role']
Length: 10
Epoch: 53
Ave Loss: 4.342098383342519
Generated sentence:
['beyond', 'kid', 'work', 'strife', 'made', 'showing', 'writing', 'grow', 'flow', 'mike']
Length: 10
Epoch: 54
Ave Loss: 4.342054465237786
Generated sentence:
['change', ',', 'circumstance', 'mirror', 'neighborhood', 'many', 'showing', 'hood', 'made', 'forgot']
Length: 10
Epoch: 55
Ave Loss: 4.342010000172784
Generated sentence:
['helped', 'started', 'better', 'crowd', 'would', 'finally', 'make', 'flow', 'upon', 'raised']
Length: 10
Epoch: 56
Ave Loss: 4.341964953085956
Generated sentence:
['lyric', 'would', 'finally', 'beat', 'successful', 'writing', 'quite', 'rapper', 'flow', 'killed']
Length: 10
Epoch: 57
Ave Loss: 4.3

Ave Loss: 4.3392497721840355
Generated sentence:
['kid', 'grow', 'raised', 'mike', 'hood', 'killed', 'came', 'chase', 'perform', 'strife']
Length: 10
Epoch: 103
Ave Loss: 4.339177440194523
Generated sentence:
['rhyme', 'strife', 'front', 'life', 'top', 'finally', 'mood', 'one', 'end', 'stage']
Length: 10
Epoch: 104
Ave Loss: 4.339104659417096
Generated sentence:
['rapper', 'watch', 'work', 'life', 'upon', 'knew', 'better', 'hard', 'rhyme', 'raised']
Length: 10
Epoch: 105
Ave Loss: 4.33903145088869
Generated sentence:
['mood', 'successful', 'chance', 'many', 'troubled', 'made', 'writing', 'many', 'reality', 'writing']
Length: 10
Epoch: 106
Ave Loss: 4.338957737473881
Generated sentence:
['raised', 'stop', 'end', 'hood', 'make', 'see', 'raised', 'named', 'end', 'love']
Length: 10
Epoch: 107
Ave Loss: 4.338883617345025
Generated sentence:
['fulfill', 'helped', 'hood', 'never', 'better', 'could', 'life', 'music', 'rap', 'mirror']
Length: 10
Epoch: 108
Ave Loss: 4.338808936231277
Generated 

Ave Loss: 4.335245980935938
Generated sentence:
['upon', 'work', 'mood', 'reality', 'rage', 'kid', 'work', 'struggle', 'lyric', 'would']
Length: 10
Epoch: 153
Ave Loss: 4.33516252040863
Generated sentence:
['work', 'day', 'strife', 'seems', 'reality', 'change', 'top', 'rap', 'flow', 'time']
Length: 10
Epoch: 154
Ave Loss: 4.335079172078301
Generated sentence:
['seems', 'struggle', 'practicing', 'flow', 'strife', 'forgot', 'showing', 'killed', 'make', ',']
Length: 10
Epoch: 155
Ave Loss: 4.334995823747971
Generated sentence:
['see', 'finally', 'forgot', 'stage', 'work', 'mirror', 'knew', 'practicing', 'neighborhood', 'troubled']
Length: 10
Epoch: 156
Ave Loss: 4.334912517491509
Generated sentence:
['rise', 'quite', 'fulfill', 'chase', 'beyond', 'beat', 'made', 'many', 'rapper', 'music']
Length: 10
Epoch: 157
Ave Loss: 4.334829337456647
Generated sentence:
['beat', 'mike', 'strife', 'stage', 'work', 'make', 'escape', 'never', 'rise', 'seems']
Length: 10
Epoch: 158
Ave Loss: 4.33474638181

Ave Loss: 4.3312301705865295
Generated sentence:
['grow', 'many', 'finally', 'beyond', 'upon', 'end', 'escape', 'time', 'end', 'finally']
Length: 10
Epoch: 204
Ave Loss: 4.3311588343452
Generated sentence:
['perform', 'upon', 'lyric', 'work', 'many', 'beat', 'upon', 'neighborhood', 'role', 'grow']
Length: 10
Epoch: 205
Ave Loss: 4.331087988965652
Generated sentence:
['finally', 'raised', 'rise', 'reality', 'beat', 'growing', 'dream', 'change', 'troubled', 'rage']
Length: 10
Epoch: 206
Ave Loss: 4.331017529263216
Generated sentence:
['love', 'raised', 'make', 'practicing', 'see', 'perform', 'end', 'rage', 'grow', 'growing']
Length: 10
Epoch: 207
Ave Loss: 4.330947378102471
Generated sentence:
['quite', 'mirror', 'dream', 'killed', 'work', 'mirror', 'flow', 'raised', 'troubled', 'many']
Length: 10
Epoch: 208
Ave Loss: 4.330877731828129
Generated sentence:
['grow', 'made', 'started', 'love', 'chase', 'raised', 'mirror', 'came', 'grow', 'stop']
Length: 10
Epoch: 209
Ave Loss: 4.33080833098

Ave Loss: 4.328073768054738
Generated sentence:
['reality', 'forgot', 'grow', 'quite', 'would', 'one', 'rage', 'always', 'stage', 'many']
Length: 10
Epoch: 255
Ave Loss: 4.328021603472092
Generated sentence:
['lyric', 'upon', 'mood', 'dream', 'life', 'would', 'see', 'hood', 'never', 'role']
Length: 10
Epoch: 256
Ave Loss: 4.327969691332648
Generated sentence:
['killed', 'mirror', 'top', 'change', 'kid', 'rise', 'troubled', 'make', 'rapper', 'helped']
Length: 10
Epoch: 257
Ave Loss: 4.327918220968807
Generated sentence:
['crowd', 'killed', 'music', 'chase', 'flow', 'day', 'mike', 'flow', 'made', 'raised']
Length: 10
Epoch: 258
Ave Loss: 4.327867122257457
Generated sentence:
['beyond', 'escape', 'practicing', 'hood', 'life', 'rise', 'fulfill', 'came', 'model', 'model']
Length: 10
Epoch: 259
Ave Loss: 4.32781640922322
Generated sentence:
['rap', 'mike', 'make', 'flow', 'fulfill', 'grow', 'rap', 'reality', 'mike', 'fulfill']
Length: 10
Epoch: 260
Ave Loss: 4.3277660187552955
Generated sent

Ave Loss: 4.32584328511182
Generated sentence:
['forgot', 'practicing', 'knew', 'dream', 'stop', 'kid', 'rise', 'troubled', 'made', 'perform']
Length: 10
Epoch: 306
Ave Loss: 4.3258077747681565
Generated sentence:
['time', 'dream', 'writing', 'growing', 'raised', 'dream', 'rage', 'music', 'end', 'knew']
Length: 10
Epoch: 307
Ave Loss: 4.325772453756893
Generated sentence:
['life', 'quite', 'rhyme', 'hood', 'quite', 'seems', 'model', 'kid', 'mike', 'rhyme']
Length: 10
Epoch: 308
Ave Loss: 4.32573749738581
Generated sentence:
['world', 'hood', 'one', 'rage', 'lyric', 'see', 'day', 'rise', 'watch', 'crowd']
Length: 10
Epoch: 309
Ave Loss: 4.325702779433307
Generated sentence:
['flow', 'raised', 'never', 'change', 'strife', 'killed', 'successful', 'named', 'rage', 'made']
Length: 10
Epoch: 310
Ave Loss: 4.325668320936315
Generated sentence:
['grow', 'knew', 'dream', 'struggle', 'knew', 'never', 'dream', 'rapper', 'upon', 'change']
Length: 10
Epoch: 311
Ave Loss: 4.325634156956392
Generated

Ave Loss: 4.324365054859834
Generated sentence:
['helped', 'beyond', 'music', 'many', 'model', 'beyond', 'always', 'writing', 'work', 'beyond']
Length: 10
Epoch: 357
Ave Loss: 4.324342194725485
Generated sentence:
['beat', 'upon', 'seems', 'stop', 'kid', 'reality', 'showing', 'end', 'never', 'work']
Length: 10
Epoch: 358
Ave Loss: 4.324319587034338
Generated sentence:
['dream', 'practicing', 'kid', 'lyric', 'end', 'flow', 'time', 'would', 'writing', 'time']
Length: 10
Epoch: 359
Ave Loss: 4.3242972107494575
Generated sentence:
['dream', 'watch', 'never', 'mike', 'day', 'stage', 'flow', 'rhyme', 'one', 'troubled']
Length: 10
Epoch: 360
Ave Loss: 4.324275044833913
Generated sentence:
['growing', 'neighborhood', 'dream', 'successful', 'flow', 'rage', 'stage', 'rhyme', 'world', 'practicing']
Length: 10
Epoch: 361
Ave Loss: 4.32425307526308
Generated sentence:
['neighborhood', 'many', 'started', 'rage', 'dream', 'love', 'knew', 'raised', 'work', 'rise']
Length: 10
Epoch: 362
Ave Loss: 4.324

Ave Loss: 4.323440867311814
Generated sentence:
['showing', 'one', 'grow', 'dream', 'model', 'started', 'lyric', 'work', 'knew', 'always']
Length: 10
Epoch: 408
Ave Loss: 4.323427017997293
Generated sentence:
['dream', 'dream', 'dream', 'model', 'watch', 'dream', 'successful', 'rise', 'writing', 'strife']
Length: 10
Epoch: 409
Ave Loss: 4.323413365027484
Generated sentence:
['rage', 'practicing', 'life', 'successful', 'flow', 'rage', 'change', 'showing', 'dream', 'escape']
Length: 10
Epoch: 410
Ave Loss: 4.323399754131541
Generated sentence:
['beat', 'never', 'stop', 'came', 'escape', 'hood', 'day', 'made', 'chase', 'crowd']
Length: 10
Epoch: 411
Ave Loss: 4.323386395678801
Generated sentence:
['rapper', 'rapper', 'watch', 'role', 'successful', 'many', 'named', 'stop', 'dream', 'perform']
Length: 10
Epoch: 412
Ave Loss: 4.3233731283861045
Generated sentence:
['watch', 'quite', 'mike', 'stop', 'crowd', 'upon', 'perform', 'one', 'crowd', 'work']
Length: 10
Epoch: 413
Ave Loss: 4.32336002

Ave Loss: 4.32289463632247
Generated sentence:
['rhyme', 'helped', 'love', 'beat', 'knew', 'stage', 'rapper', 'boy', 'world', 'beyond']
Length: 10
Epoch: 459
Ave Loss: 4.322886733447804
Generated sentence:
['rapper', 'time', 'world', 'beyond', 'mood', 'neighborhood', 'life', 'strife', 'fulfill', 'music']
Length: 10
Epoch: 460
Ave Loss: 4.322878935757806
Generated sentence:
['boy', 'world', 'beyond', 'mood', 'neighborhood', 'life', 'strife', 'fulfill', 'hood', 'end']
Length: 10
Epoch: 461
Ave Loss: 4.322871145080118
Generated sentence:
['struggle', 'started', 'started', 'music', 'music', 'came', 'reality', 'raised', 'dream', 'beat']
Length: 10
Epoch: 462
Ave Loss: 4.322863522697897
Generated sentence:
['many', 'change', 'watch', 'work', 'successful', 'rise', 'named', 'crowd', 'perform', 'forgot']
Length: 10
Epoch: 463
Ave Loss: 4.3228559143403
Generated sentence:
['reality', 'raised', 'writing', 'showing', 'stage', 'role', 'dream', 'watch', 'rage', 'hood']
Length: 10
Epoch: 464
Ave Loss

## Explanation
### What gets generated when the Text Generation notebook is converted to a word token level?
Whereas the original Text Generation notebook on a character token level outputs a string with 40 characters (since `maxlen=40` in the original), the Text Generation notebook on a word token level outputs an array of 10 words (since `maxlen=10` in this notebook). This is because the original notebook has each element in `sentences` (aka each input) be a string of characters, whereas this notebook has each element in `sentences` be an array of words.

The model does not learn whitespace since whitespace is not treated as a token unlike in the original notebook. It also does not learn stop words and articles as this was removed during text cleanup. 

The model does learn to use a mix of words/mostly avoid consecutive repeating words (ex. 'make', make'). Although the model itself does not inherently know parts of speech, it seems that the model also learned to use a mix of nouns and verbs. These patterns arise because these were learned from the `sentences` used as the model's input.