# Technical Implementation - Part 1
### For a list of names dataset, how will the vectorized version look like? The hyperparameter maxlen should be set to a fixed value. Why is this the case? Show your solution in a jupyter notebook.

## Creating mappings for characters & indices

In [1]:
# Read names to use as training data
filename = "names.txt"
f = open(filename, 'r')

raw_text = f.read()

In [2]:
# Get unique characters from the training data as our vocabulary
characters = sorted(list(set(raw_text)))
print(f"Vocabulary Size (Total Characters): {len(characters)}")

Vocabulary Size (Total Characters): 53


In [3]:
# Returns the index of a given character
character_indices = dict((c, i) for i,c in enumerate(characters))

# Returns the character given an index
indices_characters = dict((i, c) for i, c in enumerate(characters))

In [4]:
character_indices

{'\n': 0,
 'A': 1,
 'B': 2,
 'C': 3,
 'D': 4,
 'E': 5,
 'F': 6,
 'G': 7,
 'H': 8,
 'I': 9,
 'J': 10,
 'K': 11,
 'L': 12,
 'M': 13,
 'N': 14,
 'O': 15,
 'P': 16,
 'Q': 17,
 'R': 18,
 'S': 19,
 'T': 20,
 'U': 21,
 'V': 22,
 'W': 23,
 'X': 24,
 'Y': 25,
 'Z': 26,
 'a': 27,
 'b': 28,
 'c': 29,
 'd': 30,
 'e': 31,
 'f': 32,
 'g': 33,
 'h': 34,
 'i': 35,
 'j': 36,
 'k': 37,
 'l': 38,
 'm': 39,
 'n': 40,
 'o': 41,
 'p': 42,
 'q': 43,
 'r': 44,
 's': 45,
 't': 46,
 'u': 47,
 'v': 48,
 'w': 49,
 'x': 50,
 'y': 51,
 'z': 52}

## Convert to a Set of Symbols of Fixed Length

In [5]:
maxlen = 10
step = 2

data_points = []
next_characters = []

for i in range(0, len(raw_text) - maxlen, step):
    data_points.append(raw_text[i: i+maxlen])
    next_characters.append(raw_text[i + maxlen])

In [6]:
# Print the first few elements of data_points for inspection
data_points[:3]

['Michael\nCh', 'chael\nChri', 'ael\nChrist']

In [7]:
# Print the first few elements of next_characters for inspection
next_characters[:3]

['r', 's', 'o']

In [8]:
# Print the number of observations/data points generated from the dataset
print(f"Number of data points: {len(data_points)}")

Number of data points: 66774


## Vectorization

In [9]:
# Convert data_points into x and characters into y
import numpy as np

x = np.zeros((len(data_points), maxlen, len(characters)), dtype=int)
y = np.zeros((len(data_points), len(characters)), dtype=int) 
# dtype = int is used above for the purposes of demonstrating the vectors, 
# however in actual implementation it should be set to dtype = np.float64

In [10]:
# Create one hot encoding of data_points and characters
for i, data_point in enumerate(data_points):
    for t, character, in enumerate(data_point):
        x[i, t, character_indices[character]] = 1
    y[i, character_indices[next_characters[i]]] = 1

## Explanation

### For a list of names dataset, how will the vectorized version look like?
From the list of names dataset, we generate an x vector (observations/data points) and a y vector (predicted next characters). We look at the y vector first:

The y vector (predicted next characters) can be thought of as an array of arrays. Each array inside is the size of the vocabulary, and the array contains a '1' at the index corresponding to the character and '0' everywhere else. Thus, the shape of y will be `the number of data points` x `the vocabulary size`.

For example, our vocabulary size (number of unique characters from the dataset) is 53. y is an array of arrays, wherein each array inside has 53 elements. When we printed `next_characters` earlier, the first predicted next character is 'r'. Looking at `character_indices`, 'r' is at index 44. Thus, the first array in y has '1' at index 44 and '0' everywhere else. We can verify this by inspecting y below:

In [11]:
y[:3] # the first few arrays in y, showing y is an array of arrays

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        1, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0]])

In [12]:
y[0] # the first array in y, which corresponds to 'r' (the first element in next_characters)
# 'r' is represented by '1' at index 44 and '0' everywhere else

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       1, 0, 0, 0, 0, 0, 0, 0, 0])

In [13]:
y[0][44] # show that '1' is at index 44

1

In [14]:
indices_characters[44] # show that index 44 corresponds to 'r'

'r'

In [15]:
y.shape
# The shape of y will be: the number of data points,
# the number of characters in our vocabulary (vocabulary size)

(66774, 53)

Next, we look at the x vector. Whereas y can be thought of as a 2D array, x is a 3D array. That is, while y itself is an array of arrays, each element in x is an array of arrays.

For example, when we printed `data_points` earlier, the first element is 'Michael\nCh' (an array of arrays). Each character in 'Michael\nCh' is represented as an array in the same way y represented one character. Looking at `character_indices`, 'M' is one array with '1' at index 13, 'i' is one array with '1' at index 35, and so on.

Looking at `character_indices`, 'r' is at index 44. Thus, the first array in y has '1' at index 44 and '0' everywhere else. We can verify this by inspecting x below:

In [16]:
x[:2] # the first few elements of x, showing x is a 3D array

array([[[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [1, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]],

       [[0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        ...,
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0],
        [0, 0, 0, ..., 0, 0, 0]]])

In [17]:
x[0] # the first element of x, which is an array of arrays corresponding to 'Michael\nCh' (the first element in data_points)
# the first array is for 'M', represented by '1' at index 13 and '0' everywhere else
# the second array is for 'i', represented by '1' at index 35 and '0' everywhere else, and so on

array([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
        0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
      

In [18]:
x.shape
# The shape of x will be: the number of data points, 
# the number of characters per data point (maxlen),
# the number of characters in our vocabulary (vocabulary size)

(66774, 10, 53)

### The hyperparameter maxlen should be set to a fixed value. Why is this the case?
The names themselves have varying lengths (ex. Michael has length 7, Christian has length 9), however, MLP requires that inputs must all have the same size. To resolve this, the textual dataset is split using `maxlen` into data points that all have the same dimensionality (length).

For example, `maxlen` was set to 10 above. Thus, whereas the original text has names of varying lengths, the inputs become 'Michael\nCh', 'chael\nChri', 'ael\nChrist', and so on, each of length 10. The split inputs can then be passed through the MLP model.

# Technical Implementation - Part 2
### Convert the notebook Intuition for Text Generation notebook to a word token level (as opposed to character level token) and see what gets generated.
This notebook shows an end to end process for the objective of generating text on a word token level. It first reads a text file as defined in `story.txt` and performs the following:

* Text Cleanup by lowercasing everything
* Text Cleanup by removing unwanted symbols
* Creating a mapping for character indices and indices to characters
* Defining x and y vectors for classification
* Pass it to a multi-layer perceptron
* Generate some text

## Required Libraries

* `nltk`
* `scikit-learn`
* `torch`

### Import the Necessary Libraries

In [19]:
import re
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import random
import nltk
from nltk.tokenize import word_tokenize
from nltk.stem import PorterStemmer, WordNetLemmatizer
from nltk.corpus import stopwords
nltk.download('punkt')
nltk.download('wordnet')
nltk.download('stopwords')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!
[nltk_data] Downloading package stopwords to
[nltk_data]     C:\Users\Irish\AppData\Roaming\nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

### Read a Text File and Get Raw Text

In [20]:
filename = "story.txt"
f = open(filename, 'r')

raw_text = f.read()

### Examine First 1000 Characters

In [21]:
raw_text[0:1000]

'Once upon a time, there was a boy named Mike\nHe was raised in the hood, life was never quite\nGrowing up, he faced many struggles and strife\nBut through it all, he had one love in his life\n\nRap music was his escape from reality\nThe beats and lyrics helped him to see\nA world beyond his troubled neighborhood\nHe knew that with hard work, he could change his mood\n\nHe started writing rhymes and practicing his flow\nIn the mirror, he would rap and watch himself grow\nHe knew that if he could just make it to the top\nHe could change his life, and make a better stop\n\nOne day, he had the chance to perform on stage\nIn front of a crowd, he killed it, he killed it with rage\nHe had finally made it, he had achieved his dream\nAnd now he is a successful rapper, or so it seems\n\nHe never forgot where he came from, and he never will\nHe always remember the struggles that he had to fulfill\nHe is now a role model to kids in the hood\nShowing them that with hard work, they too could\n\nRis

### Cleanup Text
* Lowercase all characters
* Remove special symbols

In [22]:
processed_text = raw_text.lower()
processed_text = re.sub(r'[^\x00-\x7f]', r'', processed_text)

processed_text

'once upon a time, there was a boy named mike\nhe was raised in the hood, life was never quite\ngrowing up, he faced many struggles and strife\nbut through it all, he had one love in his life\n\nrap music was his escape from reality\nthe beats and lyrics helped him to see\na world beyond his troubled neighborhood\nhe knew that with hard work, he could change his mood\n\nhe started writing rhymes and practicing his flow\nin the mirror, he would rap and watch himself grow\nhe knew that if he could just make it to the top\nhe could change his life, and make a better stop\n\none day, he had the chance to perform on stage\nin front of a crowd, he killed it, he killed it with rage\nhe had finally made it, he had achieved his dream\nand now he is a successful rapper, or so it seems\n\nhe never forgot where he came from, and he never will\nhe always remember the struggles that he had to fulfill\nhe is now a role model to kids in the hood\nshowing them that with hard work, they too could\n\nris

In [23]:
word_tokens = word_tokenize(processed_text)

In [24]:
stop_words = set(stopwords.words('english'))
filtered_tokens = [token for token in word_tokens if token.lower() not in stop_words]

In [25]:
word_vocabulary = list(set(sorted(filtered_tokens)))

In [26]:
lemmatizer = WordNetLemmatizer()
lemmatized_tokens = [lemmatizer.lemmatize(token) for token in word_vocabulary]

lemmatized_tokens

['life',
 'remember',
 'helped',
 'forgot',
 'escape',
 'quite',
 'chance',
 'came',
 'grow',
 'watch',
 'killed',
 'knew',
 'dream',
 'raised',
 'showing',
 'mirror',
 'stop',
 'made',
 'flow',
 'one',
 'time',
 'many',
 'seems',
 'always',
 'end',
 'better',
 'practicing',
 'reality',
 'role',
 ',',
 'perform',
 'neighborhood',
 'front',
 'fulfill',
 'achieved',
 'named',
 'beat',
 'kid',
 'world',
 'strife',
 'hard',
 'rapper',
 'successful',
 'would',
 'boy',
 'mood',
 'finally',
 'hood',
 'rap',
 'rage',
 'writing',
 'top',
 'love',
 'upon',
 'circumstance',
 'model',
 'chase',
 'crowd',
 'rhyme',
 'faced',
 'stage',
 'change',
 'struggle',
 'day',
 'lyric',
 'mike',
 'troubled',
 'growing',
 'started',
 'never',
 'make',
 'see',
 'could',
 'music',
 'dream',
 'work',
 'beyond',
 'rise']

### Create Index Mappings

* `word_indices`: Return the index of a given word. Example: `word_indices['a']`
* `indices_word`: Returns the word given an index. Example: `indices_word[0]`

In this case, `word` serves as your **vocabulary**.

In [27]:
words = sorted(list(set(lemmatized_tokens)))
print("Total Words: {}".format(len(words)))

Total Words: 77


In [28]:
word_indices = dict((w, i) for i,w in enumerate(words))
indices_words = dict((i, w) for i,w in enumerate(words))

### Convert to a Set of Symbols of Fixed Length

* `maxlen`: Dimensionality of each data point
* `step`: Granularity of skips. The lower the number, the noisier. The higher the number, the more erratic.

Take note we also capture the predicted character for the given sentence. This will allow us to setup the data in such a way that a given set of sequences predicts the next character. In machine learning, we will denote this as our `y` value or ground truth. Each `y` however is represented as a one hot encoding where a position will receive a value of `1` depending on the character position in `word_indices`

In [29]:
maxlen = 10
step = 5

sentences = []
next_words = []

for i in range(0, len(lemmatized_tokens) - maxlen, step):
    sentences.append(lemmatized_tokens[i: i+maxlen])
    next_words.append(lemmatized_tokens[i + maxlen])

In [30]:
sentences

[['life',
  'remember',
  'helped',
  'forgot',
  'escape',
  'quite',
  'chance',
  'came',
  'grow',
  'watch'],
 ['quite',
  'chance',
  'came',
  'grow',
  'watch',
  'killed',
  'knew',
  'dream',
  'raised',
  'showing'],
 ['killed',
  'knew',
  'dream',
  'raised',
  'showing',
  'mirror',
  'stop',
  'made',
  'flow',
  'one'],
 ['mirror',
  'stop',
  'made',
  'flow',
  'one',
  'time',
  'many',
  'seems',
  'always',
  'end'],
 ['time',
  'many',
  'seems',
  'always',
  'end',
  'better',
  'practicing',
  'reality',
  'role',
  ','],
 ['better',
  'practicing',
  'reality',
  'role',
  ',',
  'perform',
  'neighborhood',
  'front',
  'fulfill',
  'achieved'],
 ['perform',
  'neighborhood',
  'front',
  'fulfill',
  'achieved',
  'named',
  'beat',
  'kid',
  'world',
  'strife'],
 ['named',
  'beat',
  'kid',
  'world',
  'strife',
  'hard',
  'rapper',
  'successful',
  'would',
  'boy'],
 ['hard',
  'rapper',
  'successful',
  'would',
  'boy',
  'mood',
  'finally',
  '

In [31]:
next_words

['killed',
 'mirror',
 'time',
 'better',
 'perform',
 'named',
 'hard',
 'mood',
 'writing',
 'model',
 'stage',
 'mike',
 'make',
 'work']

### Vectorization

This step simply converts the `sentences` and `next_words` to its `x` and `y` components respectively. Since we're using pytorch, we convert it to a tensor. Succeeding cells check the shape of `x` and `y`.

In [32]:
print("Vectorization")
device = 'cpu'
x = np.zeros((len(sentences), maxlen, len(words)), dtype=np.float64)
y = np.zeros((len(sentences), len(words)), dtype=np.float64)

for i, sentence in enumerate(sentences):
    for t, word, in enumerate(sentence):
        x[i, t, word_indices[word]] = 1
    y[i, word_indices[next_words[i]]] = 1
    
x = torch.tensor(x).float().to(device)
y = torch.tensor(y).float().to(device)

Vectorization


In [33]:
x.shape

torch.Size([14, 10, 77])

In [34]:
x = torch.flatten(x, start_dim=1)

x.shape

torch.Size([14, 770])

In [35]:
y.shape

torch.Size([14, 77])

### Utility Function for Generating Samples

In [36]:
def sample(preds, temperature=1.0):
    # helper function to sample an index from a probability array
    preds = np.asarray(preds).astype('float64')
    preds = np.log(preds) / temperature
    exp_preds = np.exp(preds)
    preds = exp_preds / np.sum(exp_preds)
    probas = np.random.multinomial(1, preds, 1)
    return np.argmax(probas)

### Callback Function

This generates sample text from a given seed and ran for every epoch of the model.

In [37]:
def callback(model):
    start = 0
    stop = len(lemmatized_tokens) - maxlen - 1

    #print("Start: {}".format(start))
    #print("Stop: {}".format(stop))

    start_index = random.randint(start, stop)

    #print("Start Index: {}".format(start_index))

    sentence = lemmatized_tokens[start_index: start_index + maxlen]

    #print("Sentence: {}".format(sentence))
    #print("Sentence Length: {}".format(len(sentence)))

    generated = ""

    for i in range(400):
        x_predictions = np.zeros((1, maxlen, len(words)))

        for t, w in enumerate(sentence):
            #print("Sentence t: {}".format(t))
            #print("Sentence w: {}".format(w))
            #print("x_predictions[0, t, word_indices[w]]: {}".format(x_predictions[0, t, word_indices[w]]))
            x_predictions[0, t, word_indices[w]] = 1

            #print(x_predictions)
        x_predictions = torch.tensor(x_predictions).float().to(device)
        x = torch.flatten(x_predictions, start_dim=1)

        preds = model.forward(x)[0].detach().cpu().numpy()

        next_index = sample(preds)
        #print("next_index: {}".format(next_index))
        next_word = indices_words[next_index]
        #print("next_word: {}".format(next_word))

        generated += next_word
        #print("generated with next_word: {}".format(generated))
        sentence = sentence[1:]
        sentence.append(next_word)
        #print("sentence with next_word: {}".format(sentence))

    return sentence

### MultiLayerPerceptron Model

This will be our current language model. Although originally used for classification, we can also treat is a regression model since our ground truth represents the next predicted character in a sequence. Take note that this is a rather simplistic model without any properties to remember previous input. This forces the model to treat each input as an independent observation without considering sequential behavior. TLDR, it won't generate good results.

In [38]:
class MultiLayerPerceptron(nn.Module):
    def __init__(self, input_dim, output_dim):
        super().__init__()

        self.hidden = nn.Linear(input_dim, 500)
        self.output = nn.Linear(500, output_dim)
        
        self.relu = nn.ReLU()
        self.sigmoid = nn.Sigmoid()

    def forward(self, x):
        # f(x) = a(f(x))
        x = self.relu(self.hidden(x))
        y = self.sigmoid(self.output(x))

        return y

In [39]:
model = MultiLayerPerceptron(x.shape[1], y.shape[1]).to(device)

model

MultiLayerPerceptron(
  (hidden): Linear(in_features=770, out_features=500, bias=True)
  (output): Linear(in_features=500, out_features=77, bias=True)
  (relu): ReLU()
  (sigmoid): Sigmoid()
)

### Training Function

In [40]:
optimizer = optim.Adam(model.parameters(), lr=0.00001)
criterion = nn.CrossEntropyLoss()

def train_fn(model, optimizer, loss_fn, device):
    ave_loss = 0
    count = 0
    
    for i, data in enumerate(x):
        data = x[i]
        targets = y[i]
        
        # Forward
        predictions = model.forward(data)
        
        predictions = F.softmax(predictions, dim=-1)
        
        loss = loss_fn(predictions, targets)
        
        # Backward
        optimizer.zero_grad()
        
        loss.backward()
        
        optimizer.step()

        count += 1
        ave_loss += loss.item()
    
    ave_loss = ave_loss / count

    return ave_loss

epochs = 100

average_losses = []

for epoch in range(epochs):
    print("Epoch: {}".format(epoch))
    ave_loss = train_fn(model, optimizer, criterion, device)
    
    average_losses.append(ave_loss)
        
    print("Ave Loss: {}".format(ave_loss))
    
    generated_sentence = callback(model)
    
    print("Generated sentence:")
    print(generated_sentence)
    print("Length: {}".format(len(generated_sentence)))

Epoch: 0
Ave Loss: 4.343805858067104
Generated sentence:
['make', 'music', ',', 'top', 'grow', 'successful', 'beyond', 'writing', 'beat', 'beyond']
Length: 10
Epoch: 1
Ave Loss: 4.343792847224644
Generated sentence:
['rage', 'rise', 'work', 'started', 'end', 'hood', 'escape', 'quite', 'stage', 'see']
Length: 10
Epoch: 2
Ave Loss: 4.343780653817313
Generated sentence:
['dream', 'role', 'model', 'love', 'beyond', 'showing', 'mirror', 'chance', 'could', 'hood']
Length: 10
Epoch: 3
Ave Loss: 4.34376859664917
Generated sentence:
['named', 'achieved', 'stop', 'dream', 'struggle', 'never', 'remember', 'watch', 'role', 'upon']
Length: 10
Epoch: 4
Ave Loss: 4.343756301062448
Generated sentence:
['hard', 'music', 'hood', 'quite', 'grow', 'knew', 'kid', 'rap', 'chase', 'see']
Length: 10
Epoch: 5
Ave Loss: 4.3437440395355225
Generated sentence:
['perform', 'raised', 'world', 'quite', 'love', 'raised', 'world', 'escape', 'rap', 'time']
Length: 10
Epoch: 6
Ave Loss: 4.343731948307583
Generated sente

Ave Loss: 4.343130929129464
Generated sentence:
['killed', 'role', 'rise', 'front', 'showing', 'rise', 'rise', 'writing', 'mood', 'world']
Length: 10
Epoch: 53
Ave Loss: 4.343116760253906
Generated sentence:
['writing', 'work', 'hard', 'stage', 'work', 'quite', 'rage', 'rise', 'music', 'troubled']
Length: 10
Epoch: 54
Ave Loss: 4.343102523258755
Generated sentence:
['love', 'one', 'rhyme', 'growing', 'troubled', 'change', 'growing', ',', 'growing', 'music']
Length: 10
Epoch: 55
Ave Loss: 4.343088150024414
Generated sentence:
['dream', 'achieved', 'hard', 'role', 'watch', 'could', 'raised', 'always', 'beat', 'faced']
Length: 10
Epoch: 56
Ave Loss: 4.343073776790074
Generated sentence:
['always', 'lyric', 'upon', 'knew', 'finally', 'lyric', 'model', 'hard', 'upon', 'time']
Length: 10
Epoch: 57
Ave Loss: 4.343059233256748
Generated sentence:
['stage', 'rhyme', 'remember', 'escape', 'watch', 'struggle', 'chance', 'could', 'grow', 'hood']
Length: 10
Epoch: 58
Ave Loss: 4.343044655663626
Gen

Ave Loss: 4.342285735266549
Generated sentence:
['change', 'watch', 'stop', 'kid', 'see', 'top', 'world', ',', 'mirror', 'boy']
Length: 10
Epoch: 105
Ave Loss: 4.342267036437988
Generated sentence:
['life', 'came', 'chance', 'achieved', 'quite', 'would', 'struggle', 'made', 'named', 'hood']
Length: 10
Epoch: 106
Ave Loss: 4.342248133250645
Generated sentence:
['growing', 'music', 'successful', 'killed', 'work', 'reality', 'end', 'hood', 'fulfill', 'see']
Length: 10
Epoch: 107
Ave Loss: 4.342229127883911
Generated sentence:
['growing', 'one', 'would', 'finally', 'finally', 'lyric', 'rise', 'raised', 'world', 'strife']
Length: 10
Epoch: 108
Ave Loss: 4.342209952218192
Generated sentence:
['made', 'mike', 'better', 'mirror', 'end', 'upon', 'music', 'rise', 'crowd', 'world']
Length: 10
Epoch: 109
Ave Loss: 4.342190844672067
Generated sentence:
['forgot', 'quite', 'music', 'love', 'fulfill', 'quite', 'kid', 'troubled', 'day', 'escape']
Length: 10
Epoch: 110
Ave Loss: 4.342171464647565
Gener

Generated sentence:
['growing', 'one', 'chase', 'boy', 'role', ',', 'showing', 'showing', 'hood', 'reality']
Length: 10
Epoch: 155
Ave Loss: 4.341181005750384
Generated sentence:
['beyond', 'quite', 'mirror', 'growing', 'beat', 'escape', 'successful', 'showing', 'flow', 'kid']
Length: 10
Epoch: 156
Ave Loss: 4.341156005859375
Generated sentence:
['rage', 'work', 'beyond', 'rapper', 'made', 'change', 'life', 'achieved', 'successful', 'showing']
Length: 10
Epoch: 157
Ave Loss: 4.341130903788975
Generated sentence:
['knew', 'music', 'hood', 'killed', 'never', 'better', 'mirror', 'started', 'rise', 'work']
Length: 10
Epoch: 158
Ave Loss: 4.341105733598981
Generated sentence:
['would', 'quite', 'mirror', 'killed', 'time', 'better', 'struggle', 'could', 'upon', 'seems']
Length: 10
Epoch: 159
Ave Loss: 4.3410802228110175
Generated sentence:
['front', 'mood', 'end', 'role', 'always', 'reality', 'model', 'beat', 'make', 'helped']
Length: 10
Epoch: 160
Ave Loss: 4.341054814202445
Generated sente

KeyboardInterrupt: 