# **Third Attempt at generating text using RNN**

So far i've tried to generate shakespeare like text using Tensorflow. ith my first few attempts i went of the Udacity course and tried to train my own model to perfom text generation. I didn't get very far as the models i tried to train were far to complex and i saw very little results. Following the guide on tensorflow, the trained model was far simplier and was successfully trained.

I thought i'd give it another go at training my own text generation model/Something a bit intresting...

Text Generation Model trained on [Anime Quotes](https://www.kaggle.com/datasets/tarundalal/anime-quotes)


P.s
I'm most likely going to steal some stuff from both the Udacity and Tensorflow guide

# **Import Dependencies**

In [None]:
import tensorflow as tf
import numpy as np
import urllib.request
import csv

print(tf.__version__)


2.8.2


# **Download the dataset**

In [None]:
# i downloaded the dataset from this link
url = "https://www.kaggle.com/datasets/tarundalal/anime-quotes/download?datasetVersionNumber=1"


In [None]:
!pwd

/content


Extracted the csv file and loaded it into the contents folder

In [None]:
# read the csvfile
anime_quotes = []

# the csv file contains Quote, character, Anime. For this task we are only
# interested in the quote so we would only get the first column from each row.
with open('AnimeQuotes.csv') as csv_file:
  csv_reader = csv.reader(csv_file, delimiter=',')
  for row in csv_reader:
    anime_quotes.append(row[0])

print(anime_quotes[:10])


['Quote', 'People’s lives don’t end when they die, it ends when they lose faith.', 'If you don’t take risks, you can’t create a future!', 'If you don’t like your destiny, don’t accept it.', 'When you give up, that’s when the game ends.', 'All we can do is live until the day we die. Control what we can…and fly free.', 'Forgetting is like a wound. The wound may heal, but it has already left a scar.', 'It’s just pathetic to give up on something before you even give it a shot.”', 'If you don’t share someone’s pain, you can never understand them.', 'Whatever you lose, you’ll find it again. But what you throw away you’ll never get back.']


In [None]:
# remove the header
anime_quotes = anime_quotes[1:]

print(anime_quotes[:10])
print(len(anime_quotes))


['People’s lives don’t end when they die, it ends when they lose faith.', 'If you don’t take risks, you can’t create a future!', 'If you don’t like your destiny, don’t accept it.', 'When you give up, that’s when the game ends.', 'All we can do is live until the day we die. Control what we can…and fly free.', 'Forgetting is like a wound. The wound may heal, but it has already left a scar.', 'It’s just pathetic to give up on something before you even give it a shot.”', 'If you don’t share someone’s pain, you can never understand them.', 'Whatever you lose, you’ll find it again. But what you throw away you’ll never get back.', 'We don’t have to know what tomorrow holds! That’s why we can live for everything we’re worth today!”']
121


# **Prepare the text**

The main task is here is to be able to generate anime quotes from our own seed text. Towards this we need, a set of feature and labels to train the model on.

<br>

**Set features and labels**   
The feature and labels need to reflect the task, so the feature should be a set of initial text and the label should be the next set of text.

From what i've seen there are 2 ways we can approach this, we can create a model which Predicts the next char or predicts the next word. I'll try out the different methods to prepare the text
- Predicting next char 
- Predicting the next probable word


In this collab i'll generate a model to predict the next probable char.



In [None]:
# combine the contents of the list into a single string
all_anime_quotes = " ".join(anime_quotes)


num_char = len(all_anime_quotes)
unique_chars = set(all_anime_quotes)
vocab_size = len(unique_chars)

print(all_anime_quotes)
print(f"Unique_chars: {unique_chars}")
print(f"Total number of charachthers in all_anime_quotes: {num_char}")
print(f"Vocabulary size: {vocab_size}")


People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you can’t create a future! If you don’t like your destiny, don’t accept it. When you give up, that’s when the game ends. All we can do is live until the day we die. Control what we can…and fly free. Forgetting is like a wound. The wound may heal, but it has already left a scar. It’s just pathetic to give up on something before you even give it a shot.” If you don’t share someone’s pain, you can never understand them. Whatever you lose, you’ll find it again. But what you throw away you’ll never get back. We don’t have to know what tomorrow holds! That’s why we can live for everything we’re worth today!” Why should I apologize for being a monster? Has anyone ever apologized for turning me into one? People become stronger because they have memories they can’t forget. I’ll leave tomorrow’s problems to tomorrow’s me. If you wanna make people dream, you’ve gotta start by believing in that dream you

In [None]:
print(sorted(unique_chars))


[' ', '!', ',', '-', '.', ':', '?', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'R', 'S', 'T', 'U', 'V', 'W', 'Y', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '\xa0', '’', '“', '”', '…']


To recap the steps we are going to take for the text generation model.

**Preparing the text**
- We are going to perform tokenization on each individual chars to convert them into tokens
- From the tokens we would then create sequences. We would create sequences of length 100 which would be our feature. Our label would be our sequence shifted one way to the right.

**Model training**
- We would then train an RNN model on the features and labels.

**Text generation**
- We would then generate a text from a seed word using the trained model

define a function to map the char into tokens.

In [None]:
# define a dictionary to map the char into token
char_to_token = dict([(char, token) for token, char in enumerate(unique_chars)])
print(char_to_token)


{'p': 0, 'r': 1, 'u': 2, 'm': 3, ':': 4, 'J': 5, 'H': 6, 'L': 7, 'i': 8, 'M': 9, 'F': 10, 'B': 11, 'G': 12, '’': 13, 'A': 14, 'f': 15, 'c': 16, 'g': 17, 'I': 18, 'v': 19, 't': 20, '.': 21, 'T': 22, '?': 23, 'E': 24, 'j': 25, 'D': 26, 'K': 27, ' ': 28, 'R': 29, 'e': 30, 'h': 31, 's': 32, 'o': 33, 'Y': 34, 'x': 35, '…': 36, 'C': 37, 'a': 38, 'w': 39, 'y': 40, 'W': 41, 'l': 42, 'q': 43, 'V': 44, 'U': 45, 'b': 46, '“': 47, 'd': 48, 'S': 49, '!': 50, '”': 51, ',': 52, 'O': 53, 'k': 54, 'P': 55, 'N': 56, '-': 57, 'n': 58, 'z': 59, '\xa0': 60}


In [None]:
# create a dictionary with inverted mapping
token_to_char = dict([(token, char) for char, token in char_to_token.items()])
print(token_to_char)


{0: 'p', 1: 'r', 2: 'u', 3: 'm', 4: ':', 5: 'J', 6: 'H', 7: 'L', 8: 'i', 9: 'M', 10: 'F', 11: 'B', 12: 'G', 13: '’', 14: 'A', 15: 'f', 16: 'c', 17: 'g', 18: 'I', 19: 'v', 20: 't', 21: '.', 22: 'T', 23: '?', 24: 'E', 25: 'j', 26: 'D', 27: 'K', 28: ' ', 29: 'R', 30: 'e', 31: 'h', 32: 's', 33: 'o', 34: 'Y', 35: 'x', 36: '…', 37: 'C', 38: 'a', 39: 'w', 40: 'y', 41: 'W', 42: 'l', 43: 'q', 44: 'V', 45: 'U', 46: 'b', 47: '“', 48: 'd', 49: 'S', 50: '!', 51: '”', 52: ',', 53: 'O', 54: 'k', 55: 'P', 56: 'N', 57: '-', 58: 'n', 59: 'z', 60: '\xa0'}


In [None]:
# Sanity check
print(f"A has token {char_to_token['A']}")
print(f"{char_to_token['A']} represents {token_to_char[char_to_token['A']]}")


A has token 14
14 represents A


In [None]:
# Convert the text data into sequences
sequences = []
for char in all_anime_quotes:
  token = char_to_token[char]
  sequences.append(token)

print(sequences)
print(f"Length of sequence: {len(sequences)}")


[55, 30, 33, 0, 42, 30, 13, 32, 28, 42, 8, 19, 30, 32, 28, 48, 33, 58, 13, 20, 28, 30, 58, 48, 28, 39, 31, 30, 58, 28, 20, 31, 30, 40, 28, 48, 8, 30, 52, 28, 8, 20, 28, 30, 58, 48, 32, 28, 39, 31, 30, 58, 28, 20, 31, 30, 40, 28, 42, 33, 32, 30, 28, 15, 38, 8, 20, 31, 21, 28, 18, 15, 28, 40, 33, 2, 28, 48, 33, 58, 13, 20, 28, 20, 38, 54, 30, 28, 1, 8, 32, 54, 32, 52, 28, 40, 33, 2, 28, 16, 38, 58, 13, 20, 28, 16, 1, 30, 38, 20, 30, 28, 38, 28, 15, 2, 20, 2, 1, 30, 50, 28, 18, 15, 28, 40, 33, 2, 28, 48, 33, 58, 13, 20, 28, 42, 8, 54, 30, 28, 40, 33, 2, 1, 28, 48, 30, 32, 20, 8, 58, 40, 52, 28, 48, 33, 58, 13, 20, 28, 38, 16, 16, 30, 0, 20, 28, 8, 20, 21, 28, 41, 31, 30, 58, 28, 40, 33, 2, 28, 17, 8, 19, 30, 28, 2, 0, 52, 28, 20, 31, 38, 20, 13, 32, 28, 39, 31, 30, 58, 28, 20, 31, 30, 28, 17, 38, 3, 30, 28, 30, 58, 48, 32, 21, 28, 14, 42, 42, 28, 39, 30, 28, 16, 38, 58, 28, 48, 33, 28, 8, 32, 28, 42, 8, 19, 30, 28, 2, 58, 20, 8, 42, 28, 20, 31, 30, 28, 48, 38, 40, 28, 39, 30, 28, 48, 8, 3

In [None]:
# From the sequence_data create a list containing
sequence_length = 100
sequences_as_tf_data = tf.data.Dataset.from_tensor_slices(sequences).batch(sequence_length+1, drop_remainder=True)


In [None]:
# display the 2 samples from the dataset
for sample in sequences_as_tf_data.take(2):
  print("".join([token_to_char[token] for token in sample.numpy()]))

People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you ca
n’t create a future! If you don’t like your destiny, don’t accept it. When you give up, that’s when t


In [None]:
# Split sequences into features and labels
def split_sequence(sequence):
  feature = sequence[:-1]
  label = sequence[1:]
  return feature, label


In [None]:
# try it out on the 2 samples
for sample in sequences_as_tf_data.take(2):
  (feature, label) = split_sequence(sample.numpy())
  print("\nFeature, Label pair")
  print("".join([token_to_char[token] for token in feature]))
  print("".join([token_to_char[token] for token in label]))



Feature, Label pair
People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you c
eople’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you ca

Feature, Label pair
n’t create a future! If you don’t like your destiny, don’t accept it. When you give up, that’s when 
’t create a future! If you don’t like your destiny, don’t accept it. When you give up, that’s when t


In [None]:
# Apply the split_sequence function to the dataset
feature_label_data = sequences_as_tf_data.map(split_sequence)

for feature, label in feature_label_data.take(1):
  print(feature)
  print(label)


tf.Tensor(
[55 30 33  0 42 30 13 32 28 42  8 19 30 32 28 48 33 58 13 20 28 30 58 48
 28 39 31 30 58 28 20 31 30 40 28 48  8 30 52 28  8 20 28 30 58 48 32 28
 39 31 30 58 28 20 31 30 40 28 42 33 32 30 28 15 38  8 20 31 21 28 18 15
 28 40 33  2 28 48 33 58 13 20 28 20 38 54 30 28  1  8 32 54 32 52 28 40
 33  2 28 16], shape=(100,), dtype=int32)
tf.Tensor(
[30 33  0 42 30 13 32 28 42  8 19 30 32 28 48 33 58 13 20 28 30 58 48 28
 39 31 30 58 28 20 31 30 40 28 48  8 30 52 28  8 20 28 30 58 48 32 28 39
 31 30 58 28 20 31 30 40 28 42 33 32 30 28 15 38  8 20 31 21 28 18 15 28
 40 33  2 28 48 33 58 13 20 28 20 38 54 30 28  1  8 32 54 32 52 28 40 33
  2 28 16 38], shape=(100,), dtype=int32)


We have our text data prepared. Inputs to the model is a sequence of tokens and the label is also the same sequence shifted by 1 to the right.

In [None]:
# might be easier to convert sequences to strings if we define a function to convert it
def convert_sequence_to_string(sequence):
  string = "".join([token_to_char[token] for token in sequence])
  return string


In [None]:
for feature, label in feature_label_data.take(1):
  print(convert_sequence_to_string(feature.numpy()))
  print(convert_sequence_to_string(label.numpy()))

People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you c
eople’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you ca


In [None]:
# create a batched dataset
batched_dataset = (feature_label_data.batch(1))

Althought this seems un-necessary considering that, there is really only 124 samples, RNN models expect a batch dimension. 

# **Define the RNN model**

In [None]:
vocab_size = 61
Embedding_dim = 128
GRU_units = 256


In [None]:
# define a text generation model
Anime_qoutes_model = tf.keras.Sequential([tf.keras.layers.Embedding(vocab_size, Embedding_dim),
                                         tf.keras.layers.GRU(units=GRU_units, dropout=0.5, 
                                                             recurrent_dropout=0.25,
                                                             return_sequences=True),
                                         tf.keras.layers.Dense(units=vocab_size, activation="softmax")])




In [None]:
# Compile the model
Anime_qoutes_model.compile(optimizer='adam',
                           loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                           )

Anime_qoutes_model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       (None, None, 128)         7808      
                                                                 
 gru (GRU)                   (None, None, 256)         296448    
                                                                 
 dense (Dense)               (None, None, 61)          15677     
                                                                 
Total params: 319,933
Trainable params: 319,933
Non-trainable params: 0
_________________________________________________________________


Some notes missed out from an unsaved version
- Batch dimension in the data is needed for RNN 
- SparseCategoricalCrossentropy used for multiclass classification when expected labels are integers and not one-hot encoded labels.
- Use categoricalCrossentropy loss for multiclass classifications with one-hot encoded labels

# **Train the model**

**Run forward pass and get the loss**

In [None]:
# get a single feature and label from the batch dataset
for batch_feature, batch_label in batched_dataset.take(1):
  print(batch_feature[0])
  print(batch_label[0])
  

tf.Tensor(
[55 30 33  0 42 30 13 32 28 42  8 19 30 32 28 48 33 58 13 20 28 30 58 48
 28 39 31 30 58 28 20 31 30 40 28 48  8 30 52 28  8 20 28 30 58 48 32 28
 39 31 30 58 28 20 31 30 40 28 42 33 32 30 28 15 38  8 20 31 21 28 18 15
 28 40 33  2 28 48 33 58 13 20 28 20 38 54 30 28  1  8 32 54 32 52 28 40
 33  2 28 16], shape=(100,), dtype=int32)
tf.Tensor(
[30 33  0 42 30 13 32 28 42  8 19 30 32 28 48 33 58 13 20 28 30 58 48 28
 39 31 30 58 28 20 31 30 40 28 48  8 30 52 28  8 20 28 30 58 48 32 28 39
 31 30 58 28 20 31 30 40 28 42 33 32 30 28 15 38  8 20 31 21 28 18 15 28
 40 33  2 28 48 33 58 13 20 28 20 38 54 30 28  1  8 32 54 32 52 28 40 33
  2 28 16 38], shape=(100,), dtype=int32)


In [None]:
# display the string
print(convert_sequence_to_string(batch_feature[0].numpy()))
print(convert_sequence_to_string(batch_label[0].numpy()))


People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you c
eople’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you ca


In [None]:
# pass the feature to the untrained model and get the prediction
batch_prediction = Anime_qoutes_model(batch_feature)
print(batch_prediction)


tf.Tensor(
[[[0.01644353 0.01639114 0.01645644 ... 0.01631887 0.01625522 0.0166002 ]
  [0.01636711 0.01647367 0.01622433 ... 0.01640512 0.01645475 0.01650914]
  [0.01642953 0.01659862 0.01609304 ... 0.01649837 0.01635189 0.01639101]
  ...
  [0.01637354 0.01678303 0.01618241 ... 0.01650596 0.01639222 0.01627393]
  [0.01636053 0.01669525 0.01630641 ... 0.01667272 0.0164395  0.01614342]
  [0.01665466 0.01635142 0.01626649 ... 0.01645242 0.01649253 0.01620634]]], shape=(1, 100, 61), dtype=float32)


In [None]:
# something intresting to show
# try passing just the feature and not the batch feature.
try:
  batch_prediction = Anime_qoutes_model(batch_feature[0])
  print(batch_prediction)
except Exception as e:
  print("Sorry that's a no no")
  print(f"{e}")

Sorry that's a no no
Exception encountered when calling layer "sequential" (type Sequential).

Input 0 of layer "gru" is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: (100, 128)

Call arguments received:
  • inputs=tf.Tensor(shape=(100,), dtype=int32)
  • training=False
  • mask=None


The GRU layer expects a feature with 3 dimensions: `[batch dimension, time steps, time step dimension]`

In [None]:
# convert the prediction into a readable text
predicted_char_list = []
for prob_distribution in batch_prediction[0].numpy():
  char_id = np.argmax(prob_distribution)
  predicted_char_list.append(token_to_char[char_id])

print(predicted_char_list)


['-', '!', ':', '“', '?', 'B', 'Y', 'O', 'i', '?', 'k', 'r', 'B', 's', 'i', 'i', 'i', '-', 'f', 'a', 'i', 'B', 'l', 'l', 'i', 'i', 'i', 'B', 'l', 'i', 'a', 'i', 'B', '!', 'i', 'i', 'i', 'B', 'M', 'i', 'i', 'r', 'i', 'B', '’', 'T', 'l', 'i', 'F', 'i', 'B', 'l', 'i', 'a', 'i', 'B', '!', 'i', '?', ':', ':', 'B', 'i', '’', '’', 'k', 'a', 'i', 'Y', 'i', 'i', 'g', 'i', 'g', ':', 'i', 'i', 'i', 'i', '-', 'f', 'a', 'i', 'a', 'a', 'd', 'B', 'i', 'M', 'i', 'k', 'Y', 'o', 'F', 'i', 'g', ':', 'i', 'i', 'i']


In [None]:
predicted_text = "".join(predicted_char_list)
print(predicted_text)
print(len(predicted_text))


-!:“?BYOi?krBsiii-faiBlliiiBliaiB!iiiBMiiriB’TliFiBliaiB!i?::Bi’’kaiYiigig:iiii-faiaadBiMikYoFig:iii
100


In [None]:
# get the loss of the model
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
example_loss = loss(batch_label, batch_prediction).numpy()
print(example_loss)


4.106488


  return dispatch_target(*args, **kwargs)


Sort of need to make sense of the warning. I've found this [thread](https://stackoverflow.com/questions/67848962/selecting-loss-and-metrics-for-tensorflow-model) on stackoverflow that provides some better guidance on selecting loss, metrics and difference between softmax and sigmoid activation.

**TLDR Summary**
- Use **sparse_categorical_accuracy as a metric for classification task**, where the **label is an integer** and not a one-hot encoded label.
- Similarly, we can use **spare_categorical_crossentropy** as a **loss function for classification task** in the same scenario as above.

<br>

- In cases, where we have our **labels represented as one-hot encoded vectors** we could use **categorical_accuracy as a metric** and **categorical_crossentropy as a loss function**.

<br>

- **softmax activation functions** are commonly used as the activation function in the output layer for the **classification task**. These functions produce a **probabilitiy distribution, so the sum of the output from the layer = 1**.
Generally if the model outputs a probability distribution, **you'll need to set the from_logits = False**.

<br>

So what are Logits???   
[Another stack overflow thread](https://stackoverflow.com/questions/34240703/what-are-logits-what-is-the-difference-between-softmax-and-softmax-cross-entrop)

**Summary**



-- **Calculate the loss**

In [None]:
# define the loss without setting the from_logits argument to be true
loss=tf.keras.losses.SparseCategoricalCrossentropy()
example_loss = loss(batch_label, batch_prediction).numpy()
print(example_loss)


4.106488


Happy days.

As annoying as it is i think it is best i keep the errors in the notebook so that i and anyone who might be reading this can easily see the issues, lessons and solutions i came across.


## **Redefine the model and train it**

In [None]:
Anime_qoutes_model = tf.keras.Sequential([tf.keras.layers.Embedding(vocab_size,
                                                                    Embedding_dim),
                                          tf.keras.layers.GRU(units=GRU_units,
                                                              dropout=0.5,
                                                              recurrent_dropout=0.25,
                                                              return_sequences=True),
                                          tf.keras.layers.Dense(units=vocab_size, activation="softmax")])

Anime_qoutes_model.compile(optimizer='adam',
                           loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=False), #from_logits is set to False by default
                           metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])



In [None]:
# define the model call backs
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath="./model_checkpoints/model_epoch_{epoch}_loss_{loss}",
                                                               monitor='loss',
                                                               save_best_only=True)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=0.1, patience=4)

increased epoch from 22 to 60, to see if we can get a better accuracy and performance when generating text

In [None]:
history = Anime_qoutes_model.fit(batched_dataset, epochs=60,
                                 callbacks=[model_checkpoint_callback, early_stopping_callback])

Epoch 1/60



Epoch 2/60



Epoch 3/60



Epoch 4/60



Epoch 5/60



Epoch 6/60



Epoch 7/60



Epoch 8/60



Epoch 9/60



Epoch 10/60



Epoch 11/60



Epoch 12/60



Epoch 13/60



Epoch 14/60



Epoch 15/60



Epoch 16/60



Epoch 17/60



Epoch 18/60



Epoch 19/60



Epoch 20/60



Epoch 21/60



Epoch 22/60



Epoch 23/60



Epoch 24/60



Epoch 25/60



Epoch 26/60



Epoch 27/60



Epoch 28/60



Epoch 29/60



Epoch 30/60



Epoch 31/60



Epoch 32/60





i have a feeling the final model is going to overfit onto the dataset, might be best to stop at the 10th epoch.

I'm not quite sure what is going on with this Warning   
- 0.3391WARNING:absl:<keras.layers.recurrent.GRUCell object at 0x7f3a90345450> has the same name 'GRUCell' as a built-in Keras object. Consider renaming <class 'keras.layers.recurrent.GRUCell'> to avoid naming conflicts when loading with `tf.keras.models.load_model`. If renaming is not possible, pass the object in the `custom_objects` parameter of the load function.

Found related [issue](https://github.com/keras-team/keras/issues/15964) **Correct, the warnings have to do with the saving only and are indeed not related to model.fit**

# **Generate text using the trained model**

In [None]:
# if you know, you know
Seed_word = "Never going to take me down"
Seed_word_id = []
for char in Seed_word:
  id = char_to_token[char]
  Seed_word_id.append(id)

print(Seed_word)
print(Seed_word_id)
print(len(Seed_word_id))


Never going to take me down
[56, 30, 19, 30, 1, 28, 17, 33, 8, 58, 17, 28, 20, 33, 28, 20, 38, 54, 30, 28, 3, 30, 28, 48, 33, 39, 58]
27


In [None]:
for i in range(100):
  # get the model prediction
  prediction = Anime_qoutes_model.predict([Seed_word_id])
  # print(prediction.shape)

  # use the final prediction
  final_prediction_distribution = prediction[0][-1]
  predicted_char_id = np.argmax(final_prediction_distribution)
  predicted_char = token_to_char[predicted_char_id]
  # print(f"predicted_char_id: {predicted_char_id}")
  # print(f"predicted_char: {predicted_char}")
  
  Seed_word_id.append(predicted_char_id)



In [None]:
print(Seed_word_id)

[56, 30, 19, 30, 1, 28, 17, 33, 8, 58, 17, 28, 20, 33, 28, 20, 38, 54, 30, 28, 3, 30, 28, 48, 33, 39, 58, 28, 41, 30, 28, 49, 8, 3, 0, 42, 30, 28, 32, 20, 30, 0, 21, 28, 41, 30, 28, 25, 2, 32, 20, 28, 8, 3, 0, 33, 1, 20, 38, 58, 20, 28, 20, 33, 28, 46, 30, 28, 31, 38, 0, 0, 40, 23, 28, 18, 28, 38, 3, 28, 20, 31, 30, 28, 20, 33, 1, 28, 20, 31, 30, 28, 42, 8, 17, 31, 20, 28, 18, 15, 28, 40, 33, 2, 28, 39, 38, 58, 20, 28, 20, 33, 28, 46, 30, 28, 31, 38, 0, 0, 40, 23, 28, 18, 28, 38, 3]


In [None]:
print("Initial seed text: ", Seed_word)
list_of_generated_char = [token_to_char[token] for token in Seed_word_id]
generated_text = "".join(list_of_generated_char)
print(generated_text)

Initial seed text:  Never going to take me down
Never going to take me down We Simple step. We just important to be happy? I am the tor the light If you want to be happy? I am


well that is slightly better compared to before, but it looks like it gots stuck in a loop towards the end and keeps predicting the same set of texts.

One more try with a different set of input

In [None]:
# if you know, you know
Seed_word = "I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is my fifth gear"
Seed_word_id = []
for char in Seed_word:
  id = char_to_token[char]
  Seed_word_id.append(id)

print(Seed_word)
print(Seed_word_id)
print(len(Seed_word_id))

I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is my fifth gear
[18, 28, 15, 30, 30, 42, 28, 42, 8, 54, 30, 28, 38, 58, 40, 20, 31, 8, 58, 17, 28, 8, 32, 28, 0, 33, 32, 32, 8, 46, 42, 30, 28, 58, 33, 39, 21, 28, 18, 28, 16, 38, 58, 28, 54, 30, 30, 0, 28, 15, 8, 17, 31, 20, 8, 58, 17, 28, 38, 28, 46, 8, 20, 28, 42, 33, 58, 17, 30, 1, 21, 28, 9, 40, 28, 31, 30, 38, 1, 20, 46, 30, 38, 20, 28, 32, 33, 2, 58, 48, 32, 28, 15, 2, 58, 58, 40, 21, 28, 22, 31, 8, 32, 28, 8, 32, 28, 3, 40, 28, 0, 30, 38, 54, 52, 28, 20, 31, 8, 32, 28, 8, 32, 28, 3, 40, 28, 15, 8, 15, 20, 31, 28, 17, 30, 38, 1]
137


In [None]:
for i in range(200):
  # get the model prediction
  prediction = Anime_qoutes_model.predict([Seed_word_id])
  # print(prediction.shape)

  # use the final prediction
  final_prediction_distribution = prediction[0][-1]
  predicted_char_id = np.argmax(final_prediction_distribution)
  predicted_char = token_to_char[predicted_char_id]
  # print(f"predicted_char_id: {predicted_char_id}")
  # print(f"predicted_char: {predicted_char}")
  
  Seed_word_id.append(predicted_char_id)

In [None]:
print(Seed_word_id)

[18, 28, 15, 30, 30, 42, 28, 42, 8, 54, 30, 28, 38, 58, 40, 20, 31, 8, 58, 17, 28, 8, 32, 28, 0, 33, 32, 32, 8, 46, 42, 30, 28, 58, 33, 39, 21, 28, 18, 28, 16, 38, 58, 28, 54, 30, 30, 0, 28, 15, 8, 17, 31, 20, 8, 58, 17, 28, 38, 28, 46, 8, 20, 28, 42, 33, 58, 17, 30, 1, 21, 28, 9, 40, 28, 31, 30, 38, 1, 20, 46, 30, 38, 20, 28, 32, 33, 2, 58, 48, 32, 28, 15, 2, 58, 58, 40, 21, 28, 22, 31, 8, 32, 28, 8, 32, 28, 3, 40, 28, 0, 30, 38, 54, 52, 28, 20, 31, 8, 32, 28, 8, 32, 28, 3, 40, 28, 15, 8, 15, 20, 31, 28, 17, 30, 38, 1, 20, 28, 20, 31, 30, 28, 46, 30, 32, 20, 28, 38, 58, 48, 28, 32, 20, 33, 0, 32, 28, 38, 58, 48, 28, 16, 38, 58, 28, 58, 30, 19, 30, 1, 28, 31, 33, 39, 28, 0, 38, 20, 31, 30, 20, 8, 16, 28, 20, 31, 30, 28, 46, 30, 32, 20, 28, 38, 58, 48, 28, 32, 20, 33, 0, 32, 28, 38, 58, 48, 28, 16, 38, 58, 28, 58, 30, 19, 30, 1, 28, 31, 33, 39, 28, 0, 38, 20, 31, 30, 20, 8, 16, 28, 20, 31, 30, 28, 46, 30, 32, 20, 28, 38, 58, 48, 28, 32, 20, 33, 0, 32, 28, 38, 58, 48, 28, 16, 38, 58, 28,

In [None]:
print("Initial seed text: ", Seed_word)
list_of_generated_char = [token_to_char[token] for token in Seed_word_id]
generated_text = "".join(list_of_generated_char)
print(generated_text)

Initial seed text:  I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is my fifth gear
I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is my fifth geart the best and stops and can never how pathetic the best and stops and can never how pathetic the best and stops and can never how pathetic the best and stops and can never how pathetic the best and s


yeah it just ends up with the same set of texts towards the end

i wonder if it's because it's doing nothing with the states, as the  model goes through the sequence, it generates an output probability distribution

# **Another attempt with a different model**

In [None]:
vocab_szie = 61
Embedding_dim = 128
GRU_units = 128

In [None]:
Anime_qoutes_model_2 = tf.keras.Sequential([tf.keras.layers.Embedding(vocab_size,
                                                                    Embedding_dim),
                                          tf.keras.layers.GRU(units=GRU_units,
                                                              dropout=0.5,
                                                              recurrent_dropout=0.25,
                                                              return_sequences=True),
                                          tf.keras.layers.GRU(units=GRU_units,
                                                              dropout=0.5,
                                                              recurrent_dropout=0.25,
                                                              return_sequences=False),
                                          tf.keras.layers.Dense(units=vocab_size, activation="softmax")])

Anime_qoutes_model_2.compile(optimizer='adam',
                           loss=tf.keras.losses.SparseCategoricalCrossentropy(),
                           metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])



In [None]:
# define the model call backs
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(filepath="./model_checkpoints/model_epoch_{epoch}_loss_{loss}",
                                                               monitor='loss',
                                                               save_best_only=True)
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='loss', min_delta=0.1, patience=4)

since the model no longer generates a sequence it would fail if we attempt to train it with the current batched dataset that we have, so the label the model tries to predict should no longer be another sequence but just the final characther.

In [None]:
sequences_as_tf_data

def split_sequence(sequence):
  feature = sequence[:-1]
  label = sequence[-1]
  return feature, label

for sample in sequences_as_tf_data.take(2):
  (feature, label) = split_sequence(sample.numpy())
  print("\nFeature, Label pair")
  print("".join([token_to_char[token] for token in feature]))
  print(f"{token_to_char[label]}")



Feature, Label pair
People’s lives don’t end when they die, it ends when they lose faith. If you don’t take risks, you c
a

Feature, Label pair
n’t create a future! If you don’t like your destiny, don’t accept it. When you give up, that’s when 
t


In [None]:
# Apply the split_sequence function to the dataset
feature_label_data = sequences_as_tf_data.map(split_sequence)

for feature, label in feature_label_data.take(1):
  print(feature)
  print(label)
  

tf.Tensor(
[33 19 46 15 54 19 36 60 58 54  8  1 19 60 58  6 46  3 36 25 58 19  3  6
 58 32 20 19  3 58 25 20 19 29 58  6  8 19 23 58  8 25 58 19  3  6 60 58
 32 20 19  3 58 25 20 19 29 58 54 46 60 19 58 21 39  8 25 20 41 58 22 21
 58 29 46 59 58  6 46  3 36 25 58 25 39 34 19 58 31  8 60 34 60 23 58 29
 46 59 58 26], shape=(100,), dtype=int32)
tf.Tensor(39, shape=(), dtype=int32)


In [None]:
# create a batched_dataset to train the model on
batched_dataset = (feature_label_data.batch(1))

In [None]:
history = Anime_qoutes_model_2.fit(batched_dataset, epochs=15,
                                 callbacks=[model_checkpoint_callback, early_stopping_callback])

Epoch 1/15



Epoch 2/15



Epoch 3/15



Epoch 4/15



Epoch 5/15



Epoch 6/15



Epoch 7/15



Epoch 8/15



Epoch 9/15



Epoch 10/15



Epoch 11/15



Epoch 12/15



Epoch 13/15



Epoch 14/15



Epoch 15/15





Try out this model

In [None]:
# if you know, you know
Seed_word = "I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is gear fifth"
Seed_word_id = []
for char in Seed_word:
  id = char_to_token[char]
  Seed_word_id.append(id)

print(Seed_word)
print(Seed_word_id)
print(len(Seed_word_id))


I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is gear fifth
[22, 58, 21, 19, 19, 54, 58, 54, 8, 34, 19, 58, 39, 3, 29, 25, 20, 8, 3, 47, 58, 8, 60, 58, 15, 46, 60, 60, 8, 2, 54, 19, 58, 3, 46, 32, 41, 58, 22, 58, 26, 39, 3, 58, 34, 19, 19, 15, 58, 21, 8, 47, 20, 25, 8, 3, 47, 58, 39, 58, 2, 8, 25, 58, 54, 46, 3, 47, 19, 31, 41, 58, 28, 29, 58, 20, 19, 39, 31, 25, 2, 19, 39, 25, 58, 60, 46, 59, 3, 6, 60, 58, 21, 59, 3, 3, 29, 41, 58, 11, 20, 8, 60, 58, 8, 60, 58, 16, 29, 58, 15, 19, 39, 34, 23, 58, 25, 20, 8, 60, 58, 8, 60, 58, 47, 19, 39, 31, 58, 21, 8, 21, 25, 20]
134


In [None]:
for i in range(100):
  # get the model prediction
  prediction = Anime_qoutes_model_2.predict([Seed_word_id])
  # print(prediction.shape)

  # use the final prediction
  final_prediction_distribution = prediction[0][-1]
  predicted_char_id = np.argmax(final_prediction_distribution)
  predicted_char = token_to_char[predicted_char_id]
  # print(f"predicted_char_id: {predicted_char_id}")
  # print(f"predicted_char: {predicted_char}")
  
  Seed_word_id.append(predicted_char_id)


In [None]:
print("Initial seed text: ", Seed_word)
list_of_generated_char = [token_to_char[token] for token in Seed_word_id]
generated_text = "".join(list_of_generated_char)
print(generated_text)

Initial seed text:  I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is gear fifth
I feel like anything is possible now. I can keep fighting a bit longer. My heartbeat sounds funny. This is my peak, this is gear fifthKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKKK


I've had a look at the Text generation code using RNN and the Udacity notebook on text generation.

One thing i could try to do to resolve the repeated block of text generated is to randomly select the next char from the categorical distribution. To avoid making this notebook to big, i'll attempt this in a different notebook.