# **4th Attempt at Text Generation Using RNN**

Following on from my 3rd attempt, i thought i'd try training the model, on the same dataset, but this time not select the most probable characther from the prediction distribution.
To make model training faster as well. I think not using the model checkpoint call back at the end of every epoch would be ideal, or at the very least reduce the frequency at which it is used.


## **Import Dependencies**

In [1]:
import numpy as np
import tensorflow as tf

print(tf.__version__)

2.8.2


## **Get the Data**

In [2]:
# i downloaded the dataset from this link
url = "https://www.kaggle.com/datasets/tarundalal/anime-quotes/download?datasetVersionNumber=1"


In [3]:
import csv

# read the csvfile
anime_quotes = []

# the csv file contains Quote, character, Anime.
with open('AnimeQuotes.csv') as csv_file:
  csv_reader = csv.reader(csv_file, delimiter=',')
  for row in csv_reader:
    anime_quotes.append(row[0])

print(anime_quotes[:10])


['Quote', 'People’s lives don’t end when they die, it ends when they lose faith.', 'If you don’t take risks, you can’t create a future!', 'If you don’t like your destiny, don’t accept it.', 'When you give up, that’s when the game ends.', 'All we can do is live until the day we die. Control what we can…and fly free.', 'Forgetting is like a wound. The wound may heal, but it has already left a scar.', 'It’s just pathetic to give up on something before you even give it a shot.”', 'If you don’t share someone’s pain, you can never understand them.', 'Whatever you lose, you’ll find it again. But what you throw away you’ll never get back.']


## **Prepare the data**

In [4]:
# remove the header
anime_quotes = anime_quotes[1:]

print(anime_quotes[:10])
print(len(anime_quotes))


['People’s lives don’t end when they die, it ends when they lose faith.', 'If you don’t take risks, you can’t create a future!', 'If you don’t like your destiny, don’t accept it.', 'When you give up, that’s when the game ends.', 'All we can do is live until the day we die. Control what we can…and fly free.', 'Forgetting is like a wound. The wound may heal, but it has already left a scar.', 'It’s just pathetic to give up on something before you even give it a shot.”', 'If you don’t share someone’s pain, you can never understand them.', 'Whatever you lose, you’ll find it again. But what you throw away you’ll never get back.', 'We don’t have to know what tomorrow holds! That’s why we can live for everything we’re worth today!”']
121


In [5]:
# Combine the contents into a single string
all_anime_quotes = "".join(anime_quotes)

# get some information about the dataset
num_char = len(all_anime_quotes)
unique_chars = set(all_anime_quotes)
vocab_size = len(unique_chars)

print(f"Number of characthers in the string: {num_char}")
print(f"Number of unique characthers in the string: {vocab_size}")
print(f"Unique characthers in the string: {unique_chars}")


Number of characthers in the string: 12381
Number of unique characthers in the string: 61
Unique characthers in the string: {'g', 'U', 'q', 'C', 'h', 'o', '’', 'P', '…', 'x', '-', 't', 'T', 'S', '.', 'G', 'E', ' ', 'c', '“', '?', 'z', 'J', 'w', 'A', 'f', 'i', 'H', 'K', 'B', ',', 'I', 'j', 'N', 'l', 'e', 'R', 'n', 'M', 'k', 'W', 'm', 'O', 'd', 'p', 'b', '”', 'V', 'v', 's', 'u', 'D', ':', '!', 'Y', 'r', 'y', '\xa0', 'a', 'F', 'L'}


**Define the characther encoder**

In [6]:
# define a dictionary to map the chars to a token
char_to_token = dict((char, index) for index, char in enumerate(unique_chars))
print(char_to_token)


{'g': 0, 'U': 1, 'q': 2, 'C': 3, 'h': 4, 'o': 5, '’': 6, 'P': 7, '…': 8, 'x': 9, '-': 10, 't': 11, 'T': 12, 'S': 13, '.': 14, 'G': 15, 'E': 16, ' ': 17, 'c': 18, '“': 19, '?': 20, 'z': 21, 'J': 22, 'w': 23, 'A': 24, 'f': 25, 'i': 26, 'H': 27, 'K': 28, 'B': 29, ',': 30, 'I': 31, 'j': 32, 'N': 33, 'l': 34, 'e': 35, 'R': 36, 'n': 37, 'M': 38, 'k': 39, 'W': 40, 'm': 41, 'O': 42, 'd': 43, 'p': 44, 'b': 45, '”': 46, 'V': 47, 'v': 48, 's': 49, 'u': 50, 'D': 51, ':': 52, '!': 53, 'Y': 54, 'r': 55, 'y': 56, '\xa0': 57, 'a': 58, 'F': 59, 'L': 60}


**Define the characther decoder**

In [7]:
# define a dictionary to map the token to a characther
token_to_char = dict((value, key) for key, value in char_to_token.items())
print(token_to_char)


{0: 'g', 1: 'U', 2: 'q', 3: 'C', 4: 'h', 5: 'o', 6: '’', 7: 'P', 8: '…', 9: 'x', 10: '-', 11: 't', 12: 'T', 13: 'S', 14: '.', 15: 'G', 16: 'E', 17: ' ', 18: 'c', 19: '“', 20: '?', 21: 'z', 22: 'J', 23: 'w', 24: 'A', 25: 'f', 26: 'i', 27: 'H', 28: 'K', 29: 'B', 30: ',', 31: 'I', 32: 'j', 33: 'N', 34: 'l', 35: 'e', 36: 'R', 37: 'n', 38: 'M', 39: 'k', 40: 'W', 41: 'm', 42: 'O', 43: 'd', 44: 'p', 45: 'b', 46: '”', 47: 'V', 48: 'v', 49: 's', 50: 'u', 51: 'D', 52: ':', 53: '!', 54: 'Y', 55: 'r', 56: 'y', 57: '\xa0', 58: 'a', 59: 'F', 60: 'L'}


In [13]:
# Sanity check
print(char_to_token['A'])
print(token_to_char[char_to_token['A']])


24
A


**Text preparation process**
- convert the char into tokens
- split the list of tokens into a sequence
- Generate a dataset containing 3 dimensions `[batch dimension, sequence length, vocab size]`
- extract the label from the dataset

**Convert each characthers in the text into Token**

In [14]:
# create a list of tokens
all_anime_quotes_token = []

for char in all_anime_quotes:
  all_anime_quotes_token.append(char_to_token[char])

assert len(all_anime_quotes_token) == len(all_anime_quotes) 


In [15]:
print(all_anime_quotes_token)

[7, 35, 5, 44, 34, 35, 6, 49, 17, 34, 26, 48, 35, 49, 17, 43, 5, 37, 6, 11, 17, 35, 37, 43, 17, 23, 4, 35, 37, 17, 11, 4, 35, 56, 17, 43, 26, 35, 30, 17, 26, 11, 17, 35, 37, 43, 49, 17, 23, 4, 35, 37, 17, 11, 4, 35, 56, 17, 34, 5, 49, 35, 17, 25, 58, 26, 11, 4, 14, 31, 25, 17, 56, 5, 50, 17, 43, 5, 37, 6, 11, 17, 11, 58, 39, 35, 17, 55, 26, 49, 39, 49, 30, 17, 56, 5, 50, 17, 18, 58, 37, 6, 11, 17, 18, 55, 35, 58, 11, 35, 17, 58, 17, 25, 50, 11, 50, 55, 35, 53, 31, 25, 17, 56, 5, 50, 17, 43, 5, 37, 6, 11, 17, 34, 26, 39, 35, 17, 56, 5, 50, 55, 17, 43, 35, 49, 11, 26, 37, 56, 30, 17, 43, 5, 37, 6, 11, 17, 58, 18, 18, 35, 44, 11, 17, 26, 11, 14, 40, 4, 35, 37, 17, 56, 5, 50, 17, 0, 26, 48, 35, 17, 50, 44, 30, 17, 11, 4, 58, 11, 6, 49, 17, 23, 4, 35, 37, 17, 11, 4, 35, 17, 0, 58, 41, 35, 17, 35, 37, 43, 49, 14, 24, 34, 34, 17, 23, 35, 17, 18, 58, 37, 17, 43, 5, 17, 26, 49, 17, 34, 26, 48, 35, 17, 50, 37, 11, 26, 34, 17, 11, 4, 35, 17, 43, 58, 56, 17, 23, 35, 17, 43, 26, 35, 14, 17, 3, 5, 3

**Generate a dataset from the tokens**

In [16]:
anime_dataset = tf.data.Dataset.from_tensor_slices(all_anime_quotes_token)

print([sample.numpy() for sample in anime_dataset.take(4)])

[7, 35, 5, 44]


**Split the dataset into sequences**

In [17]:
SEQUENCE_LENGTH = 100
anime_dataset_sequences = anime_dataset.batch(SEQUENCE_LENGTH + 1) # account for offset

example_sample = list(anime_dataset_sequences.take(1).as_numpy_iterator())
print(example_sample)
print(len(example_sample[0]))



[array([ 7, 35,  5, 44, 34, 35,  6, 49, 17, 34, 26, 48, 35, 49, 17, 43,  5,
       37,  6, 11, 17, 35, 37, 43, 17, 23,  4, 35, 37, 17, 11,  4, 35, 56,
       17, 43, 26, 35, 30, 17, 26, 11, 17, 35, 37, 43, 49, 17, 23,  4, 35,
       37, 17, 11,  4, 35, 56, 17, 34,  5, 49, 35, 17, 25, 58, 26, 11,  4,
       14, 31, 25, 17, 56,  5, 50, 17, 43,  5, 37,  6, 11, 17, 11, 58, 39,
       35, 17, 55, 26, 49, 39, 49, 30, 17, 56,  5, 50, 17, 18, 58, 37],
      dtype=int32)]
101


**Convert each sequence into sample (feature + label pair)**

In [18]:
# define a function to convert each sequence into a sample
def create_sample(sequence):
  feature = sequence[:-1]
  label = sequence[1:]
  return feature, label


In [19]:
# try out the function on a single sample from the dataset
for sequence in anime_dataset_sequences.take(1):
  sequence = sequence.numpy()
  feature, label = create_sample(sequence)
  print(feature)
  print(label)
  assert len(feature) == 100
  assert len(label) == 100


[ 7 35  5 44 34 35  6 49 17 34 26 48 35 49 17 43  5 37  6 11 17 35 37 43
 17 23  4 35 37 17 11  4 35 56 17 43 26 35 30 17 26 11 17 35 37 43 49 17
 23  4 35 37 17 11  4 35 56 17 34  5 49 35 17 25 58 26 11  4 14 31 25 17
 56  5 50 17 43  5 37  6 11 17 11 58 39 35 17 55 26 49 39 49 30 17 56  5
 50 17 18 58]
[35  5 44 34 35  6 49 17 34 26 48 35 49 17 43  5 37  6 11 17 35 37 43 17
 23  4 35 37 17 11  4 35 56 17 43 26 35 30 17 26 11 17 35 37 43 49 17 23
  4 35 37 17 11  4 35 56 17 34  5 49 35 17 25 58 26 11  4 14 31 25 17 56
  5 50 17 43  5 37  6 11 17 11 58 39 35 17 55 26 49 39 49 30 17 56  5 50
 17 18 58 37]


In [20]:
# Apply the function to the datset
anime_dataset_samples = anime_dataset_sequences.map(create_sample)


In [21]:
for feature, label in anime_dataset_samples.take(1):
  print(feature.numpy())
  print(label.numpy())
  

[ 7 35  5 44 34 35  6 49 17 34 26 48 35 49 17 43  5 37  6 11 17 35 37 43
 17 23  4 35 37 17 11  4 35 56 17 43 26 35 30 17 26 11 17 35 37 43 49 17
 23  4 35 37 17 11  4 35 56 17 34  5 49 35 17 25 58 26 11  4 14 31 25 17
 56  5 50 17 43  5 37  6 11 17 11 58 39 35 17 55 26 49 39 49 30 17 56  5
 50 17 18 58]
[35  5 44 34 35  6 49 17 34 26 48 35 49 17 43  5 37  6 11 17 35 37 43 17
 23  4 35 37 17 11  4 35 56 17 43 26 35 30 17 26 11 17 35 37 43 49 17 23
  4 35 37 17 11  4 35 56 17 34  5 49 35 17 25 58 26 11  4 14 31 25 17 56
  5 50 17 43  5 37  6 11 17 11 58 39 35 17 55 26 49 39 49 30 17 56  5 50
 17 18 58 37]


**Apply the batch dimension**

In [22]:
anime_dataset_batched = anime_dataset_samples.batch(1)


In [23]:
for batched_example in anime_dataset_batched.take(1):
  print(batched_example)

(<tf.Tensor: shape=(1, 100), dtype=int32, numpy=
array([[ 7, 35,  5, 44, 34, 35,  6, 49, 17, 34, 26, 48, 35, 49, 17, 43,
         5, 37,  6, 11, 17, 35, 37, 43, 17, 23,  4, 35, 37, 17, 11,  4,
        35, 56, 17, 43, 26, 35, 30, 17, 26, 11, 17, 35, 37, 43, 49, 17,
        23,  4, 35, 37, 17, 11,  4, 35, 56, 17, 34,  5, 49, 35, 17, 25,
        58, 26, 11,  4, 14, 31, 25, 17, 56,  5, 50, 17, 43,  5, 37,  6,
        11, 17, 11, 58, 39, 35, 17, 55, 26, 49, 39, 49, 30, 17, 56,  5,
        50, 17, 18, 58]], dtype=int32)>, <tf.Tensor: shape=(1, 100), dtype=int32, numpy=
array([[35,  5, 44, 34, 35,  6, 49, 17, 34, 26, 48, 35, 49, 17, 43,  5,
        37,  6, 11, 17, 35, 37, 43, 17, 23,  4, 35, 37, 17, 11,  4, 35,
        56, 17, 43, 26, 35, 30, 17, 26, 11, 17, 35, 37, 43, 49, 17, 23,
         4, 35, 37, 17, 11,  4, 35, 56, 17, 34,  5, 49, 35, 17, 25, 58,
        26, 11,  4, 14, 31, 25, 17, 56,  5, 50, 17, 43,  5, 37,  6, 11,
        17, 11, 58, 39, 35, 17, 55, 26, 49, 39, 49, 30, 17, 56,  5, 50

## **Define the model**


In [24]:
embedding_dim = 128
gru_units = 256 
vocab_size
SEQUENCE_LENGTH

100

In [31]:
# define the RNN model
anime_qoutes_model = tf.keras.Sequential([tf.keras.layers.Embedding(vocab_size, embedding_dim),
                                          tf.keras.layers.GRU(gru_units, return_sequences=True),
                                          tf.keras.layers.Dense(vocab_size)]) # not the using the softmax activation function

anime_qoutes_model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
                           optimizer=tf.keras.optimizers.Adam(),
                           metrics=[tf.keras.metrics.SparseCategoricalAccuracy()])

anime_qoutes_model.summary()


Model: "sequential_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     (None, None, 128)         7808      
                                                                 
 gru_1 (GRU)                 (None, None, 256)         296448    
                                                                 
 dense_1 (Dense)             (None, None, 61)          15677     
                                                                 
Total params: 319,933
Trainable params: 319,933
Non-trainable params: 0
_________________________________________________________________


## **Train the model**

In [32]:
# define the model callbacks
early_stopping_callback = tf.keras.callbacks.EarlyStopping(monitor='loss',
                                                           min_delta=0.1,
                                                           patience=4,
                                                           restore_best_weights=True)
# i skipped using the model_checkpoint callback

In [33]:
EPOCH = 50

history = anime_qoutes_model.fit(anime_dataset_batched, epochs=EPOCH,
                                 callbacks=[early_stopping_callback])


Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50


yikes, that was really quick

## **Generate text using the model**

In [39]:
seed_word = "Never going to take me down"
seed_word_id = []
for char in seed_word:
  seed_word_id.append(char_to_token[char])


print(seed_word)
print(seed_word_id)
print(len(seed_word_id))


Never going to take me down
[33, 35, 48, 35, 55, 17, 0, 5, 26, 37, 0, 17, 11, 5, 17, 11, 58, 39, 35, 17, 41, 35, 17, 43, 5, 23, 37]
27


In [35]:
# define a function to convert the sequences back into string

def convert_sequence_to_text(sequence):
  text = "".join([token_to_char[token] for token in sequence])
  return text


In [36]:
# try out the defined function
print(convert_sequence_to_text(seed_word_id))


Never going to take me down


In [38]:
# Apply the seed word to the model
prediction = anime_qoutes_model.predict([seed_word_id])
print(prediction.shape)


(1, 27, 61)


In [40]:
last_prediction_distribution = prediction[0][-1]
print(last_prediction_distribution)


[-1.262339   -4.9454017  -8.605406   -7.217613   -7.604553    3.8042288
  4.027885   -1.7126205  -2.5859444  -4.00648    -2.0662582   3.133666
  0.61678535 -6.9769864   5.188908   -6.385815   -7.3067923   9.912058
  0.27676857 -2.2298343   3.8000202  -0.04259861 -4.801853   -3.5337353
 -3.1620827  -0.20651534 -2.0969124  -1.1270517  -5.738397   -5.8237863
  4.7318735   0.2872066  -5.459974   -1.693542   -4.205974    8.36103
 -3.8628836   3.3215897  -3.7585382  -3.7995265  -3.9572623   0.17300925
 -5.2188177   0.9885816  -6.1593227  -3.3142662  -1.6808448  -4.6945286
 -8.299984    1.1230333  -1.1680001  -1.8657911  -2.0646071   9.69963
 -0.6555708  -2.951839    1.0693482   0.13568424  0.51982045 -4.3344088
 -2.2094095 ]


In [51]:
predicted_char_id = tf.random.categorical([last_prediction_distribution], 1)
predicted_char_id = tf.squeeze(predicted_char_id.numpy()).numpy()
print(predicted_char_id)

17


In [52]:
# Repeat the above codes to generate text
for i in range(100):
  # get a prediction from the model
  prediction = anime_qoutes_model.predict([seed_word_id])
  
  # get the very last categorical distribution
  last_prediction_distribution = prediction[0][-1]

  # "randomly" select a category from the unnormalized distribution
  predicted_char_id = tf.random.categorical([last_prediction_distribution], 1)
  predicted_char_id = tf.squeeze(predicted_char_id.numpy()).numpy()

  # append the prediction to the initial seed word
  seed_word_id.append(predicted_char_id)


print(seed_word_id)


[33, 35, 48, 35, 55, 17, 0, 5, 26, 37, 0, 17, 11, 5, 17, 11, 58, 39, 35, 17, 41, 35, 17, 43, 5, 23, 37, 17, 50, 44, 30, 17, 11, 4, 26, 37, 39, 17, 58, 17, 43, 5, 50, 49, 37, 35, 35, 49, 17, 49, 11, 55, 50, 34, 0, 57, 25, 5, 55, 17, 56, 5, 50, 17, 11, 4, 35, 11, 17, 56, 5, 50, 6, 55, 35, 17, 37, 5, 11, 17, 11, 4, 35, 17, 45, 35, 49, 11, 17, 58, 37, 43, 17, 58, 49, 17, 18, 58, 37, 14, 17, 31, 11, 17, 32, 5, 17, 58, 18, 5, 41, 35, 17, 25, 58, 18, 11, 34, 56, 17, 58, 17, 56, 5, 50, 17, 34]


In [53]:
# convert the sequence back into a text
generated_text = convert_sequence_to_text(seed_word_id)
print(generated_text)

Never going to take me down up, think a dousnees strulg for you thet you’re not the best and as can. It jo acome factly a you l


it doesn't look like it's repeating on it's self but the generated text is not very coherent, lets try with a longer duration.

In [54]:
seed_word = "I am going to be king of the pirates"
seed_word_id = []
for char in seed_word:
  seed_word_id.append(char_to_token[char])


print(seed_word)
print(seed_word_id)
print(len(seed_word_id))

I am going to be king of the pirates
[31, 17, 58, 41, 17, 0, 5, 26, 37, 0, 17, 11, 5, 17, 45, 35, 17, 39, 26, 37, 0, 17, 5, 25, 17, 11, 4, 35, 17, 44, 26, 55, 58, 11, 35, 49]
36


In [55]:
for i in range(300):
  # get a prediction from the model
  prediction = anime_qoutes_model.predict([seed_word_id])
  
  # get the very last categorical distribution
  last_prediction_distribution = prediction[0][-1]

  # "randomly" select a category from the unnormalized distribution
  predicted_char_id = tf.random.categorical([last_prediction_distribution], 1)
  predicted_char_id = tf.squeeze(predicted_char_id.numpy()).numpy()

  # append the prediction to the initial seed word
  seed_word_id.append(predicted_char_id)


print(seed_word_id)

[31, 17, 58, 41, 17, 0, 5, 26, 37, 0, 17, 11, 5, 17, 45, 35, 17, 39, 26, 37, 0, 17, 5, 25, 17, 11, 4, 35, 17, 44, 26, 55, 58, 11, 35, 49, 11, 17, 58, 37, 43, 17, 11, 4, 35, 17, 49, 26, 0, 4, 11, 17, 5, 37, 17, 56, 5, 50, 55, 17, 11, 50, 35, 49, 17, 34, 26, 48, 35, 17, 50, 37, 11, 26, 34, 17, 11, 4, 35, 17, 43, 58, 56, 17, 23, 35, 17, 43, 26, 35, 14, 17, 24, 37, 43, 17, 56, 5, 50, 17, 49, 50, 35, 34, 58, 34, 26, 37, 0, 17, 34, 5, 48, 35, 17, 25, 5, 55, 23, 58, 55, 43, 14, 17, 15, 58, 48, 35, 17, 56, 5, 50, 17, 18, 58, 37, 17, 18, 58, 34, 34, 17, 45, 35, 0, 5, 55, 26, 11, 17, 56, 5, 50, 17, 49, 58, 34, 17, 56, 5, 50, 17, 11, 5, 17, 11, 50, 35, 37, 17, 34, 26, 35, 48, 35, 17, 26, 37, 17, 11, 4, 35, 17, 49, 26, 43, 17, 26, 49, 17, 11, 5, 17, 37, 5, 11, 17, 34, 35, 11, 17, 5, 25, 17, 11, 4, 35, 17, 23, 5, 55, 34, 43, 17, 31, 49, 17, 3, 55, 50, 35, 43, 17, 12, 4, 35, 17, 60, 35, 58, 37, 17, 56, 5, 50, 17, 4, 5, 48, 35, 17, 37, 5, 11, 17, 58, 17, 0, 35, 41, 26, 49, 26, 5, 37, 49, 16, 34, 34, 

In [56]:
# convert the sequence back into a text
generated_text = convert_sequence_to_text(seed_word_id)
print(generated_text)


I am going to be king of the piratest and the sight on your tues live until the day we die. And you suelaling love forward. Gave you can call begorit you sal you to tuen lieve in the sid is to not let of the world Is Crued The Lean you hove not a gemisionsEll fly because something. But by enduring that is the truth. Theever cloies. Do


it not repeating on it's self but it's not sensible words either. Lets try selecting on the most probable word

In [58]:
seed_word = "We keep moving forward, opening new doors and doing new things, because we are curious and curiosity keeps leading us down new paths"
seed_word_id = []
for char in seed_word:
  seed_word_id.append(char_to_token[char])


print(seed_word)
print(seed_word_id)
print(len(seed_word_id))

We keep moving forward, opening new doors and doing new things, because we are curious and curiosity keeps leading us down new paths
[40, 35, 17, 39, 35, 35, 44, 17, 41, 5, 48, 26, 37, 0, 17, 25, 5, 55, 23, 58, 55, 43, 30, 17, 5, 44, 35, 37, 26, 37, 0, 17, 37, 35, 23, 17, 43, 5, 5, 55, 49, 17, 58, 37, 43, 17, 43, 5, 26, 37, 0, 17, 37, 35, 23, 17, 11, 4, 26, 37, 0, 49, 30, 17, 45, 35, 18, 58, 50, 49, 35, 17, 23, 35, 17, 58, 55, 35, 17, 18, 50, 55, 26, 5, 50, 49, 17, 58, 37, 43, 17, 18, 50, 55, 26, 5, 49, 26, 11, 56, 17, 39, 35, 35, 44, 49, 17, 34, 35, 58, 43, 26, 37, 0, 17, 50, 49, 17, 43, 5, 23, 37, 17, 37, 35, 23, 17, 44, 58, 11, 4, 49]
132


In [59]:
for i in range(300):
  # get a prediction from the model
  prediction = anime_qoutes_model.predict([seed_word_id])
  
  # get the very last categorical distribution
  last_prediction_distribution = prediction[0][-1]

  # select only the most probable char
  predicted_char_id = np.argmax(last_prediction_distribution)

  # append the prediction to the initial seed word
  seed_word_id.append(predicted_char_id)


print(seed_word_id)

[40, 35, 17, 39, 35, 35, 44, 17, 41, 5, 48, 26, 37, 0, 17, 25, 5, 55, 23, 58, 55, 43, 30, 17, 5, 44, 35, 37, 26, 37, 0, 17, 37, 35, 23, 17, 43, 5, 5, 55, 49, 17, 58, 37, 43, 17, 43, 5, 26, 37, 0, 17, 37, 35, 23, 17, 11, 4, 26, 37, 0, 49, 30, 17, 45, 35, 18, 58, 50, 49, 35, 17, 23, 35, 17, 58, 55, 35, 17, 18, 50, 55, 26, 5, 50, 49, 17, 58, 37, 43, 17, 18, 50, 55, 26, 5, 49, 26, 11, 56, 17, 39, 35, 35, 44, 49, 17, 34, 35, 58, 43, 26, 37, 0, 17, 50, 49, 17, 43, 5, 23, 37, 17, 37, 35, 23, 17, 44, 58, 11, 4, 49, 17, 26, 11, 17, 49, 5, 17, 43, 5, 37, 6, 11, 17, 35, 37, 43, 17, 23, 4, 35, 37, 17, 11, 4, 35, 17, 41, 5, 41, 35, 37, 11, 17, 26, 11, 17, 11, 55, 50, 34, 35, 17, 12, 5, 17, 29, 35, 17, 16, 37, 11, 26, 55, 35, 34, 56, 17, 15, 5, 5, 43, 17, 12, 5, 17, 16, 48, 35, 55, 56, 5, 37, 35, 17, 12, 5, 17, 13, 5, 41, 35, 17, 54, 5, 50, 17, 24, 55, 35, 17, 24, 17, 29, 58, 43, 17, 7, 35, 55, 49, 5, 37, 17, 31, 11, 17, 31, 49, 17, 31, 41, 44, 5, 49, 49, 26, 45, 34, 35, 17, 12, 5, 17, 29, 35, 17, 1

In [60]:
# convert the sequence back into a text
generated_text = convert_sequence_to_text(seed_word_id)
print(generated_text)


We keep moving forward, opening new doors and doing new things, because we are curious and curiosity keeps leading us down new paths it so don’t end when the moment it trule To Be Entirely Good To Everyone To Some You Are A Bad Person It Is Impossible To Be Entirely Good To Everyone To Some You Are A Bad Person It Is Impossible To Be Entirely Good To Everyone To Some You Are A Bad Person It Is Impossible To Be Entirely Good To E


in selecting the most probable word it just repeats on its self. whereas in using the random categorical function, it doesn't loop in on itself but the generated text is not coherent either/