# RNN PlaY Generator
Now we are going to use RNN to generate a next character for us when typing , we want it to recreate and it will learn how to write a version of its own.
#
 we will do this using a character predictive model that will take as input a variable length sequence and predict the next character for us
#
we can use the model many times in a row with the output from the last prediction as the input for the next till we get a play/ good paragraph
#

all this guides are in : https://www.tensorflow.org/tutorials/text/text_generation

#### Always make sure tensorflow is installed if not use this code
!pip install tensorflow  # if your in a notebook

In [75]:
from keras.preprocessing import sequence
import keras
import tensorflow as tf
import os
import numpy as np


## Data Set
 For this example we only need one peice of data. infact we can wright our own poem and pass that to the network for training if we'd like. however lets yuse something esay  lets use from a shakespear play

In [76]:
path_to_file = tf.keras.utils.get_file("shakespeare.txt", "https://storage.googleapis.com/download.tensorflow.org/data/shakespeare.txt")

#### Loading Your Own Data
to load your own data you'll need to upload a file from the dialog below. the you'll need to follow the steps from above but load in this new file insead
#
uncomment it to use the code below and make sure the file you load is a .txt file from your machine

In [77]:

#from google.colab import files
#path_to_file =  list(files.upload().keys())[0]

## Read Contents of file

In [78]:
#Read, then decode for py2 compat
text = open(path_to_file, 'rb').read().decode(encoding='utf-8')

# Length of text is the number of characters in it
print (" length of text: {} characters".format(len(text)))

 length of text: 1115394 characters


In [79]:
# first 250 Characters in text
print(text[:250])

First Citizen:
Before we proceed any further, hear me speak.

All:
Speak, speak.

First Citizen:
You are all resolved rather to die than to famish?

All:
Resolved. resolved.

First Citizen:
First, you know Caius Marcius is chief enemy to the people.



### Encoding
Since this text isint encoded yet well need to do that ourself. we are going to encode each unique characters as differnt integer

In [80]:
vocab = sorted(set(text))
#creating a mapping from unique characters to indices
char2idx = {u:i for i, u in enumerate(vocab)}
idx2char = np.array(vocab)

def text_to_int(text):
    return np.array([char2idx[c] for c in text])
text_as_int = text_to_int(text)


In [81]:
#lets look at how part of our text is encoded
print('text: ', text[:13])
print("Encoded: ",text_to_int(text[:13]))

text:  First Citizen
Encoded:  [18 47 56 57 58  1 15 47 58 47 64 43 52]


##### And here we will make a function that can convert our numeric values to text

In [82]:
def int_to_text(ints):
    try:
        ints = ints.numpy()
    except:
        pass
    return ''. join(idx2char[ints])
print(int_to_text(text_as_int[:13]))

First Citizen


## Creating Training Examples
Remeber our task is to feed the model a sequence and have it return to us the next character. this means we need to split our text data from above into shorter sequences that we can pass to the model as training examples
#
the training example we will prepare will use a seq_length sequence as input  and seq_length sequence as the output where that sequence is the original sequence shifted one letter to the right example:
#
input: Hell | output: ello
#
our first task is to create a stream of characters from our text data

In [83]:
seq_length = 100 # length of sequence for a training example
examples_per_epoch = len(text)//(seq_length + 1)

#Create training examples / targets
char_dataset = tf.data.Dataset.from_tensor_slices(text_as_int)

##### Next we can  use the batch method to turn this stream of characters into bactches of desired length

In [84]:
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)

##### Now we use this sequences of length 101 and split them into  input and output.

In [85]:
def split_input_target(chunk): # FOR example: hello
    input_text = chunk[:-1] # hell
    target_text = chunk[1:] # ello
    return input_text, target_text # hell, ello

dataset = sequences.map(split_input_target) # we use map to apply the above function to every entry

In [86]:
for x, y in dataset.take(2):
    print("\n\nEXAMPLE\n")
    print("INPUT")
    print(int_to_text(x))
    print("\nOUTPUT")
    print(int_to_text(y))



EXAMPLE

INPUT
First Citizen:
Befor

OUTPUT
irst Citizen:
Before


EXAMPLE

INPUT
 we proceed any furt

OUTPUT
we proceed any furth


##### Finally we need to make training batches

In [87]:
BATCH_SIZE = 64
VOCAB_SIZE = len(vocab) # vocab is number of unique Characters
EMBEDDING_DIM = 256
RNN_UNITS = 1024

# Buffer size to shuffle the dataset
# (TF data is designed to work with possibly infinite sequences,
#  so it doesent attempt to shuffle the entire sequence in memory. Instead,
# it maintains a buffer in which it shuffles elements)

BUFFER_SIZE = 10000

data = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder = True)
print(data)

<_BatchDataset element_spec=(TensorSpec(shape=(64, 20), dtype=tf.int64, name=None), TensorSpec(shape=(64, 20), dtype=tf.int64, name=None))>


### Building the Model
now we will use an embedding layer as LSTM and one dense layer that contains node for each unique Character in our training data the dense layer will give us a probability distribution over all nodes

In [88]:
# the reson we are first making a function is that later we are going to be pusing bactches of 64size data
def build_model(vocab_size, embedding_dim, rnn_units, batch_size):
    model = tf.keras.Sequential([
        tf.keras.layers.Embedding(vocab_size, embedding_dim,
                                batch_input_shape=[batch_size, None]),
        tf.keras.layers.LSTM(rnn_units,
                            return_sequences= True,
                            stateful = True,
                            recurrent_initializer = 'glorot_uniform'),
        tf.keras.layers.Dense(vocab_size)

    ])
    return model
model = build_model(VOCAB_SIZE,EMBEDDING_DIM,RNN_UNITS,BATCH_SIZE)
model.summary()

Model: "sequential_4"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_4 (Embedding)     (64, None, 256)           16640     
                                                                 
 lstm_4 (LSTM)               (64, None, 1024)          5246976   
                                                                 
 dense_4 (Dense)             (64, None, 65)            66625     
                                                                 
Total params: 5,330,241
Trainable params: 5,330,241
Non-trainable params: 0
_________________________________________________________________


## Creating a Loss Function
We will create our own loss function , this is becouse our model will output a (64, sequence_lenth, 65) shaped tensor that represents the probability distribution of each characterat each timestep for every sequence in the batch.
#
be4 we do this lets first look at input and output of our untraiend data so wecan understand what the model is actually giving us

In [89]:
for input_example_batch, target_example_batch in data.take(1):
    example_batch_predictions = model(input_example_batch)  # ask our model for a prediction on our first batch of training data
    print(example_batch_predictions.shape, " #(batch_size, sequence_length, vocab_size)")# print out the output shape



(64, 20, 65)  #(batch_size, sequence_length, vocab_size)


In [90]:
# we can see that the prediction is an array of 64 arrays, one for each entry in the batch
print(len(example_batch_predictions))
print(example_batch_predictions)

64
tf.Tensor(
[[[-2.9305695e-05 -4.2956294e-03 -5.2133258e-03 ...  9.5791626e-04
   -3.4533191e-04  3.2098237e-03]
  [-1.3752125e-03 -5.8074058e-03 -5.7119508e-03 ...  2.7252191e-03
   -1.4251822e-03  1.1254540e-02]
  [ 5.8382819e-03 -6.6267042e-03 -3.8421443e-03 ...  1.3222675e-03
    3.1737301e-03  8.9204768e-03]
  ...
  [ 8.2987072e-03  1.0339039e-02  1.2308577e-02 ...  4.0844036e-03
    8.1793619e-03  2.5647180e-04]
  [-1.1709656e-03  1.1991293e-02  4.1806553e-03 ... -2.0262958e-03
    1.1298696e-02  2.6900638e-03]
  [-3.6952903e-03  1.2396922e-02  1.1973148e-02 ...  1.3705932e-03
    1.1277175e-02  1.5380629e-03]]

 [[-2.9305695e-05 -4.2956294e-03 -5.2133258e-03 ...  9.5791626e-04
   -3.4533191e-04  3.2098237e-03]
  [-5.8811123e-04 -5.6665693e-04 -4.3918639e-03 ...  2.6377849e-04
   -6.3296678e-03  4.3960810e-03]
  [ 1.1733968e-03 -3.7005169e-03 -2.3881635e-03 ...  7.7000959e-03
   -8.7662712e-03 -8.3507347e-04]
  ...
  [-9.1525670e-03  1.3997662e-02  8.5285744e-03 ... -2.1275708e

In [91]:
# Lets examine one prediction
pred = example_batch_predictions[0]
print (len(pred))
print(pred)
# Notice this is 2d array of length 100 where exach interior array is the prediction for the next character at each time step

20
tf.Tensor(
[[-2.9305695e-05 -4.2956294e-03 -5.2133258e-03 ...  9.5791626e-04
  -3.4533191e-04  3.2098237e-03]
 [-1.3752125e-03 -5.8074058e-03 -5.7119508e-03 ...  2.7252191e-03
  -1.4251822e-03  1.1254540e-02]
 [ 5.8382819e-03 -6.6267042e-03 -3.8421443e-03 ...  1.3222675e-03
   3.1737301e-03  8.9204768e-03]
 ...
 [ 8.2987072e-03  1.0339039e-02  1.2308577e-02 ...  4.0844036e-03
   8.1793619e-03  2.5647180e-04]
 [-1.1709656e-03  1.1991293e-02  4.1806553e-03 ... -2.0262958e-03
   1.1298696e-02  2.6900638e-03]
 [-3.6952903e-03  1.2396922e-02  1.1973148e-02 ...  1.3705932e-03
   1.1277175e-02  1.5380629e-03]], shape=(20, 65), dtype=float32)


In [92]:
# and Finally well look at a prediction at the first timestep
time_pred = pred[0]
print(len(time_pred))
print(time_pred)
#and of course its 65 values representing the probanility of each occuring next

65
tf.Tensor(
[-2.9305695e-05 -4.2956294e-03 -5.2133258e-03 -1.5957048e-03
 -2.6199180e-03 -6.3154206e-05  2.1698875e-03 -1.5188932e-03
  1.5875272e-03  3.0522957e-03  7.5091573e-04  4.7875801e-04
  1.6347948e-03  5.2597430e-03 -1.7516541e-03 -1.1778957e-03
  2.4833458e-03 -3.6216460e-03 -1.3057920e-03 -7.4356874e-03
 -4.6790414e-03 -8.2352874e-04 -2.7526062e-04  1.1525935e-04
  8.9232239e-04  8.3908549e-04 -1.3037049e-03 -2.3174258e-03
  1.3266376e-04 -3.4023642e-03  1.4815219e-03 -5.2332985e-03
 -3.4534530e-04 -4.2564035e-03 -1.6735424e-03 -1.6155639e-03
 -2.0279887e-03  2.2832681e-03 -2.3553397e-03 -2.8063357e-04
 -2.3779168e-03  4.0577451e-04 -1.6055136e-03  9.6513890e-04
  1.8542563e-04 -1.0320484e-03  4.3390971e-04  1.9718506e-03
  4.0854369e-03 -2.8712134e-04 -1.1179373e-03 -1.1089054e-03
  5.7534967e-03 -6.3680205e-03 -3.7487708e-03  3.3414911e-03
 -1.2157069e-03  5.4902229e-03  2.0733010e-04  3.1038190e-03
 -8.7722734e-04 -7.6791632e-04  9.5791626e-04 -3.4533191e-04
  3.209823

In [93]:
# If we wantto determine the predicted character we need to sample the output distribution(pick a value based on probabilitis)
sampled_indices = tf.random.categorical(pred, num_samples=1)

# now we can reshape that array and convert all the integers to numbers to see the actual Characters
sampled_indices = np.reshape(sampled_indices, (1, -1))[0]
predicted_chars = int_to_text(sampled_indices)

predicted_chars # and this is what the model predicted for training sequence 1

'jwAUsXkPUf?K !SHCrvP'

##### Becose we do not have a loss function in tensorflow that can check a 3D array of prediction and tell us oure loss we need to create one of our own  we do it this way
 so now we need to create a loss function that can compare that output to the expected output and gives us more numerical value repesenting how close the two were

In [94]:
def loss(labels, logits):
    return tf.keras.losses.sparse_categorical_crossentropy(labels,logits, from_logits= True)

### Compiling the model
at this point we can think of our problem as classification problem where the model predicts the probability of each unique letter coming next

In [95]:
model.compile(optimizer = 'adam', loss = loss)

### Creating Checkpoints
now we are going to steup and configure our model to save checkpoints as it trains. this will allow us to load our model and continue training it

In [96]:
# Directory where the checkpoints will be saved
checkpoint_dir = './training_checkpoints'

# Name of the checkpoint files
checkpoint_prefix = os.path.join(checkpoint_dir, "ckpt_{epoch}")
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
    filepath = checkpoint_prefix,
    save_weights_only=True
)

## Training Model

In [97]:
history = model.fit(data, epochs = 50, callbacks= [checkpoint_callback])

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

### Loading the Model
we will rebuild the model from a checkpoint using a batch_size of 1  so that we can feed one peice of text to the model and have it make a prediction.

In [98]:
model = build_model(VOCAB_SIZE, EMBEDDING_DIM, RNN_UNITS, batch_size = 1)

once the model is finished training we can find the lastest checkpoint that stores the model weights using the following line

In [99]:
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))

We can load any checkpoint we want by specifying the exact file to load

use this code insteade os above code if you want to use your own checkpoints

In [100]:
#checkpoint_num =  10
#model.load_weights(tf.train.load_checkpoint("./training_checkpoints/ckpt_" + str(checkpoint_num)))
#model.build(tf.TensorShape([1, None]))

## Generating text
now we can use this function provided by tensor flow to generate some text using any starting wed'like

In [101]:
def generate_text(model, start_string):
  # Evaluation step(generating text using the learned model)
  #Number of Characters to generate
   num_generate = 800

   # Converting our start string to numbers(vectorizing)
   input_eval = [char2idx[s] for s in start_string]
   input_eval = tf.expand_dims(input_eval,0)

   # Empty String to store our results
   text_generated = []

   # Low temperatures results in more predictable text
   # Higher temperatures results in more surprising text
   # Experiment to find the best setting
   Temperature = 1.0

   # Here batch size == 1
   model.reset_states()
   for i in range(num_generate):
    predictions = model(input_eval)
    # Remove the batch dimension
    predictions = tf.squeeze(predictions, 0)

    # Using a categorical distribution to predict the character returned by the model
    predictions = predictions / Temperature
    predicted_id = tf.random.categorical(predictions, num_samples = 1)[-1, 0].numpy()

    # we pass the predicted Character as the next input to the model
    # along with the previous hidden state

    input_eval = tf. expand_dims([predicted_id], 0)

    text_generated.append(idx2char[predicted_id])

   return (start_string + ''.join(text_generated))



In [103]:
inp = input("Type a starting string: ")
print(generate_text(model, inp))

Type a starting string: hello
hellows.

GONZALO:
Not in my part.

KATHARINA:
Love me one.

GRUMIO:
He wants here you you Val were a princen
my honour do thee.

GONZALO:
What, slave!

Provost:
Is the loving boar-pent not Let them blose him I take
ill smile.

LUCIO:
I pray thee, met'st me infurning him s: the arment make up myself
That thus to go, both your majesty cast out, the
horsed cloak--
For that, I have heard it but by?

BAPTISTA:
As ging eyes of golden,
Even to the stones abreath
As to leave your honour.

Citizens:'
Be safers that marks the news abruction of me;
Lone him: I challed one lies more, or again and sweat
And hag seven'd in aide. Katharina, that by this time will command.

PRINCE:
Come hither.
Good Gloucester,
Though all the weakill show
I live not with him; he cabrings; which s he shall yitwere to Curtons. 


## CONCLUSION
Basically you would want to increase the number pf epochs to make the prediction better and also if you have along text of data its better.

 it is hard to over train this model becouse you would want it to learn the language even more for better predictions

 you would want the loss to be as little as posible therefore add number of epoch or use a more detailed training data or increase the number of batches by reducing the sequence length